Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network operations for packages should be done in parallel #467

Closed
metajack opened this issue Aug 28, 2014 · 12 comments
Closed

Network operations for packages should be done in parallel #467

metajack opened this issue Aug 28, 2014 · 12 comments
Labels
A-git Area: anything dealing with git E-hard Experience: Hard

Comments

@metajack
Copy link

Currently this is serialized and is quite slow when you have many dependencies.

Note that I expect initial fetches to be slow, but this is the case where the git repos are all unchanged. This is happening to me all the time because I'm constantly failing builds while getting Servo ported over.

@alexcrichton
Copy link
Member

Now that a Cargo.lock is generated as soon as resolve is completed, I don't think that this is as much of an issue any more. I'm worried about updating git repos in parallel because they're almost always I/O bound and I'm not sure that you're going to get much higher throughput by doing it all in parallel.

I'm going to close this for now, but if it comes back as a pressing issue, then we can definitely reopen!

@metajack
Copy link
Author

metajack commented Sep 8, 2014

I think you make a few assumption there about it being I/O bound that may not hold.

  1. Is it likely network bound not disk bound.
  2. The network bottleneck is for a single source location, but the git repos may be synced from several sources.

The fact that browsers make N requests in parallel from the same domain leads me to believe that being network bound on a single socket != being network bound.

#2 won't help servo, since everything is pointed at github. But parallel sockets may still have higher throughput.

That said, we did this in serial before, so this is not really a pressing issue.

@alexcrichton
Copy link
Member

Reopening, I'd like to track this into the future.

@alexcrichton alexcrichton reopened this Oct 16, 2014
@alexcrichton alexcrichton added A-git Area: anything dealing with git E-hard Experience: Hard labels Oct 20, 2014
@SimonSapin
Copy link
Contributor

I believe that browsers making N requests in parallel is more about latency (for opening a new connection or making a new request) than about throughput. However, git-clone does multiple things after fetching packfiles (bound by network throughput), it extracts them and resolves them (bound by CPU or disk), so having multiple git-clones at different phases at the same time could help. Maybe.

@alexcrichton alexcrichton changed the title updating git repos should be done in parallel Network operations for packages should be done in parallel Jan 14, 2015
@alexcrichton
Copy link
Member

Clarifying that I'd like to download packages in parallel in as well. I'd basically like to have the ability to perform all of our network operations in parallel, even if we don't necessarily take advantage of it by default.

@almereyda
Copy link

If multiple dependencies share the same origin and are served via HTTP/2, a parallelized network IO stack should be able to make use of multiplexing.

@ishitatsuyuki
Copy link
Contributor

I'm planning to take this. Some insights:

The point of parallelizing (or multiplexing) is to reduce the impact of network latency. Multiple connections has its own pros and cons, but I'm planning to maintain only one connection per server, as they are supported below.

Another thing I'd like to do is to add file sizes to crates.io-index, so we can know the total download size beforehand and display a united progress bar. I'm not sure if the migration can be done smoothly though.

@ishitatsuyuki
Copy link
Contributor

Close?

@SimonSapin
Copy link
Contributor

Why?

@dwijnand
Copy link
Member

Because of #6005 (is the question-I'm not sure, myself)

@SimonSapin
Copy link
Contributor

Oh, nice! I haven’t tried it yet but yes, it seems this can be closed as fixed by #6005. (The back link shows up just above @ishitatsuyuki’s comment in the web view, but not in email notifications.)

@alexcrichton
Copy link
Member

Ah yes, we can indeed close! There's also a call for testing on internals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-git Area: anything dealing with git E-hard Experience: Hard
Projects
None yet
Development

No branches or pull requests

6 participants