-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fast-deps] Parallelize wheel download #8771
Conversation
46b364c
to
83ccf72
Compare
2a76451
to
846ec17
Compare
Sorry for being lazy, but one question without reading the implement: Does this single progress bar implementation only apply to |
It only applies to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
cdc6c38 provides I suppose that for this to be merged, we only need to consider the following points:
For convenience purposes, I've made a small write-up on the behavior of pip with this patch. |
Gentil ping! Also I hesitate to mark this as something that solves GH-825: should I? |
Gentle ping! Off-topic: Wow it has been 23 days from last ping, time flies like crazy since I got back to school 😞 |
Pong. /cc @cosmicexplorer I guess updating this PR is a good idea. :P |
Nah, let it be for now -- we'll make a big-ish celebratory post once all the little things are wrapped up nicely. :) |
Hmmm the conflict was introduced 9 days ago, why didn't brown truck tell me anything? BTW do you have 2FA enabled and if so how do you authenticate pushes? I'm putting a token in
Woo-hoo 😄 |
SSH keys - using the ssh variant of the git URLs and adding my public key to the GitHub account. |
Thanks, that's much better! |
Would you mind taking a look at this, @uranusjr? I'd be really happy to get this to land in 20.3! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is extremely cool!!! 🔥 🔥 🔥 I am currently making others excited about this!!
@@ -125,12 +133,24 @@ def _get_http_response_filename(resp, link): | |||
return filename | |||
|
|||
|
|||
def _http_get_download(session, link): | |||
# type: (PipSession, Link) -> Response | |||
def _http_get_download(session, link, head=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result of HEAD and GET seem so different that I would wonder if instead we would want to make this method's error handling into a @contextmanager
. This is also driven by the thought that we might want to extract link.url.split('#', 1)[0]
into its own named method, because it's not immediately clear what it's doing if you don't know or remember that pypi urls are structured this way.
To avoid breaking other callers, we could keep _http_get_download()
undisturbed, making use of the new url splitting method and wrapping the session.get(...)
call in the new contextmanager. That is just what first came to my mind when thinking of how to reuse the error-handling mechanisms in this method -- not a blocking review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the utility of factoring this away into a method though, so def _http_head(session, link):
, which wraps its session.head(...)
in the error-handling contextmanager could perhaps be a way to keep the _FileToDownload
object clean. Just suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
we might want to extract
link.url.split('#', 1)[0]
into its own named method, because it's not immediately clear what it's doing if you don't know or remember that pypi urls are structured this way.
I also think that this is a good idea—the 1/0 numbers looks really cryptic. I'll add an reference to PEP 503 to clarify it for future readers. Edit: many spots in the codebase uses this same expression. I'd prefer to do this in a follow-up since it does not exactly fit in the scope of this PR.
Regarding splitting a separate method for HEAD, however, I don't feel that it would make the callers' code any cleaner, i.e. this level of abstraction feel just right for me, although it's just personal taste. I'll reword s/head=/just_head=/ to hopefully makes it make more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding head
, it won’t make the caller’s code cleaner, but is still a good idea since having a boolean flag switching between behaviours is a smell in general. it is better to extract the session.method(...)
call out of this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cosmicexplorer and @uranusjr, does 307c4ae make it look better to you (sorry for the delay though)?
BatchDownloader will need to handle the progress bars by itself. In addition, this commit changes the minimum response size to have a progress bar from 40000 bytes to 4 times chunk size.
By using head requests for information about the file, we can avoid opening too many connections at the same time and making urllib3 connection pool full.
Unfortunately HEAD responses does not carry caching information, so all files will be assumed to be downloaded from the index. Hence, the speed report might be inaccurate if there are cached files.
How difficult would it be to have some tests around this? pip does not currently have a lot of tests around downloading packages via HTTP (since it currently just uses requests, which is quite mature, and only in very simple ways), which this would significantly change. |
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
Closing as this is significantly out of date, and has merge conflicts. Please do feel free to resubmit an updated PR for this! ^>^ |
Example output: