-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] internal wheel cache #1572
Conversation
Nice. |
I've not had a chance to try this out yet, but the UI looks good to me. I guess Agreed that this doesn't make |
I still need to come back to this, and work on a few things: logging, tests etc... just been busy. My high level concerns were:
|
So what exactly is more awesome about this than the existing |
Download Cache == you rebuild lxml every time you pip install it, Wheel Cache == you build it once and install from wheel from there on out. |
and supposing theoretically, everything you needed was already a wheel on pypi, this is still faster than it has been a frequent compaint, that |
# TODO: quiet the failure logging from the finder | ||
cache_finder = PackageFinder(find_links=[self.wheel_cache_dir], index_urls=[]) | ||
try: | ||
url = cache_finder.find_requirement(req, upgrade=upgrade) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upgrades and non-versioned reqs need to go through the normal finder
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I guess it's debateable whether a non-versioned req should just use what's in the cache, or check for the latest. maybe just use cache, unless --upgrade
is set. that matches the way #878 was written up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this. If I do pip install foo
and foo 1.2 is the latest version on PyPI (sdist only), I'd expect to build a wheel in the cache and install it. If I later did the same install to a different venv, I'd expect to check PyPI, find there's no newer version, and use the wheel from cache (even if the project had added a 1.2 wheel to PyPI in the meantime). But if there is a newer version, why wouldn't I use that?
Your comment "it's debateable whether a non-versioned req should just use what's in the cache, or check for the latest" implies that you're thinking that pip install foo
will get a 1.2 wheel from the cache even if 1.3 has been released on PyPI. I don't think that's right.
Looking at #878 I see that this may well be what you mean. Maybe there's a use case for it, but I do not think it's the right default. A "cache" should speed things up but not change behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted. I think I agree with you. just trying to cover all the angles. just realize that if foo
is installed, pip install foo
doesn't check for the latest version, so that's what I was mulling over.
pip install --wheel-cache --no-index foo
is probably the answer for re-using cache without an expensive pypi check., or maybe a more explicit --use-cache
is needed
and moreover, the idea is that eventually pip will only know how to install from wheel. this is the opt-in step towards that. |
... which makes it even more important that enabling the cache doesn't result in changed behaviour. |
Just a note here, Stefan Krah and Marc-Andre Lemburg have commented in a thread on distutils-sig (the relevant portion of the thread starts at around https://mail.python.org/pipermail/distutils-sig/2014-May/024251.html) that this may be an issue for some projects (that's heavily summarised, read the thread for details if it matters). In my view:
But we should probably announce this feature on distutils-sig (I'll write the email) before it goes in, and we should definitely ensure there's chance for a discussion there before it becomes the default behaviour (assuming it does at some point). |
I've found that although not many do this, only most of the bigger-and-extremely-well-used ones do this, like half of the numpy stack, and twisted :D |
note that one of the proposed options (from the description) was |
# 'pip install --wheel-cache' | ||
elif (url.filename.endswith(wheel_ext) | ||
and self.wheel_cache_dir | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since wheel_cache_dir has a default value, self.wheel_cache should also be checked.
This also asks what should happen when pip install --wheel-cache --download-dir=some_place
Should the wheel go to the default wheel-cache-dir or to "some_place" or to both ?
closing this to signal that I don't have the time (or motivation) right now to take this from a POC to a ready-to-merge feature. good luck to whoever picks up this idea. |
:( |
I'm quite interested in implementing this feature. |
@qwcode the lack of this annoyed me yet again today, so I have some motivation to work on it (maybe not enough to complete it, but who knows? ;-)) What in your view remains to be done? You mention docs and tests in the original comment, and there will obviously be work around any bugs that may appear as tests get written and/or users play with it, but is there anything more that you're aware of? Also, what's the best way to get the current state of this work? Should I just pull the qwcode:wheel_cache branch and work from there? |
@qwcode Heh. The other key bit is probably "fix all the merge conflicts". Might take a bit longer than I was hoping :-) |
Hmm, personally (without reading the code at all), I think it would be smart to refactor pip so that pip always builds Wheels if the only thing that is available is a sdist, and then installs from that built Wheel. This has a few benefits:
And some downsides:
|
@dstufft btw, what you said would be smart, is exactly what the PR was trying to do. : ) as for everything not installing correctly, that was the intent of the |
@pfmoore one specific thing that needed work was the logging, but you would see that once you got into it. |
@dstufft @qwcode the problem with the smart approach is that you don't really get the benefits unless the wheel cache is required, rather than opt-in (if it's opt-in you need to retain the current code paths as well as the simplified one). I'd very much like to know how things ("half of the numpy stack, and twisted" were the ones @Ivoz mentioned) fail, because I think we should be working on addressing those failures independently of this PR. We'll never switch to a "via wheels only" mode until the failures are sorted, after all. For example, I thought the numpy issues were around architecture tags and things like that, which only really apply for distributed wheels, not for a cache (even if the tags aren't specific enough, the cache will never be used anywhere else anyway). |
No, you'd still get the benefits because we'd build a wheel regardless of if the wheel cache is enabled or not. You always build wheels and install from that. The cache just controls whether we need to build a wheel everytime someone does |
I also think that the wheel cache should be enabled by default, but we might want to wait a release or two to enable it by default so we don't get bad wheels cached that we want to purge since this is a fairly new/big change in steps. |
@qwcode WELP, I should probably read PRs before I comment, but I just wanted to toss that out there :) |
@dstufft no worries |
Oh, I see. Yes, that's a good idea. I can't tell from reading the diff if that's what the code here does - I'll have to read it in context. Actually, build-via-wheel is likely the easiest bit (modulo issues with packages that don't play nicely with wheels) because it avoids all the complexities of managing/rebuilding/etc the cache. I'd be tempted to start by adding a |
I wouldn't bother with the We probably need to figure out how to handle Numpy and friends in that though. If those don't work with the new path we might want to add those to the list of exclusions inside the code, and raise a warning about it that in the future they'll need to exclude them. We could even go one better, depending on what the failure modes look like, we could attempt to build a Wheel, and if that fails fall back to the old way with a deprecation warning? |
Ugh, that code's ugly... I thought of just patching I can't even describe what (conceptually) an OK, rant over. I'll post this as a record should I get back to looking at this and need a reminder. But I doubt I'll be able to make much progress other than simply rebasing @qwcode's patch and seeing how we go from there. Time to give up on grand plans... |
Whatever is downloaded goes to the download-dir, and a wheel is put in the The download-dir may or may not gain a wheel, but the wheel cache always On May 21, 2014 2:07 PM, "xavfernandez" notifications@github.com wrote:
|
Yup, pip's code isn't simple... |
What was the conclusion here? I can't tell from the commentary. |
@bukzor Yes and we need a volunteer that is able to implement this feature. |
@thedrow Thanks. It seems like the design has changed since the OP. @dstufft Are you decided that the wheel-caching should be always-on, with a (soon deprecated) option to opt-out on a per-package basis? Is there a reason to use a new wheel-cache-dir rather than re-using the already extant wheel-dir? I don't see that the semantics are significantly different between the two. |
At any rate, we need a new, open ticket for this issue, since there's still It would be best if someone that knows the plan (better than I) would write
|
Love the idea. Hope this gets revived at some point. |
We'd still need egg_info because of dependency detection. |
here's a WIP implementation of an inline wheel build cache, that's opt-in.
here's the new
pip install
options:much of the diff is shuffle. the new logic is primarily in 2 new functions:
InstallRequirement.build
:this builds wheels into the cacheRequirementSet.find_requirement
: this knows how to look in the cache first.I still need to cleanup some things, fix/add tests, update docs, but wanted to put the basic implementation out there for review.
Over email, I mentioned doing the cache in a general way for sdists as well,
but atm, thinking that would just be more complex, and mostly pointless in the long run. (i.e. the
--package-cache
idea, #878, would go away)A historial note: an early version of the wheel support, had a
--build-wheel=DIR
option, but it wasn't a full cache solution. see the discussion in #684 as to how we ended up withpip wheel
. btw, I thinkpip wheel
still has a place, after we have wheel caching.