-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repeatable installs via hashing #3137
Conversation
We purposely keep it off the CLI for now. optparse isn't really geared to expose interspersed args and options, so a more heavy-handed approach will be necessary to support things like `pip install SomePackage --sha256=abcdef... OtherPackage --sha256=012345...`.
…f packages. Close pypa#1175. * Add --require-hashes option. This is handy in deployment scripts to force application authors to hash their requirements. It is also a convenient way to get pip to show computed hashes for a virgin, unhashed requirements file. Eventually, additions to `pip freeze` should fill a superset of this use case. * In --require-hashes mode, at least one hash is required to match for each requirement. * Option-based requirements (--sha256=...) turn on --require-hashes mode implicitly. * Internet-derived URL-based hashes are "necessary but not sufficient": they do not satisfy --require-hashes mode when they match, but they are still used to guard against transmission errors. * Other URL-based requirements (#md5=...) are treated just like flag-based ones, except they don't turn on --require-hashes. * Complain informatively, with the most devastating errors first so you don't chase your tail all day only to run up against a brick wall at the end. This also means we don't complain that a hash is missing, only for the user to find, after fixing it, that we have no idea how to even compute a hash for that type of requirement. * Complain about unpinned requirements when hash-checking mode is on, lest they cause the user surprise later. * Complain about missing hashes. * Complain about requirement types we don't know how to hash (like VCS ones and local dirs). * Have InstallRequirement keep its original Link around (original_link) so we can differentiate between URL hashes from requirements files and ones downloaded from the (untrustworthy) internet. * Remove test_download_hashes, which is obsolete. Similar coverage is provided in test_utils.TestHashes and the various hash cases in test_req.py.
Everybody seems to favor this. Spelled -H, it's still pretty short. And it is less unusual programmatically.
Hmm, those unicode encoding test failures don't occur locally. Curious. [They do now!] |
Didn't review this, but I think it shouldn't imply |
Oh, in addition, this is obviously not going to protect against |
Another thought: Would it make sense to do |
Well, conforming to GNU conventions, it'd be |
Yea, that's a bit weird and it's probably not a major use case. I'm fine saying it's not worth it. |
@@ -523,6 +524,47 @@ def only_binary(): | |||
) | |||
|
|||
|
|||
def _good_hashes(): | |||
"""Return names of hashlib algorithms at least as strong as sha256.""" | |||
# Remove getattr when 2.6 dies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think 2.7 has this before like, 2.7.9 so probably we can't get rid of it until we get rid of 2.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.python.org/3/whatsnew/2.7.html suggests it was added in 2.7.0; it lists things added in 2.7.x separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I'm stupid I was confusing it with guaranteed_hashes
.
A few comments, will give it a more in depth review once I wake up more. |
dstufft is nervous about blowing a single-char option on something that will usually be copied and pasted anyway. We can always put it back later if it proves to be a pain.
For dependencies that are properly pinned and hashed (not really dependencies at all, if you like, since they're explicit, root-level requirements), we install them as normal. For ones that are not pinned and hashes, we raise the errors typical of any unhashed requirement in --require-hashes mode. Since the stanza under "if not ignore_dependencies" doesn't actually add anything if it's already in the RequirementSet, not much has to be done in the way of code: the unhashed deps don't have any hashes, so we complain about them as per usual. Also... * Revise wording of HashUnpinned errors. They can be raised even if no hash is specified, so the previous wording was misleading. * Make wording of HashMissing less awkward.
Previously, Hash Verification, Editable Installs, Controlling setup_requires, and Build System Interface were all getting placed under it.
Those commands already checked hashes, since they use RequirementSet, where the hash-checking is done. Reorder some options so pre, no-clean, and require-hashes are always in the same order.
For example, some-package==1.2 is pinned; some-package>1.2 is not. | ||
""" | ||
specifiers = self.specifier | ||
return len(specifiers) == 1 and next(iter(specifiers)).operator == '==' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also pin with '==='
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a new one to me. What's that mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little bit odd here, because with PEP 440 ==
doesn't actually guarantee you'll always get the exact same thing. Local versions are not considered when checking ==
so if someone uploads 1.0+mylocalversion
that will match ==1.0
. However PyPI does not allow uploads of local versions, so it might be a moot point.
The ===
is a way to essentially escape PEP 440's strict parsing requirements for legacy versions that can't be successfully parsed as a PEP 440 version. It also provides a way to escape the ==
behavior around local versions.
Hashes from PyPI | ||
~~~~~~~~~~~~~~~~ | ||
|
||
PyPI provides an md5 hash in the fragment portion of each package download |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not documenting ability to check other hash algos, like previous docs.
to be clear, your intention would be for this to mean "trust my cache", or "use my cache, but check it's hashes"? |
"use my cache, but check its hashes" IOW, "I have put hashes of the locally built wheels into my reqs file. If there is a locally built wheel, check it and use it; don't avoid it." In light of the Horrible Truth, though, this option isn't such a good idea as I once thought, since those hashes turn out not to be very durable. I'll think about that some more. |
here's a user story:
the last point is where I'm not sure. I'm thinking what people will want is just to have a flag to trust their cache (and not have to maintain their own requirements file), and know that later stages of the pipeline won't be so trusting... but ultimately, that's a debate for a later PR... Although, I'm still mulling this over a bit, I think I'm settled on the wheel cache being turned off on this PR for this feature |
Removed the mention of "package index options" in the docs, because they don't all fit that category anymore. Not even --no-binary and --only-binary do; they're "install options".
It depends on what the "platform-neutral" version is. Is it a pure-Python wheel? Then the wheel cache doesn't even come into play. The wheel is fetched into the http cache, it's installed from there every time thereafter, and everybody's happy. If it's a sdist, then the wheel cache comes into play. But in the wake of the Horrible Truth, I'm leaning toward trashing |
…f C compiler nondeterminism.
in this story, it was sdist and truly platform-neutral wheels
roger |
+1 for merging |
:-D |
@dstufft What do you think? Ready to merge? |
yea, I'd like to merge before this gets too stale.... cc @pfmoore @dstufft @rbtcollins @xavfernandez @Ivoz |
I'll go over it again in a little bit. |
I don't have the time to do a code review (nor is the subject something I'm particularly expert in anyway) but the discussion here seems thorough and I agree with the way things appear to have turned out, so it's OK with me. |
Will review shortly. This week is openstack summit, so may not be this week - sorry. |
Just did another pass over the code and still +1 on merging this. Agreed about using a later PR to add support for locally cached wheel files. |
This looks good to me. It has a merge conflict that once it's resolved I can merge (or I'll resolve it in a bit). |
And there was much rejoicing! Thanks for the merge! :-D |
woohoo! |
Add checks against requirements-file-dwelling hashes for most kinds of packages. Close #1175.
This lets you add
--hash=sha256:abcde…
to a line in a requirements file to verify a package against a good local hash, guaranteeing repeatable installs without needing to run your own index mirrors, in the vein of https://pypi.python.org/pypi/peep/.pip freeze
should fill a superset of this use case.pip hash
command for hashing package archives.pip download
andpip wheel
.For a later release:
pip freeze