Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Use ZipFile.open instead of ZipFile.read #5848
To avoid huge memory usage in unusual situations (e.g. the TensorFlow wheel on a Raspberry Pi), use
referenced this pull request
Oct 17, 2018
This PR looks okay to me now. Thanks, @waveform80!
However, in the course of reviewing the PR, I noticed that if you comment out the lines changed by this PR, the
A couple other random things I noticed (but not for this PR as it's out of scope):
Yup - I considered adding that change to this PR too, but I'm always wary of unintended consequences, so I wound up keeping things as straight-forward as I could.
As I understand it, such a change should only affect pathological zip-files containing multiple members with the same path and name, which I think one could reasonably assume would never happen with a normally constructed wheel (I hope?).
The question then becomes what happens in each case and which behaviour is desirable; my assumption (unconfirmed) is that when using
Still, it might be worth considering the change anyway as it's probably a little more efficient (fewer filename lookups) and cleaner code.
That was another change I considered but decided was probably out of scope for this PR. It's definitely worth thinking about though, given pip is sometimes run as root and I don't currently see much defence against a maliciously constructed wheel that could be used to overwrite stuff in, say,
Anyway, there's some random thoughts on future changes for this; I'm happy to bash together a separate PR for them if you'd like.
Certainly. You could create one or more PR's for each of those things if you'd like. We'll have to decide whether the risk outweighs the "reward" though.
A few other things that come to mind:
They both do:
Ah, beat me to it!
Just had a look at this; I don't have a Mac but I can take a fair guess what's going on here: someone's expected symlinks in zips to "just work".
The zip spec itself has no clue about symlinks (unsurprising given its DOS origins). There's an "external attributes" field in the archive member meta-data in zips which several tools (infozip, and Python's ZipFile class) use to store the file mode (and thus the fact that a member might actually be a symlink). However, in order to re-create a symlink an extractor would still need to check the external attributes represent a symlink and call
If pip's expecting symlinks in a zip to work (which seems to be the case, given that test case, and the contents of
I'll take a look