New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort file lists #1068

Merged
merged 3 commits into from Sep 3, 2017

Conversation

Projects
None yet
2 participants
@bmwiedemann
Contributor

bmwiedemann commented Jun 21, 2017

to generate reproducible zip files
that do not differ depending on (random) filesystem order

See https://reproducible-builds.org/ for why this matters.

@bmwiedemann bmwiedemann force-pushed the bmwiedemann:sortzip branch from ffe351a to aae00f4 Aug 23, 2017

@jaraco

Overall, this change is reasonable, though I'd really like to see the functionality encapsulated in a single place that might have a docstring to explain why it's there. The PR would also be substantially better with a test case to capture the expectation (as otherwise this change could get optimized away without any test failures).

Importantly, a change like this needs an entry in the CHANGES.txt file (probably 0.1 bump).

@@ -303,7 +303,8 @@ def get_ext_outputs(self):
paths = {self.bdist_dir: ''}
for base, dirs, files in os.walk(self.bdist_dir):
for filename in files:
dirs.sort()
for filename in sorted(files):

This comment has been minimized.

@jaraco

jaraco Aug 26, 2017

Member

Why use sorted for files but .sort for dirs?

@@ -331,6 +332,8 @@ def walk_egg(egg_dir):
"""Walk an unpacked egg's contents, skipping the metadata directory"""
walker = os.walk(egg_dir)
base, dirs, files = next(walker)
dirs.sort()

This comment has been minimized.

@jaraco

jaraco Aug 26, 2017

Member

It's starting to feel like this change is getting sprinkled around a few places. Maybe it would make more sense to have a "sorted walk" function or similar.

@bmwiedemann bmwiedemann force-pushed the bmwiedemann:sortzip branch from aae00f4 to b38964a Aug 28, 2017

Sort file lists
to generate reproducible zip files
that do not differ depending on (random) filesystem order

See https://reproducible-builds.org/ for why this matters.

@bmwiedemann bmwiedemann force-pushed the bmwiedemann:sortzip branch from b38964a to 2968882 Aug 28, 2017

@bmwiedemann

This comment has been minimized.

Contributor

bmwiedemann commented Aug 28, 2017

Thanks for your nice feedback.

I added a sorted_walk function with docstring, but I have not fully tested it
and I could use some help with adding a test, because my python-foo is not the best.
Something like:

  1. create numbered file entries 1..9 (in random order)
  2. create an .egg from it
  3. check that entries in the zipfile are ordered 1..9
@jaraco

jaraco approved these changes Sep 3, 2017

@jaraco

This comment has been minimized.

Member

jaraco commented Sep 3, 2017

This new PR is much cleaner. And given that eggs are largely deprecated in favor of wheels, I'm not going to be a stickler about the test.

@jaraco jaraco merged commit 24f240e into pypa:master Sep 3, 2017

0 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment