bpo-30461: glob: sort the resulting list #1794
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
because POSIX readdir does not guarantee any order
glob often gave unexpectedly random results.
This change makes it behave similar to POSIX glob(3).
Some background:
for openSUSE Linux we build packages in the Open Build Service (OBS)
which tracks dependencies, so when e.g. a new glibc is submitted,
all packages depending on glibc are rebuilt
and if those depending binaries changed,
the new version is pushed to the mirrors.
Many python modules build their .so files from a
glob.glob("*.cpp")
The old glob behaviour would often lead to the linker
randomly ordering functions in resulting object files,
thus we were not able to auto-detect
that the package did not actually change
which wastes bandwidth of distribution mirrors and users.
See also https://reproducible-builds.org/ on that topic.
This change should not break existing software
because there were no guarantees on ordering of glob results.
Measurements with 'perf' show the new code to be 4ms / 1.07x slower
(for
/usr/*/*
with 9854 files)The alternative would be to patch each package individually
but that would be quite some effort and not be as nice to use
as can be seen in
https://www.riverbankcomputing.com/pipermail/pyqt/2017-May/039214.html
and there are plenty others out there
https://github.com/pytries/datrie/blob/master/setup.py#L10
https://github.com/jonashaag/bjoern/blob/master/setup.py#L6
https://github.com/scipy/scipy/blob/master/scipy/sparse/linalg/dsolve/setup.py#L28