New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg_resources: do not call stat() and access() #1135

Merged
merged 3 commits into from Sep 3, 2017

Conversation

Projects
None yet
2 participants
@jd
Contributor

jd commented Aug 25, 2017

The current code in find_on_path is doing a lot of stat() calls which are
actually useless and prone to race conditions.

As described in Python documentation
(https://docs.python.org/3/library/os.html#os.access), os.access must not be
used before opening a file. Same goes for a directory.

This patch removes those checks by handling exceptions correctly when using
os.listdir() instead, which improves pkg_resources import time.

path_item, metadata=PathMetadata(
path_item, os.path.join(path_item, 'EGG-INFO')
)
if _is_unpacked_egg(path_item):

This comment has been minimized.

@jaraco

jaraco Aug 26, 2017

Member

If the code heads down this path now, but path_item is not a dir or is not readable, you'll get a different result than before. That seems to be a backward-incompatible change. Is this code path not expected to be impacted by this change?

This comment has been minimized.

@jd

jd Aug 28, 2017

Contributor

You're right, it probably needs some more work to, I'll dig into it.

This comment has been minimized.

@jd

jd Aug 28, 2017

Contributor

Actually… the code check that path_item/EGG-INFO/PKG-INFO is a file, so if that's the case, path_item has to be a directory. So the check seems useless.

The permission is not that useful either: the fact that the directory is readable is only useful when calling os.listdir, but no code in that path uses os.listdir – and if it does, it'd needs to learn to handle errors. What is interesting is to be sure the directory has +x permission, and that is done by using os.path.isfile in _is_unpacked_egg: it'll return False if it can't traverse the directory.

So… I can't see of a case where things would go wrong. If there's any we should definitely add a test!

@jaraco

This comment has been minimized.

Member

jaraco commented Aug 26, 2017

I like your instinct here. I'm concerned about the test failure on Windows, which does seem implicated in the change. What's your thought?

@jd

This comment has been minimized.

Contributor

jd commented Aug 28, 2017

Interesting… The Windows failure is exactly why Python 3 has a NotADirectoryError. There seems to be no smart way to catch that in Python 2. I've updated the patch to catch that too, it's ugly, but if you know any better way let me know.

pkg_resources: do not call stat() and access()
The current code in find_on_path is doing a lot of stat() calls which are
actually useless and prone to race conditions.

As described in Python documentation
(https://docs.python.org/3/library/os.html#os.access), os.access must not be
used before opening a file. Same goes for a directory.

This patch removes those checks by handling exceptions correctly when using
os.listdir() instead, which improves pkg_resources import time.

@jd jd force-pushed the jd:less-stat branch from 37f2977 to df23966 Aug 28, 2017

@jd

This comment has been minimized.

Contributor

jd commented Aug 28, 2017

Pull-request updated, HEAD is now df23966

@jaraco jaraco merged commit 6d8381a into pypa:master Sep 3, 2017

0 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment