Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ensures cache dir does exist #319

Merged
merged 14 commits into from
Mar 23, 2020
Merged

ensures cache dir does exist #319

merged 14 commits into from
Mar 23, 2020

Conversation

holgi
Copy link
Contributor

@holgi holgi commented Feb 20, 2020

I ran into a problem when installing another project with flit for development purpose using flit install --pth-file. The user had a non-existing home directory set (`/nonexistent') and therefore the download of the trove classifiers failed.

The /nonexistent "home" directory is used on POSIX systems to denote a user without a home directory, e.g. user nobody or users only set up for services.

These proposed changes ensures that a valid directory is returned by the flit.validate.get_cache_dir() function.

If the flit-subdirectory of the cache-directory does not exist and could not be created, a temporary directory is used instead. To ensure that always the same temporary directory is used, the function is wrapped with the functors.lru_cache().

A tests for this new behavior is also included. The tests passed so far on Mac OS X (Python 3.7 and 2.7) and on FreeBSD (Python 3.6)

If the "flit" sub-directory could not be created, a temproary directory is used instead
this hopefully fixes the build errors on appveyor for windows builds
Windows allows the creation of the path "/dev/null/nonexistent/flit" - this leads to a failing test.
Since I don't know any path that is protected under Windows, this test is marked as skipped.
Copy link
Member

@takluyver takluyver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this makes sense. Just a few small suggestions.

tests/test_validate.py Outdated Show resolved Hide resolved
tests/test_validate.py Outdated Show resolved Hide resolved
flit/validate.py Outdated
pass
except (OSError, PermissionError):
cache_dir = Path(tempfile.TemporaryDirectory().name)
cache_dir.mkdir(parents=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, TemporaryDirectory() always creates the directory, so we don't need to do so here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so too, but in one of my test setups this was not working reliably. Since I'm not at home and can't remember by heart which one it was, I'd need to check on this later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I think I've got it. TemporaryDirectory() cleans itself up when the object is destroyed - which is immediately, because the code doesn't keep a reference to it. Use mkdtemp instead to get a directory which won't clear itself up.

Of course, this does mean each run of flit will create a new temp directory and not clear it up, which isn't ideal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m(
Thanks for pointing this out. I ran head over heels into this kind of race condition in the first place and made things worse afterwards. Feeling kind of stupid now ;-).
Do you have any idea how to solve this without falling back on a module scope variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately adding a new attribute to a pathlib.Path instance doesn't work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do what you're currently doing, with a temporary directory that never gets cleaned up, the easiest thing is to use the mkdtemp() function instead of TemporaryDirectory().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other possibility would be - as you described - to use mkdtemp directly, but the cleanup afterwards must also be managed somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for spamming you with comments.
I think I have a fix in commit bef5f2b using a context manager. I had to slightly rewrite validate_classifiers also. #

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@takluyver Sorry, I didn't see your reply – my connection is a little bit flaky, sitting on a train today. I actually want an automatic cleanup. Circumventing it was just a stupid mistake on my side.

@takluyver
Copy link
Member

That solves the cleanup problem, but if we create a new 'cache directory' each time the function is used, we're not really caching anything, so we can probably simplify things.

@holgi
Copy link
Contributor Author

holgi commented Feb 24, 2020

I know that it's not caching anything. But this will only be in effect for (posix) users that don't have a (writable) home directory set, so probably related to some service / daemon. And also only while using flit to install the project (not if the install is done via pip).

Since I can't think of an posix path that has guaranteed write access for all users, I think not caching in this simple case is acceptable.

Currently an install in such a case fails, so this is an improvement IMHO.

Another possibility would be to introduce and check for a "FLIT_CACHE_DIR" environmental variable (similar to the suggestion in #320). In a case where there is no writable automagical cache path, the flit command could be prefixed like FLIT_CAHCH_DIR=/vatr/tmp/cache flit install --pth-file.

@holgi
Copy link
Contributor Author

holgi commented Feb 24, 2020

Another possibility would be to move the check for FLIT_NO_NETWORK to the start of the validate_classifiers() function, so when FLIT_NO_NETWORK is set, the trove classifier check is passed completely.

@takluyver
Copy link
Member

Sorry to be unclear. I was thinking, if there's nothing being cached anyway, we don't need to bother creating a temporary directory and saving a file to disk - we can just get the data from the network and use it directly. If it's not clear how to do that, I can have a go myself next time I have a bit of spare time to work on it.

@holgi
Copy link
Contributor Author

holgi commented Feb 25, 2020

Since I had some time this morning, I removed the caching. Should I add the changes to this pull request or if you want to close this I'll open a new one.

@takluyver
Copy link
Member

But it can still cache when there is a normal home directory, right?

If you're happy with the changes, feel free to push them to this PR. Or if you're not sure, it's fine to make another PR and we can work out what's best.

@holgi
Copy link
Contributor Author

holgi commented Feb 25, 2020

Yes. With the changes I made today (not pushed so far), caches would be created and used as before if it is possible.

After a download of the classifiers an attempt is made to create a cache file for later use, but it doesn't matter if this is successful or not.

@takluyver takluyver added this to the 2.3 milestone Mar 1, 2020
Copy link
Member

@takluyver takluyver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm happy with the overall structure of this, so I'm reviewing the details now.

flit/validate.py Outdated
@@ -34,17 +34,15 @@ def get_cache_dir() -> Path:
or os.path.expanduser('~\\AppData\\Local')
return Path(local, 'flit')

def _verify_classifiers_cached(classifiers):

def _read_classifiers_cached():
"""Check classifiers against the downloaded list of known classifiers"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring here needs to be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed in commit 909f7b4

flit/validate.py Outdated


def _download_classifiers():
def _download_and_chache_classifiers():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _download_and_chache_classifiers():
def _download_and_cache_classifiers():

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo is fixed in commit 909f7b4

flit/validate.py Outdated
@@ -54,10 +52,27 @@ def _download_classifiers():
cache_dir = get_cache_dir()
try:
cache_dir.mkdir(parents=True)
except FileExistsError:
except (FileExistsError, PermissionError, OSError):
# readonly mounted file raises OSError
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like catching any OSError - that's quite a broad category. OSError objects normally have an e.errno attribute which you can check against constants in the errno module to look for specific errors. E.g. it sounds like this might be EROFS (read only file system) - but check if that's what you're hitting.

This pattern was more common in Python 2 code, before subclasses like FileExistsError and PermissionError were split out as their own subclasses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was running into this problem on a Mac with 10.15 "Catalina" running Python 3.7. The root and other system directories are read-only.

Trying to create the dir "/nonexistent/" raises a OSError for a readonly file system – unfortunately not a subclassed error like PermissionError but an OSError itself:

>>> open("/foo", "w")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 30] Read-only file system: '/foo'

I changed the try-except-block to be a little bit more verbose and check for an explicit errno.EROFS

tests/test_validate.py Show resolved Hide resolved


def test_download_and_chache_classifiers():
classifiers = fv._download_and_chache_classifiers()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like it to be possible to run the tests without requiring network access. There are two approaches for this:

  1. Mark tests which need network access, so there's a clear way to exclude them. I don't think there are any examples in Flit, but here's one from another of my projects: https://github.com/takluyver/pynsist/blob/bf2408bbd01b5da63b93107f48435a5a57f9048b/nsist/tests/test_pypi.py#L11-L14
  2. Mock out the network responses, e.g. see the test_upload module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you used responses already, I've choosen the second option.

@holgi
Copy link
Contributor Author

holgi commented Mar 2, 2020

Thanks for your patience with me and my changes ;-)

Copy link
Member

@takluyver takluyver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the actual code is looking code, I've just got a couple more comments about the tests.

Sorry that it's taken me a while to get back to this.

tests/test_validate.py Outdated Show resolved Hide resolved
tests/test_validate.py Outdated Show resolved Hide resolved
tests/test_validate.py Outdated Show resolved Hide resolved
@holgi
Copy link
Contributor Author

holgi commented Mar 21, 2020

Sure no problem. You are probably doing this in your spare time 😄 👍

@takluyver
Copy link
Member

I am indeed doing this in my spare time - though there might be more spare time with all these social distancing measures.

@holgi
Copy link
Contributor Author

holgi commented Mar 23, 2020

I removed the two tests as you proposed. Thanks again for your work and the patience with this pull request :-)

@takluyver
Copy link
Member

Thanks for your patience with my reviewing. 🙂

@takluyver takluyver merged commit 359114b into pypa:master Mar 23, 2020
@holgi
Copy link
Contributor Author

holgi commented Mar 24, 2020

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants