-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pip 19.x --upgrade breaks on nfs #6327
Comments
This looks like the same as issue #6321 which was recently filed, but this issue has more information. Thanks a lot for the detailed report and analysis, @ntextreme3! |
So... is this basically saying that Windows can rename an open DLL but can't delete it, and Linux+NFS can delete an open SO but can't rename it? That's... unfortunate. In theory, the fix should be to swallow the failure to delete the folder (turn it into a warning rather than a crash). It's already been renamed and is out of the way - Python can't import it, and pip won't try to reuse the filename in future. All by design. Preventing this scenario from happening in the first place is beyond my expertise. All pip has done is rename a directory, which has apparently succeeded. I have no idea why NFS behaves this way. |
@zooba A couple questions: why wasn’t this issue happening before, and what were the changes meant to fix? Does this mean it would be better to be using the old code path in some situations if this particular case is a regression? |
No idea why it wasn't happening before. The changes fixed the performance impact of copying files multiple times to move them out of the way vs. renaming the directory, and inadvertently fixed (in most cases) the problem of hitting the maximum path length on Windows. Given the files are already properly out of the way at this point, it's hard to choose a different code path. If you want to keep the old one on Linux because performance doesn't matter so much there, I'm not going to put up a fight, but copying files on Windows is far slower than renaming a directory and I'd rather keep that impact minimized there :) |
Could it be because of what the original poster said? Namely--
I'm thinking of a scenario where |
Apparently it is possible :) Hence my comment about NFS allowing you to delete files in use but not rename them (since the old method was deleting it rather than renaming its parent directory) But I know nothing about NFS apart from what's been said in this thread. Maybe put out a call for help on Twitter to find someone who knows what it's doing? |
It's not the delete vs rename, it's that the old method was deleting from temp directory (non-nfs in my case) which works. And the new one does the rename-then-delete from Adjacent (nfs, in my case). When you delete the file in use on nfs, it gets renamed to |
It must still have been deleting it from the NFS as part of moving it to the temp directory. Otherwise the original file would be left behind. |
Can the |
Apparently no:
In another terminal:
Result: |
Would it be helpful if I post an |
That's what I used originally -- you can see here below (example with markupsafe instead of simplejson but it's the same -- anything that has a C extension basically) The in-use directory is renamed. When it comes time to remove it, it's fails for Also, when pip was using relevant strace sectionrename("/env/lib/python3.6/site-packages/markupsafe/", "/env/lib/python3.6/site-packages/~arkupsafe") = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffd73bea8d0) = -1 EINVAL (Invalid argument)
write(1, " Successfully uninstalled M"..., 48) = 48
... # bunch of lines
rmdir("/env/lib/python3.6/site-packages/~arkupsafe") = -1 ENOTEMPTY (Directory not empty)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
chmod("/env/lib/python3.6/site-packages/~arkupsafe", 0200) = 0
rmdir("/env/lib/python3.6/site-packages/~arkupsafe") = -1 ENOTEMPTY (Directory not empty)
close(4) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
lstat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
open("/env/lib/python3.6/site-packages/~arkupsafe", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
lstat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
open("/env/lib/python3.6/site-packages/~arkupsafe", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
lstat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
open("/env/lib/python3.6/site-packages/~arkupsafe", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
lstat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
open("/env/lib/python3.6/site-packages/~arkupsafe", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
lstat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
open("/env/lib/python3.6/site-packages/~arkupsafe", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout)
lstat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
open("/env/lib/python3.6/site-packages/~arkupsafe", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
stat("/env/lib/python3.6/site-packages/~arkupsafe", {st_mode=S_IFDIR|0200, st_size=4096, ...}) = 0
ioctl(2, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffd73beb050) = -1 EINVAL (Invalid argument)
write(2, "Could not install packages due t"..., 233) = 233 |
Personally, I'm still weighing how to proceed. What options do we have? I know one option is to swallow those errors. For that option, would we be able to detect just that error (and not swallowing similar errors that we'd want to bubble up), and what consequences would that concretely have? Alternatively, is there a way to detect this scenario and fall back to using normal Any other options? [Edit: I didn't write this comment until after reading @ntextreme3's comment which does describe some more options.] |
To be clear, what pip was doing was a fairly expensive backup operation, which is made considerably cheaper without changing the semantics by doing an in-place rename. So by advocating for "configurability", you're also advocating for "much slower and less reliable uninstalls" (current issue notwithstanding - copying files across filesystems is in general less reliable than a single rename). I'm not saying configurability is wrong, but both aspects have to be balanced. (Also note that the only way the rename can be done is by doing it in the directory you're uninstalling from - anything else introduces permission/ACL concerns or simply not being able to move files across drives.) |
Maybe a good starting point on this issue would be to address the "raise a more appropriate exception" part, since that would help in any case and I think we can all agree. |
Others may completely understand the situation already, but I didn't, so I'll write up a summary here (which I hope is accurate, please correct if not): I looked at the strace for 18.x, and pip just gives up after the final But the idea with the Adjacent temp dir is that it wants to install to a temp directory that's right there, and then rename that entire temp directory to |
A couple of ideas about getting past the situation... Step one, it should stop triggering the windows error handler, or fix the chmod for unix compatibility or something. After that... If pip renames the existing directory out of the way, could an Rename the unwanted directory to a new directory called Another option would be to rename the files into place piecewise. If a target directory already exists, try renaming the children into place inside it. If one of them already exists (e.g. a shared library is in a package two levels deep), recursively do the same thing. This would be annoying to code, but still faster than copying everything, I assume. |
That's the eventual hope, yeah, but that's not what happens right now. It's only implemented for uninstallation - so basically the rename of the directory fails because one of the files inside is locked open, and presumably NFS does a copy instead of that file and renames the locked one to The weird part is that the Windows file system behaves the opposite here, where it can rename the file in use to anywhere on the same drive, but can't delete it. And apparently most Linux file systems behave like this too (or at least work with this assumption). So you can't protect against both scenarios with one approach. I'd be fine with detecting Windows and using the slower approach on other platforms (though I do have Linux machines at work where we care about the slowness of installation, but those won't suffer from the "trying to uninstall something in use" problem). The point of renaming the unwanted directory by replacing the leading characters in its name is to avoid making the path longer, as that's the cause of a whole other range of issues. Again, I'd be fine with detecting Windows to handle this case and using a different approach elsewhere. (Just reread the post and realised you may have missed that it renames the directory first, and then tries to install the new version (if any), and then |
I’ll repeat that I think the exception-masking issue should be fixed first as it shouldn’t be happening in any case. |
Did this issue get addressed? |
No. |
Pip leaves behind directories with names starting with `~` in the site-packages directory due to a bug that occurs when uninstalling packages from an NFS filesystem. See pypa/pip#6327. Fixes #206.
I think I'm hitting this issue, but I'm confused about the affected version and/or I'm hitting something related. I'm on an ubuntu machine, with nfs mounted /home and trying to create a conda environment from a environment.yml file. The conda part of the installation works, but pip fails on version 19.3.1 with the .nfsxxx device or resource is busy error. I tried downgrading to several previous versions of pip, but it then produces the "directory is not empty" error. Has this ever worked? Is there some workaround available? |
@kottmast were you able to find a fix for it? we are using pip 20.x and facing this. |
Environment
Description
pip19 uses
AdjacentTempDirectory
, where previous pip versions usedtempfile
. Because of nfs "silly rename", if working on an nfs mount and pip is called via a subprocess from a python script that happens to use one of the packages being upgraded, pip fails when trying to cleanup (since the C extensions are still mapped to interpreter -- can be seen in/proc/$PID/maps
)Expected behavior
Expect this either to not fail (as in previous versions of pip) or just raise more appropriate exception. Right now, this causes pip to enter this error handler (that seems to be designed for windows). chmod'ing the path to 200 works for Windows, but is not enough to delete on Linux. The
chmod
done here ultimately causes pip to raisePermission Error
. Maybe that should be chmod tostat.S_IRWXU
?How to Reproduce
in some nfs location, install simplejson (or anything that uses a C extension)
python3.6 -m virtualenv env source env/bin/activate pip install simplejson==3.16.0
then import it, and call pip in a subprocess
You can also run the script interactively and see the mapped files. I saved above code as 'mini.py'
Output
Taking a look at that location:
The text was updated successfully, but these errors were encountered: