Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-43153: Don't mask PermissionError with NotADirectoryError during tempdirectory cleanup #29940

Merged
merged 12 commits into from
Dec 5, 2023

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-Spinner Fidget-Spinner commented Dec 6, 2021

Fidget-Spinner and others added 2 commits December 6, 2021 22:10
…y cleanup

Co-Authored-By: andrei kulakov <andrei.avk@gmail.com>
@Fidget-Spinner
Copy link
Member Author

I've tested locally and it works on Windows.

Lib/tempfile.py Outdated Show resolved Hide resolved
Lib/tempfile.py Outdated Show resolved Hide resolved
Lib/tempfile.py Outdated Show resolved Hide resolved
Copy link
Contributor

@akulakov akulakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with a minor request to update a comment

Lib/tempfile.py Outdated
# PermissionError is raised on FreeBSD for directories
except (IsADirectoryError, PermissionError):
except PermissionError:
# On Windows, calling _rmtree again will raise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would also happen on other OSes, not just Windows. The bug can only be reproduced on windows because PermissionError in the bug was caused by an open file, but PermissionError in theory can be caused by actual permission / ownership settings. It shouldn't matter much because this is a temp dir (and files in it) created by current user, but this comment can be misunderstood, so it's better to drop ref. to Windows here.

It may be left in the if case below because it can avoid the isdir() check.

It would be good to also add "Issue #43153" to the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that calling unlink() on a directory in Windows always raises PermissionError. Windows maps the NT status code STATUS_FILE_IS_A_DIRECTORY to the WinAPI error code ERROR_ACCESS_DENIED.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would also happen on other OSes, not just Windows. The bug can only be reproduced on windows

I'm confused, should we continue the _os.name == 'nt' check then? I'm not worried about the additional performance hit from isdir (though I would like to avoid it). Correctness is more important. If this does happen on other OSes, then we should remove that OS check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the os check. It can happen on other OSes. With the check removed, the logic becomes "if we are not permitted to remove the file, return or reraise", which makes perfect sense.

@akulakov
Copy link
Contributor

akulakov commented Dec 6, 2021

It might also be good to run this on all windows build bots to confirm that unit test doesn't fail on any windows version we support. The code itself should be fine but unit test may fail if some windows version doesn't raise permission error in this case.

@akulakov
Copy link
Contributor

akulakov commented Dec 6, 2021

Please also update the PR title IsAdir.. => NotADir...

@Fidget-Spinner Fidget-Spinner changed the title bpo-43153: Don't mask PermissionError with IsADirectoryError during tempdirectory cleanup bpo-43153: Don't mask PermissionError with NotADirectoryError during tempdirectory cleanup Dec 7, 2021
@Fidget-Spinner
Copy link
Member Author

Azure pipelines failure is unrelated and fixed by GH-29963.

@akulakov
Copy link
Contributor

akulakov commented Dec 7, 2021

We can probably just keep the unit test for windows and skip it on other OSes. Other OSes would need different ways of causing a PermissionError, so it's not worth the effort. [Edit: for example in my recent testing, on MacOS it's not enough to have a file with no permissions, it is the containing dir that has to lack permissions for the file to raise PermissionError].

@Fidget-Spinner
Copy link
Member Author

We can probably just keep the unit test for windows and skip it on other OSes. Other OSes would need different ways of causing a PermissionError, so it's not worth the effort. [Edit: for example in my recent testing, on MacOS it's not enough to have a file with no permissions, it is the containing dir that has to lack permissions for the file to raise PermissionError).

Yep, I noticed the other tests are starting to fail. Seems like it's not possible to repro this on the other OSes. I'll make sure to test with buildbots before we merge this PR (and once that docfix PR lands).

BTW, thanks for guiding me throughout this PR :). I'm admittedly not familiar with this part of the stdlib. Thanks to Victor and Eryk too!

@akulakov
Copy link
Contributor

akulakov commented Dec 7, 2021

BTW, thanks for guiding me throughout this PR :). I'm admittedly not familiar with this part of the stdlib.

Glad I could help! I haven't done that much PR reviews yet so it was great to get that experience too, especially with the logic I was already a bit familiar with. Please add me as a reviewer in the future on similar PRs!

@Fidget-Spinner Fidget-Spinner added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Dec 7, 2021
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @Fidget-Spinner for commit 499e424 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Dec 7, 2021
Lib/tempfile.py Outdated Show resolved Hide resolved
Lib/tempfile.py Outdated
# bpo-43153: Calling _rmtree again may
# raise NotADirectoryError and mask the PermissionError.
# So we must re-raise the current PermissionError.
if not _os.path.isdir(path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.isdir() returns True for a symlink to directory. Either check os.path.islink() before os.path.isdir() or use os.stat() explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing to delete a symlink should be rare in a temp directory. But if it's worth handling here, then it should be consistent with shutil._rmtree_isdir(). Unfortunately the latter is written for os.DirEntry instances. Here's an updated implementation for shutil that can also check a path. I factored out the check for a mount point into a common _st_ismount(st) function, which avoids duplicated code and, I think, makes it easier to read.

if not hasattr(os.stat_result, 'st_reparse_tag'):
    def _rmtree_islink(path):
        return os.path.islink(path)

    def _rmtree_isdir(path):
        try:
            if isinstance(path, os.DirEntry):
                return entry.is_dir(follow_symlinks=False)
            return stat.S_ISDIR(os.lstat(path).st_mode)
        except OSError:
            return False
else:
    # In general mount points in Windows are directories, but os.unlink()
    # supports them, so rmtree() handles them as symlinks.
    def _st_ismount(st):
        return (st.st_file_attributes & stat.FILE_ATTRIBUTE_REPARSE_POINT
                and st.st_reparse_tag == stat.IO_REPARSE_TAG_MOUNT_POINT)

    def _rmtree_islink(path):
        try:
            if isinstance(path, os.DirEntry):
                st = entry.stat(follow_symlinks=False)
            else:
                st = os.lstat(path)
            return stat.S_ISLNK(st.st_mode) or _st_ismount(st)
        except OSError:
            return False

    def _rmtree_isdir(path):
        try:
            if isinstance(path, os.DirEntry):
                st = entry.stat(follow_symlinks=False)
            else:
                st = os.lstat(path)
            return stat.S_ISDIR(st.st_mode) and not _st_ismount(st)
        except OSError:
            return False

Lib/tempfile.py Outdated
Comment on lines 835 to 837
if _os.path.islink(path) or not _os.path.isdir(path):
if ignore_errors:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this check is inconsistent with shutil.rmtree(). os.unlink() deletes a mount point in Windows, but otherwise a mount point is handled as a directory, i.e. islink() will be false and isdir() will be true. However, they're special cased by shutil._rmtree_isdir() and shutil._rmtree_islink(), and thus shutil.rmtree() will immediately fail with OSError("Cannot call rmtree on a symbolic link"), as would happen for a regular symbolic link, which is exactly what this check is supposed to prevent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, so are you suggesting we:

  1. Modify shutil._rmtree_isdir with your suggested changes here bpo-43153: Don't mask PermissionError with NotADirectoryError during tempdirectory cleanup #29940 (comment).
  2. Use shutil._rmtree_isdir for a more consistent check instead of our current method?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's a bad practice to rely on a private function from another module. But duplicating the code would be a maintenance burden, and it's not useful enough to expose in os.path. Maybe _rmtree_isdir() and _rmtree_islink() should be refactored into a common private module. I'm curious to know what @serhiy-storchaka thinks should be done about the consistency problem between these functions and os.path.islink() and os.path.isdir().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm perfectly fine with using private functions from another module as long as it's within the stdlib itself and not exposed somewhere :). Thank you for the clarification, I wanted to summarize your suggestions since I wasn't sure I understood you correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be okay to use shutil._rmtree_isdir() here because we're asking shutil if it's valid to use rmtree on a path, and then telling shutil to use shutil._rmtree(). So these two steps of logic stay completely internal to shutil even though used outside of it. And it's also consistent with current logic of using internal shutil._rmtree().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_rmtree_isdir() and _rmtree_islink() was changed between versions, and this fix should be applied to multiple Python versions. So it is better to inline them here. If possible, we should use public API (add it if it does not exist).

@serhiy-storchaka serhiy-storchaka added needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Dec 5, 2023
@serhiy-storchaka serhiy-storchaka merged commit 8cdfee1 into python:main Dec 5, 2023
33 checks passed
@miss-islington-app
Copy link

Thanks @Fidget-Spinner for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11, 3.12.
🐍🍒⛏🤖

@bedevere-app
Copy link

bedevere-app bot commented Dec 5, 2023

GH-112753 is a backport of this pull request to the 3.12 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Dec 5, 2023
…ing tempdirectory cleanup (pythonGH-29940)

(cherry picked from commit 8cdfee1)

Co-authored-by: Ken Jin <kenjin@python.org>
Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Dec 5, 2023
@bedevere-app
Copy link

bedevere-app bot commented Dec 5, 2023

GH-112754 is a backport of this pull request to the 3.11 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Dec 5, 2023
…ing tempdirectory cleanup (pythonGH-29940)

(cherry picked from commit 8cdfee1)

Co-authored-by: Ken Jin <kenjin@python.org>
Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@bedevere-app bedevere-app bot removed the needs backport to 3.11 only security fixes label Dec 5, 2023
serhiy-storchaka added a commit that referenced this pull request Dec 5, 2023
…or` during tempdirectory cleanup (GH-29940) (GH-112754)

(cherry picked from commit 8cdfee1)

Co-authored-by: Ken Jin <kenjin@python.org>
Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
serhiy-storchaka added a commit that referenced this pull request Dec 5, 2023
…or` during tempdirectory cleanup (GH-29940) (GH-112753)

(cherry picked from commit 8cdfee1)

Co-authored-by: Ken Jin <kenjin@python.org>
Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@Fidget-Spinner Fidget-Spinner deleted the bpo-43153 branch December 5, 2023 14:13
@Fidget-Spinner
Copy link
Member Author

Thanks @serhiy-storchaka for bringing this over the finish line!

@serhiy-storchaka
Copy link
Member

Unfortunately, I failed to create tests for junction points (which would fail without special handling of junction points in tempfile). But the code looks right.

Comment on lines 891 to 918
try:
_os.unlink(path)
# PermissionError is raised on FreeBSD for directories
except (IsADirectoryError, PermissionError):
except IsADirectoryError:
cls._rmtree(path, ignore_errors=ignore_errors)
except PermissionError:
# The PermissionError handler was originally added for
# FreeBSD in directories, but it seems that it is raised
# on Windows too.
# bpo-43153: Calling _rmtree again may
# raise NotADirectoryError and mask the PermissionError.
# So we must re-raise the current PermissionError if
# path is not a directory.
try:
st = _os.lstat(path)
except OSError:
if ignore_errors:
return
raise
if (_stat.S_ISLNK(st.st_mode) or
not _stat.S_ISDIR(st.st_mode) or
(hasattr(st, 'st_file_attributes') and
st.st_file_attributes & _stat.FILE_ATTRIBUTE_REPARSE_POINT and
st.st_reparse_tag == _stat.IO_REPARSE_TAG_MOUNT_POINT)
):
if ignore_errors:
return
raise
cls._rmtree(path, ignore_errors=ignore_errors)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not go with an "ask-for-forgiveness-rather-than-permission" type of solution, e.g.:

try:
    _os.unlink(path)
except IsADirectoryError:
    cls._rmtree(path, ignore_errors=ignore_errors)
except PermissionError as pe:
    # PermissionError is raised on FreeBSD for directories
    # and by Windows on lock files used by other processes
    try:
        cls._rmtree(path, ignore_errors=ignore_errors)
    except NotADirectoryError:
        # NOTE: This is raised if PermissionError did not
        # correspond to a IsADirectoryError, e.g. on
        # Windows.
        if not ignore_errors:
            raise pe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rmtree() is a complex function, and we cannot be sure whether NotADirectoryError was raised by the toplevel scandir() or somewhere deeper in the tree.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmh...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a responsibility inversion to me. Sure there could be a bug deep down in _rmtree that ends up bubbling an unrelated NotADirectoryError, but that would technically be a bug in _rmtree. Maybe there are valid cases where such an unrelated error would be raised?

Also I see the diff I commented on is not the final version so I'll look at the final version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary a bug. It may be a race condition. Although the LBYL approach is also prone to race conditions, I think it has less chance to override error with a wrong exception.

I myself prefer the EAFP approach, but I do not think that it has advantages in this case.

aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
…ing tempdirectory cleanup (pythonGH-29940)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
…ing tempdirectory cleanup (pythonGH-29940)

Co-authored-by: andrei kulakov <andrei.avk@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants