Skip to content

os.removedirs() should define whether it follows symlinks #134161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
calestyo opened this issue May 17, 2025 · 5 comments
Open

os.removedirs() should define whether it follows symlinks #134161

calestyo opened this issue May 17, 2025 · 5 comments
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes docs Documentation in the Doc dir

Comments

@calestyo
Copy link
Contributor

Documentation

Many functions in os do not properly specify whether or not they follow symlinks. The Files and Directories even slightly implies that the default is more to follow symlinks (and only with follow_symlinks=False not).

For some functions (like os.rmdir() one might argue this is not important because they're the counterparts of well-known POSIX/C functions and for them it's well-defined.

But e.g. os.removedirs() seem to have no such counterpart and since the documentation doesn't mention anything, the current behaviour (which AFAICS is not following symlinks) might just be some implementation detail.

For os.removedirs() there are even two interesting cases in e.g. os.removedirs("a/b/c/d"):

  • if d (i.e. the final pathname component) is a symlink to a directory the referred directory is not removed (at least not in the current code, which uses os.rmdir() on the pathname)
  • if e.g. b is a symlink to a directory, which contains only c/d, then b is followed when rmdiring a/b/c/d and a/b/c but is not followed when rmdiring a/b (and the target of b isn’t removed).

All this kinda follows the spirit of POSIX' pathname resolution, but still it would IMO be nice to have it clearly defined.

Cheers,
Chris.

@calestyo calestyo added the docs Documentation in the Doc dir label May 17, 2025
@ZeroIntensity ZeroIntensity added 3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes labels May 18, 2025
@serhiy-storchaka
Copy link
Member

I'm afraid that symbolic links were not taken into account when these functions were written. There is no design behind os.removedirs() and os.renames(), they were added simply by analogy with os.makedirs(). Their behavior can be unexpected. os.removedirs()` can remove more than expected if the parent directories have only one subdirectory.

The current behavior is just an artifact of implementation. It can be changed if we approach their design thoughtfully. But now they are simply neglected.

@ryan-duve
Copy link
Contributor

@serhiy-storchaka Would it be better to document the functionality as it's currently implemented? If not, would it be better than nothing to add what you described above to the docs, i.e., there is no design behavior? I'd be willing to put up an initial PR to get either started.

@calestyo
Copy link
Contributor Author

calestyo commented May 19, 2025

@serhiy-storchaka

os.removedirs() can remove more than expected if the parent directories have only one subdirectory.

What exactly do you mean by that?

@ryan-duve IMO, proper documentation would suffice for now, and a future version could add options whether or nor not follow symlinks (and perhaps even which), should that be needed.

AFAICS, right now it won't follow any symlinks.

That is, in os.removedirs("foo/bar/baz"):

  • if baz is a symlink (even if to a directory) it won't remove the symlink nor the target but git NotADirectoryError
  • if bar is a symlink to a directory bla that contains an empty directory baz, it will remove baz, but won’t remove the symlink bar nor the directory bla
  • if baz does not exist, it will give a `FileNotFundError?
  • if baz contains another file it will give an OSError with errno ENOTEMPTY - not so if e.g. bar or foo contain other files

IMO that's all quite reasonable and should be even the default, if removedirs() would ever get options that control symlink behaviour.
In particular, even if the leaf pathname component (baz) would be a symlink to an empty dir, it should IMO not be followed.

Cheers,
Chris.

@serhiy-storchaka
Copy link
Member

What exactly do you mean by that?

os.removedirs("foo/bar/baz") can remove not only "foo/bar/baz" and "foo/bar", but also "foo" if it only contained "bar", even if you did not mean this. os.removedirs() does not know where to stop.

@calestyo
Copy link
Contributor Author

Ah, I see what you mean.

I have only had a brief glance over the code, but from what I saw, it would always try to delete the full given pathname, that is if you give it foo/bar/baz it will try to remove them up to (including) foo.

But if foo was in another directory parent and even if that had only foo it wouldn't go further up and parent would not be removed.

So I'd say the behaviour is well-defined (and reasonable), the only problem is that you cannot easily remove empty-dir hierarchies that are outside of your current directory. Of course you can do a os.chdir() (before and after) but that may have side-effects (especially when being multi-threaded).

Perhaps a future update of removedirs() should get a relative_to parameter or something like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.13 bugs and security fixes 3.14 bugs and security fixes 3.15 new features, bugs and security fixes docs Documentation in the Doc dir
Projects
Status: Todo
Development

No branches or pull requests

4 participants