-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
shutil.rmtree is vulnerable to a symlink attack #48739
Comments
Race condition in the rmtree function in the shutils module allows local See also http://bugs.debian.org/286922 Attack: --- # emulate removing /etc
$ sudo cp -a /etc /root/etc/
$ sudo python2.6
>>> for i in xrange(0, 50000):
... with open("/root/etc/" + str(i), "w") as f:
... f.write("0")
...
$ ls /root/etc > orig_list.txt $ mkdir /tmp/attack
$ cp -a /root/etc/* /tmp/attack
$ sudo python2.6
>>> from shutil import rmtree
>>> rmtree('/tmp/attack')
>>> # press ctrl-z to suspend execution
^Z
[1]+ Stopped sudo python2.6 $ mv /tmp/attack /tmp/dummy; ln -s /root/etc /tmp/attack
$ fg
sudo python2.6
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.6/shutil.py", line 225, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/usr/local/lib/python2.6/shutil.py", line 223, in rmtree
os.rmdir(path)
OSError: [Errno 20] Not a directory: '/tmp/attack' $ ls /root/etc > new_list.txt
$ diff -q orig_list.txt new_list.txt
Files orig_list.txt and new_list.txt differ If the attack wasn't successful, /root/etc would not be modified and |
What course of action do you suggest? First chmod 0700 on the directory? |
Mmmh, very recent Linux kernels (>= 2.6.16) seem to have specific |
A shameless copy of the Perl fix for the bug Somebody has to examine the fix though, I'm afraid I'm not currently |
The Perl patch is here: It is a recursive implementation of rmtree. What it does is 1) get the |
Mmmh, the problem with Perl's approach is that it changes the current |
Using chdir() makes sense and it doesn't look like a too big problem to me: def rmtree(...):
...
curdir = os.getcwd()
try:
call chdir() as required
finally:
try:
os.chdir(curdir)
except:
warnings.warn("Unable to chdir to previous current dir")
... |
It's a problem if another thread in the process is making file Since shutil functions have until now been safe against this |
Ah, right you are. Attaching an initial alpha-quality patched shutil.py Run the script by sourcing it with . test_issue4489.sh, not by executing |
And here's the diff so you can review what I was up to. Note that this does not yet fix the problem (although the logic looks |
Aha, got it -- while removing /a/b/c/d, there's no easy way to detect I.e. given directory tree a
I'm afraid the solution for the Perl bug is susceptible to the same problem. |
A blunt, ineffective solution would be to walk the tree before removing If no clever bulletproof fix emerges, perhaps this should be added as |
There's no way to do the "check inode then remove" sequence atomically. |
Fixed a minor bug in test script and added Perl test as well. Perl with File-Path-2.07 passes the test. |
Antoine, what if we add another function, rmtree_safe() that uses |
Replying to previous comment:
Right, although the attack window would be tiny, this is not a real |
I don't have any strong opinion on it, maybe it should be brought on the |
FWIW, I've opened a separate bug entry for the creation of the openat(), |
Mart, I took over the maintenance of shutil, if you are interested in contributing a bullet-proof version of rmtree, please drop a line at python-ideas, and let's see what we can do in a new function maybe. I'll be happy to review and apply patches if we get a consensus |
How does "rm -rf" address this issue? Or does it? shutils.rmtree should probably do the same thing. |
Here is a draft patch. It uses the *at functions and fdlistdir consequently it only makes it safe if those functions are available. It works using a recursive implementation and an open file descriptor pointing to a directory, instead of maintaining state by changing the current directory. If the *at functions are unavailable, it falls back to the unsafe implementation. It requires the patches from bpo-4761 and bpo-10755 to work. |
Thanks for the patch. There seems to be a race remaining here: + try: Someone could change Some other things:
I haven't looked at the tests yet. |
Updated patch removes the race condition. Since an open follows symlinks, you can't just fstat the fd to see if it is a link. I followed the following to overcome this: |
Le mercredi 05 janvier 2011 à 16:58 +0000, Ross Lagerwall a écrit :
Nice. I am unsure about the following piece of code: + if stat.S_ISDIR(mode): If rmtree() encounters a symlink *inside* the tree, I would expect it to |
I think I misread the original implementation. Here is an updated version with that code just taken out. |
I made two comments on rietveld but the email was rejected. |
Updated patch based on Eric's comments: |
Excellent Nick! I discussed the very same thing with David yesterday but somehow we didn't come up with a good name. So we kept it on "safe" (which predates the secure discussion). |
Bikeshedding: (os.unlink in os.supports_dir_fd and os.open in os.supports_dir_fd) could be rewritten as { os.open, os.unlink } <= os.supports_dir_fd As you were! |
The code using set operations seems slightly too cryptic for me, even though I’m comfortable with sets and functions as first-class objects. A matter of taste I guess. BTW it sounds a bit strange to me to have the verb in the singular in supports_dir_fd when its meaning is “the functions in this set support dir_fd”. I guess it’s too late to propose “os.open.supports_dir_fd and os.unlink.supports_dir_fd” (and I don’t know if that is feasible with C functions) :) |
New changeset c2be81151994 by Nick Coghlan in branch 'default': |
Where were you when "is_implemented" was being savagely torn apart last week? ;-) |
I was at work, and moving out of my apartment, and desperating at the subthreads spawned by my mail about packaging, and agreeing with the people being -1 on is_implemented on Signature objects :-) |
Éric - there's almost certainly going to be a PEP for 3.4 about doing this kind of feature advertisement in a cleaner and more consistent way. |
Yep, that is promising. At worst we’ll have a new cool API and three obsolete sets in the os module, not a big deal. |
Okay everyone, let's call it day – after 3,5 years. :) Further enhancement requests please in separate tickets. Thanks to everybody who took part! |
The fix for this issue broke support for bytes in shutil.rmtree: $ mkdir -p /tmp/a/b
$ python3.2 -c 'import shutil; shutil.rmtree(b"/tmp/a")'
$ mkdir -p /tmp/a/b
$ python3.3 -c 'import shutil; shutil.rmtree(b"/tmp/a")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib64/python3.3/shutil.py", line 444, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/usr/lib64/python3.3/shutil.py", line 381, in _rmtree_safe_fd
fullname = os.path.join(path, name)
File "/usr/lib64/python3.3/posixpath.py", line 78, in join
if b.startswith(sep):
TypeError: startswith first arg must be str or a tuple of str, not bytes
$ |
What platform? Windows, or non-Windows? It'll probably be obvious regardless, but that might help. |
Tinkering with os.path.join, that traceback means that "name" is a str instance, while "path" is a bytes instance. The culprit actually appears to be the fact that the type returned by os.listdir (et al) when handed a file descriptor is always a string (this is not explicitly documented, a problem in itself, but that's the behaviour I see here on linux). One way to handle this would be to use the filesystem encoding with surrogateescape to decode any bytes path passed in to shutil.rmtree (at least in the _rmtree_safe_fd case) and handle the actual removal with Unicode throughout. So long as the original encoding of the supplied bytes path is compatible with that of the underlying filesystem, surrogateescape should ensure that everything round trips correctly. |
Your deduction is correct. listdir can't tell what the original argument type was based on the output--path_converter abstracts away those details. So it separately tests the type of the first argument. Staring at it again it's about as clear as mud, but the goal was, the output is always strings unless the user specified "path" as bytes. I'll make a separate issue regarding making the code easier to read and adding a clarification to the documentation. We should spare future programmers from having to guess at this behavior :) |
New changeset 2e2329aeb5c1 by Hynek Schlawack in branch 'default': |
The fix (c910af2e3c98 + 53fc7f59c7bb) for this issue broke deletion of directories, which contain symlinks to directories. $ mkdir -p /tmp/a/b
$ ln -s b /tmp/a/c
$ python3.2 -c 'import shutil; shutil.rmtree("/tmp/a")'
$ mkdir -p /tmp/a/b
$ ln -s b /tmp/a/c
$ python3.3 -c 'import shutil; shutil.rmtree("/tmp/a")'
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/lib64/python3.3/shutil.py", line 447, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/usr/lib64/python3.3/shutil.py", line 395, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/usr/lib64/python3.3/shutil.py", line 406, in _rmtree_safe_fd
onerror(os.rmdir, path, sys.exc_info())
File "/usr/lib64/python3.3/shutil.py", line 404, in _rmtree_safe_fd
os.rmdir(path)
NotADirectoryError: [Errno 20] Not a directory: '/tmp/a/c'
$ |
New changeset f9f798f1421e by Hynek Schlawack in branch 'default': |
Thanks you for catching that! It seems we did something really stupid: followed symlinks. I’ve fixed that and added regression tests. This one is a facepalm gentlepeople. |
I'm not a security guy, but: shouldn't the os.unlink call when it isn't a directory specify follow_symlinks=False? And wouldn't it be safer if the os.rmdir() call also used dir_fd=? Additionally, I think you missed some stuff for shutil._use_fd_functions. Assuming I'm right on both of the above, you should also check:
I'd spell that Finally, up to you, but I'd be tempted to change the "lstat" "and "fstat" calls to "stat" calls using the relevant parameters. |
os.unlink has no follow_symlinks argument. Imagine what would happen if
Unfortunately, os.rmdir('.', dir_fd=topfd) doesn’t work. As in the worst
Interestingly, os.listdir is not in os.supports_dir_fd although it works: False Will you fix it right away or shall I open a ticket?
It would be: _use_fd_functions = ({os.listdir, os.open, os.stat, os.unlink} <=
os.supports_dir_fd and
os.stat in os.supports_follow_symlinks) But currently can’t do.
That's not 3.3 fodder IMHO, feel free to open an enhancement ticket. |
I'm pretty busy right now, please open a ticket for listdir. _rmtree_safe_fd could remove the directory just after the recursive step using the parent's dirfd. Of course you'd also have to add a rmdir for the very-tippy-top after the original call in shutil.rmtree too. But this would prevent the malicious user from even removing empty directories. |
done
Okay I looked into it and it seems okay. IIRC, when Martin wrote the |
New changeset 9134bb4d0578 by Hynek Schlawack in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: