Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tmpfiles: unsafe handling of hard links and a race condition #7736

Closed
orlitzky opened this issue Dec 24, 2017 · 12 comments
Closed

tmpfiles: unsafe handling of hard links and a race condition #7736

orlitzky opened this issue Dec 24, 2017 · 12 comments
Labels
Milestone

Comments

@orlitzky
Copy link

These issues only affect a vanilla kernel, so for any of this to make sense on a patched distro kernel, you'll want to disable the following:

$ sudo sysctl -w fs.protected_hardlinks=0
$ sudo sysctl -w kernel.grsecurity.linking_restrictions=0

The tmpfiles.d specification for the Z type more or less implies some kind of recursive chown. The spec heads off one type of vulnerability by saying that symlinks should not be followed; however, hard links are still a problem. Consider the following:

$ cat /etc/tmpfiles.d/exploit-recursive.conf
d /var/lib/systemd-exploit-recursive 0755 mjo mjo
Z /var/lib/systemd-exploit-recursive 0755 mjo mjo

The first time that tmpfiles is run, everything is fine. But then my "mjo" user owns the directory in question, and I can create a hard link...

$ ln /etc/passwd /var/lib/systemd-exploit-recursive/x

and re-run tmpfiles...

$ sudo ./build/systemd-tmpfiles --create

to take ownership of /etc/passwd:

$ /bin/ls -l /etc/passwd
-rwxr-xr-x 2 mjo mjo 1504 Dec 20 14:27 /etc/passwd

Now, I said that everything was fine the first time that tmpfiles was run, but I lied. The recursive chown moves from the top down, meaning that systemd-exploit-recursive/x is chowned after systemd-exploit-recursive. There is a race condition there that can be exploited. In another terminal, you can run,

while true; do ln /etc/passwd /var/lib/systemd-exploit-recursive/x; done;

and if you're lucky, the hard link will get created after you own the systemd-exploit-recursive directory, but before chown is called on x. This particular race condition isn't unique to the Z type. For another example, consider,

$ cat /etc/tmpfiles.d/exploit-race.conf
d /var/lib/systemd-exploit-race 0755 mjo mjo
f /var/lib/systemd-exploit-race/foo 0644 mjo mjo

Here, the same thing happens, and the "mjo" user has some time to replace foo with a hard link.

@orlitzky
Copy link
Author

FWIW, OpenRC's checkpath helper had to deal with this same problem, and the workaround can be seen at https://github.com/OpenRC/openrc/blob/master/src/rc/checkpath.c#L161

@floppym
Copy link
Contributor

floppym commented Dec 24, 2017

systemd has been enabling fs.protected_hardlinks since 8f27a22. The sysadmin would have to go out of his way to disable it.

@poettering
Copy link
Member

poettering commented Dec 25, 2017

FWIW, OpenRC's checkpath helper had to deal with this same problem, and the workaround can be seen at https://github.com/OpenRC/openrc/blob/master/src/rc/checkpath.c#L161

Hmm, unless I am mistaken that code is full of TOCTOU, no? The "workaround" you are referring to is the st.st_nlink > 1 check, I presume? But that check is done on stat-data that might be out-of-date the time chown() is ultimately called, hence it doesn't fix anything, does it?

I figure we could add a similar check unless protected_hardlinks is set. I see no way how we could ever deal with with this, without protected_hardinks and allow hardlinked stuff, right?

@poettering poettering added this to the v237 milestone Dec 25, 2017
@orlitzky
Copy link
Author

Hmm, unless I am mistaken that code is full of TOCTOU, no?

Ugh, yes, and it looks like even the symlink check (which is exploitable with protected_symlinks) is still vulnerable too :(

I figure we could add a similar check unless protected_hardlinks is set. I see no way how we could ever deal with with this, without protected_hardinks and allow hardlinked stuff, right?

It's up to you, I wasn't aware that systemd set protected_hardlinks by default, and that's obviously a very wise thing to do. That leaves only a few classes of people who would be affected:

  • People who don't boot systemd but who, for some reason, are running its tmpfiles implementation.
  • Admins who have disabled protected_hardlinks.

You can still mount a filesystem with the outlawed hard links on it even with protected_hardlinks=1, but I don't see any way to abuse that.

More to the point, I don't know how to fix it properly in the scenario where the user re-runs tmpfiles and a Z entry is present.

For the race condition, instead of modifying the parent and then its children, could it be done the other way around? That's how GNU chown avoids the race. I haven't tested this yet (Merry Christmas :) but I'm referring to,

        STRV_FOREACH(fn, g.gl_pathv) {
                k = action(i, *fn);
                if (k < 0 && r == 0)
                        r = k;

                if (recursive) {
                        k = item_do_children(i, *fn, action);
                        if (k < 0 && r == 0)
                                r = k;
                }
        }

and

        FOREACH_DIRENT_ALL(de, d, r = -errno) {
                ...
                q = action(i, p);
                if (q < 0 && q != -ENOENT && r == 0)
                        r = q;

                if (IN_SET(de->d_type, DT_UNKNOWN, DT_DIR)) {
                        q = item_do_children(i, p, action);
                        if (q < 0 && r == 0)
                                r = q;
                }
        }

If you called chown on foo/bar before foo, then at least the first time around, the new owner wouldn't be able to replace foo/bar at the last second. That doesn't help when there are two d entries, but it makes Z a tiny bit safer.

@orlitzky
Copy link
Author

After some more digging, it looks like the existing Z behavior might match what the spec says for files and directories? I'm having some trouble interpreting this line:

When two lines are prefix and suffix of each other, then the prefix is always processed first, the suffix later.

If that means that /foo will always be created before /foo/bar, then at least the top-down recursion is consistent, but also the init system is limited in what it can do to avoid doing stuff in a directory that isn't owned by root.

@poettering
Copy link
Member

For the race condition, instead of modifying the parent and then its children, could it be done the other way around?

So, regarding this race condition: If I understood you right you mean the race between us figuring out whether a file needs chown()ing and is safe to chown() (i.e. nlinks is 1 and so on) and us actually chowning it? if the user has write access to the dir he could replace the file in between?

I don't think this is really an issue for us: we open the inode with O_PATH, then do a stat() check on the open fd, and chown() the thing via /proc/self/fd/$fd. This means we have the guarantee that our safety checks and our chowning operate on the same inode which cannot be replaced in between.

Or am I misunderstanding you?

@orlitzky
Copy link
Author

So, regarding this race condition: If I understood you right you mean the race between us figuring out whether a file needs chown()ing and is safe to chown() (i.e. nlinks is 1 and so on) and us actually chowning it?

That whole comment was referring to the Z type. There are two exploits there,

  1. The easy one, where the service is eventually restarted (everything is re-chown'd) and you have all the time in the world to place a hard link in the directory.
  2. The hard one, where you try to exploit the recursive create/chown the first time the service is started, by sticking a hardlink into a directory the instant that the directory is created and chown'd to you.

The race condition I was referring to is (2). IIRC it's when between,

 k = action(i, *fn);

and

if (recursive) {
    k = item_do_children(i, *fn, action);

I'm able to add a child that is a hard link to /etc/passwd or whatever.

poettering added a commit to poettering/systemd that referenced this issue Jan 23, 2018
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736
@poettering
Copy link
Member

I prepped a fix for this now, see #7964. PTAL.

With that fix the race you are referring to is gone too, no? With that fix, we refuse to chown()/chmod()/ACL anything that has a hardlink count > 1 unless the protect hardlink features is on.

@orlitzky
Copy link
Author

I don't understand the /proc/pid/fd magic well enough at this point so I'll just try to describe what I see.

First, the situation that I was worried about. After the PR, we're doing...

fd = open(path, O_NOFOLLOW|O_CLOEXEC|O_PATH);
...
if (fstatat(fd, "", &st, AT_EMPTY_PATH) < 0)
...
if (hardlink_vulnerable(&st))

and I was concerned that the owner of the directory might be able to delete the hard link after you've acquired the FD, but before you stat it.

But then I see...

xsprintf(fn, "/proc/self/fd/%i", fd);
...
chown(fn, ...);

which looks very clever, because if I delete the hard link, then the symlink will still point to the old path. But let's see what happens. I introduced these two lines right before the stat,

printf("opened fd %i for %s, sleeping...\n", fd, path);
sleep(10);

if (fstatat(fd, "", &st, AT_EMPTY_PATH) < 0)

And then I ran tmpfiles on this:

d /var/lib/systemd-exploit-recursive 0755 mjo mjo
Z /var/lib/systemd-exploit-recursive 0755 mjo mjo

After the first run, I hardlinked /var/lib/systemd-exploit-recursive/passwd to /etc/passwd as my mjo user, and re-ran tmpfiles. It gets to this point,

opened fd 3 for /var/lib/systemd-exploit-recursive, sleeping...
opened fd 4 for /var/lib/systemd-exploit-recursive/passwd, sleeping...

which is what you'd expect so far. And then in another terminal, I erased /var/lib/systemd-exploit-recursive/passwd, again as my mjo user. Afterwards I see,

$ sudo ls -l --color /proc/7239/fd
total 0
lrwx------ 1 root root 64 Jan 23 10:37 0 -> /dev/pts/0
lrwx------ 1 root root 64 Jan 23 10:37 1 -> /dev/pts/0
lrwx------ 1 root root 64 Jan 23 10:37 2 -> /dev/pts/0
lr-x------ 1 root root 64 Jan 23 10:37 3 -> /var/lib/systemd-exploit-recursive
l--------- 1 root root 64 Jan 23 10:37 4 -> '/var/lib/systemd-exploit-recursive/passwd (deleted)'

which looks very promising! How can you call chown on that (deleted) path? It should fail, right?

Well, as far as I can tell, it doesn't, because when tmpfiles is finished, I own /etc/passwd.

Maybe I screwed up the test, or maybe there's some kernel documentation that explains what happened under /proc, I dunno.

@poettering
Copy link
Member

Oh, note that /proc/$pid/fd/$fd is magic. It's not really a symlink, it's some magic shit that is just exposed as symlink. It's like /proc/$pid/root, which also shows up as symlink but in reality is some magic shit that is just exposed as one.

And because it is that, /proc/$pid/fd/$fd always points to the original inode, regardless if it got renamed or deleted or anything. Hence you can can "pin" an inode through O_PATH, and then execute operations on it this way, without fearing that it will be invalidated while doing so.

@orlitzky
Copy link
Author

Oh, note that /proc/$pid/fd/$fd is magic. It's not really a symlink, it's some magic shit that is just exposed as symlink... And because it is that, /proc/$pid/fd/$fd always points to the original inode, regardless if it got renamed or deleted or anything.

Ok, that explains what I saw. This is still an improvement because it makes the attacker win a race rather than just create a hard link whenever he feels like it. And I still have no idea how to fix it in general, so there's that.

@hartwork
Copy link

hartwork commented Jan 31, 2018

FYI CVE-2017-18078 was assigned to this issue.

SergioAtSUSE pushed a commit to SergioAtSUSE/systemd_systemd that referenced this issue Jun 7, 2018
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736
(cherry picked from commit 5579f85)

[fbui: fixes bsc#1077925]
[fbui: fixes CVE-2017-18078]
Werkov pushed a commit to Werkov/systemd that referenced this issue Nov 27, 2018
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736
(cherry picked from commit 5579f85)

[fbui: fixes bsc#1077925]
[fbui: fixes CVE-2017-18078]
mikhailnov pushed a commit to mikhailnov/systemd that referenced this issue Aug 14, 2019
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736

Backport of commit 5579f85 to systemd-230
mikhailnov pushed a commit to mikhailnov/systemd that referenced this issue Aug 14, 2019
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736

Backport of commit 5579f85 to systemd-230
mikhailnov pushed a commit to mikhailnov/systemd that referenced this issue Aug 14, 2019
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736

Backport of commit 5579f85 to systemd-230
mikhailnov pushed a commit to mikhailnov/systemd that referenced this issue Aug 14, 2019
…s protected_hardlinks sysctl is on

Let's add some extra safety.

Fixes: systemd#7736

Backport of commit 5579f85 to systemd-230
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants