New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
privilege escalation via ptrace (CVE-2016-8659) #107
Comments
As a short-term fix for Debian, I've reverted #94 so there is no way to induce the privileged process to issue the See also http://www.openwall.com/lists/oss-security/2016/10/13/4
|
The check described above should be fine for that exact exploit, but we should also try to figure out the minimal part that needs DUMPABLE. |
This is normally verified on argument validation, but it may happen if someone managed to send custom priv-sep operations via e.g. ptrace. See containers#107
So, for the dumpable, if you remove the code that sets this, then you get:
Because its trying to write to /proc/self/uid_map, which is root owned for a non-dumpable app. |
The root cause of containers#107 aka CVE-2016-8659 is that we were explictly turning on the dumpable flag, which allows the caller to `ptrace()` us. In fact, Linux already introduced `setfsuid()` for the NFS server for a very similar reason; see `man setfsuid`: ``` At the time when this system call was introduced, one process could send a signal to another process with the same effective user ID. This meant that if a privileged process changed its effective user ID for the purpose of file permission checking, then it could become vulnerable to receiving signals sent by another (unprivileged) process with the same user ID. ``` Let's make use of this, which makes us the same as other setuid binaries, without introducing any additional risk from being potentially `ptrace()able`.
See #109 |
So, here is the issue. When we're using user namespaces we need to have the privileged parent set the uid mapping in the child user namespace, because there is not necessarily a 1-1 mapping. However, the child, as well as the parent are non-dumpable due to the initial setuid bit. Non-dumpable processes have uid 0 owning /proc/pid/' files though, so the parent (running as the user uid) isn't allowed to set the uid map unless the child is made dumpable. However, this is only needed in the case when we need user namespaces. If we're not, then being non-dumpable is fine. So, my solution is to only make the child non-dumpable in the case of using user namespaces. But, i hear you, then we'll be able to ptrace the child in the case of using user namespaces, causing this bug to re-appear! That is not true though, because in a user namespace we lose all capabilities in the parent namespace, so we will not be able to do something affecting the host namespace anyway (i.e. sethostname gets EPERM unless you unshared UTS ns). In fact, there is not much we can do here, because any process in the parent user namespace (i.e. the regular host namespace) have all caps in the user namespace, including CAP_SYS_PTRACE, so we override the value of dumpable anyway. |
Here is my approach instead: |
Because the fsuid patch gives somewhat increases privileges (fsuid 0) which are not needed. |
In particular it is doing file writes as uid 0. |
So the risk with fsuid is things like - what if somehow the pid got reused and we happened to write to the uid map for another pid? Another aspect is that suddenly we're trusting The risk with dumpable is - now all of a sudden we have to be concerned about the fact that someone could ptrace us. And yes, we are in a user namespace...but in fact, your argument:
appears to me to be wrong, because the dumpable flag isn't set, and we need that and the capability to ptrace. If I add in to your patch:
And try to strace before setting dumpable, I get So the aspect of being in a user namespace obviously mitigates things a lot...but we need to consider risks such as having an open file descriptor outside of the namespace. I doubt we have anything open that's writable, but we'd have to be careful in the future. If you're arguing that we can enable dumpable, it follows that everything else after that could be arbitrary user code, and we could do:
Or actually, hypothetically:
Which I think is where some of the patches in #101 are going. Basically, in the userns path (whether privileged or not) we rely solely on the kernel for security post- |
Here's a good way to compare the patches - on kernels vulnerable to https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2016-3135 (the one we cite in Basically, your patch makes bubblewrap's security equivalent to that of the full kernel user namespace feature set, whereas mine keeps things restricted to what we expose on the bwrap command line. (But on the other hand, exposing the full userns feature set is where #101 is going) |
This sounds like it's heading towards a sysadmin configuration mechanism for "how much exposure to kernel userns bugs am I willing to tolerate?"...
This sounds to me like a good argument for taking the fsuid approach, at least short-term. |
@cgwalters Your test case is wrong, because getpid() is cached so returns the old value |
So, yeah, i get your point, but even with the fsuid approach we are ptraceable. However, i mentioned this to @eparis and eric biederman, and he said: ebiederm: alex, eparis That corner case where we let user namespace caps override dumpable looks buggy. I am not certain how that got in there. So, maybe this will change over time. |
Maybe the end result is that we just can't expose user namespaces securely (above full kernel access to user namespaces) at all? |
Closed via #110 |
Sebastian Krahmer reported this to the oss-security mailing list.
The text was updated successfully, but these errors were encountered: