Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
PodSecurityPolicy allowedHostPaths does not effectively restrict to a subpath #61043
the allowedHostPaths feature limits what paths can be specified in a hostPath volume, but does not restrict symlink creation and traversal within that subpath
To prevent this, either of the following are required:
Until those changes are made, PodSecurityPolicy objects designed to limit container permissions must completely disable hostPath volumes
referenced this issue
Mar 12, 2018
changed the title from
PodSecurityPolicy allowedHostPaths does not effectively restrict to a subpath
Mar 12, 2018
tl;dr: I have an idea but it will require some layer violations between security policies and volumes.
I have a couple of ideas on how to solve this (I implemented quite a few of the symlink-related protections in Docker and runc), and I've started working on some preliminary patches. Unfortunately the more obvious ways of solving this problem aren't secure enough. One of the simpler ways of solving this problem would be:
However, the main problem I see at the moment is that there is a trivial TOCTTOU with the above setup. If one of the components of the expanded path is changed between these two steps, then the path is no longer safe. This can be handled in a more complicated way by holding a reference to the directory and then applying the checks to that reference (using the reference in the
Why is it so convoluted? Because we need a real reference to the path to avoid race conditions from mutating the a path that we operate on multiple times. Our recursion of
Sorry if that's a bit of a brain-dump, but it's how I would suggest tackling this issue.
Also note that while https://github.com/cyphar/filepath-securejoin (the symlink escape handling code that Docker and runc use) is a fairly simple solution to this problem, it was explicitly not designed for cases where you may have adversaries that are modifying the filesystem in a way that makes TOCTTOU viable.
PodSecurityPolicy enforcement is done at admission time, by the API server, not at container start time on the node. The only options I see available at admission time (without adding API fields to the pod spec and plumbing more information down to the kubelet) are letting PSP require the volume to be readonly or an exact path match.
Exact path matches won't help against symlink components changing. If you want to fix the issue in a complete way, you need to pass a file descriptor which can be used for mounting in some fashion (it's a shame that this will cause a layer violation, but the current design doesn't provide the security it claims to
From my point of view, neither are really an adequate solution if you want to protect from races against the kubelet. If races against the kubelet aren't an issue, then you can just do the first proposal I listed. That proposal will still do enforcement at admission time, but the kubelet will take an additional precaution as soon as it can to avoid further symlink changes.
The presumption is that the restricted user does not have the ability to change parent components of the allowed path
This is not intended to protect against a user with host access.