-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure JupyterHub spawner to allow root operations by default #300
Comments
Seems useful as this would also allow people to install packages e.g. Any info on what the security implications are? |
I agree that seems worth considering and potentially problematic but I don't have any expert understanding in that area to contribute. From my cursory reading it seems that running containers with root credentials makes it more likely that a kernel vulnerability could be exploited to gain access to the node running that container, given that with Docker the kernels are shared? |
I think the use cases of
Are convincing enough that I think its worth doing until we have more reason to believe its unsafe and should disable it. |
cc @mayakacz |
Containers that don't need to run as root shouldn't run as root. See for example: https://docs.docker.com/engine/security/security/#linux-kernel-capabilities My concern is the earlier comment, "its worth doing until we have more reason to believe its unsafe" - I strongly disagree. Making this kind of change later would be breaking, and it'll be painful to get users to migrate from an unsafe state to a safe state. If we can prevent this initially, that is preferred. Ideally, we provide the container with only the capabilities it needs, following the principle of least privilege. |
I agree with seeking to provide the least priv possible and preventing this initially if possible and to prevent breaking changes. I think the debate should be whether installing system packages is a requirement of the developer experience because people can pip install in a virtual environment, we can mount NFS on startup, and we can probably solve the docker problem without root (e.g. we could launch notebook pods with a docker daemon sidecar a-la-dind). On the no side perhaps we want to provide a rich base image with all the common tools already installed (e.g. ffmpeg for rendering) and expect that installation of package requiring system-level privs. will be infrequent and involve re-building the workspace container (followed by non-root users launching the container). On the yes side it seems like it would be nice to have a development environment with the full capabilities of local development. The worst case scenario seems to be that the user installs malicious packages that lead to an attacker gaining access to the kubernetes credentials for the underlying cluster. It doesn't render the issue moot, perhaps just raising another, but currently this can already happen since people are pulling kube and gcloud credentials to the notebook so they can use kubectl, gcloud, gsutil, etc. utilities. |
A few questions (apologies if I am missing context):
|
No worries sorry if I'm missing experience with security! Yeah that's a good question in terms of system packages these should be pretty common. Everyone debugging their TensorFlow Agents renders in the notebook is going to need libav-tools and ffmpeg, see also extensions of the base notebook image. Users will often need to install pip packages or change the version of those already installed but guessing this should just be done in a conda or other dependency env. @danijar I'm guessing you don't very often need to run sudo-level operations and that you're primarily installing and changing versions of pip packages, is that correct? How would you feel to develop in an environment without sudo? @mayakacz Is there any concern about allowing users to install arbitrary python packages at the user level? Perhaps this gets at your second point. Here's an example of such a notebook where "gcloud container clusters --project={PROJECT} --zone={ZONE} get-credentials {CLUSTER}" was called to obtain kube credentials which were stored in the default location. Also this particular cluster was deployed with the cloud platform scope gcloud beta container clusters create \
$CLUSTER \
--project ${PROJECT} \
--zone=$ZONE \
...
--scopes=https://www.googleapis.com/auth/cloud-platform which is why you're already authenticated to make gcloud and gsutil calls. Kubernetes has this mechanism for storing secrets but the end result is just the credential saved to disk the same as results from gcloud container clusters get-credentials... Given the above would you want to do something like mirror a bunch of popular ML packages and regularly scan those? |
@jlewi You mentioned NFS, do you anticipate people needing to be able to mount additional NFS volumes to existing notebook containers that aren't pre-attached? Idk maybe you're thinking of a variety of different datasets on different NFS volumes available to mount? |
@cwbeitel "Is there any concern about allowing users to install arbitrary python packages at the user level?" Re: credentials storage in secrets, yes, I believe that is the best way to do it with what's available today. "Given the above would you want to do something like mirror a bunch of popular ML packages and regularly scan those?" |
Thing is, we don't have this security right now - if person wants to run notebook as root, they can specify their own image that'll run as root. In fact, we explicitly run pod as root now and su to notebook user as part of start.sh. I'd go for running it all as root tbh. |
Ok cool so far it sounds like providing a rich base container without sudo is the way forward but interested to hear from the others. Also @jlewi perhaps its not necessary to even build training containers - perhaps we could just share model code with training containers via NFS and either also share dependency code that way or have training containers in their first step pip install a requirements.txt file. This would extend whatever is available in the workspace environment with user's collection of custom packages. The strategy for this not slowing down the initiation of training jobs could be to have the workspace container already include packages like pybullet and tensorflow that take the longest to build with most other python packages being quick to install. This related to whether users are doing more than installing pip packages in their workspace container because if they're not we don't really need to mess with all this container building infrastructure. |
@inc0 posted my last before seeing yours. Right, users aren't currently prevented from running root containers. Are you advocating root to avoid restricting your/the development experience? That seems like an important thing to consider. |
@cwbeitel as I said on slack, we can do 2 things:
I'd go with root at least for now, will make quick simplification and I can't of any real security issue (if someone jailbreaks out of containers, that's bad tho). ad nfs share -Yeah, I was kinda wondering why do we want to build container images in containers:) not a typical usage. I'd rather put model on PV rather than nfs share. More k8s'y and would allow to use proper storage infra. NFS can be an option that we might not need to worry about, if someone creates StorageClass with nfs, they're good. |
So I'd add cleanups to this issue as well - we have a lot of messing with perms and users in our start.sh script and others - if we run as root both jupyter dockerfile and this could lose a lot of code. |
@mayakacz Convinced me that I was being too permissive. I think we should avoid granting sudo privileges so that we don't create more work in terms of locking down the container in the future. |
Closes StatCan#31 . This --allow-root option seems unnecessary, as we don't run as root, and the option only seems to be suggested for the that case. See kubeflow/kubeflow#300 too.
Hi, sorry for this comment, the issue is already closed, but really I need to you on my issue: jupyterhub/the-littlest-jupyterhub#805 |
Regarding notebook server spawner options https://github.com/kubeflow/kubeflow/blob/master/kubeflow/core/jupyterhub_spawner.py#L79 currently c.KubeSpawner.args is being provided the arg"--allow-root" presumably because the original implementer meant for users to have root access in containers. But currently this is not available.
By adding "GRANT_SUDO=yes" to the environment users do have sudo privs in spawned notebooks, e.g. can successfully run
sudo apt-get update
:If making this addition should determine if there is anything in c.KubeSpawner.environment to over-write and if so append the variable instead of setting all.
Related to #289
The text was updated successfully, but these errors were encountered: