Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for userns in k8s >= 1.27 #8286

Closed
rata opened this issue Mar 17, 2023 · 0 comments · Fixed by #8287
Closed

Support for userns in k8s >= 1.27 #8286

rata opened this issue Mar 17, 2023 · 0 comments · Fixed by #8287

Comments

@rata
Copy link
Contributor

rata commented Mar 17, 2023

What is the problem you're trying to solve

We have reworked the implementation of userns in Kubernetes and now we rely on idmap mounts even for stateless pods (stateful pods are not yet supported, but coming soon).

This means the kubelet will now send a mapping to use for the mounts and containerd needs to pass these mappings down to the OCI runtime.

idmap mounts was added to the runtime-spec here: opencontainers/runtime-spec#1143
runc support is expected for v1.2: opencontainers/runc#3717
crun supports this since v1.8.1.

Describe the solution you'd like

I propose the following solution that seems straight forward:

  • Update the cri-api to vendor it from k8s 1.27, that has the new field for the mappings in the CRI api
  • Pass down the mappings we received in the CRI mounts to the config.json we pass to the OCI runtime
  • For that we need to modify just these functions: WithMounts() from pkg/cri/opts/spec_linux_opts.go; volumeMounts() from pkg/cri/server/container_create.go to do it also in volumes from the image (VOLUME keyword in the Dockerfile); containerMounts() from pkg/cri/server/container_create_linux.go
  • Update script/setup/runc-version to use runc 1.2 when that is released.

There is one tricky aspect and is that the runtime-spec mandates that unrecognized fields are ignored, but if we start a container with userns and the idmap mounts are ignored (this can happen with runc 1.1, for example) then the files created in the volumes will have garbage for their UID/GID, as it will be the hostUID/hostGID the userns is mapping.

Can we really trust distros will not update to containerd 1.8 without also updating runc to 1.2? Or shall we worry about that case and try to detect it in some way?

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant