Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for userns in stateless and stateful pods with idmap mounts (KEP-127, k8s >= 1.27) #8287

Merged
merged 11 commits into from Sep 14, 2023

Conversation

rata
Copy link
Contributor

@rata rata commented Mar 17, 2023

This adds support in containerd for k8s stateless and stateful pods with user namespaces as implemented in k8s >= 1.27. Kubernetes 1.28 added stateful pod support, but no other changes are needed in containerd, we just use idmap mounts for all volumes (stateless, like configmaps, or stateful, like hostPath volumes).

We have some requirements:

  • The filesystems should support idmap mounts. The most late adition was tmpfs, that we merged support in Linux 6.3 for idmap mounts, so in practice you will need Linux 6.3 for most stateless pods to work with userns.
  • The OCI runtime needs to support idmap mounts too.

This requires runc with this PR applied to support idmap mounts: opencontainers/runc#3717. This is expected to be part of runc 1.2 (yet to be released). I'm changing the runc version to that commit, we can update it again when this is included in a release.

I also check if idmap mounts are used, then if the runc version supports that, to avoid silently ignoring the feature and creating issues to users (see the commit msg for more details). If it is not supported, we throw a clear error to the user.

Note I didn't update the runc go module, as that broke the unit tests. It probably needs to be done with more care. Furthermore, we don't need that for this, the runtime-spec module is already up to date with what we need, IIUC.

I've added this in both cri/server and cri/sbserver.

@mikebrow as you requested, here is the output if we try to use it with ctr and an old runc version:

$ sudo bin/ctr --address /run/containerd-rata/containerd.sock container create --config tmp/config-userns-idmap.json rata-test
$ sudo bin/ctr --address /run/containerd-rata/containerd.sock t start rata-test
ctr: failed to create shim task: failed to detect OCI runtime features: OCI runtime doesn't support idmap mounts: missing `mountExtensions.idmap` entry in `features` command: unknown

And with a new runc (that supports idmap mounts):

$ sudo bin/ctr --address /run/containerd-rata/containerd.sock container create  --config tmp/config-userns-idmap.json rata-test 
$ sudo bin/ctr --address /run/containerd-rata/containerd.sock t start rata-test
root@runc:/# exit

TODO:

  • Update script/setup/runc-version to point to the runc version with the PR merged. EDIT: updated crun-version, that has a realease with the features. We keep the same runc version and will open a new PR when runc makes a new minor release.
  • Add unit test
  • Add integration tests.

Fixes: #8286

@k8s-ci-robot
Copy link

Hi @rata. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rata rata force-pushed the rata/userns-stateless-idmap branch from 0e5d719 to fd201ac Compare March 17, 2023 17:26
@fuweid fuweid self-requested a review March 18, 2023 02:21
@rata rata force-pushed the rata/userns-stateless-idmap branch from fd201ac to ac56187 Compare March 21, 2023 12:01
@rata
Copy link
Contributor Author

rata commented Mar 21, 2023

Rebased, the test fail seems like a flake (is on windows, that is untouched)

@dims
Copy link
Member

dims commented Mar 29, 2023

/ok-to-test

@k8s-ci-robot
Copy link

@dims: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rata rata force-pushed the rata/userns-stateless-idmap branch from ac56187 to ba89783 Compare April 13, 2023 09:12
@rata rata changed the title Add support for userns in stateless pods with idmap mounts (KEP-127, k8s >= 1.27) Add support for userns in stateless and stateful pods with idmap mounts (KEP-127, k8s >= 1.27) Jul 11, 2023
@rata rata force-pushed the rata/userns-stateless-idmap branch from ba89783 to ef903cd Compare July 11, 2023 13:39
@rata
Copy link
Contributor Author

rata commented Jul 11, 2023

Updated this to include support for sbserver too :)

@rata rata force-pushed the rata/userns-stateless-idmap branch from ef903cd to fee0126 Compare July 12, 2023 11:33
@rata
Copy link
Contributor Author

rata commented Jul 17, 2023

The runc PR was just merged. I'll see how to update the runc dependency in containerd and add unit and e2e tests here :)

@rata rata force-pushed the rata/userns-stateless-idmap branch 2 times, most recently from d6154d6 to f4fdd49 Compare July 19, 2023 14:12
@rata rata marked this pull request as ready for review July 19, 2023 14:12
@rata rata force-pushed the rata/userns-stateless-idmap branch from f4fdd49 to 072c2dd Compare July 19, 2023 14:20
@rata
Copy link
Contributor Author

rata commented Jul 19, 2023

@fuweid I see you self assigned for this PR. Now it is ready for review! Take a look at the PR description, I've updated it now :)

@rata
Copy link
Contributor Author

rata commented Jul 19, 2023

It seems I need to skip the tests on old kernels and that is causing the CI failures.

fuweid
fuweid previously approved these changes Jul 20, 2023
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me.

@fuweid fuweid dismissed their stale review July 20, 2023 05:27

need to solve the CI issue

@rata rata force-pushed the rata/userns-stateless-idmap branch 4 times, most recently from f51f6c4 to 71da691 Compare July 20, 2023 13:52
This reverts commit 7e6ab84.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
Future patches will use that field.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
runc, as mandated by the runtime-spec, ignores unknown fields in the
config.json. This is unfortunate for cases where we _must_ enable that
feature or fail.

For example, if we want to start a container with user namespaces and
volumes, using the uidMappings/gidMappings field is needed so the
UID/GIDs in the volume don't end up with garbage. However, if we don't
fail when runc will ignore these fields (because they are unknown to
runc), we will just start a container without using the mappings and the
UID/GIDs the container will persist to volumes the hostUID/GID, that can
change if the container is re-scheduled by Kubernetes.

This will end up in volumes having "garbage" and unmapped UIDs that the
container can no longer change. So, let's avoid this entirely by just
checking that runc supports idmap mounts if the container we are about
to create needs them.

Please note that the "runc features" subcommand is only run when we are
using idmap mounts. If idmap mounts are not used, the subcommand is not
run and therefore this should not affect containers that don't use idmap
mounts in any way.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@rata rata force-pushed the rata/userns-stateless-idmap branch from 1825db1 to a678ab7 Compare September 13, 2023 14:45
@rata
Copy link
Contributor Author

rata commented Sep 13, 2023

@AkihiroSuda pushed fixing your comments (all documentation changes), so all should be green again

@estesp
Copy link
Member

estesp commented Sep 13, 2023

One small nit in the docs with a link rendering issue due to typo; once we fix that, LGTM and hopefully we can go ahead and merge this in

@rata rata force-pushed the rata/userns-stateless-idmap branch 3 times, most recently from 95f37be to 0e4645b Compare September 13, 2023 21:29
@rata
Copy link
Contributor Author

rata commented Sep 13, 2023

@estesp thanks, fixed! PTAL :)

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@rata rata force-pushed the rata/userns-stateless-idmap branch from 0e4645b to cbad93c Compare September 13, 2023 21:37
@rata rata requested a review from estesp September 13, 2023 21:39
@rata rata force-pushed the rata/userns-stateless-idmap branch 2 times, most recently from 79cbecd to 52e5a37 Compare September 14, 2023 07:47
crun 1.9 was just released with fixes and exposes idmap mounts support
via the "features" sub-command.

We use that feature to throw a clear error to users (if they request
idmap mounts and the OCI runtime doesn't support it), but also to skip
tests on CI when the OCI runtime doesn't support it.

Let's bump it so the CI runs the tests with crun.

Signed-off-by: Rodrigo Campos <rodrigoca@microsoft.com>
@rata rata force-pushed the rata/userns-stateless-idmap branch from 52e5a37 to 83240a4 Compare September 14, 2023 09:07
Copy link
Member

@fuweid fuweid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuweid fuweid merged commit fe17f65 into containerd:main Sep 14, 2023
45 checks passed
@rata rata deleted the rata/userns-stateless-idmap branch September 14, 2023 10:25
@samuelkarp samuelkarp added impact/changelog area/cri Container Runtime Interface (CRI) labels Nov 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) impact/changelog ok-to-test
Projects
Development

Successfully merging this pull request may close these issues.

Support for userns in k8s >= 1.27
9 participants