Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

namespaces: by default create cgroupns on cgroups v2 #4374

Conversation

giuseppe
Copy link
Member

change the default on cgroups v2 and create a new cgroup namespace.

When a cgroup namespace is used, processes inside the namespace are
only able to see cgroup paths relative to the cgroup namespace root
and not have full visibility on all the cgroups present on the
system.

The previous behaviour is maintained on a cgroups v1 host, where a
cgroup namespace is not created by default.

Closes: #4363

Signed-off-by: Giuseppe Scrivano gscrivan@redhat.com

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S labels Oct 30, 2019
@giuseppe giuseppe force-pushed the create-cgroupns-by-default-on-cgroupsv2 branch from 439b243 to 245dc68 Compare October 30, 2019 08:25
@giuseppe giuseppe force-pushed the create-cgroupns-by-default-on-cgroupsv2 branch from 245dc68 to 9eeb4e6 Compare October 30, 2019 09:08
@giuseppe
Copy link
Member Author

tests is failing with: [+0318s] Error: could not get runtime: please update to v2.0.1 or later: outdated conmon version

@vrothberg
Copy link
Member

That's very curious. We merged the PR yesterday and all checks were green -> https://github.com/containers/libpod/pull/3792/checks

@vrothberg
Copy link
Member

@cevich @haircommander PTAL

@haircommander
Copy link
Collaborator

So I had to manually push a version of the in_podman image that updated conmon. The in_podman image is updated as a post merge task on every PR. It seems any PR that isn't rebased to the new Dockerfile (with the new conmon) updates this image, and breaks subsequent PRs. I am not sure if there is a better work around other than rebasing ever live PR, and pushing the image manually until it's fixed... wdyt @cevich

@rhatdan
Copy link
Member

rhatdan commented Oct 30, 2019

LGTM

@cevich
Copy link
Member

cevich commented Oct 30, 2019

and pushing the image manually until it's fixed... wdyt @cevich

That was my bad, totally forgot that quay.io was set to re-build that image from Dockerfile.fedora instead of Dockerfile. That's fixed now and in_podman tests are passing with it.

@rhatdan
Copy link
Member

rhatdan commented Oct 31, 2019

@vrothberg @mheon @baude @TomSweeneyRedHat @QiWang19 @jwhonce Tests are passing, can we get this merged.

@@ -132,7 +132,7 @@ func getCreateFlags(c *cliconfig.PodmanCommand) {
"Drop capabilities from the container",
)
createFlags.String(
"cgroupns", "host",
"cgroupns", "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we dynamically set the default like we do for a few other things? A bit less ugly than special-casing the empty string

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about that, but one potential issue would be, if we default the CLI based on when the container was created, and the user switched cgroups, it could cause issues with existing containers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing containers are an issue either way - this is all done at spec generation time, it'll be in the final, immutable spec that we put in the DB

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we don't add this logic to the cmd/ package if possible.

Also, we need to be able to catch errors in the cgroups v2 detection. If we move it here, we would need to either ignore the error or panic.

@rh-atomic-bot
Copy link
Collaborator

☔ The latest upstream changes (presumably #4354) made this pull request unmergeable. Please resolve the merge conflicts.

@giuseppe giuseppe force-pushed the create-cgroupns-by-default-on-cgroupsv2 branch from 9eeb4e6 to 1ba9902 Compare November 5, 2019 07:42
@giuseppe
Copy link
Member Author

giuseppe commented Nov 5, 2019

rebased

@AkihiroSuda
Copy link
Collaborator

What should be the default value for --privileged mode?
I think private is ok, but some people may expect host for privileged containers? (Is there any such use-case?)

@giuseppe giuseppe force-pushed the create-cgroupns-by-default-on-cgroupsv2 branch from 1ba9902 to 8a4ee14 Compare November 5, 2019 10:53
@giuseppe
Copy link
Member Author

giuseppe commented Nov 5, 2019

I think private is ok, but some people may expect host for privileged containers? (Is there any such use-case?)

Good point! I think we should keep --cgroupns=host when --privileged is used. Privileged containers are expected to have full control on the host. I've changed the patch to address that.

@AkihiroSuda
Copy link
Collaborator

Privileged containers are expected to have full control on the host.

True, but privileged containers still unshare other namespaces except User Namespace?
So I think it makes more sense to use private ns for privileged containers, unless there is specific usecase to require host ns

@rhatdan
Copy link
Member

rhatdan commented Nov 5, 2019

I agree with @AkihiroSuda. --privileged should not modify the default behaviour of namespaces.
If a user wants to work on the host cgroupns, then they have to set the option.

change the default on cgroups v2 and create a new cgroup namespace.

When a cgroup namespace is used, processes inside the namespace are
only able to see cgroup paths relative to the cgroup namespace root
and not have full visibility on all the cgroups present on the
system.

The previous behaviour is maintained on a cgroups v1 host, where a
cgroup namespace is not created by default.

Closes: containers#4363

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
@giuseppe giuseppe force-pushed the create-cgroupns-by-default-on-cgroupsv2 branch from 8a4ee14 to b8514ca Compare November 5, 2019 16:29
@giuseppe
Copy link
Member Author

giuseppe commented Nov 5, 2019

I agree with @AkihiroSuda. --privileged should not modify the default behaviour of namespaces.
If a user wants to work on the host cgroupns, then they have to set the option.

ok I've reverted the change

@rhatdan
Copy link
Member

rhatdan commented Nov 5, 2019

LGTM

@rhatdan
Copy link
Member

rhatdan commented Nov 5, 2019

@mheon
Copy link
Member

mheon commented Nov 5, 2019

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 5, 2019
@openshift-merge-robot openshift-merge-robot merged commit 7eda1b0 into containers:master Nov 5, 2019
AkihiroSuda added a commit to AkihiroSuda/docker that referenced this pull request Dec 31, 2019
For cgroup v1, we were unable to change the default because of
compatibility issue.

For cgroup v2, we should change the default right now because switching
to cgroup v2 is already breaking change.

See also containers/podman#4363 containers/podman#4374

Privileged containers also use cgroupns=private by default.
containers/podman#4374 (comment)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
AkihiroSuda added a commit to AkihiroSuda/cri-containerd that referenced this pull request Jan 9, 2020
In cgroup v1 container implementations, cgroupns is not used by default because
it was not available in the kernel until kernel 4.6 (May 2016), and the default
behavior will not change on cgroup v1 environments, because changing the
default will break compatibility and surprise users.

For cgroup v2, implementations are going to unshare cgroupns by default
so as to hide /sys/fs/cgroup from containers.

* Discussion: containers/podman#4363
* Podman PR (merged): containers/podman#4374
* Moby PR: moby/moby#40174

This PR enables cgroupns for containers, but pod sandboxes are untouched
because probably there is no need to do.

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Jan 9, 2020
For cgroup v1, we were unable to change the default because of
compatibility issue.

For cgroup v2, we should change the default right now because switching
to cgroup v2 is already breaking change.

See also containers/podman#4363 containers/podman#4374

Privileged containers also use cgroupns=private by default.
containers/podman#4374 (comment)

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Upstream-commit: 19baeaca267d5710907ac1b3c3972d44725fe8ad
Component: engine
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 26, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cgroupns=private should be enabled by default for unified mode?
10 participants