New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroup2: use new fstype for unified hierarchy #2112

Closed
wants to merge 4 commits into
base: master
from

Conversation

5 participants
@alban
Member

alban commented Dec 7, 2015

Since Linux v4.4-rc1, __DEVEL__sane_behavior does not exist anymore and
is replaced by a new fstype "cgroup2".

torvalds/linux@67e9c74


Question: should systemd support unified hierarchy both before and after
Linux v4.4-rc1? If so, systemd needs to support both
__DEVEL__sane_behavior and cgroup2.

By the way, systemd-cgls is broken with the unified hierarchy.

cgroup2: use new fstype for unified hierarchy
Since Linux v4.4-rc1, __DEVEL__sane_behavior does not exist anymore and
is replaced by a new fstype "cgroup2".

torvalds/linux@67e9c74

alban added some commits Dec 7, 2015

nspawn: refuse to start a container out of it cgroup namespace root
When the systemd-nspawn is in a cgroup that is not reachable from the
current cgroup namespace, we cannot mount the cgroup for the container
in any meaningful way.

systemd or systemd-nspawn does not lead to this situation. However, the
user could use "unshare --cgroup" and then migrate the process outside
of its cgroup namespace's root, then run systemd-nspawn.
nspawn: implement cgroup namespaces
Disabled by default. Users can opt-in by setting the environment
variable CGROUP_NAMESPACE=1.
@alban

This comment has been minimized.

Member

alban commented Dec 8, 2015

More commits added in this branch implementing cgroup namespaces (CLONE_NEWCGROUP).

@poettering

This comment has been minimized.

Member

poettering commented Dec 9, 2015

I figure we should drop the old __DEVEL__sane_behaviour as soon as 4.4 is out, and probably merge this patch only then too.

@poettering

This comment has been minimized.

poettering commented on src/basic/cgroup-util.c in d8c6e5f Dec 9, 2015

I figure we should drop the check for the old magic entirely.

This comment has been minimized.

Member

alban replied Dec 28, 2015

That would be easier but...

What if the host is systemd < v229 with Linux < 4.4 booted with systemd.unified_cgroup_hierarchy=1, then starting a container with systemd-nspawn < v229 (preparing an unified /sys/fs/cgroup with __DEVEL__sane_behavior) and the container has a new systemd that does not check the old magic CGROUP_SUPER_MAGIC?

This comment has been minimized.

poettering replied Dec 29, 2015

well, the kernel unified hierarchy stuff wasn't stable so far, and hence nothing officially supported there. Thus, it shouldn't really be something officially supported in systemd either, and it's fine if we break compat hard on it. Thus, if people use the unified hierarchy they better update the kernel and systemd in sync, as well as all containers. As soon as the kernel declares the unified hierarchy stable this will change, but until then hard breaks are OK and what we should do as long as we mention them in NEWS.

This comment has been minimized.

Member

alban replied Dec 29, 2015

Ok, I'll do that.

@poettering

This comment has been minimized.

poettering commented on src/nspawn/nspawn-mount.c in 69395c2 Dec 9, 2015

We only log error codes that we receive, not those we generate. Hence, please do not pass EACCESS to log_error_errno() like this. Instead write this as:

{
        log_error("Own control group unreach...");
        return -EINVAL;
}

or so.

@poettering

This comment has been minimized.

poettering commented on a35b159 Dec 9, 2015

lgtm.

@poettering

This comment has been minimized.

poettering commented on 4a8d16c Dec 9, 2015

Hmm, has this stuff been merged upstream? I wasn't aware...

Is presume this is specific to the unified hierarchy? Or is it not?

Why make this an option? Sounds OK to just do this always in the unified hierarchy.

This comment has been minimized.

Member

alban replied Dec 9, 2015

Right, only the new fstype "cgroup2" has made it to Linus' tree, but not the cgroup namespace code. I should have made a separate PR, sorry.

I think cgroupns is not specific to the unified hierarchy, but I don't know if it is worth implementing cgroupns support in nspawn for non-unified hierarchy hosts. I have not yet tested cgroupns on non-unified hierarchy.

I made it an option because of an incorrect reasoning: I want nspawn to support starting old containers before cgroupns existed. But I now realize that it does not make sense because cgroupns is transparent to the container. It's not like the option for unified hierarchy, which is not transparent.

I can remove the option, and handle the error in unshare() more gracefully.

@poettering

This comment has been minimized.

Member

poettering commented Dec 9, 2015

I added the "postponed" label, to indicate that this should wait until 4.4 is released.

@poettering

This comment has been minimized.

Member

poettering commented Dec 10, 2015

Hmm I don't see how cgroupns could even work in the legacy hierarchy, as the top-level dir in cgroupfs was always special in the legacy hierarchy, and hat a couple of settings the others didn't have, such as notifier support.

Anyway, happy to see cgroups2 and cgroupsns supported, but if both of these only apply to the unified hiearchy anyway, then i'd prefer if we'd just make use of them unconditionally, after all the stuff is in flux anyway, and the fewer options the better.

@keszybz

This comment has been minimized.

keszybz commented on src/nspawn/nspawn-mount.c in 69395c2 Dec 25, 2015

What exactly is being matched here? Paths with a ".." component? If yes, then it should probably be "/../". Please add a comment that explains what is being matched and why.

This comment has been minimized.

Member

alban replied Dec 28, 2015

Paths starting with a ".." component. The second "/" in "/../" is not necessary because path_startswith does not match partial components in paths (e.g. path_startswith("/component1/path", "/comp") will not match).

I will add a comment with a reference on the cgroup doc explaining the "/../".

@zonque

This comment has been minimized.

Member

zonque commented Jan 15, 2016

Replaced by #2332

@zonque zonque closed this Jan 15, 2016

@poettering

This comment has been minimized.

Member

poettering commented Jan 15, 2016

@zonque I think you closed the wrong bug here...

@poettering poettering reopened this Jan 15, 2016

@zonque

This comment has been minimized.

Member

zonque commented Jan 15, 2016

Oh, sorry. Indeed.

@poettering

This comment has been minimized.

Member

poettering commented Jan 26, 2016

I guess #2271 replaces this one.

@poettering poettering closed this Jan 26, 2016

@alban

This comment has been minimized.

Member

alban commented Apr 6, 2016

For the record, the status of this branch:

commit 4a8d16c
nspawn: implement cgroup namespaces

Not merged. TODO.

commit a35b159
nspawn: userns and cgroup2: chown cgroup.events

Replaced by ab2c386

commit 69395c2
nspawn: refuse to start a container out of it cgroup namespace root

Not merged. TODO.

commit d8c6e5f
cgroup2: use new fstype for unified hierarchy

Merged as 0996199

@keszybz

This comment has been minimized.

Member

keszybz commented Apr 6, 2016

commit 4a8d16c
nspawn: implement cgroup namespaces

Can we make this the default when we switch to unified hierarchy?

@alban

This comment has been minimized.

Member

alban commented Apr 6, 2016

Can we make this the default when we switch to unified hierarchy?

Yes we could. We should however make sure the kernel supports cgroupns by checking /proc/self/ns/cgroup, and just not use cgroupns if not supported. The commit above does not do this.

@brauner brauner referenced this pull request Jun 23, 2016

Merged

Cgroup namespace #3589

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment