Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate to cgroupsv2 #73800

Closed
deliciouslytyped opened this issue Nov 20, 2019 · 10 comments
Closed

Migrate to cgroupsv2 #73800

deliciouslytyped opened this issue Nov 20, 2019 · 10 comments
Projects
Milestone

Comments

@deliciouslytyped
Copy link
Contributor

deliciouslytyped commented Nov 20, 2019

#68096 (comment)

For the uninitiated https://www.kernel.org/doc/Documentation/cgroup-v2.txt , see also section R stating the rationales for v2.

This is not an immediately actionable issue, the first major blocker seems to be waiting for docker to migrate.

Some more keywords for searchability (maybe): cgroups cgroups-v2

Currently AFAICT all cgroup controllers are acquired by the v1 system, because a controller can only be mounted (or whatever the word is) on one cgroup hierarchy at a time.

https://web.archive.org/web/20191120061819/https://medium.com/nttlabs/cgroup-v2-596d035be4d7 suggests

Docker / Moby will gain the support for cgroup v2, as soon as runc
and containerd gains the support. 
Docker/Moby+containerd+runc will follow soon. If everything goes well,
we might be able to get nightly binaries for cgroup v2 by the end of 2019.

cc @arianvp

@deliciouslytyped
Copy link
Contributor Author

deliciouslytyped commented Nov 20, 2019

Presumably it should be sufficient to omit this setting to use v2 https://github.com/andir/nixpkgs/blob/9c06aae94ad42aba50c7ff3c503ddcb362f4a80e/pkgs/os-specific/linux/systemd/default.nix#L109 since it was added in #68096 precisely to not default to v2.

Though there's also something about setting the systemd.unified_cgroup_hierarchy=1 kernel parameter?

@deliciouslytyped
Copy link
Contributor Author

I had some issues in hybrid mode with systemd-run https://discourse.nixos.org/t/ram-limiting-firefox-for-pathological-tabbers/5117/

Using cgroupsv2 proper appears to fix it.

@mikroskeem
Copy link
Member

mikroskeem commented Oct 23, 2020

One of the major blockers is indeed Docker - Docker derivation is using old containerd. containerd gained CGroups v2 support in release 1.4.0. runc... I think in v1.0-rc91.

However, Docker's current release (19.03.13 at the time of writing) does not seem to contain the required changes to get CGroups v2 supported. Building Docker from commit 3b9fb515ce3a39e2d9a1dcd7f094eb3ed511581d gets it working (tested with setting systemd.unified_cgroup_hierarchy=1 and cgroup_no_v1=all to ensure old cgroups not being present).

image

Image shows crun (personal preference), but I tested with latest runc (1.0-rc92) & it worked as expected.

@flokli flokli added this to To Do in systemd via automation Nov 17, 2020
@andir andir added this to the 21.03 milestone Nov 17, 2020
@flokli
Copy link
Contributor

flokli commented Nov 17, 2020

I think this is mature enough for now. I'll draft a PR :-)

Other distros already switched to the unified cgroup hierarchy, and people who want to keep using docker can add a systemd.unified_cgroup_hierarchy=0 to their cmdline, like documented for Fedora etc.

We should switch to this in unstable soon-ish, so potential issues can be sorted out in unstable, and it gets less stressful for the 21.03 release.

It'd be nice if we could also sort out #77925, so using other container runtimes gets simpler.

@archseer
Copy link
Member

archseer commented Dec 4, 2020

Note: security.hideProcessInformation doesn't work with systemd and is now completely broken with cgroupsv2:

systemd/systemd#12955

We might want to use an assertion for it in the security module because I just learned it the hard way after upgrading.

@arianvp
Copy link
Member

arianvp commented Dec 4, 2020

This was brought up in the original PR. We poked the maintainer of the module for feedback and advice but they removed themselves from review. I interpreted that as "This is not a blocker"

I don't think we can realistically keep supporting hideProcessInformation. But I'm fine merging a PR fallling back to cgroupsv1 when it's enabled (Just like we do with the docker module). You mind making one?

@archseer
Copy link
Member

archseer commented Dec 4, 2020

I think we should probably just remove the option completely. It's apparently subtly broken on v1 as well, it just became more obvious with the migration to v2. I'll make a PR removing it?

@flokli
Copy link
Contributor

flokli commented Dec 4, 2020

SGTM!

@archseer
Copy link
Member

archseer commented Dec 6, 2020

Hmm so apparently this is solvable: https://wiki.archlinux.org/index.php/security#hidepid

For user sessions to work correctly, an exception needs to be added for systemd-logind:

/etc/systemd/system/systemd-logind.service.d/hidepid.conf

[Service]
SupplementaryGroups=proc

The patches mentioned in the systemd issue also landed in 5.8 so it might get resolved in upstream systemd

Edit: The supplementary group is apparently already applied under nixos/modules/security/hidepid.nix, hmm. I guess it doesn't fix the issue

@xaverdh
Copy link
Contributor

xaverdh commented Feb 2, 2021

I think we should probably just remove the option completely. It's apparently subtly broken on v1 as well, it just became more obvious with the migration to v2. I'll make a PR removing it?

So since this is broken currently and people stumble over this (cf. #111629) now, should we disable the option for now, or at least mark it as broken somehow?

xaverdh added a commit to xaverdh/nixpkgs that referenced this issue Feb 21, 2021
This has been in an unusable state since the switch to cgroups-v2.
See NixOS#73800 for details.
YodaEmbedding added a commit to YodaEmbedding/nixos that referenced this issue Mar 24, 2022
Took an entire day to find the linked comment [1] by @biggs, which says:

> Fix on NixOS (where cgroup v2 is also now default): add
> `systemd.enableUnifiedCgroupHierarchy = false;`
> and restart.

Indeed, after applying this commit and then running
`sudo systemctl restart docker`, any of the following commands works:

```bash
sudo docker run --gpus=all nvidia/cuda:10.0-runtime nvidia-smi
sudo docker run --runtime=nvidia nvidia/cuda:10.0-runtime nvidia-smi
sudo nvidia-docker run nvidia/cuda:10.0-runtime nvidia-smi
```

ARGH!!!1

Links:
[1] NVIDIA/nvidia-docker#1447 (comment)
[2] NixOS/nixpkgs#127146
[3] NixOS/nixpkgs#73800
[4] https://blog.zentria.company/posts/nixos-cgroupsv2/

P.S.
I use Colemak, but typing arstarstarst doesn't have the same ring to it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
systemd
  
Done
Development

No branches or pull requests

8 participants