-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory stats and freezer management with cgroupv2 #10251
Comments
Just upgraded to Debian “bullseye” and they are using |
Also Ubuntu will change to |
As a workaround, you can switch the kernel to "hybrid" cgroup hierarchy. It fixes the issue. For debian-based distros, use the following script:
Details: |
The workaround works for this, but I noticed that the memory usage reporting in the nomad dashboard only represents the RSS memory, where as docker considers both the RSS and the Cache. This results in things that might look okay memory wise in nomad being OOM killed by the kernel and it not being obvious without looking at dmesg. Running |
Any update on this issue? There's a PR open for this (timdaman/check_docker#82). |
is there a workaround to disable cgroups for everything except the pid list? the raw executor is currently nonfunctional due to #10551 unless you disable cgroups entirely. unfortunately it doesn't use sessions so you leak processes without cgroups |
Any updates on this one? |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad cgroup-v2 integration as it has some cgroupv1-isms. Cgroups-v2 changed the filesystem representation and changed the memory metrics that Nomad has relied on, so Nomad reports 0 memory summary metric across ~all drivers.
First, Nomad memory reporting relies on cgroup-v1 metrics. Nomad defaults to using RSS as the top line memory summary value to report, and reports
Kernel Max Usage
,Kernel Usage
,Max Usage
,RSS
, none of which are reported in cgroupv2. You can view the libcontainer reporting difference by comparing cgroup v1 memory stats with cgroup v2. This is pretty confusing.Also, the executor
DestroyCgroup
method uses libcontainer cgroup v1 . This needs to be updated to account for v2 and ideally select the relevant cgroup backend.It's not clear what the state of cgroup-v2 adoption is. Seems like Fedora and ArchLinux. Other distros, like RHEL and Ubuntu, provide it as an option but the default one.
Sample metrics of cgroup v2
Running on Fedora 33, I see the following stats info:
Also, here is docker memory stats for cgroup v1 and v2
Cgroup v2
Cgroup v1
Links
The text was updated successfully, but these errors were encountered: