-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: sysbox-runc does not copy cpu.cfs_{quota,period}_us to syscont-cgroup-root #582
Comments
Hi @johnstcn, thanks for filing the issue and for the excellent description. The issue has a fairly simple fix, but I am not sure that it's always the "right thing" to do to make the cpu.cfs_[quota|period]_us values visible inside the container. Without cgroup delegation (i.e., without a cgroup manager inside the container), it makes sense to expose these inside the container since there is a single cgroup manager at host level. But with cgroup delegation (e.g., with a cgroup manager such as systemd inside the sysbox container), it's less clear because there is a valid argument for making this opaque inside the container so as to fool the cgroup manager into thinking it has full control of the cpu bandwidth (even though it's constrained by the parent cgroup). Since Sysbox containers are most often used as "system containers", we made the decision to go with the latter approach. But maybe we need a knob to control the behavior. Curious on your thoughts about this? |
Out of curiosity, are you aware of any programs that look into the cpu.cfs_[quota|period]_us? |
In a container environment (e.g. Kubernetes), not respecting CPU limits can cause an application to not respond to liveness probes and be killed just as if it had consumed too much memory (with a different smoking gun in each case, of course). This is especially true on more powerful systems with e.g. 64 or more logical cores -- an application may decide to spin up that many separate threads to do its work without knowing that it's "effectively" constrained to far fewer.
Agreed, it does seem that there are some competing use cases here with different behaviours.
Yes! Best example I can think of is the OpenJDK JVM (since JDK-8146115, which seems to have been backported to all major JVM versions in use). For example,
It's likely other JREs support this, but I haven't tested all of them. There's more recent work in supporting CGroupv2 as well (link). Back in Go-land, there's also an open issue for the Go runtime to automatically set |
Thanks @johnstcn for the detailed response. Not arguing against anything you said, but the thing that still confuses me a bit about applications that rely on the CFS quota/period is that cgroups are hierarchical in nature, and therefore it seems to me an app can't really tell how much CPU is has without looking at the corresponding limits on it's parent (and further ancestor) cgroups. For example, if the parent cgroup limits the app to 25% CPU via CFS quota/period, then it does not matter if the cgroup for the app itself is configured with 200% CPU quota/period; it won't go over the 25% limit imposed by the parent. Now when such an app runs in a container, it won't even have access to the cgroup of it's parent (and further ancestors), so I am not sure how can a clear determination be made. Does this make sense or am I missing something basic here? |
@ctalledo Yep, no arguments from me there either :-) Just what you said there also applies to the |
bug: sysbox-runc does not copy cpu.cfs_{quota_period}_us to syscont-cgroup-root
Summary
Similar to #303 sysbox-runc is not copying the cgroup CPU quota and limit values from the parent cgroup to syscont-cgroup-root.
Processes running inside the container have no idea about how much CPU they have to work with.
Impact
Low. The container still gets limited, but the limit is opaque and not visible inside the container.
Steps to reproduce
Reproduced on sysbox-ce version 0.5.0 and version 0.5.2.
The text was updated successfully, but these errors were encountered: