-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cgroup v2 support to coreclr #34334
Conversation
885ea76
to
16b3a75
Compare
I don't understand the macOS failures. Can someone help me understand the build issue? |
Crossgen of System.Private.CoreLib.dll is crashing with SIGSEGV. Crossgen uses PAL too, so maybe it crashes in the newly added code due to the fact that there are no cgroups on OSX and somewhere we don't handle properly the case when the cgroup initialization failed? |
Aaah. I think I see it. I was hoping the assertion message would be printed at that point if such a condition triggers :( |
result = ReadMemoryValueFromFile(mem_usage_filename, &temp); | ||
if (result) | ||
{ | ||
if (temp > std::numeric_limits<size_t>::max()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always going to be false on a 64-bit system. Do we force size_t
to be 64-bit on 32-bit systems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure. This code was added as part of dotnet/coreclr#25724. I only moved it around.
Edit: As I see it now, ReadMemoryValueFromFile
returns a 64-bit integer. On 32-bit platforms, it needs to be clamped correctly to fit into a size_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that was the intent of this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of nits, but it looks good otherwise.
Upstream cgroup v2 documentation is available at: https://www.kernel.org/doc/Documentation/cgroup-v2.txt Some notable differences between cgroup v1 and v2, from a coreclr point of view, include: - cgroup v2 has a single hierarchy, so we just look for a single "cgroup2" entry in /proc/self/mountinfo (without looking for a subsystem match). - Since cgroup v2 has a single hierarchy, /proc/self/cgroup generally has a single line "0::/path". There's no need to match subsystems or hierarchy ids here. - "memory.limit_in_bytes" is now "memory.max". It can contain the literal "max" to indicate no limit. - "memory.usage_in_bytes" is now "memory.current" - "cpu.cfs_quota_us" and "cpu.cfs_period_us" have been combined into a single "cpu.max" file with the format "$MAX $PERIOD". The max value can be a literal "max" to indicate a limit is not active. It is possible to have both cgroup v1 and v2 enabled on a host (but not inside a container, AFAIK). In that case, this change will pick one based on /sys/fs/cgroup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Thanks for the review, your great suggestions and for merging this! |
@omajid You're welcome, thank you for adding the support! |
This commit brings in two changes from coreclr to libraries: 1. dotnet#980 "Fix named cgroup handling in docker" This fixes getting cgroup information for named cgroups inside containers. 2. dotnet#34334 "Add cgroup v2 support to coreclr" This is essentially the same change pushed to corefx (now libraries) to add cgroupv2 support, but this newer coreclr change has one major difference: it determines whether the system is using cgroup v1 or cgroup v2 once, and then explicitly uses that (only). This avoids issues on systems where both cgroup v1 and v2 are enabled, (but only one is being used by default).
This commit brings in two changes from coreclr to libraries: 1. #980 "Fix named cgroup handling in docker" This fixes getting cgroup information for named cgroups inside containers. 2. #34334 "Add cgroup v2 support to coreclr" This is essentially the same change pushed to corefx (now libraries) to add cgroupv2 support, but this newer coreclr change has one major difference: it determines whether the system is using cgroup v1 or cgroup v2 once, and then explicitly uses that (only). This avoids issues on systems where both cgroup v1 and v2 are enabled, (but only one is being used by default).
Is this blog post a fair assessment of the cgroup v2 landscape? If not, can you suggest something that is? |
Yes, I think it is. I have been using it as the source of truth myself 😄 I haven't found anything in it that's wrong. For a more technical details, https://systemd.io/CGROUP_DELEGATION/ is also a useful resource as is https://www.kernel.org/doc/Documentation/cgroup-v2.txt. |
Upstream cgroup v2 documentation is available at: https://www.kernel.org/doc/Documentation/cgroup-v2.txt
Some notable differences between cgroup v1 and v2, from a coreclr point of view, include:
cgroup v2 has a single hierarchy, so we just look for a single "cgroup2" entry in /proc/self/mountinfo (without looking for a subsystem match).
Since cgroup v2 has a single hierarchy, /proc/self/cgroup generally has a single line "0::/path". There's no need to match subsystems or hierarchy ids here.
"memory.limit_in_bytes" is now "memory.max". It can contain the literal "max" to indicate no limit.
"memory.usage_in_bytes" is now "memory.current"
"cpu.cfs_quota_us" and "cpu.cfs_period_us" have been combined into a single "cpu.max" file with the format "$MAX $PERIOD". The max value can be a literal "max" to indicate a limit is not active.
It is possible to have both cgroup v1 and v2 to both be enabled on a host (but not inside a container). In that case, this change will pick one, which may not be the correct one.We should be able to find the right one by checking the system configuration as described here.