-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JDK-8266490: Extend the OSContainer API to support the pids controller of cgroups #4518
Conversation
👋 Welcome back mbaesken! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work. How did you test this? Did you run container tests on a cgroups v1 and cgroups v2 system?
@@ -122,9 +133,10 @@ bool CgroupSubsystemFactory::determine_type(CgroupInfo* cg_infos, | |||
char buf[MAXPATHLEN+1]; | |||
char *p; | |||
bool is_cgroupsV2; | |||
// true iff all controllers, memory, cpu, cpuset, cpuacct are enabled | |||
// true iff all required controllers, memory, cpu, cpuset, cpuacct enabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are enabled, please.
is_cgroupsV2 = is_cgroupsV2 && cg_infos[i]._hierarchy_id == 0; | ||
all_required_controllers_enabled = all_required_controllers_enabled && cg_infos[i]._enabled; | ||
} | ||
if (! cg_infos[i]._enabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if is only present for debug logging and should be guarded to that effect.
for (int i = 0; i < CG_INFO_LENGTH; i++) { | ||
is_cgroupsV2 = is_cgroupsV2 && cg_infos[i]._hierarchy_id == 0; | ||
all_controllers_enabled = all_controllers_enabled && cg_infos[i]._enabled; | ||
// the pids controller is not there on older Linux distros |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Change the code comment to // pids controller is optional. All other controllers are required
@@ -387,10 +421,13 @@ bool CgroupSubsystemFactory::determine_type(CgroupInfo* cg_infos, | |||
*flags = INVALID_CGROUPS_V1; | |||
return false; | |||
} | |||
if (!cg_infos[PIDS_IDX]._data_complete) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, this if should be guarded with debug logging being enabled.
int err2; | ||
err2 = subsystem_file_line_contents(_pids, "/pids.max", NULL, "%1023s", myline); | ||
if (err2 != 0) { | ||
if (strncmp(myline, "max", 3) == 0) return -3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use -1
for "unlimited" elsewhere and should probably do the same here.
char myline[1024]; | ||
int err2; | ||
err2 = subsystem_file_line_contents(_pids, "/pids.max", NULL, "%1023s", myline); | ||
if (err2 != 0) { | ||
if (strncmp(myline, "max", 3) == 0) return -3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it should use GET_CONTAINER_INFO_CPTR
macro and then limit_from_str
from cgroups v2 code. Perhaps move limit_from_str
method to the base class.
jlong CgroupV2Subsystem::pids_max() { | ||
// we have to handle the special "max" value | ||
GET_CONTAINER_INFO(jlong, _unified, "/pids.max", | ||
"Maximum number of tasks is: " JLONG_FORMAT, JLONG_FORMAT, pidsmax); | ||
// not a number -> could be "max" | ||
if (pidsmax < 0) { | ||
char myline[1024]; | ||
int err2; | ||
err2 = subsystem_file_line_contents(_unified, "/pids.max", NULL, "%1023s", myline); | ||
if (err2 != 0) { | ||
if (strncmp(myline, "max", 3) == 0) return -3; | ||
} | ||
} | ||
return pidsmax; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have this pattern of needing to handle max
elsewhere in cgroups v2 code. See for example: CgroupV2Subsystem::cpu_quota()
. We should handle it similarly here.
src/hotspot/os/linux/os_linux.cpp
Outdated
if (j == -3) { | ||
st->print_cr("max"); | ||
} else { | ||
st->print_cr("%s", j == OSCONTAINER_ERROR ? "not supported" : "unlimited"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should treat the unlimited case similar to how we handle them elsewhere. I'm not sure this magic constant of -3
gives us any more info that we'd get with -1
that we use elsewhere.
* pids subsystem | ||
****************************************************************/ | ||
public long getPidsMax() { | ||
return CgroupV1SubsystemController.longValOrUnlimited(getLongValue(pids, "pids.max")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this value may be max
we should use the same logic than for v2. I.e.:
String pidsMaxStr = CgroupSubsystemController.getStringValue(pids, "pids.max");
return CgroupSubsystemController.limitFromString(pidsMaxStr);
@@ -168,7 +169,7 @@ | |||
"perf_event 4 1 1\n" + | |||
"net_prio 5 1 1\n" + | |||
"hugetlb 6 1 1\n" + | |||
"pids 3 80 1"; | |||
"pids 9 80 1"; // the 3 did not match 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment leaves the reader none the wiser. I think you are alluding to controller id matching between /proc/cgroups
and /proc/self/cgroup
. If so, please use that info.
Hi Severin , thanks for all the comments. I prepared a second version with those changes Regarding your questions about tests, I run the exisiting docker/cgroup related tests; and also checked |
Have a look at Then consider adding the pids limit to the |
Hi Severin, thanks for the suggestions . Best regards, Matthias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good now. Looking forward to seeing container tests for this new code.
if (limit_str == NULL) { | ||
return OSCONTAINER_ERROR; | ||
} | ||
// Unlimited memory in Cgroups V2 is the literal string 'max' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't add version specific comments to version agnostic code. Suggestion: "Unlimited memory in cgroups is the literal string 'max' for some controllers, for example the pids controller."
jlong pidsmax = limit_from_str(pidsmax_str); | ||
return pidsmax; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this local variable? Consider using return limit_from_str(pidsmax_str);
instead.
jlong pidsmax = limit_from_str(pidsmax_str); | ||
return pidsmax; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. Use return limit_from_str(pidsmax_str);
Hi Severin , I did some adjustments following your suggestions. |
OK. Please also add a test on the hotspot side. You may want to add relevant parts to |
Thanks for the suggestion, I will look into TestMisc.java . |
I added some HS testing code in the latest commit. Best regards, Matthias |
Thanks, I'll test it and review once again. Meanwhile, please merge the master branch into your |
@MBaesken
|
I think we should do something like this, which seems to solve the podman case (that's in-line with the docker-run man page):
Note that for docker the default value for not setting it via
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the hotspot container test there is no "unlimited" test. Is that intentional? Perhaps align with the JDK side?
@@ -167,19 +178,30 @@ bool CgroupSubsystemFactory::determine_type(CgroupInfo* cg_infos, | |||
cg_infos[CPUACCT_IDX]._name = os::strdup(name); | |||
cg_infos[CPUACCT_IDX]._hierarchy_id = hierarchy_id; | |||
cg_infos[CPUACCT_IDX]._enabled = (enabled == 1); | |||
} else if (strcmp(name, "pids") == 0) { | |||
log_info(os, container)("Detected optional pids controller entry in %s", proc_cgroups); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason why this is level INFO? We have log_debug() or log_trace() for similar debugging in the same file. I don't think this will need a higher level than debug.
@@ -405,12 +405,23 @@ public static void printSystemMetrics() { | |||
limit = c.getMemoryAndSwapLimit(); | |||
ostream.println(formatLimitString(limit, INDENT + "Memory & Swap Limit: ", longRetvalNotSupported)); | |||
|
|||
limit = c.getPidsMax(); | |||
ostream.println(formatLimitString(limit, INDENT + "Maximum number of tasks available to the process: ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses some different capitalization that for other metrics. Could we perhaps shorten this to: Maximum Processes Limit:
@@ -37,6 +37,7 @@ public static void main(String[] args) { | |||
System.out.println("Metrics.getMemoryLimit() == " + metrics.getMemoryLimit()); | |||
System.out.println("Metrics.getMemoryAndSwapUsage() == " + metrics.getMemoryAndSwapUsage()); | |||
System.out.println("Metrics.getMemoryUsage() == " + metrics.getMemoryUsage()); | |||
System.out.println("Metrics.getPidsMax() == " + metrics.getPidsMax()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this got added? It seems unrelated to the change at hand.
} | ||
|
||
private static void testPidsLimit(String pidsLimit) throws Exception { | ||
Common.logNewTestCase("TestPidsLimit"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if the test case log included the expected limit. Something like:
Common.logNewTestCase("TestPidsLimit (limit: " + pidsLimit + ")");
Hi Severin, thanks for the comments. I added a commit with a number of adjustments src/hotspot/os/linux/cgroupSubsystem_linux.cpp src/java.base/share/classes/sun/launcher/LauncherHelper.java test/hotspot/jtreg/containers/docker/CheckOperatingSystemMXBean.java test/hotspot/jtreg/containers/docker/TestPids.java test/jdk/jdk/internal/platform/docker/TestPidsLimit.java |
@MBaesken Thanks. We need a solution for #4518 (comment) though.
|
Hi Severin, that's a pity and looks like a bug, because the docker documentation says
Do you have an idea what to set with docker 20 on your setup? I did not find much about this in the docker 20 release notes https://docs.docker.com/engine/release-notes/ . |
No, I don't know what to do about it. All I can see it comes back with a pids limit of |
Very strange indeed, I have docker 20.10.2 on my Ubuntu test server and there the tests work and no "38019" is coming back for -1/unlimited . What distro are you using? I did a quick search but did not find much about the mysterious |
I'm on Fedora 34 and have the moby distro build of docker: |
Possibly. Not sure where to look, though. |
I ask some local experts about that issue. |
This seems most reasonable. I'd suggest to accept a limit of
This seems overkill and prone to failures, IMHO. |
Hi Severin, I adjusted the tests so that in case of Unlimited, an integer value > 20000 is accepted as well. |
@MBaesken Thanks. I'll test it and will report back. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me now and passes tests on my cgroup v1 and cgroup v2 setups.
if (value.equals("Unlimited")) { | ||
opts.addDockerOpts("--pids-limit=-1"); | ||
} else { | ||
opts.addDockerOpts("--pids-limit="+value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: Space before and after +
.
if (pidsLimit.equals("Unlimited")) { | ||
opts.addDockerOpts("--pids-limit=-1"); | ||
} else { | ||
opts.addDockerOpts("--pids-limit="+pidsLimit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style nit: Space before and after +
.
@MBaesken This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 224 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Thanks for the review and approval . Best regards, Matthias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good to me.
I have one question/suggestion re. copyright comments. The files
test/hotspot/jtreg/containers/docker/TestPids.java
and
test/jdk/jdk/internal/platform/docker/TestPidsLimit.java
are new with this PR. Shouldn't the copyright then just specify the current year and not a range?
And how about adding the author's company as additional copyright holder?
/integrate |
Going to push as commit 089e83b.
Your commit was automatically rebased without conflicts. |
Hello, please review this PR; it extend the OSContainer API in order to also support the pids controller of cgroups.
I noticed that unlike the other controllers "cpu", "cpuset", "cpuacct", "memory" on some older Linux distros (SLES 12.1, RHEL 7.1) the pids controller might not be there (or not fully supported) so it was added as optional , see the coding
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4518/head:pull/4518
$ git checkout pull/4518
Update a local copy of the PR:
$ git checkout pull/4518
$ git pull https://git.openjdk.java.net/jdk pull/4518/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4518
View PR using the GUI difftool:
$ git pr show -t 4518
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4518.diff