-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8253435: Cgroup: 'stomping of _mount_path' crash if manually mounted cpusets exist #295
Conversation
👋 Welcome back simonis! A progress list of the required criteria for merging this PR into |
@simonis The following label will be automatically applied to this pull request: When this pull request is ready to be reviewed, an RFR email will be sent to the corresponding mailing list. If you would like to change these labels, use the |
Webrevs
|
Did you run container tests with this? |
Mailing list message from Bob Vandette on hotspot-runtime-dev: Yuk. I just fixed a bug which caused us to use the mount source for the cgroup type. Not fixing Are there any hints in /proc/self/cgroup or /proc/self/mounts that we could use to eliminate this manual mount? I?d be tempted to eliminate mountinfo entries that are 1) duplicate controllers and 2) not in ?/sys/fs/cgroup? mount point. Bob. |
Yes. It took me some time to find out that I have to set For release builds they all pass with and without the change (except |
Sorry, but I don't understand. which bug are you speaking of and has it been fixed in the jdk already?
Not that I'm aware of. I couldn't find any.
It's not easy to remove the right duplicate :) Checking for So what about the following solution:
I think this is the best we can do if we don't want to parse What do you think?
|
Yes, this is a bit painful. I should have said that.
Hmm, which ones did you run? It seems odd that they fail to run in fastdebug config. FWIW, I've crafted a regression test for this issue. Please include something like that if you can: A fix like this should make it pass (uses the
|
Sounds sensible to me. |
@bobvandette probably meant https://bugs.openjdk.java.net/browse/JDK-8252359. A little correction, though. We didn't use the mount source as the cgroup type before JDK-8252359, but we relied on the mount source to be Bob is right, though, prior JDK-8252359, you wouldn't have hit the assert because of what I just said above. Your extra cpuset entries have |
For the record, the important one to run is this one:
It's independent of your hosts cgroup files. I believe that test broke with your proposed v1 fix because after your patch any |
Mailing list message from Bob Vandette on hotspot-runtime-dev:
That?s the bug I was referring to. Bob. |
Mailing list message from Bob Vandette on hotspot-runtime-dev:
I?m ok with that approach as well. Bob. |
568c489
to
2949e5f
Compare
Hi Severin, Bob, so here comes the new version as discussed. @jerboaa thanks for the additional test. I've merged it and extended it such that it checks both variants, when the manual csets controller comes before and when it comes after all the other controllers to exercise both code path in the detection. All container tests pass now (except Are you OK with the change? Thank you and best regards, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this extra gymnastics will be worth the added complexity. This makes the code somewhat harder to follow and, essentially, we end up checking whether or not the cgroup controller is being mounted under /sys/fs/cgroup
. The code tracks any interesting controller, like memory
, cpu
, cpuset
and cpuacct
and records the mount point. It's going to be /sys/fs/cgroup
for almost all cases, would it not?
The skip for non-/sys/fs/cgroup controllers ends up being either /fo/bar/baz
or whatever the lead-up path to the first seen "interesting" controller is or /sys/fs/cgroup
. I'm not convinced it's really anything other than /sys/fs/cgroup
. How about a simpler solution like this?
This seems to work for me.
I was expecting to see some logic in this "else if" section that recorded the first occurance but did the validation on the second pass (cg_infos[CPUSET_IDX]._mount_path != NULL). When this situation is detected, we accept the mount with the /sys/fs/cgroup.
|
2949e5f
to
7753bc5
Compare
You're right, the logic to ignore a However, the problem is that we don't know if the first or the second occurrence of Otherwise, my current solution tries to be conservative and does not assume a predefined mount point for Cgroup controllers. Instead it records the mount point of the first controller out of |
In my suggestion, it doesn't matter which entry is first. If we see the manual on first, record it. When the second one comes around, replace the first one if it's /sys/fs/cgroup. In the very unlikely event that there isn't a second non-manual one, I think we still want to record the manaul mount point since there could be a cpuset limit setup which we should respect. As for the second point about /sys/fs/cgroup, I think that using this string is just as good if not better than assuming they are all mounted in the same subdirectory. If you follow my suggestion, then we will only be subjected to a failure to mount if there are two cpuset mount entries AND neither are mounted on /sys/fs/cgroup/cpuset. |
Like this perhaps? I tend to think that in actual container workloads it might be unlikely to actually see multiple cpuset mounts. So in a sense that's a special case. The general case, before this bug, shouldn't be penalized. So perhaps the above would be worth considering. |
Yes, that's exactly what I was thinking. In the manual case from your example, we'd record "/" as the mount point if it showed up first and then overwrite it if /sys/fs/cgroup came along. |
7753bc5
to
b3d0f28
Compare
OK, I just want to get this done so I can finally use debug builds again. I've picked your version now in the hope that you'll review that :) I only changed the condition
to
otherwise you'd still choose the alternative cpuset if that was mounted on a mount point which is lexicographically after I've also adapted the logs to reflect the fact that the current solution will simply choose the cpusets from the second mount in the (unusual?) case where the "normal" Cgroups are not mounted to Hope that's still fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. Thanks for your patience!
@simonis This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for more details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 85 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Thanks Severin. I'll wait one more day to also give Bob a chance to look at the final version and push after that. Best regards, |
Looks good Volker. |
Mailing list message from Bob Vandette on hotspot-runtime-dev: Looks good Volker. Bob. |
Thanks Bob. |
/reviewer credit @bobvandette |
@simonis |
/integrate |
@simonis Since your change was applied there have been 85 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 0054c15. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Hi,
can I please have a review (or an idea for a better fix) for this PR?
If a tool like cpuset is used to manually create and manage cpusets the cgroups detections will be confused and crash in a debug build or behave unexpectedly in a product build.
The problem is that the additionally mounted cpuset will be interpreted as if it was belonging to Cgroup controller:
The current fix solves this problem for manually created cpusets which don't have a "mount source" but this is yet another heuristic. I'm open to better solutions for detecting cpusets which don't don't belong to a Cgroup.
Thanks,
Volker
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/295/head:pull/295
$ git checkout pull/295