-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8321931: memory_swap_current_in_bytes reports 0 as "unlimited" #17314
Conversation
👋 Welcome back gziemski! A progress list of the required criteria for merging this PR into |
@gerard-ziemski The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
@iklam @poonamparhar @jerboaa can you please review this fix? It's pretty simple bug fix I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure 0 means unlimited? I looked at https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt
and did not find any remarks.
https://unix.stackexchange.com/questions/420906/what-is-the-value-for-the-cgroups-limit-in-bytes-if-the-memory-is-not-restricte seems to indicate the value is some very high number. Or is this handled by the JVMs cgroup layer?
st->print("%s: ", metrics); | ||
if (j > 0) { | ||
if ((j > 0) || ((j == 0) && (limit == false))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearer would be a reversal like this:
if (j == 0 && limit) {
"unlimited"
} else {
print
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I decided to have it this way to draw the attention to the fact that the value 0
for a non-limit
kind should be treated as numerical value and not a special case.
This provides, in my opinion, a self-documenting code without the need for an explicit comment. That was my rationale at least, but if that doesn't come across, then we can simplify and add an actual comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you. Both are okay, this was just a small nit :)
@@ -45,7 +45,7 @@ class OSContainer: AllStatic { | |||
public: | |||
static void init(); | |||
static void print_version_specific_info(outputStream* st); | |||
static void print_container_helper(outputStream* st, jlong j, const char* metrics); | |||
static void print_container_helper(outputStream* st, jlong j, const char* metrics, boolean limit = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name this parameter better, or add a comment at least?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe boolean zero_means_unlimited = false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would isLimit
be any better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
I'll have a look tomorrow. In general unlimited means |
Now I'm not 100% sure. I think it was a source code comment somewhere where I saw this special 0 treatment. I will have to refresh my memory... Thank you for taking a look! |
Thank you for taking a look. I think I saw this special treatment of 0 value somewhere in source code. I will have to look. (I wish I posted this PR sooner when all this was still fresh in my memory, but by the same token, I'm happy that at least we are taking a look at this right now and not a year from now...) |
According to
E.g.,
I don't think there's a file under /sys/fs/cgroup where the numberic value of "0" means "unlimited". Instead, "0" means zero or "not allowed at all". Otherwise, how can you express "don't allow any swapping for this container"? I think the proper fix is:
It's worthwhile scanning the source code to for such cases and possibly fixing them. In most cases, I see that -1 means unlimited, and -2 (OSCONTAINER_ERROR) means error:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug is a bit thin on the details. Can that be improved? I'm not aware that any of the APIs in hotspot return 0 for unlimited
. If it is, it's a bug. It would be good to have this captured in a test somewhere. Would a test with --memory 200M --memory-swap=200M
(no swap), trigger the said case? I think it would. For cg v1 the test case would probably also need a swappiness value of 0
.
My original fix was just being extra careful and was fixing non-limit type values only. However, looking at https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git/tree/Documentation/admin-guide/cgroup-v2.rst I noticed a section on
so 0 is a valid value, with no special meaning, so your suggestion on accepting 0 in all cases looks valid to me. Looks like the original implementation got this incorrect. Thank you for the review and feedback! |
Actually hotspot is doing the opposite: it returns "unlimited" when it was supposed to return "0". You can see it right now if you run "jcmd PID VM.info" and look at "memory_swap_current_in_bytes". No special flags needed to see this. Thank you for taking a look. I also agree that we should treat "0" as "0" for all cases. |
@gerard-ziemski Please add a test case for this in
Sorry for not being clear. This particular case exercises
|
Thank you for the feedback and the suggestion. I chose to check that |
Why? If a container is being run with I agree, though, that it might be better to move to a separate test. Up to you. |
This bug fix is correcting the behavior where I'd really prefer to keep the test as simple as possible to cover this specific fix, which in my opinion it does. Setting What do you think? |
I disagree. What you are looking for is if the actual value is |
As long as the test gets run in an environment where we do get I feel like I don't have enough container knowledge to write such a test, without investing more time learning the basics. Would you kindly help and write the test you want please? At least the part that sets up the environment and starts the test (on Linux only?) with swap==0 ? |
Sure. Here you go: |
Thank you very much! I will incorporate it into this fix shortly... |
I have added your test with one change. Instead of checking for strict syntax:
we do:
I did this in anticipation of JDK-8321932 which will change the layout.
Ready for a re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test would be a bit easier to read if we extracted the loose matching to a function. Suggestion:
[...]
out.shouldContain("memory_swap_max_limit_in_bytes");
checkLineForValue(str, "memory_swap_max_limit_in_bytes", "0");
[...]
private static void checkLineForValue(String output, String match, String value) {
[...]
}
@gerard-ziemski This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 25 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@@ -45,7 +45,7 @@ class OSContainer: AllStatic { | |||
public: | |||
static void init(); | |||
static void print_version_specific_info(outputStream* st); | |||
static void print_container_helper(outputStream* st, jlong j, const char* metrics); | |||
static void print_container_helper(outputStream* st, jlong j, const char* metrics, boolean limit = false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe boolean zero_means_unlimited = false
?
I have to beg out, too much to do before FOSDEM. But you have two reviewers already, so.. |
Thank you all for the reviews! I will re-run Mach5 teir1-5 tests, just in case, before I push it... |
I had to make a small fix in the test - I forgot to pass required argument. Tested it with MACH5 tier1-5 and that looks fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still fine.
/integrate |
Going to push as commit 7777eb5.
Your commit was automatically rebased without conflicts. |
@gerard-ziemski Pushed as commit 7777eb5. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
We make a distinction between 0 for a limit vs non-limit cgroup values.
For all values (limit or non-limit), 0 simply means 0.
Output before from
jcmd PID VM.info
:Output now from
jcmd PID VM.info
:Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17314/head:pull/17314
$ git checkout pull/17314
Update a local copy of the PR:
$ git checkout pull/17314
$ git pull https://git.openjdk.org/jdk.git pull/17314/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 17314
View PR using the GUI difftool:
$ git pr show -t 17314
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17314.diff
Webrev
Link to Webrev Comment