-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8263236: runtime/os/TestTracePageSizes.java fails on old kernels #3415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Welcome back shade! A progress list of the required criteria for merging this PR into |
Webrevs
|
|
Anyone? :) |
|
Anyone? This unfortunately breaks |
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Aleksey,
I can't really evaluate the changes as I'm not familiar with the information that is being queried. I get the gist of things and as this is a test the real question is whether the test now passes okay. So on that basis I'll approve it.
Thanks,
David
|
@shipilev This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 12 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
I try to understand this. So: AnonHugePages - number of huge pages mapped into area, THP or explicit Therefore: !ht && AnonHugePages > 0 -> THP? What confuses me is that the kernel patch you refer to: torvalds/linux@50f8b92 sounds like the flag is not passed down to the memory management layer. Would this not effectively switch off THP for the region, and the resulting mapping would use small pages? In which case the failing test would have been correct, since we specified UseTransparentHugePages but it did not work? Sorry for my confusion. ..Thomas |
|
(sighs) The recent pull made the test fail again even with this patch. Let me see what is up there... |
|
This test is in tier1. Shall we ProblemList it if we can't fix it in a short time? |
|
We cannot problemlist it for older kernels only, unfortunately. |
|
Sorry, totally missed this PR. I saw the bug-report a while back and hoped my recent refactoring would help the situation, but I guess it only made things worse? Will take a look. |
|
I am looking at it too, since I have a machine where this failure reproduces reliably. Looks like the remaining failures are intermittent. |
|
I notice that the failures are like this with debug turned on: So these probably are not committed yet, because |
|
@shipilev, do you know which change did break the test again, I did a few different cleanup that are related. |
|
@shipilev, that sounds like a valid theory. But not sure why that should have changed recently. |
Don't know yet. But I think this kind of failure highlights that tracking
I am now not even sure that the test passed reliably during my first attempt. All 6 subtests intermittently pass/fails with this patch. Let me mull over this a bit. Maybe the saner way out would be checking the kernel version and bailing on older kernels. |
That would be one approach, when I first started to think about supporting THP for the test I looked at a few different ways including |
|
Ran out of ideas. New version checks for kernel version and bails on kernels lower than 5.x. I shall try and see if I can find the more precise kernel version where this was fixed. Meanwhile, would the coarse check like this work, if I could not find a more precise version? |
kstefanj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. The isLinux() check should not be needed since we require this to only run on Linux. But it won't hurt and makes it more clear so I'm good with it.
tstuefe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too.
|
Just realized one thing... you might be able to solve it with just adding to the |
|
Should we just handle AlwaysPretouch directly in os::reserve_memory() instead of having each caller do this? Or would this interfere with concurrent pretouching? |
|
In theory this would be good, but as you say it would make parallel/concurrent pre-touch harder. Or at least we would have to extend a lot of APIs to pass down the needed work gang. |
|
Okay, I did a rough bisect over pre-built Debian kernels, and that points to |
kstefanj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Or make this an optional part of reservation, to be controlled by yet anther argument (sigh - maybe not :) |
All right. Here is the kicker. Vanilla It copies the mmap tags to the VMA flags. Now, VMA flags are getting printed to How's that a problem for this test? Stare at this regexp:
And this only happens on some Debian kernels, because you have to have that new unhandled flag. This also explains why So this is deserves a simple fix in the regexp itself: Testing that now... EDIT: Of course it works. Gaaaa. |
kstefanj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great dig @shipilev 👍
So the hg vmFlag is present for all kernels we care about?
Apparently. It is the test regexp that mismatches when kernel replies |
Got that. Just wanted to make sure noone else had this problem because they were on an even older kernel where |
The way I see the kernel sources, |
|
Hah. Beautiful. And somewhat ironic, what are the odds that the offending patch was actually a workaround for a hotspot issue. +1. |
|
/integrate |
|
@shipilev Since your change was applied there have been 14 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 36e5ad6. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
See the bug report for details. On some kernels, we have trouble parsing madvise tags from
/proc/smaps.Additional testing:
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3415/head:pull/3415$ git checkout pull/3415Update a local copy of the PR:
$ git checkout pull/3415$ git pull https://git.openjdk.java.net/jdk pull/3415/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 3415View PR using the GUI difftool:
$ git pr show -t 3415Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3415.diff