New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8263236: runtime/os/TestTracePageSizes.java fails on old kernels #3415
Conversation
|
Webrevs
|
Anyone? :) |
Anyone? This unfortunately breaks |
Hi Aleksey,
I can't really evaluate the changes as I'm not familiar with the information that is being queried. I get the gist of things and as this is a test the real question is whether the test now passes okay. So on that basis I'll approve it.
Thanks,
David
@shipilev This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 12 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
I try to understand this. So: AnonHugePages - number of huge pages mapped into area, THP or explicit Therefore: !ht && AnonHugePages > 0 -> THP? What confuses me is that the kernel patch you refer to: torvalds/linux@50f8b92 sounds like the flag is not passed down to the memory management layer. Would this not effectively switch off THP for the region, and the resulting mapping would use small pages? In which case the failing test would have been correct, since we specified UseTransparentHugePages but it did not work? Sorry for my confusion. ..Thomas |
(sighs) The recent pull made the test fail again even with this patch. Let me see what is up there... |
This test is in tier1. Shall we ProblemList it if we can't fix it in a short time? |
We cannot problemlist it for older kernels only, unfortunately. |
Sorry, totally missed this PR. I saw the bug-report a while back and hoped my recent refactoring would help the situation, but I guess it only made things worse? Will take a look. |
I am looking at it too, since I have a machine where this failure reproduces reliably. Looks like the remaining failures are intermittent. |
I notice that the failures are like this with debug turned on:
So these probably are not committed yet, because |
@shipilev, do you know which change did break the test again, I did a few different cleanup that are related. |
@shipilev, that sounds like a valid theory. But not sure why that should have changed recently. |
Don't know yet. But I think this kind of failure highlights that tracking
I am now not even sure that the test passed reliably during my first attempt. All 6 subtests intermittently pass/fails with this patch. Let me mull over this a bit. Maybe the saner way out would be checking the kernel version and bailing on older kernels. |
That would be one approach, when I first started to think about supporting THP for the test I looked at a few different ways including |
Ran out of ideas. New version checks for kernel version and bails on kernels lower than 5.x. I shall try and see if I can find the more precise kernel version where this was fixed. Meanwhile, would the coarse check like this work, if I could not find a more precise version? |
Looks good to me. The isLinux()
check should not be needed since we require this to only run on Linux. But it won't hurt and makes it more clear so I'm good with it.
Just realized one thing... you might be able to solve it with just adding to the |
Should we just handle AlwaysPretouch directly in os::reserve_memory() instead of having each caller do this? Or would this interfere with concurrent pretouching? |
In theory this would be good, but as you say it would make parallel/concurrent pre-touch harder. Or at least we would have to extend a lot of APIs to pass down the needed work gang. |
Okay, I did a rough bisect over pre-built Debian kernels, and that points to |
Or make this an optional part of reservation, to be controlled by yet anther argument (sigh - maybe not :) |
All right. Here is the kicker. Vanilla
It copies the mmap tags to the VMA flags. Now, VMA flags are getting printed to How's that a problem for this test? Stare at this regexp:
And this only happens on some Debian kernels, because you have to have that new unhandled flag. This also explains why So this is deserves a simple fix in the regexp itself:
Testing that now... EDIT: Of course it works. Gaaaa. |
Apparently. It is the test regexp that mismatches when kernel replies |
Got that. Just wanted to make sure noone else had this problem because they were on an even older kernel where |
The way I see the kernel sources, |
Hah. Beautiful. And somewhat ironic, what are the odds that the offending patch was actually a workaround for a hotspot issue. +1. |
/integrate |
@shipilev Since your change was applied there have been 14 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 36e5ad6. |
See the bug report for details. On some kernels, we have trouble parsing madvise tags from
/proc/smaps
.Additional testing:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3415/head:pull/3415
$ git checkout pull/3415
Update a local copy of the PR:
$ git checkout pull/3415
$ git pull https://git.openjdk.java.net/jdk pull/3415/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 3415
View PR using the GUI difftool:
$ git pr show -t 3415
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3415.diff