-
Notifications
You must be signed in to change notification settings - Fork 76
8259271: gc/parallel/TestDynShrinkHeap.java still fails "assert(covered_region.contains(new_memregion)) failed: new region is not in covered_region" #127
Conversation
👋 Welcome back kbarrett! A progress list of the required criteria for merging this PR into |
@kimbarrett Setting summary to |
@kimbarrett The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm. Thanks.
@kimbarrett This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been no new commits pushed to the ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
/integrate |
@kimbarrett Since your change was applied there have been 4 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 685c03d. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Please review this fix for an intermittent crash when using ParallelGC on
aarch64. The problem is a mis-ordered pair of reads that permit an
algorithmic invariant to be violated. The mis-ordering is due to the lack
any explicit ordering request (a missing barrier).
In MutableSpace::cas_allocate, we had
If end is read before top, other threads may advance top and end between
those reads. If, when top is read, current top > old end and current top +
size > current end, the range check will unexpectedly pass because of
underflow in pointer_delta. This will allow top to be advanced to a value
which is > current end, violating the algorithmic invariant, and likely
leading to crashes or memory corruption.
gcc for x86 doesn't reorder the reads, but for aarch64 it does, and is
permitted to do so. Even if it didn't, there is nothing to prevent the
hardware from doing so. The solution is to use a load_acquire for top, to
ensure the needed order.
This may have been the actual root cause of JDK-8257999. However, the
change made there was and still is needed for the reasons described.
Testing:
mach5 tier1-3
Even with knowledge of the failure mode it's very hard to reproduce. I was
unable to catch the underflow case in over 1K attempts using machines in our
test farm, though StefanK caught it a few times on a personal machine.
/summary Use load_acquire to order reads of top and end.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk16 pull/127/head:pull/127
$ git checkout pull/127