-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8330275: Crash in XMark::follow_array #18941
Conversation
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
👋 Welcome back asmehra! A progress list of the required criteria for merging this PR into |
@ashu-mehra This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 275 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@ashu-mehra The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Webrevs
|
I am currently trying to get access to aarch64 system and run the tests test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x. |
/label add hotspot-gc |
@ashu-mehra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ashu-mehra,
Thanks for fixing this issue.
There's a number of changes style changes I would like to make to make sure that the code looks more inline with what the rest of the ZGC code looks like. But before we start with that I would like to request that we skip making the changes to marking stack code and limit the changes to only the probing code. Doing so will make it easier to get this fix reviewed and delivered.
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
@stefank I am trying to understand the reason behind your suggestion to remove the changes in marking stack code. Are they not correct or is it that they don't belong to this PR? |
To me, it was not bleeding obvious that they were the right thing to do, and given other changes that doesn't follow the grown ZGC coding style, I wanted suggest a way forward for you to get this bug fixed, with less resistance from us ZGC developers/maintainers. That was the reasoning. |
I agree with Stefan. I would keep the patch as minimal as possible to make it easier to follow the actual error that has been fixed, and to make it easier for backporters to decide what to downport. Code cleanups can happen in a separate RFE. Ashu, are the other platforms actually broken? If yes, which ones? If a platform is not broken, I would defer touching it up to a separate cleanup RFE. Again because of patch clarity. |
So, the absolute minimal point-fix would be to change the value 47 to 46, which would be very easy to backport, right? If we still want to make the change that is currently in the PR I would like to tweak the code along the lines of what I've in my branch here: The extra patch:
The last two points where done to (at least for me) see and understand why the various plus and minuses where performed. I didn't touch the PPC code, since it's quite difference and I don't want to risk messing it up. |
I agree from the point of view of backporting, point-fix is all we need in this PR. @tstuefe As for the other platforms (riscv and ppc), looking at their code they seem to be broken in the same way as aarch64 but then the problem only happens if the user runs with > 1TB heap size with more than 48 addressable bits. @tstuefe @stefank please let me know if you agree with just doing the point-fix to aarch64. |
Absolutely. We can do any platform testing on other platforms and cleanups in subsequent RFEs. |
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Sorry for the long absence on this PR. I have updated the PR to just do a point fix for aarch64. I have also done tier1, tier2 and tier3 tests on aarch64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
FWIW, the 128TB and 64GB numbers are just confusing when we are talking about a bit position. If the 46th bit succeeds the usable address range is 128TB, and the 46th bit will account for 64TB out of those 128TB. I wouldn't mind at all if we just ripped out the mentions of 64TB and 64GB here.
Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
Done |
@tstuefe does this look ok? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, question inline.
@@ -142,9 +142,9 @@ | |||
// * 63-48 Fixed (16-bits, always zero) | |||
// | |||
|
|||
// Default value if probing is not implemented for a certain platform: 128TB | |||
static const size_t DEFAULT_MAX_ADDRESS_BIT = 47; | |||
// Minimum value returned, if probing fails: 64GB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason you removed the comment for MINIMUM_MAX_ADDRESS_BIT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh! I think I misunderstood stefank's suggestion. I should have just removed the values 64GB and 128TB mentioned in the comment. Let me restore the rest.
…l numbers that can be confusing Signed-off-by: Ashutosh Mehra <asmehra@redhat.com>
As the last commit is a trivial change to add the comments back, I am not requesting new review and integrating it as is. |
/integrate |
Going to push as commit 42b1d85.
Your commit was automatically rebased without conflicts. |
@ashu-mehra Pushed as commit 42b1d85. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
// Minimum value returned, if probing fails: 64GB | ||
// Default value if probing is not implemented for a certain platform | ||
// Max address bit is restricted by implicit assumptions in the code, for instance | ||
// the bit layout of XForwardingEntry or Partial array entry (see XMarkStackEntry) in mark stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment was copy-n-pasted without updating the names to ZForwardingEntry and ZMarkStackEntry.
/backport jdk21u-dev |
@ashu-mehra the backport was successfully created on the branch backport-ashu-mehra-42b1d858 in my personal fork of openjdk/jdk21u-dev. To create a pull request with this backport targeting openjdk/jdk21u-dev:master, just click the following link: The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:
If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u-dev:
|
/backport jdk22u |
@ashu-mehra the backport was successfully created on the branch backport-ashu-mehra-42b1d858 in my personal fork of openjdk/jdk22u. To create a pull request with this backport targeting openjdk/jdk22u:master, just click the following link: The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:
If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk22u:
|
This PR addresses the issue in ZGC where the number of address offset bits can go beyond the limit imposed by the encoding scheme in mark stack, thereby causing the encoding to fail.
Encoding of partial array offset in mark stack requires that the max address bit be no more than 46 bit.
But the current mechanism to probe maximum address offset bits on aarch64, riscv and ppc platforms can return value larger that 44 bits. This patch sets the maximum address offset bits to 44.I have updated the generational mode to avoid subtracting 3 bits from the maximum address offset bit probed by the system, as the generational mode does not use multi-mapping.I have also updated the code to set MarkPartialArrayMinSizeShift dynamically depending on the number of address offset bits used. This would avoid running into such problem again if in future maximum address offset bits is increased beyond 44.For some reason (that I can't comprehend from the code) the existing implementation for probing the max addressable bit for ppc in non-generation ZGC is very different from other platforms and from generational mode as well. I have kept the existing implementation as is and just fixed it to ensure it does not return value greater than 44 bits.Testing:
test/hotspot/jtreg/gc/z and test/hotspot/jtreg/gc/x on x86tier1, tier2 and tier3 on aarch64 using fastdebug build with options JTREG="EXTRA_PROBLEM_LISTS=ProblemList-zgc.txt;JAVA_OPTIONS=-XX:+UseZGC -XX:+ZVerifyOops;JOBS=4" (as per the suggestion in JDK-8330275)Update: Striked out the changes that are not relevant now that it is only doing a point fix for aarch64
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18941/head:pull/18941
$ git checkout pull/18941
Update a local copy of the PR:
$ git checkout pull/18941
$ git pull https://git.openjdk.org/jdk.git pull/18941/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18941
View PR using the GUI difftool:
$ git pr show -t 18941
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18941.diff
Webrev
Link to Webrev Comment