-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8325469: Freeze/Thaw code can crash in the presence of OSR frames #18637
Conversation
👋 Welcome back pchilanomate! A progress list of the required criteria for merging this PR into |
@pchilano This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 5 new commits pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the ➡️ To integrate this PR with the above commit message to the |
/label remove core-libs |
@pchilano |
Webrevs
|
This looks good, but have you considered computing the value every time instead of caching it in _num_stack_arg_slots and increasing the size of every nmethod? |
@@ -801,6 +802,7 @@ nmethod::nmethod( | |||
|
|||
init_defaults(); | |||
_entry_bci = entry_bci; | |||
_num_stack_arg_slots = entry_bci != InvocationEntryBci ? 0 : _method->constMethod()->num_stack_arg_slots(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, is the condition on this line the actual fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The point is that _num_stack_arg_slots should not be fixed for a given Method as now but it should depend on the actual nmethod.
while (!cont.isDone()) { | ||
cont.run(); | ||
if (freezeFast && !thawFast && fooCallCount == 2) { | ||
// All frames freezed in last yield should be compiled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
freezed -> frozen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
// provoke OSR compilation | ||
for (int i = 0; i < 500_000 * fooCallCount; i++) { | ||
} | ||
fooCallCount++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps use WhiteBox to check if we're OSRed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll test using isMethodCompiled(m, true) as another condition to break the loop.
// provoke OSR compilation | ||
for (int i = 0; i < 5_000_000 * fooCallCount; i++) { | ||
} | ||
fooCallCount++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto. Perhaps use WhiteBox to check if we're OSRed?
Since this is used in the thaw fast path too I wanted the avoid the extra load of constMethod if possible, but I think either case is fine. Moving _is_unlinked to where the other booleans are defined actually keeps the size of the nmethod same as before (368 bytes). What do you think? |
Can you do a performance measurement to see if the extra load actually makes a difference. I think @vnkozlov is also doing nmethod field reordering/compaction, so the relative overhead of an extra field might not remain 0. |
It may be hard to do a proper measurement because the number of methods in our microbenchmarks is small. We're also talking an extra branch, I think. This is code than can be called a million times per second per core. It's very performance sensitive. So I would prefer to first see if there's an impact on nmethod size, and only if there is consider whether the speed implications are acceptable. |
OK, let's go with the new nmethod field. |
The parent pull request that this pull request depends on has now been integrated and the target branch of this pull request has been updated. This means that changes from the dependent pull request can start to show up as belonging to this pull request, which may be confusing for reviewers. To remedy this situation, simply merge the latest changes from the new target branch into this pull request by running commands similar to these in the local repository for your personal fork: git checkout JDK-8325469
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# if there are conflicts, follow the instructions given by git merge
git commit -m "Merge master"
git push |
@pchilano this pull request can not be integrated into git checkout JDK-8325469
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
|
Thanks for the reviews @pron and @dean-long! |
/integrate |
Going to push as commit fd331ff.
Your commit was automatically rebased without conflicts. |
Hi @pchilano This change did affect my PR which try to reduce I am fine with caching the value in I am currently resolving conflict in my PR with your changes and I am planning to use |
Yes. I just used int because that was the return value of num_stack_arg_slots() that I moved from method.hpp, but I missed the field can just be defined as a u2 instead. |
Okay. Thanks! |
Freeze/thaw code assumes that a compiled frame for a method where num_stack_arg_slots() > 0 will always have the arguments setup above the metadata at the bottom of the frame. But when converting an interpreter frame to a compiled frame during OSR we don't explicitly leave room for the stack arguments after popping the interpreter frame. All parameters needed will be read from the "buf" array and stored inside the frame before calling OSR_migration_end().
This mismatch in how the stack looks and what we assume can lead to different crashes. In particular the issue happens when the OSR conversion happens for the bottom-most frame in the stack. If the OSR frame has a caller in the stack then there is no issue on freezing/thawing. I added more details about this in the bug comments.
When the OSR conversion happens for the bottom-most frame then a future freeze/thaw can lead to crashes for all cases: freeze_fast/thaw_fast, freeze_fast/thaw_slow, freeze_slow/thaw_slow. When freezing fast, either thawing fast or slow can lead to trying to read past the bottom of the stackChunk or writing below the allocated space in the stack. The freeze slow case is almost okay, except that it uncovered an invalid assert that is triggered if the size of the OSR frame plus all the other frames we freeze takes less space than the size of locals minus parameters of the interpreter frame that was OSR. I also added more details about these in the bug comments.
I tested different fixes, but I think the most straightforward one is to add _num_stack_arg_slots in the nmethod class and initialize it accordingly depending on whether the nmethod is an OSR one or not.
The patch includes a new test that exercises all these possible combinations of OSR frame at bottom of stack or not, and then freezing fast/slow and thawing fast/slow. The bottom case where we freeze fast and thaw slow reproduces the originally reported crash. There are actually two different failure modes depending of whether this is a thaw top or return barrier case. The other bottom cases lead to the other crashes described in the bug comments.
The new test uncover another bug besides the OSR issues, but since it's a different one I filed a separate JBS issue (JDK-8329665) and I made this a dependent PR.
I tested the current patch with the new test and also run it through mach5 tiers1-6.
Thanks,
Patricio
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18637/head:pull/18637
$ git checkout pull/18637
Update a local copy of the PR:
$ git checkout pull/18637
$ git pull https://git.openjdk.org/jdk.git pull/18637/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18637
View PR using the GUI difftool:
$ git pr show -t 18637
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18637.diff
Webrev
Link to Webrev Comment