-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8286104: use aggressive liveness for unstable_if traps #8545
Conversation
👋 Welcome back xliu! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you doing BCI and SP manipulation only to affect result of liveness_at_bci()
call in kill_dead_locals()
? May it can be done less disruptive way.
I am concern that different bci
and sp
may affect correctness of uncommon trap call generation (its debug info). It affects result of compute_stack_effects()
, too_many_recompiles()
and should_reexecute_implied_by_bytecode()
.
In addition logs about such uncommon trap will be different.
|
hi, Vladimir and Tobias, Thank you for reviewing this patch.
yes, that is my goal. I tried the less disruptive way. One corner case is that the if bytecode may reference an object which happens to be dead at the next bci. It would be wrong if we avoid restoring it in deoptimization. As a result, I have to parse the if-bytecode. Then I came up the idea that we can redefine the semantic of
I want to change debuginfo. This example shows that we don't need to save the scalarized object in debuginfo. We don't need to repush arguments to stack(sp) for comparison either. Both There are 2 cases in Parse::do_if(). We do use |
hi, @vnkozlov
|
Got it. So the idea is that we don't need to re-materialize Object in case the path which will be taken after deoptimization in Interpreter will not use it. This seems reasonable optimization. But I am still not sure that current implementation (shift uncommon trap to next bc) is valid.
First, for your information we re-execute I am not sure how your changes verify your assumptions.
If you re-execute
PreserveJVMState is used there only to provide correct starting JVM state for other branch processing. After merge JVM states from both branches are merged. |
Also in the test you pointed new uncommon trap is placed at |
Or not move uncommon trap - keep it at |
first, I have to admit that I overlook the fact that reexecution will update MDO of if bytecode. For item 2: reexecution in interpreter takes the other path, or next bci. Here is my thought: when HotSpot reinterprets the original bc, it must take the unstable path, otherwise this deoptimization shouldn't happen in the first place. In this patch, I just shift bci of uncommon_trap to next_bci, not only in IR, but also in ScopeDesc! Interpreter will start over with the shifted bci. Here is what I generated for the unstable_if of Test::foo using this patch. please note that the bci changed from 11 to 21. The message pasted in JDK-8276998 was from my initial patch. I retained bci for uncommon_trap initially. I think this change is simpler.
I admit my change misses to update MDO in deoptimization. I think it's fine because HotSpot won't recompile this method again until interpreter evaluates it thousands of times. The MDO still gets updated. |
Item 2. Yes, that is what will happened and that is why we may do this optimization. Your original words were confusing. Again, MDO may not be update for rare case even after running in Interpreter for some time. As result recompiled code will be the same and we again hit unc trap. In my additional comment I stated that placing uncommon trap to BC after merge point is wrong. You may not have all info in general cases (several branches merging to the same BC). |
hi, Vladimir, I really appreciate you look into this issue. your inputs are super helpful. C2's speculative compilation is fascinating. The motive we would like to prune an infrequent branch should be that we can simplify IR and shrink code size. Back to Test::foo(), the else block is blank... Pruning a trivial basic block actually makes IR more complex. We should not replace a trivial basic block with unstable_if . What Tobias reported is a great example. C2 compiled foo() before it became mature. therefore, C2 refrained his "heroic optimization. it turns out that c2 selects CMOVE and gets simpler, faster and smaller code... thanks, |
This feature not only folds two CMPI but also merge two uncommon_traps. it uses the dominating uncommon_trap and revaluate the two if in interpreter. currently, aggressiveliveness can't work for that.
|
I submitted new testing. |
2 tests failed so far. I put information into RFE. |
No other new failures in my tier1-7 testing. I think after you address found issue it will be ready to integrate (after second review by other Reviewer). But I would suggest to push it into JDK 20 after 19 is forked in one week to get more testing before release. |
Also add a regression test
hi, @vnkozlov, I file a new issue JDK-8287840(PR) for the new regression. It will allow some IfNodes carry out fold-compares in 1st IterGVN. Back to this PR, for safety, I added a mechanism to guarantee that fold-compares gives up if its dominating trap has been modified. This is a guardrail and should not happen frequent. I also add a regression test to cover fold-compare cases. for
sure. no problem. thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update is good. I agree with avoiding folding if unc trap info was modified.
I have few comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update looks good.
You need second review.
@navyxliu This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 123 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. Some comments/questions:
- Why can't we remove traps that have been modified?
- I'm wondering how useful
Compile::print_statistics()
really is. Is it worth extending it? Is anyone using it? - Do you need to check for unstable if traps in
Node::destruct
?
// Specialized uncommon_trap of unstable_if, we have 2 optimizations for them: | ||
// 1. suppress trivial Unstable_If traps | ||
// 2. use next_bci of _path to update live locals. | ||
class UnstableIfTrap { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about moving this information into CallStaticJavaNode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CallStaticJavaNode is popular. uncommon_trap/unstable_if is just a special case. that's why factor out and use a dedicated class for it.
src/hotspot/share/opto/compile.cpp
Outdated
CallStaticJavaNode* unc = trap->uncommon_trap(); | ||
int next_bci = trap->next_bci(); | ||
|
||
if (next_bci != -1 && !trap->modified()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can it be already modified? We are only processing each trap once, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. there is only one pass right now. so it's redundant.
I discussed this with Vladimir Kozlov. Technically, we can run it after 'inline_incrementally()" as well. I think we can add it when we prove this optimization is stable. "!trap->modified()" can dedup for that.
src/hotspot/share/opto/ifnode.cpp
Outdated
@@ -838,7 +838,8 @@ bool IfNode::has_only_uncommon_traps(ProjNode* proj, ProjNode*& success, ProjNod | |||
ciMethod* dom_method = dom_unc->jvms()->method(); | |||
int dom_bci = dom_unc->jvms()->bci(); | |||
if (!igvn->C->too_many_traps(dom_method, dom_bci, Deoptimization::Reason_unstable_fused_if) && | |||
!igvn->C->too_many_traps(dom_method, dom_bci, Deoptimization::Reason_range_check)) { | |||
!igvn->C->too_many_traps(dom_method, dom_bci, Deoptimization::Reason_range_check) && | |||
igvn->C->remove_unstable_if_trap(dom_unc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to IfNode::merge_uncommon_traps
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that in the first place. It turns out that situation is complex. The following 3 functions take place in order.
has_only_uncommon_traps() => fold_compares_helper() => merge_uncommon_traps()
IfNode::has_only_uncommon_traps
is the last stop before modifying code. if we moved the predicate "remove_unstable_if_trap(dom_unc)" to merge_uncommon_traps()
, it would be too late when it returns true. We have to rollback code change in fold_compares_helper
.
also changed the option from AggressiveLivenessForUnstableIf to OptimizeUnstableIf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarifications.
You added the same test twice with a slightly different name.
test/hotspot/jtreg/compiler/c2/irTests/TestOptimizeUnstableIf.java
Outdated
Show resolved
Hide resolved
/* | ||
* @test | ||
* @bug 8286104 | ||
* @summary Test C2 uses aggressive liveness to get rid of the boxing object which is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @summary Test C2 uses aggressive liveness to get rid of the boxing object which is | |
* @summary Test that C2 uses aggressive liveness to get rid of the boxing object which is |
Looks good! Let me re-submit testing. I'll report back once it passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests passed.
hi, @vnkozlov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New changes are good except the change to flag.
also update commments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I ran it through tier1 to make sure this last change did not break it.
I take a look at the build failure on ubuntu/i386. The problem is that apt-get can't install gcc toolchains from a custom repo |
Good. |
/integrate |
Going to push as commit 31e50f2.
Your commit was automatically rebased without conflicts. |
I found that some phi nodes are useful because its uses are uncommon_trap nodes. In worse case, it would hinder boxing object eliminating and scalar replacement. Test.java of JDK-8286104 reveals this issue. This patch allows c2 parser to collect liveness based on next bci for unstable_if traps. In most cases, next bci is closer to exits, so live locals are diminishing. It helps to reduce the number of inputs of unstable_if traps.
This is not a REDO of Optimization of Box nodes in uncommon_trap(JDK-8261137). Two patches are orthogonal. I adapt test from TestEliminateBoxInDebugInfo.java, so part of credit goes to the original author. I found that Scalar replacement can take care of the object
Integer ii = Integer.valueOf(value)
in original test even it can't be removed by later inliner. I tweak the profiling data of Integer.valueOf() to hinder scalar replacement.This patch can cover the problem discovered by JDK-8261137. I ran the microbench and got 9x speedup on x86_64. It's almost same as JDK-8261137. Besides runtime, the codecache utilization reduces from 1648 bytes to 1192 bytes, or 27.6%
Before:
After:
Testing
I ran tier1 test with and without
-XX:+DeoptimizeALot
. No regression has been found yet.Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/8545/head:pull/8545
$ git checkout pull/8545
Update a local copy of the PR:
$ git checkout pull/8545
$ git pull https://git.openjdk.org/jdk pull/8545/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 8545
View PR using the GUI difftool:
$ git pr show -t 8545
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/8545.diff