-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8293996: C2: fix and simplify IdealLoopTree::do_remove_empty_loop #10393
Conversation
👋 Welcome back epeter! A progress list of the required criteria for merging this PR into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
@eme64 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 95 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice analysis, looks good to me too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice analysis, looks good to me, too!
As we have discussed offline, I agree that we should investigate in a separate RFE if there are other cases where we miss to optimize nodes in IGVN due to not adding outputs of outputs back to the worklist. But removing the empty loop as in your fix is better and less fragile instead of relying on IGVN.
Thanks @rwestrel @TobiHartmann @chhagedorn for the discussions and reviews! |
Going to push as commit dd51f7e.
Your commit was automatically rebased without conflicts. |
Context
In
IdealLoopTree::do_remove_empty_loop
, we detect empty loops, and break them such thatIGVN
can clean away the loops.For this, we used to simply replace the tripcount
Phi
node with the final iv. This has two intended effects:CountedLoopEnd
node itself collapses, as the back edge is not taken.Problem
We assume that replacing the tripcount necessarily leads to Optimizations of the nodes down to the
CountedLoopEnd
. If this optimization is not done, we are left with a broken loop that IGVN does not clean up. Later optimizations might now find the remnants of the loop, and assume the tripcountPhi
still exists - but it does not.In this concrete example, I had an AddI node that did not do the constant folding. We had
Now, if
<y>
gets replaced with a constant<int:2>
, output node100 AddI
is notified viaPhaseIterGVN::add_users_to_worklist
, however node101 AddI
is an output of the output - we currently do not notify it. But we would expect that it is folded to:This optimization does not happen, and the propagation down does not take place, prohibiting that the loop condition is determined to be
false
during IGVN.Solution
I saw two approaches to fix this issue:
CountedLoopEnd
are performed. This is difficult as it is easy to miss a small pattern. Here I could simply notify outputs of outputs, if they are bothAddI
nodes. Probably there will eventually be other little Optimizations that were missed, and then we have to fix them all one by one.false
, why not directly replace the condition of theCountedLoopEnd
withint:0
? We still need to replace the tripcountPhi
with the final iv, to ensure exit values are correct for users after the loop. But we can simplify the code inIdealLoopTree::do_remove_empty_loop
substantially, as we do not need to clone thecmp
node any more.I picked solution 2. I suggest that we still perform solution 1, in an RFE like JDK-8257197, which catches all optimizations that IGVN could have done but failed to do.
I added to the regression test.
4fb2bb5#diff-6a59f91cb710d682247df87c75faf602f0ff9f87e2855ead1b80719704fbedffR3132-R3138
I ran a large test suite.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10393/head:pull/10393
$ git checkout pull/10393
Update a local copy of the PR:
$ git checkout pull/10393
$ git pull https://git.openjdk.org/jdk pull/10393/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10393
View PR using the GUI difftool:
$ git pr show -t 10393
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10393.diff