-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8294540: Remove Opaque2Node: it is broken and triggers assert #11477
Conversation
👋 Welcome back epeter! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice summary! I agree with this solution. A workaround for the assert seems to be complicated given that Opaque2
nodes actually do not really fulfill their original purpose anymore as you've stated and also discussed in #10306.
Must remove for now, maybe implement properly later
Right, we could still come back to this optimization and implement it properly if it turns out to be beneficial.
class Opaque3Node : public Node { | ||
int _opt; // what optimization it was used for | ||
virtual uint hash() const; | ||
virtual bool cmp(const Node &n ) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual bool cmp(const Node &n ) const; | |
virtual bool cmp(const Node &n) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do that 👍
// this kind of a Node, we'll get slightly pessimal, but correct, code. Thus | ||
// it's OK to be slightly sloppy on optimizations here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should also add this comment again to the Opaque3
nodes as it suggests that it is okay to be sloppy with Opaque3
as well (but then on the other hand, what exactly is the definition of being sloppy in the context of Opaque3
nodes? So, it might be more confusing than actually helping).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave it out, unless if we are sure that it really applies to Opaque3
Nodes.
if (loop_head->unrolled_count() == 1) { // only for first unroll | ||
// Separate limit by Opaque node in case it is an incremented | ||
// variable from previous loop to avoid using pre-incremented | ||
// value which could increase register pressure. | ||
// Otherwise reorg_offsets() optimization will create a separate | ||
// Opaque node for each use of trip-counter and as result | ||
// zero trip guard limit will be different from loop limit. | ||
assert(has_ctrl(opaq), "should have it"); | ||
Node* opaq_ctrl = get_ctrl(opaq); | ||
limit = new Opaque2Node(C, limit); | ||
register_new_node(limit, opaq_ctrl); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could directly be simplified into assert(loop_head->unrolled_count() != 1 || has_ctrl(opaq), "should have opaque")
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do that 👍
@eme64 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 27 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with removal of Opaque2
.
On side note, please file RFE to rename all Opaque
nodes to something meaningful.
With removal of Opaque2
numbering is broken. Originally it was only 2 such nodes but now we have 4 and it become hassle to find what they do and why we have so many.
That is a good idea. I've planned to rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update looks good!
Thanks @chhagedorn for the previous investigation into Opaque2, and the review! Thanks @vnkozlov for the review! |
Going to push as commit 619b68c.
Your commit was automatically rebased without conflicts. |
As @chhagedorn nicely analyzed,
Opaque2
nodes seem to be broken.#10306
Original idea behing Opaque2
The idea was to avoid loop-exit values to have both the un-incremented phi-value and the incremented one. Having both may mean using 2 registers, where 1 register could be enough if we only carry the incremented value, and simply decrement after the loop. Opaque2 was inserted to prevent optimizations that would optimize (Phi + inc - inc) down to (Phi). We had a pattern (Phi + inc -> Opaque2 -> - inc), so the add/sub do not get collapsed. Many years back, that used to be fine, and we would only remove Opaque2 nodes once no IGVN round was run anymore. But now, we take Opaque2 nodes out of the graph and then run IGVN again, which undoes all of the effort.
Must remove for now, maybe implement properly later
For now we remove it, because they can trigger asserts, and is simply a rotten optimization. We think that this optimization should probably be done at the very end anyway, after the last IGVN round.
Performance tests indicate that removing it now does not lead to slowdown.
Filed JDK-8298019 for future investigation / reimplementation.
Analysis of the assert
In
PhaseIdealLoop::do_unroll
, we check that the main loop has the same limit in the zero-trip-guard as in the loop-exit. For that, we find theOpaque1
nodeopaq
for the zero-trip-guard limit. We compare its input with thelimit
from the loop-exit. This assert is useful, if we change loop-limits we must ensure they are in sync.jdk/src/hotspot/share/opto/loopTransform.cpp
Lines 2227 to 2228 in 8c472e4
Unfortunately, we insert two separate
Opaque2
nodes for the zero-trip-guard and loop-exit in this example.Sequence of events:
CmpI
and the loop-exitCmpI
both are direct uses of that pre-incremented Phi (this is very rare, as it seems, usually there are other operations in between, and then it is a single use, that eventually feeds into zero-trip-guard and loop-exit-check).Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11477/head:pull/11477
$ git checkout pull/11477
Update a local copy of the PR:
$ git checkout pull/11477
$ git pull https://git.openjdk.org/jdk pull/11477/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 11477
View PR using the GUI difftool:
$ git pr show -t 11477
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11477.diff