-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8330819: C2 SuperWord: bad dominance after pre-loop limit adjustment with base that has CastLL after pre-loop #18892
Conversation
👋 Welcome back epeter! A progress list of the required criteria for merging this PR into |
@eme64 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 250 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks good to me.
In a future RFE, I can then look into improving the VPointer parsing. The VPointer code is already very convoluted, and adding much more functionality will be difficult to manage. I want to completely refactor VPointer in a few months anyway. With improved parsing, this regression test could vectorize again.
Sounds good! Maybe you can link this bug to this RFE (if there is already one) to not forget about updating this test to check if it indeed vectorizes.
test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegmentMainLoopAlignment.java
Outdated
Show resolved
Hide resolved
JDK-8330991 C2 SuperWord: refactor VPointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
Thanks @vnkozlov @chhagedorn for the reviews! |
Going to push as commit e681e9b.
Your commit was automatically rebased without conflicts. |
Summary: the address
adr
of the vector we want to align the main-loop for has aCastLL
after the pre-loop and before the main-loop. When we use this address to adjust the pre-loop limit, we create a use before theCastLL
, which leads to a "bad dominance" assert. Solution: make sure that all such base addressesadr
are not just invariant in the main-loop, but also are invariant of/before the pre-loop.Example where we get the "bad dominance"
This code shape comes from the attached regression tests (no matter if with Unsafe or MemorySegment).
The loop is PreMainPost-ed and the main-loop unrolled.
1326 CountedLoop
is the pre-loop, and1657 CountedLoop
is the main-loop, which contains the1648 LoadI
. DuringSuperWord
, we take this load's address to align the main-loop.The address is parsed into its components by
VPointer
:VPointer[mem: 1648 LoadI, base: 1, adr: 1669, base[ 1] + offset( 0) + invar( 0) + scale( 4) * iv]
We note that this is the access to native memory via Unsafe / MemorySegment, and so there is no array pointer base, and the
base = 1 TOP
.VPointer
tries still to find a "base" adressadr
by parsing the very left-most input to the chain ofAddP
s. Here, there is only a single1711 AddP
, and the left input isadr = 1669 CastX2P
. The right side of theAddP
is also parsed, and determined to be4 * iv
.The problematic part:
1669 CastX2P
is "pinned" down below the pre-loop by the1513CastLL
that is applied to11 Parm
(=long offset
parameter in the test).During
SuperWord
, we want to align the main-loop vectors. We do this by adjusting the pre-loop limit1439 Opaque1
:You can see the dark-green IR nodes, which compute the
new_limit = old_limit + adjustment
, where the adjustment is a modulo1734 AndI
of the address of the1738 LoadVector
for which we are aligning. In this computation we are also using theadr
of ourVPointer
, which depends on the1513 CastLL
which is pinned below the pre-loop. Thus, we are using a node inside the pre-loop that is pinned after the pre-loop. Hence the "bad dominance" assert.Why does this happen?
Usually, the
base
and/oradr
of aVPointer
are invariant not just of the main-loop but also the pre-loop. The pinning after pre-loop and before main-loop via1513 CastLL
seems to be quite rare. TheCastLL
comes from the longcheckIndex
for the Unsafe / MemorySegment load, somehow in combination with the constant-size array / constant range of the main-loop.Another realization: the
adr
is basically an additionraw_base + offset
. I would have expected theoffset
to be an invariant and end up in theinvar
. But theVPointer
parsing seems to only parse through theAddP
, and stop at theCastX2P
, and take this as theadr
, even though we could parse further through the1590 AddL
, and separate the11 Parm = long offset
and the602 LoadL = raw_base
.Solution
For now, and for the simplicity of backports, I simply check that the
adr
is not justis_loop_member
(of the main-loop), but that it isinvariant
(of the main-loop and the pre-loop). We also already do this check for invariantsinvar
, and the base/adr is essencially another invariant. That solution means that in our regression tests, we would mark the problematicVPointer
s as not valid, and they will not be vectorized. This case is very rare (after all, we have never hit this bad-dominance assert before), and therefore there should not be any relevant performance regression.In a future RFE, I can then look into improving the
VPointer
parsing. TheVPointer
code is already very convoluted, and adding much more functionality will be difficult to manage. I want to completely refactorVPointer
in a few months anyway. With improved parsing, this regression test could vectorize again.See: JDK-8330991 C2 SuperWord: refactor VPointer
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/18892/head:pull/18892
$ git checkout pull/18892
Update a local copy of the PR:
$ git checkout pull/18892
$ git pull https://git.openjdk.org/jdk.git pull/18892/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 18892
View PR using the GUI difftool:
$ git pr show -t 18892
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/18892.diff
Webrev
Link to Webrev Comment