-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8274179: AArch64: Support SVE operations with encodable immediates #6115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
for(int i = 0; i < LENGTH; i++) {
c[i] = a[i] + 2;
}
For the case showed above, after superword optimization with SVE,
without the patch, the vector add operation always has 2 z-reg inputs,
like:
mov z16.s, openjdk#2
add z17.s, z17.s, z16.s
Considering sve has supported basic binary operations with immediate,
this pattern could be further optimized to:
add z16.s, z16.s, openjdk#2
To implement it, we added some new match rules and assembler rules in
the aarch64 backend. We also made some extensions on immediate types
and functions to keep backward compatible.
With the patch, only these binary integer vector operations, +(add),
-(sub), &(and), |(orr), and ^(eor) with immediate are supported for
the optimization. Other vector operations are not supported currently.
Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64
CPU, no new failure.
There is no obvious performance uplift but it can help remove one
redundant mov instruction.
Change-Id: Iaec40e362918118691083fb171cc4dff390b35a2
|
👋 Welcome back fg1417! A progress list of the required criteria for merging this PR into |
|
@fg1417 I'm running this through our internal testing as we have had problems with other recent SVE changes. Thanks, |
Thanks, @dholmes-ora . If you have any problems, please contact me. |
|
Test run was successful - tiers 1-3. |
Thanks for your effort :) |
| Node* imm_node = replicate_node->in(1); | ||
| if (!imm_node->is_Con() || | ||
| !(imm_node->bottom_type()->isa_int() || imm_node->bottom_type()->isa_long())) { | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd split this up a bit.
Please consider
if (!imm_node->is_Con() return false;
const Type t = imm_node->bottom_type();
if (! (t->isa_int() || t->isa_long)) return false;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
I'd like you to split this patch into two parts, please. |
Change-Id: I52aa66d200b74ac312c5d40283b94854bc1142e6
…tirely untouched Change-Id: If8ddcef07b15615d7dd0c3063c44d2b705fac6f7
5ffd5cb to
a7a915a
Compare
Thanks. Done. |
| } | ||
|
|
||
| unsigned Assembler::regVariant_to_elemBits(Assembler::SIMD_RegVariant T){ | ||
| return 1 << (T + 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert something about T here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| uint64_t uimm = (uint64_t)uabs((jlong)imm); | ||
| if (uimm < (UCONST64(1) << nbits)) | ||
| return true; | ||
| if (uimm < (UCONST64(1) << (2 * nbits)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert something about nbits here. It has to be less than 32, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
theRealAph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine with those minor changes.
Change-Id: Ic9120902bd8f8a8ead2e3740435a40f35d21757c
|
@fg1417 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 24 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @nick-arm) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
|
@theRealAph , could you please help approve it? Thanks for your time :) |
Change-Id: I2004dc45f7f0ab44bc22b48083b185e7b3bd5eea
Change-Id: I1292449268c73c8f84cc3ffa7a4c859cf79058eb
|
Hi @theRealAph , I rebased my patch and retested it internally. Can I have your review :)? Thanks. |
theRealAph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, well done.
|
Mailing list message from Andrew Haley on hotspot-dev: On 11/17/21 09:57, Andrew Haley wrote:
Sorry, that was a mistake. -- |
|
Thanks :) @theRealAph |
|
/integrate |
|
/sponsor |
|
Going to push as commit 8193800.
Your commit was automatically rebased without conflicts. |
for(int i = 0; i < LENGTH; i++) {
c[i] = a[i] + 2;
}
For the case showed above, after superword optimization with SVE,
without the patch, the vector add operation always has 2 z-reg inputs,
like:
mov z16.s, #2
add z17.s, z17.s, z16.s
Considering sve has supported basic binary operations with immediate,
this pattern could be further optimized to:
add z16.s, z16.s, #2
To implement it, we added some new match rules and assembler rules in
the aarch64 backend. We also made some extensions on immediate types
and functions to keep backward compatible.
With the patch, only these binary integer vector operations, +(add),
-(sub), &(and), |(orr), and ^(eor) with immediate are supported for
the optimization. Other vector operations are not supported currently.
Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64
CPU, no new failure.
There is no obvious performance uplift but it can help remove one
redundant mov instruction.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6115/head:pull/6115$ git checkout pull/6115Update a local copy of the PR:
$ git checkout pull/6115$ git pull https://git.openjdk.java.net/jdk pull/6115/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 6115View PR using the GUI difftool:
$ git pr show -t 6115Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6115.diff