-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8288294: [vector] Add Identity/Ideal transformations for vector logic operations #9211
Conversation
👋 Welcome back xgong! A progress list of the required criteria for merging this PR into |
@XiaohongGong The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Hi, could anyone please help to take a look at this simple patch? Thanks a lot for your time! |
} | ||
// (AndV (Replicate zero) src) => (Replicate zero) | ||
// (AndVMask (MaskAll zero) src) => (MaskAll zero) | ||
if (VectorNode::is_all_zeros_vector(in(1))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you expect it to be in(1)
instead of in(2)
as in previous case? Do we create inputs in such order based on mask value?
At least add comment explaining it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have the case that the all zeros vector is in(2)
in the followed codes. Please see line 1817. The main reason to do the different handle is the consideration for the predicated vector operations in Vector API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what confuses me. The comment there says: masked operation requires the unmasked lanes to save the same values in the first operand
.
I'm interpreting it as mask should be in(2)
:
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/aarch64_sve.ad#L379
But here you check in(1)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment means: for masked operations, the result of non-masked lanes should be from in(1)
, and the masked lanes are from the operation results.
For "AndV"
with zero, the results is zero. So if in(1)
is all zeros vector which is the expected result, no matter whether the AndV
is masked or not, the result is right (i.e. for masked AndV
, the non-masked lanes should be from in(1)
, and the masked lanes should be from the operation result which is also in(1)
). But if the all zeros vector is in(2)
, this transformation will the results not right for masked AndV
. That's why I added !is_predicated_vector()
limit for the second case while not for the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment means: for masked operations, the result of non-masked lanes should be from
in(1)
, and the masked lanes are from the operation results.
Okay, I got it finally. Thank you for explanation.
} | ||
// (OrV src (Replicate zero)) => src | ||
// (OrVMask src (MaskAll zero)) => src | ||
if (VectorNode::is_all_zeros_vector(in(2))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same question as for AndVNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same as AndVNode
.
// (XorV src src) => (Replicate zero) | ||
// (XorVMask src src) => (MaskAll zero) | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this? Is Replicate
asm instruction faster than XorV
? I understand it may help reduce registers pressure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this patch! I think the main benefit is "(Replicate zero)" is loop invariant which could be hoist outside of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
@jatin-bhateja can you look on this. It is shared code and x86 is also affected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tobias already ran testing and results are good.
@XiaohongGong This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been no new commits pushed to the ➡️ To integrate this PR with the above commit message to the |
We can also handle following vector constant folding cases :- |
Other than above patch looks good to me. |
Thanks so much for the advice @jatin-bhateja ! I basically agree with this idea! It seems the similar optimization can also be applied to other binary arithmetic vector operations like |
Thanks for the review and testing @vnkozlov ! |
Agree, addressing it in subsequent PR along with other operations should be ok. Thanks. LGTM. |
Thanks for the reivew @jatin-bhateja ! |
/integrate |
Going to push as commit 124c63c.
Your commit was automatically rebased without conflicts. |
@XiaohongGong Pushed as commit 124c63c. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This patch adds the following transformations for vector logic operations such as "
AndV, OrV, XorV
", incuding:where "
m1
" is the integer constant -1, together with the same optimizations for vector mask operations like "AndVMask, OrVMask, XorVMask
".Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9211/head:pull/9211
$ git checkout pull/9211
Update a local copy of the PR:
$ git checkout pull/9211
$ git pull https://git.openjdk.org/jdk pull/9211/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 9211
View PR using the GUI difftool:
$ git pr show -t 9211
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9211.diff