New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8257483: C2: Split immediate vector rotate from RotateLeftV and RotateRightV nodes #1532
Conversation
👋 Welcome back dongbo! A progress list of the required criteria for merging this PR into |
/label remove core-libs |
@dgbo |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you introduce new ideal nodes (RotateLeftImmV
/RotateRightImmV
)? You already extended is_vector_rotate_supported
to take the shift argument into account.
Also, you mention aarch64, but I see only x86 changes.
I think extended Aarch64 supports If extended
The variable roations would not be vectorized by With
I tried to optimize immediate vector rotations with shift&insert in JDK-8256820 for aarch64. |
Thanks for the detailed answer, but I'm still confused why |
It looks like you implicitly depend on the check that there are rules present which match that particular opcode:
But you don't have to and I don't see why additional checks in |
As far as I considered, Taking the following test for example.
On aarch64, we have scalar and vector instructions to implement the loop core of this test:
As of now, the default code generated by C2 is Without
But the check also tells C2 that [1] An implementation with |
Thanks for the clarifications. I think I understand now the issue you are facing. As of now, it works as follows:
Your fix proposes to introduce
What I'm after is to keep existing nodes and adjust the checks in |
Yes, it works just the same as you mentioned. Thanks a lot to make this so clear. However, the workflow relys on the creation of jdk/src/hotspot/share/opto/matcher.cpp Line 1669 in fa58671
IMHO, seems ajusting checks in |
But degeneration step is specifically designed to cover the case when rotate vector node is not supported by the matcher. |
5b23bd2
to
766a6a5
Compare
@dgbo this pull request can not be integrated into git checkout split_vector_rotate
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
766a6a5
to
1f85f55
Compare
Sorry, my mistake, ajusting the checks would be enough. Seems we have to pass the
With this, I think it is also easy to extended to cover other possible special CPU dependent checks for match rules in the future, like partially supports for other |
Hi, @iwanowww I have updated a version, in which I ajusted the checks in function Thanks. |
@@ -1172,7 +1182,8 @@ Node* VectorNode::degenerate_vector_rotate(Node* src, Node* cnt, bool is_rotate_ | |||
Node* RotateLeftVNode::Ideal(PhaseGVN* phase, bool can_reshape) { | |||
int vlen = length(); | |||
BasicType bt = vect_type()->element_basic_type(); | |||
if (!Matcher::match_rule_supported_vector(Op_RotateLeftV, vlen, bt)) { | |||
int extinfo = encode_rotate_vector_shift_type(in(2)); | |||
if (!Matcher::match_rule_supported_vector(Op_RotateLeftV, vlen, bt, extinfo)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see much value in encoding and packing node-specific info in an int. Why not simply introduce a new platform-specific entry and pass a Node*
instead letting platform-specific code to extract any useful information from it?
As an alternative, introduce a new capability (Matcher::supports_vector_variable_rotates
) and check it in shared code (RotateLeftVNode::Ideal
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced Matcher::supports_vector_variable_rotates(VectorNode *n)
in the updated commit d2cb793.
Verified it both on AARCH64
and X86_64
platforms with fastdebug
build. Tests are good.
d826100
to
d2cb793
Compare
src/hotspot/cpu/x86/x86.ad
Outdated
@@ -1814,6 +1814,11 @@ bool Matcher::supports_vector_variable_shifts(void) { | |||
return (UseAVX >= 2); | |||
} | |||
|
|||
bool Matcher::supports_vector_variable_rotates(VectorNode *n) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, turn it into a capability check (just bool Matcher::supports_vector_variable_rotates(void)
, no argument).
There's no need to wrap Matcher::match_rule_supported_vector
since you already check it:
(in(2)->is_Con() && !Matcher::supports_vector_variable_rotates(this)) ||
!Matcher::match_rule_supported_vector(Op_RotateLeftV, vlen, bt)) {
Otherwise, looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Turned the capability check into bool Matcher::supports_vector_variable_rotates(void)
. On x86_64, it returns true
directly now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@dgbo This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 90 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@iwanowww) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@iwanowww Thanks a lot for the review. |
/sponsor |
@RealFYang @dgbo Since your change was applied there have been 90 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit 026b09c. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Currently, for all CPUs, if optimizing RotateLeftV and RotateRightV with match rules in AD files, we have to implement both immediate and variable versions.
On aarch64, with match rules for vector rotation, immediate vector rotatation can be optimized with shift+insert instructions (i.e. SLI/SRI, ~20% improvements with an initial implementation).
However there woule be performance regression for variable version, due to SLI/SRI have no register version in NEON intruction set and there is no register version for right shift neither.
The instructions for match rules of vector rotate variable should be:
With this patch, immediate vector rotation can be matched alone and optimized on CPUs like aarch64.
Verified with linux-x86_64-server-fastdebug tier1-3 and passed.
Also added immediate vector rotation tests to micro
test/micro/org/openjdk/bench/java/lang/RotateBenchmark.java
.Tested the micro on a x86_64/aarch64 server and witnessed no regressions.
Progress
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/1532/head:pull/1532
$ git checkout pull/1532