Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8265321: Add Rearrange nodes implementation for Arm SVE #70

Closed

Conversation

Wanghuang-Huawei
Copy link
Collaborator

@Wanghuang-Huawei Wanghuang-Huawei commented Apr 17, 2021

  • Add Rearrange nodes implementation for Arm SVE, like rearrangeB/I/S/L
  • add sve_tbl , which reads each element of the second source (index) vector and uses its value to select an indexed element from the first source (table) vector, and places the indexed table element in the destination vector element corresponding to the index vector element. If an index value is greater than or equal to the number of vector elements then it places zero in the corresponding destination vector element.[1]

[1] https://developer.arm.com/documentation/ddi0596/2020-12/SVE-Instructions/TBL--Programmable-table-lookup-in-single-vector-table-?lang=en


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed

Issue

  • JDK-8265321: Add Rearrange nodes implementation for Arm SVE

Reviewers

Contributors

  • Wang Huang <whuang@openjdk.org>
  • Ai Jiaming <aijiaming1@huawei.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/panama-vector pull/70/head:pull/70
$ git checkout pull/70

Update a local copy of the PR:
$ git checkout pull/70
$ git pull https://git.openjdk.java.net/panama-vector pull/70/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 70

View PR using the GUI difftool:
$ git pr show -t 70

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/panama-vector/pull/70.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 17, 2021

👋 Welcome back whuang! A progress list of the required criteria for merging this PR into vectorIntrinsics will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Apr 17, 2021
@Wanghuang-Huawei
Copy link
Collaborator Author

/contributor add Wang Huang whuang@openjdk.org
/contributor add Ai Jiaming aijiaming1@huawei.com

@openjdk
Copy link

openjdk bot commented Apr 17, 2021

@Wanghuang-Huawei
Contributor Wang Huang <whuang@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Apr 17, 2021

@Wanghuang-Huawei
Contributor Ai Jiaming <aijiaming1@huawei.com> successfully added.

@mlbridge
Copy link

mlbridge bot commented Apr 17, 2021

Webrevs

Comment on lines 4054 to 4080
instruct rearrangeL(vReg dst, vReg src, vReg shuffle)
%{
predicate(UseSVE > 0 &&
n->bottom_type()->is_vect()->element_basic_type() == T_LONG);
match(Set dst (VectorRearrange src shuffle));
ins_cost(SVE_COST);
format %{ "sve_tbl $dst, D, $src, $shuffle\t# vector rearrange (D)" %}
ins_encode %{
__ sve_tbl(as_FloatRegister($dst$$reg), __ D,
as_FloatRegister($src$$reg), as_FloatRegister($shuffle$$reg));
%}
ins_pipe(pipe_slow);
%}

instruct rearrangeD(vReg dst, vReg src, vReg shuffle)
%{
predicate(UseSVE > 0 &&
n->bottom_type()->is_vect()->element_basic_type() == T_DOUBLE);
match(Set dst (VectorRearrange src shuffle));
ins_cost(SVE_COST);
format %{ "sve_tbl $dst, D, $src, $shuffle\t# vector rearrange (D)" %}
ins_encode %{
__ sve_tbl(as_FloatRegister($dst$$reg), __ D,
as_FloatRegister($src$$reg), as_FloatRegister($shuffle$$reg));
%}
ins_pipe(pipe_slow);
%}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least merge long/double and float/int ? The format looks the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's my fault. The comment is L for rearrangeL instead of D. I will fix it in my next patch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer merging them instead of having two separate rules. From the instruction point of view, there's only D, not L.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also prefer merging them instead of having two separate rules. From the instruction point of view, there's only D, not L.

Thank you for your review.

  • I think that the comment is the node's comment (or rule's comment) instead of single instruction's comment. This rule is rearrangeL which means rearranging long type so I think that the comment should be L instead of D.
  • In aarch64_neon.ad, the comments are B``I``F``D and so on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think that the comment is the node's comment (or rule's comment) instead of single instruction's comment. This rule is rearrangeL which means rearranging long type so I think that the comment should be L instead of D.
  • In aarch64_neon.ad, the comments are BIFD `` and so on.

I assume you mean comment in format section. In aarch64_neon, some are S while some are F, but I don't think there's a need to generate two rules just for different format comments. If there's any, I would suggest to fix them as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think that the comment is the node's comment (or rule's comment) instead of single instruction's comment. This rule is rearrangeL which means rearranging long type so I think that the comment should be L instead of D.
  • In aarch64_neon.ad, the comments are BIFD `` and so on.

I assume you mean comment in format section. In aarch64_neon, some are S while some are F, but I don't think there's a need to generate two rules just for different format comments. If there's any, I would suggest to fix them as well.

Of course, in some case L and D can be zipped in one rule. However, in some case, L and D is difference. D is float type. That is what I mentioned before the comment is for the whole rule. In the comment, we should show the data type. ;-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cases are different, but I don't think we need to distinguish floating point data types for this operation as they are actually the same thing and generate the same code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @nsjian . I prefer to merge all these rules into one as we have discussed too much times before, as there is no instruction difference for different types. Anyway, we can have a separate patch to unify the styles in ad file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cases are different, but I don't think we need to distinguish floating point data types for this operation as they are actually the same thing and generate the same code.

Yes, I think for this rule L and D have the same form. However, the comment itself should be consistent with others. In other case, we should distinguish floating point data types. That is why I choose to use L instead of D.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really want to have a correct comment, you can comment as something like "(L/D)" - though I doubt if it really make any sense for your testing/debugging.

Copy link
Collaborator

@nsjian nsjian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work! Overall looks good to me. Just some nits.

Comment on lines +2414 to +2423
switch (opcode) {
case Op_VectorLoadShuffle:
case Op_VectorRearrange:
if (vlen < 4) {
return false;
}
break;
default:
break;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer put this inside op_sve_supported.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will fix that.

@@ -4057,7 +4057,7 @@ instruct rearrangeL(vReg dst, vReg src, vReg shuffle)
n->bottom_type()->is_vect()->element_basic_type() == T_LONG);
match(Set dst (VectorRearrange src shuffle));
ins_cost(SVE_COST);
format %{ "sve_tbl $dst, D, $src, $shuffle\t# vector rearrange (D)" %}
format %{ "sve_tbl $dst, D, $src, $shuffle\t# vector rearrange (L)" %}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think that it make no sense to separate these rules. But anyway, I can do a cleanup later. FYI: Xiaohong and our Intel friends are working hard trying not to increase the instruction selection patterns on masking support design: http://openjdk.java.net/jeps/8261663.

@openjdk
Copy link

openjdk bot commented Apr 22, 2021

@Wanghuang-Huawei This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8265321: Add Rearrange nodes implementation for Arm SVE

Co-authored-by: Wang Huang <whuang@openjdk.org>
Co-authored-by: Ai Jiaming <aijiaming1@huawei.com>
Reviewed-by: njian

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1 new commit pushed to the vectorIntrinsics branch:

Please see this link for an up-to-date comparison between the source branch of this pull request and the vectorIntrinsics branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the vectorIntrinsics branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Apr 22, 2021
@openjdk openjdk bot closed this Apr 23, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Apr 23, 2021
@openjdk
Copy link

openjdk bot commented Apr 23, 2021

@Wanghuang-Huawei Since your change was applied there has been 1 commit pushed to the vectorIntrinsics branch:

Your commit was automatically rebased without conflicts.

Pushed as commit 6d5c8bd.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants