Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8259948: Aarch64: Add cast nodes for Aarch64 Neon backend #4839

Closed
wants to merge 13 commits into from

Conversation

Wanghuang-Huawei
Copy link

@Wanghuang-Huawei Wanghuang-Huawei commented Jul 20, 2021

  • In this issue, we plan to complete all missing implementation for aarch64 neon backend. For example, cast from Byte to Long, cast from Long to Byte, and so on.
  • It may be a solver of JDK-8269866, or part of it.

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8259948: Aarch64: Add cast nodes for Aarch64 Neon backend

Reviewers

Contributors

  • Wang Huang <whuang@openjdk.org>
  • Wu Yan <wuyan@openjdk.org>
  • Miao Zhuojun <mouzhuojun@huawei.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4839/head:pull/4839
$ git checkout pull/4839

Update a local copy of the PR:
$ git checkout pull/4839
$ git pull https://git.openjdk.java.net/jdk pull/4839/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 4839

View PR using the GUI difftool:
$ git pr show -t 4839

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4839.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 20, 2021

👋 Welcome back whuang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@Wanghuang-Huawei
Copy link
Author

/contributor add Wang Huang whuang@openjdk.org
/contributor add Wu Yan wuyan@openjdk.org
/contributor add Miao Zhuojun mouzhuojun@huawei.com

@openjdk
Copy link

openjdk bot commented Jul 20, 2021

@Wanghuang-Huawei
Contributor Wang Huang <whuang@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Jul 20, 2021

@Wanghuang-Huawei
Contributor Wu Yan <wuyan@openjdk.org> successfully added.

@openjdk
Copy link

openjdk bot commented Jul 20, 2021

@Wanghuang-Huawei
Contributor Miao Zhuojun <mouzhuojun@huawei.com> successfully added.

@openjdk
Copy link

openjdk bot commented Jul 20, 2021

@Wanghuang-Huawei The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jul 20, 2021
@nsjian
Copy link

nsjian commented Jul 21, 2021

Thanks for the work! Some general comments:

It may be a solver of JDK-8269866, or part of it.

I would suggest not to have a partial fix of JDK-8269866. I think you can still keep 8259948 as duplicate while targeting this to JDK-8269866 and have a fully proper fix. @theRealELiu may have some thoughts on how to have a clean fix: e.g. there may be some dependency on mid-end part, like JDK-8265244?

@theRealELiu has marked those missing rules opcode (with specific types/sizes) as unsupported in JDK-8268966, but I don't see you have unmarked them in your patch. So your newly added rules are not able to be tested. And there are some test cases included in JDK-8268966, could you please merge your test case into existing tests, if existing tests cannot cover some cases.

P.S. could you please fix the jcheck error?

@openjdk openjdk bot added the rfr Pull request is ready for review label Aug 3, 2021
@Wanghuang-Huawei
Copy link
Author

Thanks for the work! Some general comments:

It may be a solver of JDK-8269866, or part of it.

I would suggest not to have a partial fix of JDK-8269866. I think you can still keep 8259948 as duplicate while targeting this to JDK-8269866 and have a fully proper fix. @theRealELiu may have some thoughts on how to have a clean fix: e.g. there may be some dependency on mid-end part, like JDK-8265244?

@theRealELiu has marked those missing rules opcode (with specific types/sizes) as unsupported in JDK-8268966, but I don't see you have unmarked them in your patch. So your newly added rules are not able to be tested. And there are some test cases included in JDK-8268966, could you please merge your test case into existing tests, if existing tests cannot cover some cases.

P.S. could you please fix the jcheck error?

Thank you for your suggestion. We fix these errors and upload new patch.

@mlbridge
Copy link

mlbridge bot commented Aug 3, 2021

Copy link
Contributor

@theRealAph theRealAph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big question for all of this is: what is the test coverage?

@wuyan0
Copy link

wuyan0 commented Aug 6, 2021

The big question for all of this is: what is the test coverage?

We tested the following two test cases:
test/hotspot/jtreg/compiler/vectorapi/VectorCastShape128Test.java
test/hotspot/jtreg/compiler/vectorapi/VectorCastShape64Test.java

@theRealAph
Copy link
Contributor

The big question for all of this is: what is the test coverage?

We tested the following two test cases:
test/hotspot/jtreg/compiler/vectorapi/VectorCastShape128Test.java
test/hotspot/jtreg/compiler/vectorapi/VectorCastShape64Test.java

Thank you, but that doesn't really answer my question. Does your testing cover all that is added in this patch? If so, how did you ascertain it?

ins_encode %{
// If registers are the same, no register move is required - the
// upper bits of "src" are expected to have been initialized
// to zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a little concern about this assumption. How to ensure the upper bits are zero? Since the ReinterpretNode could be used separately and not always with CastNode together. E.g

https://github.com/openjdk/jdk/blob/jdk-18%2B9/src/hotspot/share/opto/vectorIntrinsics.cpp#L844

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several places in the AArch64 back end where we expect the upper bits of a register to be zero, but we've never depended on it. This is not a good time to start, so let's clear the bits in order to be certain.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as reinterpretD2X. The upper 96 bits of "src" are zero, I guess it is because when the 128-bit register is initialized, the upper 96 bits will be zeroed, and only its lower 32-bit data will be manipulated later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to clear the higher bits for such cases.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll clear the higher bits in the next commit.

@wuyan0
Copy link

wuyan0 commented Aug 10, 2021

Does your testing cover all that is added in this patch? If so, how did you ascertain it?

Yes, These two test cases have covered the code in the patch. First, I unmark the unsupported opcodes in JDK-8268966. Then I test the cases in both of the above test files one by one, implementing the rules once the testcase failed until all the testcases pass.

@@ -712,6 +712,7 @@
__ stxp(r4, zr, zr, r5); // stxp w4, xzr, xzr, [x5]
__ stxpw(r6, zr, zr, sp); // stxp w6, wzr, wzr, [sp]
__ dup(v0, __ T16B, zr); // dup v0.16b, wzr
__ dups(v0, __ T2S, v1); // mov s0, v1.s[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the DUP AArch64 instruction here, rather than MOV.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@theRealAph
Copy link
Contributor

Does your testing cover all that is added in this patch? If so, how did you ascertain it?

Yes, These two test cases have covered the code in the patch. First, I unmark the unsupported opcodes in JDK-8268966. Then I test the cases in both of the above test files one by one, implementing the rules once the testcase failed until all the testcases pass.

OK.

src/hotspot/cpu/aarch64/aarch64_neon.ad Outdated Show resolved Hide resolved
src/hotspot/cpu/aarch64/aarch64_neon.ad Outdated Show resolved Hide resolved
ins_encode %{
// If registers are the same, no register move is required - the
// upper bits of "src" are expected to have been initialized
// to zero.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to clear the higher bits for such cases.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Sep 27, 2021
@e1iu
Copy link
Contributor

e1iu commented Oct 14, 2021

I triggered the test, hotspot_all(no vmTestBase stress), langtools:tier1, jdk:tier1, tier2, tier3 are passed on aarch64.

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels Oct 14, 2021
@openjdk openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Oct 19, 2021
Copy link

@nsjian nsjian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Just some style nits.

src/hotspot/cpu/aarch64/aarch64_neon.ad Outdated Show resolved Hide resolved
src/hotspot/cpu/aarch64/aarch64_neon.ad Outdated Show resolved Hide resolved
Comment on lines 514 to 517
ins_pipe(pipe_slow);
%}
instruct vcvt4Fto4B(vecD dst, vecX src)
%{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here

src/hotspot/cpu/aarch64/aarch64_neon.ad Show resolved Hide resolved
Copy link
Contributor

@e1iu e1iu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wuyan0
Copy link

wuyan0 commented Oct 26, 2021

Thanks @nsjian @theRealELiu for reviewing this.

@wuyan0
Copy link

wuyan0 commented Oct 26, 2021

Hi, @theRealAph do I need another reviewer to review this?

Copy link
Contributor

@theRealAph theRealAph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks everyone, for all your hard work.

@Wanghuang-Huawei
Copy link
Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Oct 27, 2021
@openjdk
Copy link

openjdk bot commented Oct 27, 2021

@Wanghuang-Huawei
Your change (at version f317a0b) is now ready to be sponsored by a Committer.

@nsjian
Copy link

nsjian commented Oct 27, 2021

/sponsor

@openjdk
Copy link

openjdk bot commented Oct 27, 2021

Going to push as commit 9f75d5c.
Since your change was applied there have been 127 commits pushed to the master branch:

  • d98b7c2: 8202926: Test java/awt/Focus/WindowUpdateFocusabilityTest/WindowUpdateFocusabilityTest.html fails
  • b0d1e4f: 8273585: String.charAt performance degrades due to JDK-8268698
  • 7addcd7: 8276034: ProblemList gtest dll_address_to_function_and_library_name on macosx-x64
  • 2448b3f: 8275874: [JVMCI] only support aligned reads in c2v_readFieldValue
  • f1f5e26: 8275872: Sync J2DBench run and analyze Makefile targets with build.xml
  • 19f76c2: 8275079: Remove unnecessary conversion to String in java.net.http
  • e5cd269: 8274944: AppCDS dump causes SEGV in VM thread while adjusting lambda proxy class info
  • 82f4aac: 8259609: C2: optimize long range checks in long counted loops
  • 574f890: 8275720: CommonComponentAccessibility.createWithParent isWrapped causes mem leak
  • 7c88a59: 8275809: crash in [CommonComponentAccessibility getCAccessible:withEnv:]
  • ... and 117 more: https://git.openjdk.java.net/jdk/compare/947d52c4c3deec1bdea43959c200201c614ae114...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Oct 27, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Oct 27, 2021
@openjdk
Copy link

openjdk bot commented Oct 27, 2021

@nsjian @Wanghuang-Huawei Pushed as commit 9f75d5c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
6 participants