Skip to content

Conversation

@Bhavana-Kilambi
Copy link
Contributor

@Bhavana-Kilambi Bhavana-Kilambi commented Feb 24, 2025

This patch adds aarch64 backend for scalar FP16 operations namely - add, subtract, multiply, divide, fma, sqrt, min and max.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8345125: Aarch64: Add aarch64 backend for Float16 scalar operations (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/23748/head:pull/23748
$ git checkout pull/23748

Update a local copy of the PR:
$ git checkout pull/23748
$ git pull https://git.openjdk.org/jdk.git pull/23748/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 23748

View PR using the GUI difftool:
$ git pr show -t 23748

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/23748.diff

Using Webrev

Link to Webrev Comment

This patch adds aarch64 backend for scalar FP16 operations namely - add,
subtract, multiply, divide, fma, sqrt, min and max.
@bridgekeeper
Copy link

bridgekeeper bot commented Feb 24, 2025

👋 Welcome back bkilambi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 24, 2025

@Bhavana-Kilambi This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8345125: Aarch64: Add aarch64 backend for Float16 scalar operations

Reviewed-by: aph, haosun

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 18 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @shqking) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 24, 2025
@openjdk
Copy link

openjdk bot commented Feb 24, 2025

@Bhavana-Kilambi The following labels will be automatically applied to this pull request:

  • graal
  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added graal graal-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Feb 24, 2025
@mlbridge
Copy link

mlbridge bot commented Feb 24, 2025

Webrevs

@Bhavana-Kilambi Bhavana-Kilambi changed the title 8345125: Aarch64: Add aarch64 backend for Float16 operations 8345125: Aarch64: Add aarch64 backend for Float16 scalar operations Feb 24, 2025
0xff03, 0xfffe]
0x7e0, 0xfc0, 0x1f80, 0x3ff0, 0x7e00, 0x8000,
0x81ff, 0xc1ff, 0xc003, 0xc7ff, 0xdfff, 0xe03f,
0xe1ff, 0xf801, 0xfc00, 0xfc07, 0xff03, 0xfffe]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here you've deleted the duplicated 0x7e00 (good) but also the not-duplicated 0xe10f. Is 0xe10f not valid?

Copy link
Contributor Author

@Bhavana-Kilambi Bhavana-Kilambi Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, yes 0xe10f does not seem to be valid. While I tried generating the asmtest.out.h I ran into errors with this value -

aarch64ops.s:1105: Error: immediate out of range at operand 3 -- eor z6.h,z6.h,#0xe10f
aarch64ops.s:1123: Error: immediate out of range at operand 3 -- eor z3.h,z3.h,#0xe10f

So I looked it up here - https://gist.github.com/dinfuehr/51a01ac58c0b23e4de9aac313ed6a06a to see if this number is a legal immediate and looks like it isn't. Maybe it's just chance that this number wasn't generated before as an immediate operand and these errors didn't show up till now.

@theRealAph
Copy link
Contributor

Overall, this looks like a great pice of work. I only have a few changes in comments and a question, then we're good to go.

INSN(fnmuld, 0b000, 0b01, 0b100010, 0b1);

// Half-precision floating-point instructions
INSN(fabdh, 0b011, 0b11, 0b000101, 0b0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose fadbh and fnmulh are added to keep aligned with the float and double ones, i.e. fabd(s|d) and fnmul(s|d).

I noticed that there are matching rules for fabd(s|d), i.e. absd(F|D)_reg. I wonder if we need add the corresponding rule for fp16 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @shqking , thanks for your review comments. Yes I added fabdh and fnmulh to keep aligned with float and double types.
For adding support for FP16 absd we need AbsHF to be supported (along with SubHF) but AbsHF node is not implemented currently. abs operation is directly executed from the java code here -

public static Float16 abs(Float16 f16) {
and is not intrinsified or pattern matched like other FP16 operations. Same with negate operation for FP16 -
return shortBitsToFloat16((short)(f16.value ^ (short)0x0000_8000));

On the Valhalla repo, while these operation were being developed, I tried adding support for AbsHF/NegHF which emitted fabs and fneg instructions but the performance with the direct java code(bit manipulation operations) was much faster (sorry don't remember the exact number) so we decided to go with the java implementation instead.
I still added fabd here because op21 is 0 only in fabd H variant and felt that it'd be better to handle it here as it belongs to this group of instructions. Please let me know your thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the RM, fabd is in Advanced SIMD scalar three same FP16, but the rest are in Floating-point data-processing (2 source). The decoding scheme looks rather different.fabd, then, doesn't really fit here, but in a section with the rest of the three same FP16 instructions.
The encoding scheme for Advanced SIMD scalar three same FP16 is pretty simple, so I suggest you create a new group for them, and put fabd in there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bhavana-Kilambi Thanks for your explanation for the missing AbsHF. It's okay to me to have fadbh and fnmulh in this patch.

Overall it's good to me except aph's comment above.

Copy link
Contributor Author

@Bhavana-Kilambi Bhavana-Kilambi Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @theRealAph Thanks again for the review and apologies for the delay in responding.
I moved the three fabd instructions out of their current place and added them in two separate sections - one for the single and double precision (Advanced SIMD scalar three same) and another for FP16 (Advanced SIMD scalar three same FP16). Please review the changes. Thank you!

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 26, 2025

@Bhavana-Kilambi This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@Bhavana-Kilambi
Copy link
Contributor Author

Hello @shqking @theRealAph , sincere apologies for the delay in addressing the review comments. I am planning on uploading a patch soon addressing all review comments. Thank you !

@Bhavana-Kilambi
Copy link
Contributor Author

Hello, I would not be able to respond to comments until the next couple months or so due to some urgent tasks at work. Until then, I'd move this PR to draft status so that it would not be closed due to lack of activity. Thank you for the review!

@Bhavana-Kilambi Bhavana-Kilambi marked this pull request as draft March 31, 2025 09:52
@openjdk
Copy link

openjdk bot commented Mar 31, 2025

@Bhavana-Kilambi this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8345125
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed rfr Pull request is ready for review labels Mar 31, 2025
@mhaessig
Copy link
Contributor

Can you please uncomment the following tests using aarch64 float16 in this PR?

https://github.com/mhaessig/jdk/blob/cec48ed270d3bdf704c389a091b42a32c2ed6440/test/hotspot/jtreg/compiler/floatingpoint/TestSubNodeFloatDoubleNegation.java#L58-L60

@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Apr 24, 2025
@Bhavana-Kilambi
Copy link
Contributor Author

Can you please uncomment the following tests using aarch64 float16 in this PR?

https://github.com/mhaessig/jdk/blob/cec48ed270d3bdf704c389a091b42a32c2ed6440/test/hotspot/jtreg/compiler/floatingpoint/TestSubNodeFloatDoubleNegation.java#L58-L60

Done. Thanks for notifying.

@Bhavana-Kilambi Bhavana-Kilambi marked this pull request as ready for review April 24, 2025 15:53
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 24, 2025
@shqking
Copy link
Contributor

shqking commented Apr 25, 2025

Hi @Bhavana-Kilambi, I noticed there exists inconsistency between test/hotspot/gtest/aarch64/asmtest.out.h and test/hotspot/gtest/aarch64/aarch64-asmtest.py in the latest commit. We should resolve that.

@Bhavana-Kilambi
Copy link
Contributor Author

Hi @Bhavana-Kilambi, I noticed there exists inconsistency between test/hotspot/gtest/aarch64/asmtest.out.h and test/hotspot/gtest/aarch64/aarch64-asmtest.py in the latest commit. We should resolve that.

Thanks. I think I missed generating the asmtest.out.h file in the new commit. I'll update it.

@Bhavana-Kilambi
Copy link
Contributor Author

Hi @shqking I have regenerated the test/hotspot/gtest/aarch64/asmtest.out.h to keep it consistent with the instructions in test/hotspot/gtest/aarch64/aarch64-asmtest.py. Can you please review now? Thanks!

Copy link
Contributor

@theRealAph theRealAph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Please summarize the tests that you've run before committing.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 25, 2025
@Bhavana-Kilambi
Copy link
Contributor Author

Looks good to me. Please summarize the tests that you've run before committing.

Thank you for the approval.
All hotspot (hotspot_all), jdk (tiers 1-3) and langtools (tier1) pass on N1, V1 and V2 architectures.

Copy link
Contributor

@shqking shqking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your update. Looks good to me.

@theRealAph
Copy link
Contributor

theRealAph commented Apr 25, 2025 via email

@Bhavana-Kilambi
Copy link
Contributor Author

Bhavana-Kilambi commented Apr 25, 2025

On 4/25/25 10:17, bkilambi wrote: Bhavana-Kilambi left a comment (openjdk/jdk#23748) > Looks good to me. Please summarize the tests that you've run before committing. Thank you for the approval. All hotspot (hotspot_all), jdk (tiers 1-3) and langtools (tier1) pass on N1, V1 and V2 architectures.
But do any of these tests use Float16 ?

-- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://www.redhat.com https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Yes they do.
The following FP16 tests are included in them and have also been tested separately (these are the ones which have been added/modified in the commit which added initial scalar FP16 support in JDK mainline (4b463ee) -


test/hotspot/jtreg/compiler/c2/irTests/ConvF2HFIdealizationTests.java
test/hotspot/jtreg/compiler/c2/irTests/MulHFNodeIdealizationTests.java
test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java
test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorConvChain.java
test/jdk/jdk/incubator/vector/ScalarFloat16OperationsTest.java

Please let me know if you'd like me to test anything else. Thanks!

@Bhavana-Kilambi
Copy link
Contributor Author

Hi @theRealAph , is it ok if I integrate this or would you like me to do any other testing on the patch?

@theRealAph
Copy link
Contributor

Hi @theRealAph , is it ok if I integrate this or would you like me to do any other testing on the patch?

As long as you've got test coverage for everything here, go ahead.

@Bhavana-Kilambi
Copy link
Contributor Author

Thanks a lot! Can I ask you to please sponsor this patch?
/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Apr 25, 2025
@openjdk
Copy link

openjdk bot commented Apr 25, 2025

@Bhavana-Kilambi
Your change (at version 6a49a11) is now ready to be sponsored by a Committer.

@shqking
Copy link
Contributor

shqking commented Apr 28, 2025

/sponsor

@openjdk
Copy link

openjdk bot commented Apr 28, 2025

Going to push as commit 3140de4.
Since your change was applied there have been 39 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Apr 28, 2025
@openjdk openjdk bot closed this Apr 28, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Apr 28, 2025
@openjdk
Copy link

openjdk bot commented Apr 28, 2025

@shqking @Bhavana-Kilambi Pushed as commit 3140de4.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@TobiHartmann
Copy link
Member

This caused a regression: JDK-8355708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

graal graal-dev@openjdk.org hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

5 participants