8290249: Vectorize signum on AArch64 #9807

Bhavana-Kilambi · 2022-08-09T10:15:41Z

This patch auto-vectorizes Math.signum intrinsic for float and double
types on aarch64 (Neon and SVE). On SVE supporting machines, if the
MaxVectorSize <=16 the Neon code would be emitted and if the
MaxVectorSize > 16, the SVE code for the intrinsic would be emitted.

Following is the performance data for the micro test here -
test/micro/org/openjdk/bench/vm/compiler/VectorSignum.java

Benchmark	                Size    A	B       C
VectorSignum.doubleSignum	256	1.79	1.70	3.18
VectorSignum.doubleSignum	512	1.86	1.73	3.69
VectorSignum.doubleSignum	1024	1.89	1.74	2.98
VectorSignum.doubleSignum	2048	1.92	1.75	3.04
VectorSignum.floatSignum	256	3.34	3.06	3.92
VectorSignum.floatSignum	512	3.63	3.22	5.27
VectorSignum.floatSignum	1024	3.76	3.35	4.77
VectorSignum.floatSignum	2048	3.85	3.47	5.59

A, B , C machine descriptions given below -
A : 128-bit Neon machine
B : 256-bit SVE machine
C : 512-bit SVE machine

The numbers in the table are the gain ratios between the runtime (ns/op)
of the scalar, non-vectorized intrinsic code and the vectorized version
of the intrinsic (this patch).

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8290249: Vectorize signum on AArch64

Reviewers

Andrew Haley (@theRealAph - Reviewer)
Nick Gasson (@nick-arm - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9807/head:pull/9807
$ git checkout pull/9807

Update a local copy of the PR:
$ git checkout pull/9807
$ git pull https://git.openjdk.org/jdk pull/9807/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 9807

View PR using the GUI difftool:
$ git pr show -t 9807

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9807.diff

This patch auto-vectorizes Math.signum intrinsic for float and double types on aarch64 (Neon and SVE). On SVE supporting machines, if the MaxVectorSize <=16 the Neon code would be emitted and if the MaxVectorSize > 16, the SVE code for the intrinsic would be emitted. Following is the performance data for the micro test here - test/micro/org/openjdk/bench/vm/compiler/VectorSignum.java Benchmark Size A B C VectorSignum.doubleSignum 256 1.79 1.70 3.18 VectorSignum.doubleSignum 512 1.86 1.73 3.69 VectorSignum.doubleSignum 1024 1.89 1.74 2.98 VectorSignum.doubleSignum 2048 1.92 1.75 3.04 VectorSignum.floatSignum 256 3.34 3.06 3.92 VectorSignum.floatSignum 512 3.63 3.22 5.27 VectorSignum.floatSignum 1024 3.76 3.35 4.77 VectorSignum.floatSignum 2048 3.85 3.47 5.59 A, B , C machine descriptions given below - A : 128-bit Neon machine B : 256-bit SVE machine C : 512-bit SVE machine The numbers in the table are the gain ratios between the runtime (ns/op) of the scalar, non-vectorized intrinsic code and the vectorized version of the intrinsic (this patch).

bridgekeeper · 2022-08-09T10:17:02Z

👋 Welcome back Bhavana-Kilambi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2022-08-09T10:20:10Z

@Bhavana-Kilambi The following label will be automatically applied to this pull request:

hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2022-08-09T10:24:09Z

Webrevs

theRealAph · 2022-08-10T09:27:09Z

Please do not commit this until 9346 is in.

theRealAph · 2022-08-10T09:31:03Z

src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp

+    switch (T) {
+    case S:
+      sve_and(vtmp, T, 0x80000000); // Extract the sign bit of float value in every lane of src
+      sve_orr(vtmp, T, 0x3f800000); // OR it with +1 to make the final result +1 or -1 depending


Suggested change

sve_orr(vtmp, T, 0x3f800000); // OR it with +1 to make the final result +1 or -1 depending

sve_orr(vtmp, T, jlong_cast(1.0)); // OR it with +1 to make the final result +1 or -1 depending

...everywhere

theRealAph · 2022-08-10T09:41:20Z

src/hotspot/cpu/aarch64/assembler_aarch64.hpp

  INSN(sve_mls,   0b00000100, 0, 0b011); // multiply-subtract, writing addend: Zda = Zda + -Zn*Zm
 #undef INSN

+// SVE floating-point compare abs (predicated)


This should be handled by the "SVE Integer/Floating-Point Compare - Vectors" code.

Bhavana-Kilambi · 2022-08-16T10:02:50Z

Hello, thank you for reviewing my patch. I have made the changes as suggested and waiting for the refactoring patch to be merged. I will then change my *ad files accordingly and put another patch for review in this PR.

theRealAph · 2022-08-16T10:18:36Z

Hello, thank you for reviewing my patch. I have made the changes as suggested and waiting for the refactoring patch to be merged. I will then change my *ad files accordingly and put another patch for review in this PR.

The change to assembler.hpp is still not done.

Bhavana-Kilambi · 2022-08-16T10:26:21Z

Hello, thank you for reviewing my patch. I have made the changes as suggested and waiting for the refactoring patch to be merged. I will then change my *ad files accordingly and put another patch for review in this PR.

The change to assembler.hpp is still not done.
I mean I will put up the entire patch (with changes to the assember_aarch64.hpp, c2_MacroAssembler_aarch64.cpp as suggested by you and also the *ad files) once the refactoring patch is merged. I will need to make changes to the *ad files once the refactoring patch is merged, so I plan to put them all together in a single patch. Apologies if I wasn't clear.

Bhavana-Kilambi · 2022-08-16T13:18:40Z

Hi, I just pushed a new commit with the proposed changes (and a few others). Please review. Once the refactoring patch is merged, I will rebase/merge this patch accordingly. Thank you.

theRealAph · 2022-08-16T13:28:40Z

src/hotspot/cpu/aarch64/assembler_aarch64.hpp

      case NE: cond_op = (op2 << 2) | 0b11; break;                                     \
-      case GE: cond_op = (op2 << 2) | 0b00; break;                                     \
-      case GT: cond_op = (op2 << 2) | 0b01; break;                                     \
+      case GE: cond_op = (op2 << 2) | ((op2 == 0b11) ? 0b01 : 0b00); break;            \


would something like this be easier to understand?

bool is_absolute = op2 == 0b11;
....

case GE: cond_op = (op2 << 2) | (is_absolute ? 0b01 : 0b00); break; \

Bhavana-Kilambi · 2022-08-17T08:09:48Z

src/hotspot/cpu/aarch64/assembler_aarch64.hpp

-    } else {                                                                           \
-      assert(T != B && T != Q, "invalid size");                                        \
-      assert(cond != HI && cond != HS, "invalid condition for fcm");                   \
+    assert(T != Q, "invalid size");                                                    \


Thank you for reviewing. Could you please clarify by what exactly you mean by "Please wrap all of this in #ifdef ASSERT"? Do you mean squashing the if conditions with the asserts? The assert macro calls are already inside a "#define".

Bhavana-Kilambi · 2022-08-19T11:00:30Z

The builds on aarch64 have failed as I missed adding parantheses in the assembler.hpp file. Will update with the new patch shortly.

…e error

openjdk · 2022-08-19T16:41:29Z

@Bhavana-Kilambi This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8290249: Vectorize signum on AArch64

Reviewed-by: aph, ngasson

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 19 new commits pushed to the master branch:

27b0f77: 8292318: Memory corruption in remove_dumptime_info
9a65524: 8290300: Use standard String-joining tools where applicable
f9004fe: 8292561: Make "ReplayCompiles" a diagnostic product switch
2fbb936: 8292691: Move CompilerConfig::is_xxx() inline functions out of compilerDefinitions.hpp
3601e30: 8290909: MemoryPoolMBean/isUsageThresholdExceeded tests failed with "isUsageThresholdExceeded() returned false, and is still false, while threshold = MMMMMMM and used peak = NNNNNNN"
37c0a13: 8292350: Use static methods for hashCode/toString primitives
4453200: 8292628: x86: Improve handling of constants in trigonometric stubs
07c9ba7: 8292686: runtime/cds/appcds/TestWithProfiler.java SIGSEGV in TableStatistics ctr
235151e: 8292676: Remove two kerberos tests from problem list
df5209e: 8292683: Remove BadKeyUsageTest.java from Problem List
... and 9 more: https://git.openjdk.org/jdk/compare/f2f0cd86bf4dce4633f484476077fd090549780e...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @nick-arm) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

Bhavana-Kilambi · 2022-08-22T08:23:47Z

/integrate

openjdk · 2022-08-22T08:25:28Z

@Bhavana-Kilambi
Your change (at version 221405e) is now ready to be sponsored by a Committer.

nick-arm · 2022-08-22T08:59:11Z

/sponsor

openjdk · 2022-08-22T09:01:24Z

Going to push as commit 07c7977.
Since your change was applied there have been 20 commits pushed to the master branch:

a3ec0bb: 8253413: [REDO] [REDO] G1 incorrectly limiting young gen size when using the reserve can result in repeated full gcs
27b0f77: 8292318: Memory corruption in remove_dumptime_info
9a65524: 8290300: Use standard String-joining tools where applicable
f9004fe: 8292561: Make "ReplayCompiles" a diagnostic product switch
2fbb936: 8292691: Move CompilerConfig::is_xxx() inline functions out of compilerDefinitions.hpp
3601e30: 8290909: MemoryPoolMBean/isUsageThresholdExceeded tests failed with "isUsageThresholdExceeded() returned false, and is still false, while threshold = MMMMMMM and used peak = NNNNNNN"
37c0a13: 8292350: Use static methods for hashCode/toString primitives
4453200: 8292628: x86: Improve handling of constants in trigonometric stubs
07c9ba7: 8292686: runtime/cds/appcds/TestWithProfiler.java SIGSEGV in TableStatistics ctr
235151e: 8292676: Remove two kerberos tests from problem list
... and 10 more: https://git.openjdk.org/jdk/compare/f2f0cd86bf4dce4633f484476077fd090549780e...master

Your commit was automatically rebased without conflicts.

openjdk · 2022-08-22T09:01:56Z

@nick-arm @Bhavana-Kilambi Pushed as commit 07c7977.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk bot added the rfr Pull request is ready for review label Aug 9, 2022

openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Aug 9, 2022

theRealAph reviewed Aug 10, 2022

View reviewed changes

Bhavana-Kilambi added 2 commits August 16, 2022 13:46

Merge master

133396c

Merge sve_facgt with int/fp compare and few optimizations

b5d6931

theRealAph reviewed Aug 16, 2022

View reviewed changes

Bhavana-Kilambi commented Aug 17, 2022

View reviewed changes

Bhavana-Kilambi added 2 commits August 19, 2022 10:56

Merge master

25aeba6

Add signum implementation in the aarch64_vector.ad file

a2f6b17

Add parantheses in aarch_assembler.hpp code to fix operator precedenc…

221405e

…e error

theRealAph approved these changes Aug 19, 2022

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Aug 19, 2022

nick-arm approved these changes Aug 22, 2022

View reviewed changes

openjdk bot added the sponsor Pull request is ready to be sponsored label Aug 22, 2022

openjdk bot added the integrated Pull request has been integrated label Aug 22, 2022

openjdk bot closed this Aug 22, 2022

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Aug 22, 2022

	sve_orr(vtmp, T, 0x3f800000); // OR it with +1 to make the final result +1 or -1 depending
	sve_orr(vtmp, T, jlong_cast(1.0)); // OR it with +1 to make the final result +1 or -1 depending

8290249: Vectorize signum on AArch64 #9807

8290249: Vectorize signum on AArch64 #9807

Uh oh!

Conversation

Bhavana-Kilambi commented Aug 9, 2022 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Reviewing

Uh oh!

bridgekeeper bot commented Aug 9, 2022

Uh oh!

openjdk bot commented Aug 9, 2022

Uh oh!

mlbridge bot commented Aug 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

theRealAph commented Aug 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theRealAph Aug 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

theRealAph Aug 10, 2022

Choose a reason for hiding this comment

Uh oh!

Bhavana-Kilambi commented Aug 16, 2022

Uh oh!

theRealAph commented Aug 16, 2022

Uh oh!

Bhavana-Kilambi commented Aug 16, 2022

Uh oh!

Bhavana-Kilambi commented Aug 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theRealAph Aug 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bhavana-Kilambi Aug 17, 2022

Choose a reason for hiding this comment

Uh oh!

Bhavana-Kilambi commented Aug 19, 2022

Uh oh!

openjdk bot commented Aug 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bhavana-Kilambi commented Aug 22, 2022

Uh oh!

openjdk bot commented Aug 22, 2022

Uh oh!

nick-arm commented Aug 22, 2022

Uh oh!

openjdk bot commented Aug 22, 2022

Uh oh!

openjdk bot commented Aug 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Bhavana-Kilambi commented Aug 9, 2022 •

edited by openjdk bot

Loading

mlbridge bot commented Aug 9, 2022 •

edited

Loading

theRealAph commented Aug 10, 2022 •

edited

Loading

theRealAph Aug 10, 2022 •

edited

Loading

Bhavana-Kilambi commented Aug 16, 2022 •

edited

Loading

theRealAph Aug 16, 2022 •

edited

Loading

openjdk bot commented Aug 19, 2022 •

edited

Loading