8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE #15010

pfustc · 2023-07-25T07:42:59Z

Hotspot jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on AArch64 CPUs with 512-bit SVE. The reason is that many test loops in the code cannot be vectorized due to data dependence but IR tests assume they can.

On AArch64, these IR tests just check the CPU feature of asimd and incorrectly assumes AArch64 vectors are at most 256 bits. But actually, asimd on AArch64 only represents NEON vectors which are at most 128 bits. AArch64 CPUs may have another feature of sve which represents scalable vectors of at most 2048 bits. The vectorization won't succeed on 512-bit SVE CPUs if the memory offset between some read and write is less than 512 bits.

As this jtreg is auto-generated by a python script, we have updated the script and re-generated this jtreg. In this new version, we checked the auto-vectorization on both NEON-only and NEON+SVE platforms. Below is the diff of the generator script. We have also attached the new script to the JBS page.

@@ -321,7 +321,8 @@ class Type:
            p.append(Platform("avx512", ["avx512", "true"], 64))
         else:
            assert False, "type not implemented" + self.name
-        p.append(Platform("asimd", ["asimd", "true"], 32))
+        p.append(Platform("asimd", ["asimd", "true", "sve", "false"], 16))
+        p.append(Platform("sve", ["sve", "true"], 256))
         return p

 class Test:
@@ -457,7 +458,7 @@ class Generator:
         lines.append(" *   and various MaxVectorSize values, and +- AlignVector.")
         lines.append(" *")
         lines.append(" * Note: this test is auto-generated. Please modify / generate with script:")
-        lines.append(" *       https://bugs.openjdk.org/browse/JDK-8308606")
+        lines.append(" *       https://bugs.openjdk.org/browse/JDK-8312570")
         lines.append(" *")
         lines.append(" * Types: " + ", ".join([t.name for t in self.types]))
         lines.append(" * Offsets: " + ", ".join([str(o) for o in self.offsets]))
@@ -598,7 +599,8 @@ class Generator:
             # IR rules
             for p in test.t.platforms():
                 elements = p.vector_width // test.t.size
-                lines.append(f"    // CPU: {p.name} -> vector_width: {p.vector_width} -> elements in vector: {elements}")
+                max_pre = "max " if p.name == "sve" else ""
+                lines.append(f"    // CPU: {p.name} -> {max_pre}vector_width: {p.vector_width} -> {max_pre}elements in vector: {elements}")
                 ###############  -AlignVector
                 rule = PlatformIRRule(p)
                 rule.add_pre_constraint("AlignVector", IRBool.makeFalse())
@@ -694,8 +696,8 @@ class Generator:
 def main():
     g = Generator()
     g.generate("TestDependencyOffsets",
-               "/home/emanuel/Documents/fork7-jdk/open/test/hotspot/jtreg/compiler/loopopts/superword",
-               "8298935 8308606", # Big ID
+               "test/hotspot/jtreg/compiler/loopopts/superword",
+               "8298935 8308606 8312570", # Bug ID
                "compiler.loopopts.superword", # package
     )

We tested this on various of AArch64 CPUs.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE (Bug - P4)

Reviewers

Emanuel Peter (@eme64 - Reviewer) ⚠️ Review applies to b62ade2b
Vladimir Kozlov (@vnkozlov - Reviewer) ⚠️ Review applies to b62ade2b

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/15010/head:pull/15010
$ git checkout pull/15010

Update a local copy of the PR:
$ git checkout pull/15010
$ git pull https://git.openjdk.org/jdk.git pull/15010/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 15010

View PR using the GUI difftool:
$ git pr show -t 15010

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/15010.diff

Webrev

Link to Webrev Comment

…fsets.java fails on 512-bit SVE Hotspot jtreg `compiler/loopopts/superword/TestDependencyOffsets.java` fails on AArch64 CPUs with 512-bit SVE. The reason is that many test loops in the code cannot be vectorized due to data dependence but IR tests assume they can. On AArch64, these IR tests just check the CPU feature of `asimd` and incorrectly assumes AArch64 vectors are at most 256 bits. But actually, `asimd` on AArch64 only represents NEON vectors which are at most 128 bits. AArch64 CPUs may have another feature of `sve` which represents scalable vectors of at most 2048 bits. The vectorization won't succeed on 512-bit SVE CPUs if the memory offset between some read and write is less than 512 bits. As this jtreg is auto-generated by a python script, we have updated the script and re-generated this jtreg. In this new version, we checked the auto-vectorization on both NEON-only and NEON+SVE platforms. Below is the diff of the generator script. We have also attached the new script to the JBS page. ``` @@ -321,7 +321,8 @@ class Type: p.append(Platform("avx512", ["avx512", "true"], 64)) else: assert False, "type not implemented" + self.name - p.append(Platform("asimd", ["asimd", "true"], 32)) + p.append(Platform("asimd", ["asimd", "true", "sve", "false"], 16)) + p.append(Platform("sve", ["sve", "true"], 256)) return p class Test: @@ -457,7 +458,7 @@ class Generator: lines.append(" * and various MaxVectorSize values, and +- AlignVector.") lines.append(" *") lines.append(" * Note: this test is auto-generated. Please modify / generate with script:") - lines.append(" * https://bugs.openjdk.org/browse/JDK-8308606") + lines.append(" * https://bugs.openjdk.org/browse/JDK-8312570") lines.append(" *") lines.append(" * Types: " + ", ".join([t.name for t in self.types])) lines.append(" * Offsets: " + ", ".join([str(o) for o in self.offsets])) @@ -598,7 +599,8 @@ class Generator: # IR rules for p in test.t.platforms(): elements = p.vector_width // test.t.size - lines.append(f" // CPU: {p.name} -> vector_width: {p.vector_width} -> elements in vector: {elements}") + max_pre = "max " if p.name == "sve" else "" + lines.append(f" // CPU: {p.name} -> {max_pre}vector_width: {p.vector_width} -> {max_pre}elements in vector: {elements}") ############### -AlignVector rule = PlatformIRRule(p) rule.add_pre_constraint("AlignVector", IRBool.makeFalse()) @@ -694,8 +696,8 @@ class Generator: def main(): g = Generator() g.generate("TestDependencyOffsets", - "/home/emanuel/Documents/fork7-jdk/open/test/hotspot/jtreg/compiler/loopopts/superword", - "8298935 8308606", # Big ID + "test/hotspot/jtreg/compiler/loopopts/superword", + "8298935 8308606 8312570", # Bug ID "compiler.loopopts.superword", # package ) ``` We tested this on various of AArch64 CPUs.

bridgekeeper · 2023-07-25T07:43:56Z

👋 Welcome back pli! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2023-07-25T07:45:26Z

@pfustc The following label will be automatically applied to this pull request:

hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

mlbridge · 2023-07-25T07:49:46Z

Webrevs

01: Full (310cf9d1)
00: Full (b62ade2b)

pfustc · 2023-07-25T07:51:06Z

@eme64 Please help look at this. And, how about adding the test generator script you wrote into the jdk repo?

TobiHartmann · 2023-07-27T07:27:01Z

Emanuel is on vacation until Aug 9. Can this wait or should we problem list?

pfustc · 2023-07-31T07:54:16Z

Emanuel is on vacation until Aug 9. Can this wait or should we problem list?

Thanks for the info. It's ok for us to wait, since very few people are using 512-bit SVE today.

eme64 · 2023-08-03T16:50:02Z

@pfustc @TobiHartmann I just saw this on my emails. So I'll give a quick response:

We had this running on Aarch64 machines with asimd but without sve. Why do you think that this even passed with my 32 byte assumption (256 bit)? You say it should only have 128 bit.

What is the max_pre for? Is it necessary?

Adding the script to the jdk repo could be nice. But I think there may be some issues with adding python files. We may have to rewrite it in java. But I think for now adding and updating it via JBS is ok.

pfustc · 2023-08-04T03:18:14Z

We had this running on Aarch64 machines with asimd but without sve. Why do you think that this even passed with my 32 byte assumption (256 bit)? You say it should only have 128 bit.

Assuming NEON has larger vector size (256 bit, which is wrong) won't result in any failure on NEON-only machines. But it results in running less IR checks on 256-bit SVE. Let's take below IR condition change as an example.

-        applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 16"},
+        applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8"},

Before this patch, the existence of vector IRs won't be checked on 256-bit SVE as we have MaxVectorSize <= 16. After this patch, it will be checked. The main reason of failures on 512-bit SVE is the lack of sve == false check so the IR tests will run on machines with vector length > 256 bits.

What is the max_pre for? Is it necessary?

It just adds a prefix to make the comment more precise, as SVE uses scalable vectors and the vector length ranges from 128 bits to 2048 bits.

eme64

@pfustc Thanks for the changes and explanations, looks good to me! :)

Ah. Just one more idea: Since you now have even longer vector widths with 2048 bits: Should we not add some cases with even larger dependency offsets? We should go further than -196, 196. We could consider adding 255, 256, 511, 512, 1024, 1536 (positive and negative). Of course the question is if that increases the runtime too much, what do you think?

openjdk · 2023-08-10T10:36:49Z

@pfustc This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE

Reviewed-by: epeter, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

vnkozlov

Good.

eme64 · 2023-08-15T12:55:35Z

@pfustc you will have to merge the changes from JDK-8310308.

pfustc · 2023-08-28T08:58:13Z

Thanks @eme64 @vnkozlov for review. Patch is currently merged with master. The python generator file is also updated and attached to the JBS page.

Ah. Just one more idea: Since you now have even longer vector widths with 2048 bits: Should we not add some cases with even larger dependency offsets? We should go further than -196, 196. We could consider adding 255, 256, 511, 512, 1024, 1536 (positive and negative). Of course the question is if that increases the runtime too much, what do you think?

This is a good suggestion but I don't think it's necessary now and in the near future. Although Arm architecture document says SVE vector length can be at most 2048 bits, so far, the largest SVE vector length of real AArch64 CPUs in the world is 512 bits. AFAIK, there will not be real CPUs with 1024-bit SVE in the short term. So I prefer keeping current offset values for now.

eme64 · 2023-08-28T11:03:41Z

@pfustc ok, fair enough. If there is no real hardware, and not coming anytime soon, then we can leave the offsets as is. Thanks again for the fix!

pfustc · 2023-08-29T01:33:05Z

/integrate

openjdk · 2023-08-29T01:34:35Z

Going to push as commit e5ea9aa.
Since your change was applied there have been 16 commits pushed to the master branch:

1cb2cc6: 8308464: Shared array class should not always be loaded in boot loader
69d1feb: 8315060: Out of tree incremental build fails with ccache
8e2a533: 8315137: Add explicit override RecordComponentElement.asType()
b4b2fec: 8311081: KeytoolReaderP12Test.java fail on localized Windows platform
31e2681: 8315071: Modify TrayIconScalingTest.java, PrintLatinCJKTest.java to use new PassFailJFrame's builder pattern usage
21916f3: 8139208: [macosx] Issue with setExtendedState of JFrame
99ea8bf: 8315062: [GHA] get-bootjdk action should return the abolute path
acb24bf: 8315116: fix minor issue in copyright header introduced by JDK-8269957 that is breaking the build
11da15d: 8269957: facilitate alternate impls of NameTable and Name
725ec0c: 8315020: The macro definition for LoongArch64 zero build is not accurate.
... and 6 more: https://git.openjdk.org/jdk/compare/12de9b0225363377e9a76729b11698221d4f29f2...master

Your commit was automatically rebased without conflicts.

openjdk · 2023-08-29T01:34:43Z

@pfustc Pushed as commit e5ea9aa.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

openjdk bot added the rfr Pull request is ready for review label Jul 25, 2023

openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jul 25, 2023

eme64 approved these changes Aug 10, 2023

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Aug 10, 2023

vnkozlov approved these changes Aug 10, 2023

View reviewed changes

Merge branch 'master' into deptest

310cf9d

openjdk bot added the integrated Pull request has been integrated label Aug 29, 2023

openjdk bot closed this Aug 29, 2023

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Aug 29, 2023

pfustc deleted the deptest branch August 29, 2023 01:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE #15010

8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE #15010

pfustc commented Jul 25, 2023 •

edited by openjdk bot

Loading

bridgekeeper bot commented Jul 25, 2023

openjdk bot commented Jul 25, 2023

mlbridge bot commented Jul 25, 2023 •

edited

Loading

pfustc commented Jul 25, 2023

TobiHartmann commented Jul 27, 2023

pfustc commented Jul 31, 2023

eme64 commented Aug 3, 2023 •

edited

Loading

pfustc commented Aug 4, 2023

eme64 left a comment •

edited

Loading

openjdk bot commented Aug 10, 2023 •

edited

Loading

vnkozlov left a comment

eme64 commented Aug 15, 2023

pfustc commented Aug 28, 2023

eme64 commented Aug 28, 2023

pfustc commented Aug 29, 2023

openjdk bot commented Aug 29, 2023

openjdk bot commented Aug 29, 2023

8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE #15010

8312570: [TESTBUG] Jtreg compiler/loopopts/superword/TestDependencyOffsets.java fails on 512-bit SVE #15010

Conversation

pfustc commented Jul 25, 2023 • edited by openjdk bot Loading

Progress

Issue

Reviewers

Reviewing

Webrev

bridgekeeper bot commented Jul 25, 2023

openjdk bot commented Jul 25, 2023

mlbridge bot commented Jul 25, 2023 • edited Loading

Webrevs

pfustc commented Jul 25, 2023

TobiHartmann commented Jul 27, 2023

pfustc commented Jul 31, 2023

eme64 commented Aug 3, 2023 • edited Loading

pfustc commented Aug 4, 2023

eme64 left a comment • edited Loading

Choose a reason for hiding this comment

openjdk bot commented Aug 10, 2023 • edited Loading

vnkozlov left a comment

Choose a reason for hiding this comment

eme64 commented Aug 15, 2023

pfustc commented Aug 28, 2023

eme64 commented Aug 28, 2023

pfustc commented Aug 29, 2023

openjdk bot commented Aug 29, 2023

openjdk bot commented Aug 29, 2023

pfustc commented Jul 25, 2023 •

edited by openjdk bot

Loading

mlbridge bot commented Jul 25, 2023 •

edited

Loading

eme64 commented Aug 3, 2023 •

edited

Loading

eme64 left a comment •

edited

Loading

openjdk bot commented Aug 10, 2023 •

edited

Loading