Skip to content
This repository has been archived by the owner. It is now read-only.

8259775: [Vector API] Incorrect code-gen for VectorReinterpret operation #122

Closed
wants to merge 5 commits into from

Conversation

DamonFool
Copy link
Member

@DamonFool DamonFool commented Jan 14, 2021

Hi all,

The code-gen for VectorReinterpret may be wrong on x86.

Let's see the opto-assembly for the reproducer in the JBS, which was actually based on @XiaohongGong 's example in JDK-8259353 and many thanks to her.

066     B7: #   out( N1 ) <- in( B6 )  Freq: 0.999994
066     vector_reinterpret_expand XMM0,XMM0     !
066     store_vector [R12 + R11 << 3 + #16] (compressed oop addressing),XMM0

Please note that the dst and src [1] share the same XMM0 register and movdqu [2] should be generated for this case.
But when dst == src, movdqu actually generates nothing [3], which leads to incorrect result;

For this case, movdqu should not be empty since the upper bits of dst should be zeroed.
The similar error also exists for vmovdqu [4].

I think we should also change movflt [5] to movss but I just can't understand why we have 4-byte vectors.
Isn't the shortest vectors 8-byte on x86?

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L3354
[2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L3364
[3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L2490
[4] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L2515
[5] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L3379


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8259775: [Vector API] Incorrect code-gen for VectorReinterpret operation

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk16 pull/122/head:pull/122
$ git checkout pull/122

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 14, 2021

/issue add JDK-8259775
/test
/label add hotspot-compiler
/cc hotspot-compiler

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jan 14, 2021

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Jan 14, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 14, 2021

@DamonFool This issue is referenced in the PR title - it will now be updated.

@openjdk openjdk bot added the hotspot-compiler label Jan 14, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 14, 2021

@DamonFool
The hotspot-compiler label was successfully added.

@openjdk
Copy link

@openjdk openjdk bot commented Jan 14, 2021

@DamonFool The hotspot-compiler label was already applied.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jan 14, 2021

Webrevs

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 16, 2021

/test

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 20, 2021

Hi all,

The reason for the wrong execution is that the upper bits of vector registers fails to be zeroed.
This is because movdqu(XMMRegister dst, XMMRegister src) and vmovdqu(XMMRegister dst, XMMRegister src) were incorrectly optimized when dst == src after JDK-8223347 (Integration of Vector API, Oct 14 20:02:46 2020).
So this seems to be a regression of JDK-8223347.

The 4-byte vectors also be fixed by using movfltz since we are not recommended to use movss directly [1].
And the jtreg test has been added to reproduce this bug on both AVX256 and AVX512 machines.

Could you please review it?

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.hpp#L1048

@openjdk
Copy link

@openjdk openjdk bot commented Jan 20, 2021

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8259775: [Vector API] Incorrect code-gen for VectorReinterpret operation

Reviewed-by: rbackman, neliasso, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 12 new commits pushed to the master branch:

  • ede1bea: 8227695: assert(pss->trim_ticks().seconds() == 0.0) failed: Unexpected partial trimming during evacuation
  • 62eab50: 8255199: Catching a few NumberFormatExceptions in xmldsig
  • a5367cb: 8247619: Improve Direct Buffering of Characters
  • 0408b23: 8259757: add a regression test for 8259353 and 8259601
  • 0120510: 8259732: JDK 16 L10n resource file update - msg drop 10
  • f7b96d3: 8259796: timed CompletableFuture.get may swallow InterruptedException
  • bb0821e: 8258643: [TESTBUG] javax/swing/JComponent/7154030/bug7154030.java failed with "Exception: Failed to hide opaque button"
  • cd25bf2: 8259574: SIGSEGV in BFSClosure::closure_impl
  • d5ca3b3: 8259641: C2: assert(early->dominates(LCA)) failed: early is high enough
  • e85892b: 8258396: SIGILL in jdk.jfr.internal.PlatformRecorder.rotateDisk()
  • ... and 2 more: https://git.openjdk.java.net/jdk16/compare/5926d75fa1a96e001c4167a4352e55024644adad...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Jan 20, 2021
Copy link

@neliasso neliasso left a comment

Good to know that the code was introduced in 16 so that no regression is introduced.

Approved.

Copy link

@vnkozlov vnkozlov left a comment

Don't forget to request approval for JDK 16 fix integration:
http://openjdk.java.net/jeps/3#Fix-Request-Process

@@ -168,6 +168,9 @@ class MacroAssembler: public Assembler {
void movflt(XMMRegister dst, AddressLiteral src);
void movflt(Address dst, XMMRegister src) { movss(dst, src); }

// Move with zero extension
void movfltz(XMMRegister dst, XMMRegister src) { movss(dst, src); }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems movdbl(XMMRegister dst, XMMRegister src) has the same issue.

Copy link
Member Author

@DamonFool DamonFool Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems movdbl(XMMRegister dst, XMMRegister src) has the same issue.

Good catch.
I will try to make a reproducer and fix it in another pr since VectorReinterpret doesn't use it.
Thanks.

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 21, 2021

Thanks @rickard , @neliasso and @vnkozlov for your review and comments.

@vnkozlov , jdk16-fix-request has been added in the JBS.
Thanks.

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 21, 2021

Will integrate it later since the jdk16-fix-request will be approved after PR is finished.
Thanks.

Copy link

@vnkozlov vnkozlov left a comment

I approved this fix for JDK 16.
I misread your comment and thought you will also fix movdbl() here.
I am fine with fixing it in separate PR.

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 22, 2021

/integrate

@openjdk openjdk bot closed this Jan 22, 2021
@openjdk openjdk bot added the integrated label Jan 22, 2021
@openjdk openjdk bot removed ready rfr labels Jan 22, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 22, 2021

@DamonFool Since your change was applied there have been 12 commits pushed to the master branch:

  • ede1bea: 8227695: assert(pss->trim_ticks().seconds() == 0.0) failed: Unexpected partial trimming during evacuation
  • 62eab50: 8255199: Catching a few NumberFormatExceptions in xmldsig
  • a5367cb: 8247619: Improve Direct Buffering of Characters
  • 0408b23: 8259757: add a regression test for 8259353 and 8259601
  • 0120510: 8259732: JDK 16 L10n resource file update - msg drop 10
  • f7b96d3: 8259796: timed CompletableFuture.get may swallow InterruptedException
  • bb0821e: 8258643: [TESTBUG] javax/swing/JComponent/7154030/bug7154030.java failed with "Exception: Failed to hide opaque button"
  • cd25bf2: 8259574: SIGSEGV in BFSClosure::closure_impl
  • d5ca3b3: 8259641: C2: assert(early->dominates(LCA)) failed: early is high enough
  • e85892b: 8258396: SIGILL in jdk.jfr.internal.PlatformRecorder.rotateDisk()
  • ... and 2 more: https://git.openjdk.java.net/jdk16/compare/5926d75fa1a96e001c4167a4352e55024644adad...master

Your commit was automatically rebased without conflicts.

Pushed as commit d90e06a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@DamonFool DamonFool deleted the JDK-8259775 branch Jan 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
hotspot-compiler integrated
4 participants