Skip to content
This repository has been archived by the owner. It is now read-only.

8258703: Incorrect 512-bit vector registers restore on x86_32 #64

Closed
wants to merge 1 commit into from

Conversation

@DamonFool
Copy link
Member

@DamonFool DamonFool commented Dec 23, 2020

Hi all,

Following tests fail on our AVX512 machines with x86_32:

  • compiler/runtime/Test7196199.java
  • compiler/runtime/safepoints/TestRegisterRestoring.java
  • compiler/vectorization/TestVectorsNotSavedAtSafepoint.java

The reason is that 512-bit registers (zmm0 ~ zmm7) are restored incorrectly.

Current restore logic for 512-bit registers includes:

  1. restore zmm[511..256] [1]
  2. restore zmm[255..128] [2] <-- Wrong on AVX512 with avx512vl

On our AVX512 machine, Assembler::vinsertf128 [3] was called in step 2).
According to the Intel instruction set reference, vinsertf128 just copies the lower half of zmm, which lost the upper half of zmm.

   VINSERTF128 (VEX encoded version)
   TEMP[255:0] <- SRC1[255:0]
   CASE (imm8[0]) OF
   0: TEMP[127:0]   <- SRC2[127:0]
   1: TEMP[255:128] <- SRC2[127:0]
   ESAC
   DEST <- TEMP

The fix just changes the order of the restore logic for 512-bit registers:

  1. restore zmm[255..128]
  2. restore zmm[511..256]

Thanks.
Best regards,
Jie

[1] https://github.com/openjdk/jdk16/blob/master/src/hotspot/cpu/x86/sharedRuntime_x86_32.cpp#L320
[2] https://github.com/openjdk/jdk16/blob/master/src/hotspot/cpu/x86/sharedRuntime_x86_32.cpp#L326
[3] https://github.com/openjdk/jdk16/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.hpp#L1463


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8258703: Incorrect 512-bit vector registers restore on x86_32

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk16 pull/64/head:pull/64
$ git checkout pull/64

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Dec 23, 2020

👋 Welcome back jiefu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 23, 2020

/issue add JDK-8258703
/test
/label add hotspot-compiler
/cc hotspot-compiler

Loading

@openjdk openjdk bot added the rfr label Dec 23, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Dec 23, 2020

@DamonFool This issue is referenced in the PR title - it will now be updated.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 23, 2020

@DamonFool
The hotspot-compiler label was successfully added.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 23, 2020

@DamonFool The hotspot-compiler label was already applied.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Dec 23, 2020

Webrevs

Loading

@vnkozlov
Copy link

@vnkozlov vnkozlov commented Dec 23, 2020

Someone from Intel should review this. @jatin-bhateja or @sviswa7, please.

Loading

}
// Restore upper half of YMM registers.
for (int n = 0; n < num_xmm_regs; n++) {
__ vinsertf128_high(as_XMMRegister(n), Address(rsp, n*16));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dst register is used as src1:
https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.hpp#L1459
I would assume that it should copy all bits from dst first before inserting 128 bits from src2 which is stack in this case.

Loading

Copy link
Member Author

@DamonFool DamonFool Dec 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instruction reference says only src1[255..0] is copied.
http://ftp.neutrino.es/x86InstructionSet/VINSERTF128.html

So I would assume dst[511..256] will be unpredictable after VINSERTF128 is executed.

Loading

@sviswa7
Copy link

@sviswa7 sviswa7 commented Dec 28, 2020

@DamonFool @vnkozlov The patch looks good to me.

Loading

@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Dec 29, 2020

@DamonFool @vnkozlov The patch looks good to me.

Thanks @sviswa7 for your review.

/reviewer credit @sviswa7

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Dec 29, 2020

@DamonFool
Reviewer sviswanathan successfully credited.

Loading

Copy link

@vnkozlov vnkozlov left a comment

Good.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@DamonFool This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8258703: Incorrect 512-bit vector registers restore on x86_32

Reviewed-by: kvn, sviswanathan

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 9 new commits pushed to the master branch:

  • 73f5415: 8258955: (bf) slice(int, int) on view buffers fails to adjust index according to primitive size
  • 881bceb: 8258662: JDK 17ea: Crash compiling instanceof check involving sealed interface
  • fb607f1: 8245922: [macos] Taskbar.Feature.ICON_BADGE_NUMBER no longer supported on MacOS
  • 3f67afd: 8251377: [macos11] JTabbedPane selected tab text is barely legible
  • e2aa724: 8258941: Test specify the Classpath exception in the header
  • c398a82: 8258916: javac/doclint reports broken HTML on multiline mailto links
  • 23b83c5: 8253954: javac crash when compiling code with enhanced switch expressions with option -Xjcov
  • 8b37c2c: 8257468: runtime/whitebox/TestWBDeflateIdleMonitors.java fails with Monitor should be deflated.: expected true to equal false
  • 9cd8e38: 8257521: runtime/logging/MonitorInflationTest.java crashed in MonitorList::unlink_deflated

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Jan 4, 2021
@DamonFool
Copy link
Member Author

@DamonFool DamonFool commented Jan 4, 2021

Thanks @vnkozlov for your review.
/integrate

Loading

@openjdk openjdk bot closed this Jan 4, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels Jan 4, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 4, 2021

@DamonFool Since your change was applied there have been 9 commits pushed to the master branch:

  • 73f5415: 8258955: (bf) slice(int, int) on view buffers fails to adjust index according to primitive size
  • 881bceb: 8258662: JDK 17ea: Crash compiling instanceof check involving sealed interface
  • fb607f1: 8245922: [macos] Taskbar.Feature.ICON_BADGE_NUMBER no longer supported on MacOS
  • 3f67afd: 8251377: [macos11] JTabbedPane selected tab text is barely legible
  • e2aa724: 8258941: Test specify the Classpath exception in the header
  • c398a82: 8258916: javac/doclint reports broken HTML on multiline mailto links
  • 23b83c5: 8253954: javac crash when compiling code with enhanced switch expressions with option -Xjcov
  • 8b37c2c: 8257468: runtime/whitebox/TestWBDeflateIdleMonitors.java fails with Monitor should be deflated.: expected true to equal false
  • 9cd8e38: 8257521: runtime/logging/MonitorInflationTest.java crashed in MonitorList::unlink_deflated

Your commit was automatically rebased without conflicts.

Pushed as commit 216c2ec.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

@DamonFool DamonFool deleted the JDK-8258703 branch Jan 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
3 participants