-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8320206: Some intrinsics/stubs missing vzeroupper on x86_64 #16678
Conversation
👋 Welcome back sviswanathan! A progress list of the required criteria for merging this PR into |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sviswa7 can you remind me where SSE instructions are used which need vzeroupper
: intrinsics or code which calls it? I thought we replaced almost all places to use new instructions even for scalar operations in code. And almost all intrinsics have AVX variant.
So which transition we are guarding with vzeroupper
?
@vnkozlov It is the jitted code to jvm transitions (GC etc) that we are trying to guard. The JVM itself is compiled with base SSE2 as target for 64 bit platforms. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this @sviswa7, LGTM.
@sviswa7 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 31 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Okay. Then if intrinsic stub is called only from compiled code you don't need Arraycopy stubs could be used by VM's runtime, I think. At least they are called from I may forgot something about intrinsics. Why we need There was actually the issue with There are a lot of places in VM currently where |
@vnkozlov This PR is not deviating from that scheme. We settled on this scheme based during our discussion on JDK-8178811 and its followup JDK-8190934. The discussion thread for prior is at https://mail.openjdk.org/pipermail/hotspot-compiler-dev/2017-April/026049.html. As part of JDK-8190934 we restricted vzeroupper generation in the epilog of c2 jitted method from always to only when larger vectors are used in the method. This had resolved any over generation of vzeroupper. |
I have no question about generating vzeroupper at the end of JITed code because it could be called from Interpreter. |
I understand that originally it was good decision to generate vzeroupper in all places where we use avx512 wide instructions (C2 generated SSE instructions at that time). May be it is time to check if C2 still generate SSE now. If not we should fix it so that we can remove vzeroupper in intrinsic stubs used by C2 only. |
Also, as @cl4es noticed recently, There's an attempt to detect when @sviswa7 How expensive |
Okay. I think I understand what you said. We need vzeroupper in JITed code (because called from Interpreter) if it or intrinsics it calls use avx512 instructions. We don't generate vzeroupper in JITed code epilog if the code does not uses avx512. But intrinsic stub it calls may use avx512 so we delegate vzeroupper generation to stub. Saying that I accept your changes and will test it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My testing tier1-4,xcomp,stress for version v00 passed.
@sviswa7 please, file followup RFEs based on my and Vladimir's Ivanov comments. |
@vnkozlov @jatin-bhateja @iwanowww Thanks a lot for the reviews. I have filed a RFE (https://bugs.openjdk.org/browse/JDK-8320346). |
vzeroupper is not an expensive instruction but any optimizations is always helpful. |
/integrate |
Going to push as commit 9b372e2.
Your commit was automatically rebased without conflicts. |
/backport jdk21u |
@sviswa7 the backport was successfully created on the branch sviswa7-backport-9b372e28 in my personal fork of openjdk/jdk21u. To create a pull request with this backport targeting openjdk/jdk21u:master, just click the following link: The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:
If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u:
|
/backport jdk21u-dev |
@sviswa7 the backport was successfully created on the branch backport-sviswa7-9b372e28 in my personal fork of openjdk/jdk21u-dev. To create a pull request with this backport targeting openjdk/jdk21u-dev:master, just click the following link: The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:
If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk21u-dev:
|
The following intrinsics/stubs are missing vzeroupper:
adler32 (since JDK17)
count_positives (since JDK 9)
chacha20 (since JDK 20)
string indexOfChar (since JDK 9)
Adding the missing vzeroupper to avoid AVX-SSE transition penalties.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16678/head:pull/16678
$ git checkout pull/16678
Update a local copy of the PR:
$ git checkout pull/16678
$ git pull https://git.openjdk.org/jdk.git pull/16678/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16678
View PR using the GUI difftool:
$ git pr show -t 16678
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16678.diff
Webrev
Link to Webrev Comment