Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8261649: AArch64: Optimize LSE atomics in C++ code #2612

Closed
wants to merge 1 commit into from

Conversation

theRealAph
Copy link
Contributor

@theRealAph theRealAph commented Feb 17, 2021

Now that we have support for LSE atomics in C++ HotSpot source, we can generate much better code for them. In particular, the sequence we generate for CMPXCHG with a full two-way barrier using two DMBs is way suboptimal.

Barrier-ordered-before, Arm Architecture Reference Manual B2.3 :

| Barrier instructions order prior Memory effects before subsequent
| Memory effects generated by the same Observer. A read or a write RW1
| is Barrier-ordered-before a read or a write RW2 from the same Observer
| if and only if RW1 appears in program order before RW2 and any of the
| following cases apply:
|
| [...]
|
| * RW1 appears in program order before an atomic instruction with both
| Acquire and Release semantics that appears in program order before RW2.

So a prior load or store cannot be reordered with the load of an atomic swap with Acquire and Release semantics. This barrier-ordered-before in combination with sequential consistency gives us everything we need for a full barrier. However, we still need a DMB after the cmpxchg to ensure that subsequent loads and stores cannot be reordered with the store in an atomic instruction.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8261649: AArch64: Optimize LSE atomics in C++ code

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2612/head:pull/2612
$ git checkout pull/2612

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 17, 2021

👋 Welcome back aph! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 17, 2021

@theRealAph The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Feb 17, 2021
@theRealAph theRealAph changed the title Now that we have support for LSE atomics in C++ HotSpot source, we can generate much better code for them. In particular, the sequence we generate for CMPXCHG with a full two-way barrier using two DMBs is way suboptimal. 8261649: AArch64: Optimize LSE atomics in C++ code Feb 17, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 17, 2021
@mlbridge
Copy link

mlbridge bot commented Feb 17, 2021

Webrevs

@theRealAph
Copy link
Contributor Author

This patch:

Moves memory barriers from the atomic_linux_aarch64 file into the stubs.
Rewrites the LSE versions of the stubs to be more efficient.
Fixes a race condition in stub generation.
Mostly leaves the pre-LSE stubs alone, except that I added a PRFM which according to kernel engineers improves performance.

@theRealAph
Copy link
Contributor Author

Closing because this is a duplicate.

@theRealAph theRealAph closed this Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

1 participant