Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid optimisation: atomicrmw xchg turned into atomic store #60418

Closed
RalfJung opened this issue Jan 31, 2023 · 2 comments
Closed

Invalid optimisation: atomicrmw xchg turned into atomic store #60418

RalfJung opened this issue Jan 31, 2023 · 2 comments

Comments

@RalfJung
Copy link
Contributor

This is a sibling issue to #56450: LLVM will sometimes replace atomicrmw xchg by atomic store, specifically when the ordering is relaxed or release, and the result of the xchg is unused.

However, replacing an RMW by a non-RMW operation like is not sound under the C++ memory model (and LLVM does not seem to implement a fundamentally different model in that regard). Here's a counterexample using pseudocode notation:

Y:=1 rlx
fence(rel)
SWAP(X,0,rlx)
fence(acq)
a := Z rlx
||
Z:=1 rlx
fence(rel)
SWAP(X,0,rlx)
fence(acq)
b := Y rlx

One of the two SWAP has to read-from the other. This is because two RMWs cannot both read-from the same store (the implicit initial store of 0). Without loss of generality, let's assume the SWAP in the 2nd thread reads-from the first. Therefore we now have a synchronization edge from the fence(rel) in the first thread to the fence(acq) in the 2nd thread. Therefore b must be 1. (In the symmetric case, a must be 1.)

After the optimization, it is possible for both a and b to be 0. This means the optimization introduced new program behavior, and is therefore wrong.

(Thanks to Ori Lahav and Jeehoon Kang.)

@RalfJung
Copy link
Contributor Author

This discussion has another counterexample

@nikic
Copy link
Contributor

nikic commented Jan 31, 2023

Candidate patch: https://reviews.llvm.org/D142097

CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Feb 1, 2023
Following the discussion from https://reviews.llvm.org/D141277 and in
particular Ralf Jung's comment at
https://reviews.llvm.org/D141277#inline-1365148, replacing an unused `atomicrmw
xchg` into an `atomic store` is illegal even for release ordering.

Quoting Connor Horman from the rust lang discussion linked in that comment:
"An acquire operation A only synchronizes-with a release operation R if it
takes its value from R, or any store in the release sequence headed by R, which
is R, followed by the longest continuous sequence of read-modify-write
operations.
A regular store following R in the modification order would break the release
sequence, and if an acquire operation reads that store or something later, then
it loses any synchronization it might have already had."

This fixes llvm/llvm-project#60418

Differential Revision: https://reviews.llvm.org/D142097
skatrak pushed a commit to skatrak/llvm-project-rocm that referenced this issue Feb 10, 2023
Following the discussion from https://reviews.llvm.org/D141277 and in
particular Ralf Jung's comment at
https://reviews.llvm.org/D141277#inline-1365148, replacing an unused `atomicrmw
xchg` into an `atomic store` is illegal even for release ordering.

Quoting Connor Horman from the rust lang discussion linked in that comment:
"An acquire operation A only synchronizes-with a release operation R if it
takes its value from R, or any store in the release sequence headed by R, which
is R, followed by the longest continuous sequence of read-modify-write
operations.
A regular store following R in the modification order would break the release
sequence, and if an acquire operation reads that store or something later, then
it loses any synchronization it might have already had."

This fixes llvm/llvm-project#60418

Differential Revision: https://reviews.llvm.org/D142097
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants