Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use acquire-release memory order for mutex::try_lock #274

Merged

Conversation

louiswilliams
Copy link
Contributor

@louiswilliams louiswilliams commented Nov 23, 2023

Mutex locking needs to have memory-order acquire semantics. Imagine two threads incrementing a counter, protected by a mutex:

Preconditions:

  • Mutex atomic owner = nullptr
  • Non-atomic variable X = 0

Each thread:

  1. mutex::try_lock successfully performs owner.compare_exchange_strong(self, memory_order_release). This only guarantees that any preceding reads and writes are not reordered after this one.
  2. X = X + 1. This is not an atomic variable and therefore depends on the memory ordering of owner to synchronize reads and writes.
  3. mutex::unlock calls owner.store(nullptr) which has seq ordering. This guarantees no preceding or subsequent reads or writes will be reordered.

But because the CAS uses release memory ordering, other threads can reorder 2 before 1 and 3.

This ordering is therefore possible:

Thread 1 Thread 2
2. X=X+1, X=1
2. X=X+1, X=1
1. lock
3. unlock
1. lock
3. unlock

Both threads read X as 0 and both set X to 1. The end state is that X=1, but it should be 2.

Using acquire semantics for the try_lock guarantees that 2 can only happen after 1 and before 3.

@louiswilliams louiswilliams marked this pull request as ready for review November 23, 2023 04:30
@louiswilliams louiswilliams marked this pull request as draft November 23, 2023 04:37
@beef9999
Copy link
Collaborator

beef9999 commented Nov 23, 2023

I can recall the old version gcc has immature support for the memory_order_acq_rel . Don't know what's going on today...

@louiswilliams
Copy link
Contributor Author

x86 does not have this problem. Only ARM supports the acquire, release, and relaxed memory semantics. With this change, I am able to fix a bug in my workload.

@louiswilliams louiswilliams marked this pull request as ready for review November 23, 2023 05:09
@beef9999
Copy link
Collaborator

Is the change compatible to x86?
Can you merge this change to release/0.7?

@louiswilliams louiswilliams changed the base branch from main to release/0.7 November 23, 2023 05:16
@louiswilliams louiswilliams changed the base branch from release/0.7 to main November 23, 2023 05:17
@louiswilliams louiswilliams changed the base branch from main to release/0.7 November 23, 2023 05:21
@louiswilliams
Copy link
Contributor Author

This has no effect on x86. Updated to merge into release/0.7

@lihuiba
Copy link
Collaborator

lihuiba commented Nov 23, 2023

Is the change compatible to x86?

Yes

@lihuiba lihuiba merged commit 9c64050 into alibaba:release/0.7 Nov 23, 2023
7 checks passed
beef9999 pushed a commit that referenced this pull request Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants