-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A lock-free bounded queue #83
base: main
Are you sure you want to change the base?
Conversation
3bee128
to
79f3692
Compare
79f3692
to
d6bc981
Compare
d6bc981
to
98804fa
Compare
src/cue.ml
Outdated
| Null -> None | ||
| Node r as node -> | ||
let value = r.value in | ||
if Atomic.get t.head != node then peek_opt t else Some value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if Atomic.get t.head != node then ...
There is a small bug here, it should be :
if Atomic.get t.head != head then ...
Same in peek
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is still some bugs between peek_opt
and pop_opt
.
|
Try_push 42 : true
|
.------------------------------------.
| |
Peek_opt : Some (0) Pop_opt : Some (42)
I will continue investigating !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Yes, this is a bug. What would ideally be needed here is a(n acquire) fence to prevent the r.value
read from being reordered after the Atomic.get
, but OCaml currently doesn't provide a direct way to do only that. Some alternatives:
value
could be turned into an atomic and thenr.value
would be replaced withAtomic.get r.value
. This would be quite expensive.- Instead of
Atomic.get t.head == head
one could useAtomic.compare_and_set t.head head head
. This is also somewhat expensive, because it means that thet.head
location is actually written to, which causes additional contention. - In practise, but not in the OCaml memory model, which doesn't match actual hardware, it is also possible to generate fences by accessing an unrelated location. For example, one could define
and use that to write
let fence () = Atomic.set (Atomic.make 0) 0
This is technically not guaranteed by the OCaml memory model. However, I'm not aware of any hardware where that would not work, because hardware doesn't track fences (or "frontiers") per memory location like the OCaml memory model does (my understanding of the physics is that it would be too expensive / impractical / wouldn't make any sense to do that at the hardware level).let value = r.value in fence (); if Atomic.get t.head != head then ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fence
trick seems to work !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may actually be a compiler bug.
The OCaml 5 compiler definitely has a bug related to an incorrect optimization of Atomic.get
operations.
What remains to consider is whether it is actually allowed to reorder a nonatomic load that precedes an atomic load to happen after the atomic load. I suspect it is not actually allowed. See the explanation of fences in this post about fences in C++11:
An acquire fence prevents the memory reordering of any read which precedes it in program order with any read or write which follows it in program order.
This, of course, is not exactly the same as the semantics of Atomic.get
in OCaml 5, but according to the above explanation the original code
let value = r.value in
(* The above read should not be allowed to happen after
the read below as the read below has an acquire fence. *)
if Atomic.get t.head != head then ...
would be correct. I'll try to redecipher the Bounding Data Races in Space and Time paper to find definite word on this.
I actually checked with Godbolt that, indeed, in this queue case, what happens is that the compiler optimizes the second Atomic.get t.head
away and that is what causes the code to misbehave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this is just a compiler bug, then the following workaround would be sufficient:
let value = r.value in
(* The [Sys.opaque_identity] below prevents OCaml 5 from
optimizing the repeated load away. *)
if Atomic.get (Sys.opaque_identity t.head) != head then ...
d4b7e51
to
e548e50
Compare
0084799
to
2aae21c
Compare
261eabb
to
928660b
Compare
12a77c7
to
46c2cae
Compare
b47847c
to
3749cbf
Compare
3749cbf
to
5e7d637
Compare
FYI, I will squash this PR to a single commit to make this PR a little bit easier to work with. |
734c367
to
47cdeb1
Compare
47cdeb1
to
4accc07
Compare
4accc07
to
3f75aea
Compare
There was a trivial bug using a wrong variable. Additionally, it turns out the OCaml 5 compiler incorrectly optimized away a repeated `Atomic.get`. This commit works around that compiler bug by using `Sys.opaque_identity` to provent the compiler from figuring out that the location has been read before.
The latest version of DLA includes bug fixes.
3f75aea
to
5565737
Compare
This implements a lock-free bounded queue — blocking using domain-local-await.
The data structure / algorithm is based on the Michael-Scott queue extended with length maintenance and caching of remaining capacity in such a way that additional contention is avoided in the happy paths.
TODO:
try_push
,length
(can be done inO(1)
),is_empty