Skip to content

CSE of non-atomic load around atomic load breaks SC #12825

@polytypic

Description

@polytypic

Here is an example (see also in godbolt):

let cse_breaks_sc non_atomic atomic =
  let before = !non_atomic in
  (* happens before *)
  let fence = Atomic.get atomic in
  (* happens before *)
  let after = !non_atomic in
  (* The above non_atomic read is optimized away, which breaks sequential consistency. *)
  before - after + fence

What can happen is that another domain executes something like:

non_atomic := also_new_value;
(* happens before *)
Atomic.set atomic new_value;

and due to the CSE on the second load of non_atomic the program can give results that are not consistent with the memory model.

Specifically, the Atomic.get atomic gives the new_value (which means the Atomic.set happens before the Atomic.get), but then the second read of the non_atomic does not return the also_new_value, because it has been optimized away.

The arithmetic at the end is just so that the compiler doesn't just optimize out all the reads. Also, the example is minimized. (In practical use cases there is typically more work between the reads of the non_atomic. The first read is done to see if it makes sense to try to do that work (which also includes the atomic read, which is typically more expensive (i.e. it is reading a contested memory location)) and the second read (after the atomic read) is to ensure that in fact no other domain running in parallel has completed the work before we did.)

I checked that this also happens with 5.2.0+trunk switch freshly created with opam.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions