Questionable race reported by miri and TSAN of write vs drop #2777

talchas · 2023-02-05T06:01:49Z

This is google/sanitizers#602, which was a questionable for TSAN call at the time, made even more questionable by atomic_ref. (AKA "atomic_ref is poorly specified")

A rust version is

use std::thread;
use std::sync::atomic::{AtomicU32, Ordering};
#[derive(Clone, Copy)]
struct Sendable<T>(T);
unsafe impl<T> Send for Sendable<T> {}

fn main() {
    thread::scope(move |s| {
        let x = Sendable(Box::into_raw(Box::new(AtomicU32::new(1))));
        s.spawn(move || {
            let x = x;
            thread::sleep_ms(5);
            unsafe {
                (*x.0).store(0, Ordering::Relaxed);
            }
        });
        while unsafe { (*x.0).load(Ordering::Relaxed) } == 1 {
        }
        unsafe { drop(Box::from_raw(x.0)); }
    });
}

Has miri and TSAN report a race of the the store against the drop, despite the fact that coherence says any atomic write would have to be ordered after the store. The issue is that drop/dealloc/free are not atomic writes, and the C/C++ spec is basically completely silent for how object destruction or atomic_ref work.

(Feel free to close this as another piece of "C++ atomics are broken, revisit if/when they get fixed")

The text was updated successfully, but these errors were encountered:

RalfJung · 2023-02-05T17:05:47Z

We consider deallocation to be basically a non-atomic write. Do you not think that is the most sensible choice? It is certainly not an atomic operation, and it is destructive, just like writes are (and even more so).

Or would you consider this just as questionable if we replace the drop by a non-atomic write?

Relaxed operations do not create any synchronization, that is kind of their point. So the notion of "after" does not really apply to relaxed atomics in a sensible way. This is easier to see when considering a non-atomic read: if we do one of those after the loop, the compiler would be allowed to move it up, since reordering two reads is fine and relaxed reads don't synchronize. (In contrast, reordering acquire reads is not fine.)

coherence is one of many relevant orders, but two operations being coherence ordered doesn't mean they are "ordered". The union of the various orders (coherence, reads-from, program order, synchronize-with, happens-before, ...) is allowed to have cycles after all. Crucially, coherence is not part of happens-before (and I wouldn't be surprised if coherence union happens-before would be allowed to have cycles). So the next atomic write would be coherence-ordered after the store, but that doesn't mean it happens after the store.

So, I don't even see anything broken here, at least not more than expected ;) . C++ relaxed atomics are certainly very surprising in their behavior, but if you want to fix that you have to go talk to hardware engineers (the people writing the ARM ISA spec, specifically)...

talchas · 2023-02-05T19:15:56Z

Deallocation should be a non-atomic write, yes; I consider the equivalent using AtomicN::from_mut or C++ atomic_ref to be a similar example of a spec bug (luckily rust from_mut is not unsound, because scoped threads will force you to provide an external synchronization edge).

My point is that if drop was a relaxed atomic write it would be guaranteed to occur later in the modification order, and that I think the spec should guarantee that this sort of non-atomic write does not race unless there is actually hardware where this would be a problem (particularly with atomic_ref being introduced to make this more plausibly noticeable). ARM is not such hardware - relaxed stores compile to the same code as normal stores. Compiler optimizations are the only thing in question, and once more seem questionable at best (and forcing unnecessary acquires is an actual performance cost on weak memory platforms).

RalfJung · 2023-02-05T20:10:19Z

Data races are defined via the happens-before order, and a relaxed reads-from does not create a happens-before edge -- neither compilers nor hardware permit that part. Maybe data races could be defined in a different way, I am not an expert on that. But it is pretty clear that Miri is implementing the spec correctly here, so I'll close this as "not a Miri bug".

Maybe one day we could have a project for "a better weak memory model for Rust"...

RalfJung closed this as completed Feb 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questionable race reported by miri and TSAN of write vs drop #2777

Questionable race reported by miri and TSAN of write vs drop #2777

talchas commented Feb 5, 2023

RalfJung commented Feb 5, 2023 •

edited

Loading

talchas commented Feb 5, 2023

RalfJung commented Feb 5, 2023 •

edited

Loading

Questionable race reported by miri and TSAN of write vs drop #2777

Questionable race reported by miri and TSAN of write vs drop #2777

Comments

talchas commented Feb 5, 2023

RalfJung commented Feb 5, 2023 • edited Loading

talchas commented Feb 5, 2023

RalfJung commented Feb 5, 2023 • edited Loading

RalfJung commented Feb 5, 2023 •

edited

Loading

RalfJung commented Feb 5, 2023 •

edited

Loading