Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-instruction-read/writes atomics #37

Closed
Ten0 opened this issue Aug 5, 2020 · 10 comments
Closed

Single-instruction-read/writes atomics #37

Ten0 opened this issue Aug 5, 2020 · 10 comments

Comments

@Ten0
Copy link

Ten0 commented Aug 5, 2020

We have Mutex. However, opening a critical section (disabling interrupts) to just read or write a Mutex<Cell<u8>> seems a bit overkill: the read or write is a single instruction, which makes it impossible for an interrupt to hit in the middle.
How about this crate also provides some form of atomics with respect to what the Send and Sync markers mean in our context?

(I beleive they would be implemented this way, I'd be very willing to open a PR)

use core::cell::UnsafeCell;

pub struct SingleCoreAtomicUsize {
	inner: UnsafeCell<usize>,
}

/// Safety:
/// We only provide `get` and `set`, which all take a single instruction, so no interrupt may hit in the middle
unsafe impl Sync for SingleCoreAtomicUsize {}

impl SingleCoreAtomicUsize {
	pub const fn new(val: usize) -> Self {
		Self {
			inner: UnsafeCell::new(val),
		}
	}
	pub fn get(&self) -> usize {
		unsafe { *self.inner.get() }
	}
	pub fn set(&self, val: usize) {
		unsafe { *self.inner.get() = val }
	}
}
@jonas-schievink
Copy link
Contributor

This is AtomicUsize

@Ten0
Copy link
Author

Ten0 commented Aug 5, 2020

This is AtomicUsize

But is this available on all single-core platforms, e.g. Cortex M0? I thought it wasn't (https://rust-embedded.github.io/book/concurrency/index.html#atomic-access). Also if it does synchronization across cores, it's not exactly the same thing as I'm proposing: atomics across interrupts on the same core, with no cross-core synchronization overhead.

@jonas-schievink
Copy link
Contributor

The atomic types are available everywhere (if not, that's a bug in Rust or LLVM), but CAS operations might not be. load and store are always available.

You still need to synchronize access between interrupt handlers and idle. As written, SingleCoreAtomicUsize causes data races between every get and set, which is undefined behavior.

@Ten0
Copy link
Author

Ten0 commented Aug 5, 2020

The atomic types are available everywhere (if not, that's a bug in Rust or LLVM), but CAS operations might not be. load and store are always available.

Thanks, I will double check this.

You still need to synchronize access between interrupt handlers and idle. As written, SingleCoreAtomicUsize causes data races between every get and set, which is undefined behavior.

I don't understand what data race you're talking about. Could you give an example of a scenario where the data race would trigger, and how it's UB?

Also, I'm unsure how I can get AtomicUsize to have the behaviour I was looking for: for platforms that do support multi-core (e.g your typical computer), how would you trigger a no-cross-core-synchronization read or write from an AtomicUsize? Just using Ordering::Relaxed?

@jonas-schievink
Copy link
Contributor

The unsafe impl Sync allows use of SingleCoreAtomicUsize across threads, but there is no inner synchronization, so any usage of get or set from more than one thread would be a data race, as the stores and loads happen unsynchronized with each other.

This is still the case if you replace "thread" with "interrupt handler". For example, the compiler might assume that a value loaded by subsequent gets stays the same if there is no set inbetween. This is wrong in the presence of interrupts.

The Relaxed ordering is all you need as long as you only need to synchronize accesses of the AtomicUsize itself, not any other data (the other orderings are used when writing lock free data structures with atomics, and affect how the compiler synchronizes non-atomic accesses to the actual data stored in the data structure).

@Ten0
Copy link
Author

Ten0 commented Aug 5, 2020

Thanks a lot for the explanation 👍 I think I understood almost all of it now! :)

The compiler might assume that a value loaded by subsequent gets stays the same if there is no set inbetween. This is wrong in the presence of interrupts.

From the UnsafeCell documentation:

UnsafeCell is a type that wraps some T and indicates unsafe interior operations on the wrapped type. Types with an UnsafeCell field are considered to have an 'unsafe interior'.
The compiler makes optimizations based on the knowledge that &T is not mutably aliased or mutated, and that &mut T is unique. UnsafeCell is the only core language feature to work around the restriction that &T may not be mutated.

I thought this explained it prevented that kind of optimizations.and enforced actually re-reading the value everytime it's asked.
Does it actually only force re-read if the compiler detects it might have changed due to some mutable aliasing, but in the same execution flow?

@jonas-schievink
Copy link
Contributor

That was just an example of an assumption that a compiler might make. You'd have to ask that question in terms of Rust's language semantics to get a complete answer, and I'm not sure how the precise answer looks there.

@pftbest
Copy link

pftbest commented Aug 6, 2020

@Ten0

Here is a real example with your code https://godbolt.org/z/GsYx1q

you can see that

    // Wait for the flag
    while FOO.get() == 0 {
        // nop
    }

is compiled to

    // Wait for the flag
    if FOO.get() == 0 {
        loop {
            // nop
        }
    }

Compiler assumes there is no data races in the code, so value of FOO can never change inside the loop. Which means no need to read it every time we can check it only once.

@adamgreig
Copy link
Member

This exact problem actually can occur with the bare-metal Mutex, which is why we added documentation in #34 after discussion in #33 (comment). In the cortex-m crate, compiler fences or memory clobbers are used when entering and leaving critical sections to ensure the compiler does not move accesses outside the the critical section, as otherwise @pftbest's example above would miscompile the same way.

@Ten0
Copy link
Author

Ten0 commented Aug 6, 2020

Great, thank you for all the examples and references!
Closing this as it's already in core.

@Ten0 Ten0 closed this as completed Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants