fix(interrupts): add compiler fences to enable and disable #436

mkroening · 2023-09-15T14:26:13Z

This actually bit me on AArch64, but this should apply to x86-64 as well:

Without this PR, the compiler may reorder memory accesses over these asm! calls. While interrupts::enable and interrupts::disable don't make claims about synchronizations, the docs strongly imply synchronization of the calling thread with itself.

Essentially, without compiler fences, code after interrupts::disable may still execute while interrupts are enabled. Conversely, code before interrupts::enable may only execute after interrupts have been enabled. This also translates to without_interrupts: code inside the closure may actually execute with interrupts.

This bug may require very specific conditions to show up. In my case (the Hermit kernel), some aggressive inlining resulted in improper access to a RefCell from an interrupt handler.

What do you think? :)

Without this, the compiler may reorder memory accesses over the sti and cli calls. Signed-off-by: Martin Kröning <martin.kroening@eonerc.rwth-aachen.de>

Freax13 · 2023-09-15T16:20:41Z

I'm not yet sure that this is a bug: AFAIK the general consensus is that interrupt handlers are modeled as second threads (even though they execute on the same CPU thread). Under that model sharing a RefCell with an interrupt handler is unsound because it's not Sync. Also under that model there is no such thing as a thread synchronizing with itself. To make things worse the semantics of compiler_fence are unclear: rust-lang/unsafe-code-guidelines#347.

Can you provide an example of code where reordering with the enable and disable functions could cause problems even under the assumption that interrupt handlers are second threads?

mkroening · 2023-09-15T17:54:24Z

My understanding of compiler fences was based on its documentation. It's unfortunate that they are not properly specified yet.

AFAIK the general consensus is that interrupt handlers are modeled as second threads (even though they execute on the same CPU thread).

Coming from the embedded space, I was thinking that synchronization through compiler fences together with interrupt masking would be sound on the same hardware thread, that normal execution and interrupt handlers share.

AFAIK, the whole embedded Rust ecosystem is built on this assumption: Concurrency—The Embedded Rust Book. See the critical-section crate for more information:

"For bare-metal single core, disabling interrupts in the current (only) core."

The cortex-m crate also uses compiler fences to make this behave as expected: rust-embedded/cortex-m/src/interrupt.rs#L29-L53.

Also under that model there is no such thing as a thread synchronizing with itself.

I took that wording straight from the compiler_fence docs, which provide a motivation using a program being interrupted by a signal.

Can you provide an example of code where reordering with the enable and disable functions could cause problems even under the assumption that interrupt handlers are second threads?

Assuming that this may not be used to synchronize !Sync types across software threads on the same hardware thread, I don't think you can produce unsoundness, since Sync types are synchronized internally. It would still be quite counterintuitive for without_interrupts to run code with interrupts enabled. Furthermore, you might be able to produce deadlocks if you don't synchronize with atomic accesses, similar to what is shown in the compiler fence's docs.

Related issues: I just found rust-lang/rust#30003, which describes the same issue, specifically to RefCells, I think.
There is also rust-lang/unsafe-code-guidelines#444 (comment), which might go in a similar direction.

Finally, it is probably an opsem question if using compiler fences for synchronization like this is fine. On one hand, it seems difficult to ensure the RefCell's state is reread after the compiler fence. On the other hand, it would seem very unfortunate if you had to use atomics and atomic fences for this kind of synchronization on the same hardware thread, since the hardware would not require any special atomic instructions for consistent memory accesses.

I am a bit lost where to track this. Is this a new opsem issue? Does it fit to one that already exists? Maybe @RalfJung can help?

Leaving opsem aside for the moment, would you consider merging this even if the opsem question is unresolved? It does make without_interrupts consistent and aligns the behavior with common practice from embedded Rust. While not necessarily being fine for synchronization from opsem perspective, this would improve things somewhat, I think.

Alternatively, it might even be necessary to introduce atomic fences instead. If we handle interrupt handlers as different threads, the main thread might unlock a spinlock, enable interrupts and the interrupt handler may lock the spinlock. Without an atomic fence, the unlocking may be reordered to after interrupt enabling.

Sorry for such a big comment, this is quite a big topic. 👀

Freax13 · 2023-09-15T18:43:33Z

My understanding of compiler fences was based on its documentation. It's unfortunate that they are not properly specified yet.

AFAIK the general consensus is that interrupt handlers are modeled as second threads (even though they execute on the same CPU thread).

Coming from the embedded space, I was thinking that synchronization through compiler fences together with interrupt masking would be sound on the same hardware thread, that normal execution and interrupt handlers share.

AFAIK, the whole embedded Rust ecosystem is built on this assumption: Concurrency—The Embedded Rust Book. See the critical-section crate for more information:

"For bare-metal single core, disabling interrupts in the current (only) core."

The cortex-m crate also uses compiler fences to make this behave as expected: rust-embedded/cortex-m/src/interrupt.rs#L29-L53.

Also under that model there is no such thing as a thread synchronizing with itself.

I took that wording straight from the compiler_fence docs, which provide a motivation using a program being interrupted by a signal.

Oops, I wasn't aware the compiler_fence uses that wording, thanks for reminding me.

Can you provide an example of code where reordering with the enable and disable functions could cause problems even under the assumption that interrupt handlers are second threads?

Assuming that this may not be used to synchronize !Sync types across software threads on the same hardware thread, I don't think you can produce unsoundness, since Sync types are synchronized internally. It would still be quite counterintuitive for without_interrupts to run code with interrupts enabled. Furthermore, you might be able to produce deadlocks if you don't synchronize with atomic accesses, similar to what is shown in the compiler fence's docs.

Related issues: I just found rust-lang/rust#30003, which describes the same issue, specifically to RefCells, I think.

Yup, this one seems closed to the problem presented here.

There is also rust-lang/unsafe-code-guidelines#444 (comment), which might go in a similar direction.

Finally, it is probably an opsem question if using compiler fences for synchronization like this is fine. On one hand, it seems difficult to ensure the RefCell's state is reread after the compiler fence. On the other hand, it would seem very unfortunate if you had to use atomics and atomic fences for this kind of synchronization on the same hardware thread, since the hardware would not require any special atomic instructions for consistent memory accesses.

I am a bit lost where to track this. Is this a new opsem issue? Does it fit to one that already exists? Maybe @RalfJung can help?

rust-lang/rust#30003 seems like the closest match to me, but it's old and hasn't receive much attention. rust-lang/unsafe-code-guidelines#444 and rust-lang/unsafe-code-guidelines#347 are also relevant though neither of them talk about behavior around !Sync types specifically.

Leaving opsem aside for the moment, would you consider merging this even if the opsem question is unresolved? It does make without_interrupts consistent and aligns the behavior with common practice from embedded Rust. While not necessarily being fine for synchronization from opsem perspective, this would improve things somewhat, I think.

I agree that in practice this might improve things with how people are writing code and even though it's not guaranteed to be a perfect solution I don't think there are significant downsides to this either.

Alternatively, it might even be necessary to introduce atomic fences instead. If we handle interrupt handlers as different threads, the main thread might unlock a spinlock, enable interrupts and the interrupt handler may lock the spinlock. Without an atomic fence, the unlocking may be reordered to after interrupt enabling.

Another alternative would be to remove the no_mem option from the asm! block.

Sorry for such a big comment, this is quite a big topic. 👀

No worries, it's also an important topic. I image this issue must have been hell to debug.

RalfJung · 2023-09-15T21:07:33Z

Yeah rust-lang/unsafe-code-guidelines#347 still mostly summarizes what I know about compiler fences. Pretty much the only situation where I can make sense of them is for signal handlers, which in C++ are other threads kind of -- they are other threads except that compiler fences are sufficient for synchronization. Arguably on a single-core CPU in the kernel space with preemption disabled, interrupts are like signal handlers. So it could actually make sense to use them here. I do lack all the context to evaluate the details though.

mkroening · 2023-09-15T21:33:45Z

Thanks so much for clarifying!

compiler fences are sufficient for synchronization

That's precisely what I was hoping for! 🥳

Arguably on a single-core CPU in the kernel space with preemption disabled, interrupts are like signal handlers.

That's mostly our use case. In our case, we might also have multiple cores, but each core has independent interrupts and is only expected to synchronize with those. So that should be fine then.

RalfJung · 2023-09-16T06:24:55Z

Yes, that sounds about right. But this is a bit outside my comfort zone so I suggest asking on Zulip. :)

fix(interrupts): add compiler fences to enable and disable

659c871

Without this, the compiler may reorder memory accesses over the sti and cli calls. Signed-off-by: Martin Kröning <martin.kroening@eonerc.rwth-aachen.de>

mkroening force-pushed the interrupt-fence branch from 71375b1 to 659c871 Compare September 15, 2023 14:29

Freax13 approved these changes Sep 15, 2023

View reviewed changes

Freax13 merged commit a8165d4 into rust-osdev:master Sep 15, 2023
12 checks passed

Freax13 mentioned this pull request Sep 15, 2023

Release v0.14.11 #437

Merged

mkroening deleted the interrupt-fence branch September 15, 2023 21:28

mkroening mentioned this pull request Sep 18, 2023

wakeup_task: Trying to wake up task 1 which is not blocked hermit-os/kernel#898

Closed

This was referenced Oct 9, 2023

Potential UB due to data race between regular thread and interrupt handler mkroening/interrupt-ref-cell#5

Closed

fix(interrupts): replace compiler fences with potentially-synchronizing assembly #440

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(interrupts): add compiler fences to enable and disable #436

fix(interrupts): add compiler fences to enable and disable #436

mkroening commented Sep 15, 2023

Freax13 commented Sep 15, 2023 •

edited

Loading

mkroening commented Sep 15, 2023

Freax13 commented Sep 15, 2023

RalfJung commented Sep 15, 2023

mkroening commented Sep 15, 2023

RalfJung commented Sep 16, 2023

fix(interrupts): add compiler fences to enable and disable #436

fix(interrupts): add compiler fences to enable and disable #436

Conversation

mkroening commented Sep 15, 2023

Freax13 commented Sep 15, 2023 • edited Loading

mkroening commented Sep 15, 2023

Freax13 commented Sep 15, 2023

RalfJung commented Sep 15, 2023

mkroening commented Sep 15, 2023

RalfJung commented Sep 16, 2023

Freax13 commented Sep 15, 2023 •

edited

Loading