You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the implementation is using volatile operations to prevent re-ordering between memory operations but as used this is wrong for two different reasons
volatile operations are not reordered (by the compiler) wrt to other volatile operations, but the compiler is free to move non-volatile operations around
the second ptr call is a non-volatile operation so the compiler is free to move that operation before the first volatile ptr call. Whether you observe that reordering today or not is an implementation detail but any LLVM upgrade could cause these two operations to be ordered differently
The correct abstraction to prevent both compiler-level and architecture-level reordering are atomic fences, either use atomic::fence or use Atomic*.{load,store,etc} -- the latter contain implicit atomic fences. How the different Orderings prevent reordering is covered here.
volatile operations prevent compiler reordering but the memory operations can be reordered at the architecture level: the processor may execute instructions out of order or the memory bus may commit writes in an order that doesn't match the source code; cache implementations may also be relevant here.
Again, the correct way to prevent re-ordering of memory operations are atomic fences. Using atomic fences will make this crate portable to e.g. (single core) ARM Cortex-A and more complex architectures.
I suppose an acceptable course of action would be to get rid of the volatiles and then just pepper fence(SeqCst) in some critical spots (e.g. writing the header, and also when dealing with the buffer data) to enforce correct ordering between the core and debugger (which is a somewhat weird situation to be dealing with anyways). I doubt it would have any sort of performance impact in practice either.
Edit:
On second thought, fences don't prevent the compiler from completely removing operations though if it looks like the results are not being used (which they do), do they? So all writes still have to be volatile.
I don't suppose there's a volatile "memcpy" anywhere? I really would rather not implement an efficient memcpy by hand.
the implementation is using volatile operations to prevent re-ordering between memory operations but as used this is wrong for two different reasons
For example,
RttHeader::init
contains this code:the second
ptr
call is a non-volatile operation so the compiler is free to move that operation before the first volatileptr
call. Whether you observe that reordering today or not is an implementation detail but any LLVM upgrade could cause these two operations to be ordered differentlyThe correct abstraction to prevent both compiler-level and architecture-level reordering are atomic fences, either use
atomic::fence
or useAtomic*.{load,store,etc}
-- the latter contain implicit atomic fences. How the differentOrdering
s prevent reordering is covered here.Again, the correct way to prevent re-ordering of memory operations are atomic fences. Using atomic fences will make this crate portable to e.g. (single core) ARM Cortex-A and more complex architectures.
cc @Yatekii
The text was updated successfully, but these errors were encountered: