Skip to content

frengor/membarrier2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Process-wide memory barrier

Build Status Cargo License Documentation

Fork of membarrier crate with lots of improvements and support for Linux, Windows, OSX, FreeBSD, Android, iOS, Miri and Loom.

Memory barrier is one of the strongest synchronization primitives in modern relaxed-memory concurrency. In relaxed-memory concurrency, two threads may have different viewpoint on the underlying memory system, e.g. thread T1 may have recognized a value V at location X, while T2 does not know of X=V at all. This discrepancy is one of the main reasons why concurrent programming is hard. Memory barrier synchronizes threads in such a way that after memory barriers, threads have the same viewpoint on the underlying memory system.

Unfortunately, memory barrier is not cheap. Usually, in modern computer systems, there's a designated memory barrier instruction, e.g. MFENCE in x86 and DMB SY in ARM, and they may take more than 100 cycles. Use of memory barrier instruction may be tolerable for several use cases, e.g. context switching of a few threads, or synchronizing events that happen only once in the lifetime of a long process. However, sometimes a memory barrier is necessary in the fast path, which significantly degrades performance.

In order to reduce the synchronization cost of memory barrier, some OSs provide process-wide memory barrier, which basically performs a memory barrier for every thread in the process. Provided that it's even slower than the ordinary memory barrier instruction, what's the benefit? At the cost of process-wide memory barrier, other threads may be exempted from issuing a memory barrier instruction at all! In other words, by using process-wide memory barrier, you can optimize the fast path at the performance cost of the slow path.

For process-wide memory barrier, Linux 4.14+ and FreeBSD 14.1+ provide the membarrier() system call. On older x86 and x86_64 Linux systems, the mprotect() system call (with appropriate arguments that provide process-wide memory barrier semantics) is used. Windows provides FlushProcessWriteBuffers() API. On Apple systems, thread_get_register_pointer_values() is called for every thread.

Usage

Use this crate as follows:

use std::sync::atomic::{fence, Ordering};

membarrier2::light();    // light-weight barrier
membarrier2::heavy();    // heavy-weight barrier
fence(Ordering::SeqCst); // normal barrier

Semantics

Formally, there are three kinds of memory barriers: the light one (membarrier2::light()), the heavy one (membarrier2::heavy()), and the normal one (fence(Ordering::SeqCst)). In an execution of a program, there is a total order over all instances of memory barriers. If thread A issues barrier X and thread B issues barrier Y and X is ordered before Y, then A's knowledge on the underlying memory system at the time of X is transferred to B after Y, provided that:

  • Either of A's or B's barrier is heavy; or
  • Both of A's and B's barriers are normal.

Reference

For more information, see the Linux man page for membarrier.

License

This project is licensed under either of

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

About

Process-wide memory barrier for Linux, Windows, OSX, FreeBSD, Android, iOS and more.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Contributors

Languages