systemwide_memory_barrier: use madvise(MADV_DONTNEED) instead of mpro…

…tect() Like mprotect() with a permission reduction, madvise(MADV_DONTNEED) sends and IPI to all CPUs, inducing them to inject a memory barrier. Unlike mprotect(), madvise(MADV_DONTNEED) does not take the mmap semaphore, as it works on pages, not VMAs. That means that madvise(MADV_DONTNEED) calls can be executed concurrently without serialization by the kernel. Replace the calls to mprotect() by a call to madvise(MADV_DONTNEED) and drop the spinlock that guarded the whole thing, as we are now happy with concurrent execution of systemwide_memory_barrier. This reduces our CPU consumption when going to sleep on kernels that don't have sys_membarrier(). While it appears we're reducing the system call count as well, that's not true, as we have to fault in the page before the call to madvise(). In fact the fault is slower than the system call (but we still gain overall on large machines from reduced spinning). Credit to Aliaksei Kandratsenka/gperftools for the idea. Message-Id: <20180314121738.16336-1-avi@scylladb.com>
scylladb · Mar 14, 2018 · 77a58e4 · 77a58e4
1 parent a66cc34
commit 77a58e4
Showing 1 changed file with 3 additions and 21 deletions.
diff --git a/core/systemwide_memory_barrier.cc b/core/systemwide_memory_barrier.cc
@@ -84,29 +84,15 @@ systemwide_memory_barrier() {
        assert(mem != MAP_FAILED);
        return reinterpret_cast<char*>(mem);
     }();
-    int r1 = mprotect(mem, getpagesize(), PROT_READ | PROT_WRITE);
-    assert(r1 == 0);
-    // Force page into memory to avoid next mprotect() attempting to be clever
+    // Force page into memory to make madvise() have real work to do
     *mem = 3;
-    // Force page into memory
-    // lower permissions to force kernel to send IPI to all threads, with
+    // Evict page to force kernel to send IPI to all threads, with
     // a side effect of executing a memory barrier on those threads
     // FIXME: does this work on ARM?
-    int r2 = mprotect(mem, getpagesize(), PROT_READ);
+    int r2 = madvise(mem, getpagesize(), MADV_DONTNEED);
     assert(r2 == 0);
 }
 
-struct alignas(cache_line_size) aligned_flag {
-    std::atomic<bool> flag;
-    bool try_lock() {
-        return !flag.exchange(true, std::memory_order_relaxed);
-    }
-    void unlock() {
-        flag.store(false, std::memory_order_relaxed);
-    }
-};
-static aligned_flag membarrier_lock;
-
 bool try_systemwide_memory_barrier() {
     if (try_native_membarrier()) {
         return true;
@@ -126,11 +112,7 @@ bool try_systemwide_memory_barrier() {
 
 #endif
 
-    if (!membarrier_lock.try_lock()) {
-        return false;
-    }
     systemwide_memory_barrier();
-    membarrier_lock.unlock();
     return true;
 }