Skip to content

Commit

Permalink
MDEV-25404: ssux_lock_low: Introduce a separate writer mutex
Browse files Browse the repository at this point in the history
Having both readers and writers use a single lock word in
futex system calls caused performance regression compared to
SRW_LOCK_DUMMY (mutex and 2 condition variables).
A contributing factor is that we did not accurately keep
track of the number of waiting threads and thus had to invoke
system calls to wake up any waiting threads.

SUX_LOCK_GENERIC: Renamed from SRW_LOCK_DUMMY. This is the
original implementation, with rw_lock (std::atomic<uint32_t>),
a mutex and two condition variables. Using a separate writer
mutex (as described below) is not possible, because the mutex ownership
in a buf_block_t::lock must be able to transfer from a write submitter
thread to an I/O completion thread, and pthread_mutex_lock() may assume
that the submitter thread is recursively acquiring the mutex that it
already holds, while in reality the I/O completion thread is the real
owner. POSIX does not define an interface for requesting a mutex to
be non-recursive.

On Microsoft Windows, srw_lock_low will remain a simple wrapper of
SRWLOCK. On 32-bit Microsoft Windows, sizeof(SRWLOCK)=4 while
sizeof(srw_lock_low)=8.

On other platforms, srw_lock_low is an alias of ssux_lock_low,
the Simple (non-recursive) Shared/Update/eXclusive lock.

In the futex-based implementation of ssux_lock_low (Linux, OpenBSD,
Microsoft Windows), we shall use a dedicated mutex for exclusive
requests (writer), and have a WRITER flag in the 'readers' lock word
to inform that a writer is holding the lock or waiting for the lock to
be granted. When the WRITER flag is set, all lock requests must acquire
the writer mutex. Normally, shared (S) lock requests simply perform a
compare-and-swap on the 'readers' word.

Update locks are implemented as a combination of writer mutex
and a normal counter in the 'readers' lock word. The conflict between
U and X locks is guaranteed by the writer mutex.
Unlike SUX_LOCK_GENERIC, wr_u_downgrade() will not wake up any pending
rd_lock() waits. They will wait until u_unlock() releases the writer mutex.

The ssux_lock_low is always wrapped by sux_lock (with a recursion count
of U and X locks), used for dict_index_t::lock and buf_block_t::lock.
Their memory footprint for the futex-based implementation will increase
by sizeof(srw_mutex), or 4 bytes.

This change addresses a performance regression in read-only benchmarks,
such as sysbench oltp_read_only. Also write performance was improved.

On 32-bit Linux and OpenBSD, lock_sys_t::hash_table will allocate
two hash table elements for each srw_lock (14 instead of 15 hash
table cells per 64-byte cache line on IA-32). On Microsoft Windows,
sizeof(SRWLOCK)==sizeof(void*) and there is no change.

Reviewed by: Vladislav Vaintroub
Tested by: Axel Schwenke and Vladislav Vaintroub
  • Loading branch information
dr-m committed Apr 19, 2021
1 parent 040c16a commit 8751aa7
Show file tree
Hide file tree
Showing 5 changed files with 323 additions and 149 deletions.
19 changes: 12 additions & 7 deletions storage/innobase/include/lock0lock.h
Original file line number Diff line number Diff line change
Expand Up @@ -548,7 +548,7 @@ class lock_sys_t

/** Hash table latch */
struct hash_latch
#if defined SRW_LOCK_DUMMY && !defined _WIN32
#ifdef SUX_LOCK_GENERIC
: private rw_lock
{
/** Wait for an exclusive lock */
Expand Down Expand Up @@ -577,15 +577,18 @@ class lock_sys_t
{ return memcmp(this, field_ref_zero, sizeof *this); }
#endif
};
static_assert(sizeof(hash_latch) <= sizeof(void*), "compatibility");

public:
struct hash_table
{
/** Number of consecutive array[] elements occupied by a hash_latch */
static constexpr size_t LATCH= sizeof(void*) >= sizeof(hash_latch) ? 1 : 2;
static_assert(sizeof(hash_latch) <= LATCH * sizeof(void*), "allocation");

/** Number of array[] elements per hash_latch.
Must be one less than a power of 2. */
Must be LATCH less than a power of 2. */
static constexpr size_t ELEMENTS_PER_LATCH= CPU_LEVEL1_DCACHE_LINESIZE /
sizeof(void*) - 1;
sizeof(void*) - LATCH;

/** number of payload elements in array[]. Protected by lock_sys.latch. */
ulint n_cells;
Expand All @@ -608,11 +611,13 @@ class lock_sys_t
/** @return the index of an array element */
inline ulint calc_hash(ulint fold) const;
/** @return raw array index converted to padded index */
static ulint pad(ulint h) { return 1 + (h / ELEMENTS_PER_LATCH) + h; }
static ulint pad(ulint h)
{ return LATCH + LATCH * (h / ELEMENTS_PER_LATCH) + h; }
/** Get a latch. */
static hash_latch *latch(hash_cell_t *cell)
{
void *l= ut_align_down(cell, (ELEMENTS_PER_LATCH + 1) * sizeof *cell);
void *l= ut_align_down(cell, sizeof *cell *
(ELEMENTS_PER_LATCH + LATCH));
return static_cast<hash_latch*>(l);
}
/** Get a hash table cell. */
Expand Down Expand Up @@ -646,7 +651,7 @@ class lock_sys_t
/** Number of shared latches */
std::atomic<ulint> readers{0};
#endif
#if defined SRW_LOCK_DUMMY && !defined _WIN32
#ifdef SUX_LOCK_GENERIC
protected:
/** mutex for hash_latch::wait() */
pthread_mutex_t hash_mutex;
Expand Down
32 changes: 32 additions & 0 deletions storage/innobase/include/rw_lock.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,17 @@ this program; if not, write to the Free Software Foundation, Inc.,
#include <atomic>
#include "my_dbug.h"

#if !(defined __linux__ || defined __OpenBSD__ || defined _WIN32)
# define SUX_LOCK_GENERIC
#elif 0 // defined SAFE_MUTEX
# define SUX_LOCK_GENERIC /* Use dummy implementation for debugging purposes */
#endif

#ifdef SUX_LOCK_GENERIC
/** Simple read-update-write lock based on std::atomic */
#else
/** Simple read-write lock based on std::atomic */
#endif
class rw_lock
{
/** The lock word */
Expand All @@ -35,8 +45,10 @@ class rw_lock
static constexpr uint32_t WRITER_WAITING= 1U << 30;
/** Flag to indicate that write_lock() or write_lock_wait() is pending */
static constexpr uint32_t WRITER_PENDING= WRITER | WRITER_WAITING;
#ifdef SUX_LOCK_GENERIC
/** Flag to indicate that an update lock exists */
static constexpr uint32_t UPDATER= 1U << 29;
#endif /* SUX_LOCK_GENERIC */

/** Start waiting for an exclusive lock.
@return current value of the lock word */
Expand All @@ -54,22 +66,29 @@ class rw_lock
@tparam prioritize_updater whether to ignore WRITER_WAITING for UPDATER
@param l the value of the lock word
@return whether the lock was acquired */
#ifdef SUX_LOCK_GENERIC
template<bool prioritize_updater= false>
#endif /* SUX_LOCK_GENERIC */
bool read_trylock(uint32_t &l)
{
l= UNLOCKED;
while (!lock.compare_exchange_strong(l, l + 1, std::memory_order_acquire,
std::memory_order_relaxed))
{
DBUG_ASSERT(!(WRITER & l) || !(~WRITER_PENDING & l));
#ifdef SUX_LOCK_GENERIC
DBUG_ASSERT((~(WRITER_PENDING | UPDATER) & l) < UPDATER);
if (prioritize_updater
? (WRITER & l) || ((WRITER_WAITING | UPDATER) & l) == WRITER_WAITING
: (WRITER_PENDING & l))
#else /* SUX_LOCK_GENERIC */
if (l & WRITER_PENDING)
#endif /* SUX_LOCK_GENERIC */
return false;
}
return true;
}
#ifdef SUX_LOCK_GENERIC
/** Try to acquire an update lock.
@param l the value of the lock word
@return whether the lock was acquired */
Expand Down Expand Up @@ -116,6 +135,7 @@ class rw_lock
lock.fetch_xor(WRITER | UPDATER, std::memory_order_relaxed);
DBUG_ASSERT((l & ~WRITER_WAITING) == WRITER);
}
#endif /* SUX_LOCK_GENERIC */

/** Wait for an exclusive lock.
@return whether the exclusive lock was acquired */
Expand All @@ -141,10 +161,15 @@ class rw_lock
bool read_unlock()
{
auto l= lock.fetch_sub(1, std::memory_order_release);
#ifdef SUX_LOCK_GENERIC
DBUG_ASSERT(~(WRITER_PENDING | UPDATER) & l); /* at least one read lock */
#else /* SUX_LOCK_GENERIC */
DBUG_ASSERT(~(WRITER_PENDING) & l); /* at least one read lock */
#endif /* SUX_LOCK_GENERIC */
DBUG_ASSERT(!(l & WRITER)); /* no write lock must have existed */
return (~WRITER_PENDING & l) == 1;
}
#ifdef SUX_LOCK_GENERIC
/** Release an update lock */
void update_unlock()
{
Expand All @@ -153,13 +178,18 @@ class rw_lock
/* the update lock must have existed */
DBUG_ASSERT((l & (WRITER | UPDATER)) == UPDATER);
}
#endif /* SUX_LOCK_GENERIC */
/** Release an exclusive lock */
void write_unlock()
{
IF_DBUG_ASSERT(auto l=,)
lock.fetch_and(~WRITER, std::memory_order_release);
/* the write lock must have existed */
#ifdef SUX_LOCK_GENERIC
DBUG_ASSERT((l & (WRITER | UPDATER)) == WRITER);
#else /* SUX_LOCK_GENERIC */
DBUG_ASSERT(l & WRITER);
#endif /* SUX_LOCK_GENERIC */
}
/** Try to acquire a shared lock.
@return whether the lock was acquired */
Expand All @@ -176,9 +206,11 @@ class rw_lock
/** @return whether an exclusive lock is being held by any thread */
bool is_write_locked() const
{ return !!(lock.load(std::memory_order_relaxed) & WRITER); }
#ifdef SUX_LOCK_GENERIC
/** @return whether an update lock is being held by any thread */
bool is_update_locked() const
{ return !!(lock.load(std::memory_order_relaxed) & UPDATER); }
#endif /* SUX_LOCK_GENERIC */
/** @return whether a shared lock is being held by any thread */
bool is_read_locked() const
{
Expand Down
Loading

0 comments on commit 8751aa7

Please sign in to comment.