Skip to content

Commit 8751aa7

Browse files
committed
MDEV-25404: ssux_lock_low: Introduce a separate writer mutex
Having both readers and writers use a single lock word in futex system calls caused performance regression compared to SRW_LOCK_DUMMY (mutex and 2 condition variables). A contributing factor is that we did not accurately keep track of the number of waiting threads and thus had to invoke system calls to wake up any waiting threads. SUX_LOCK_GENERIC: Renamed from SRW_LOCK_DUMMY. This is the original implementation, with rw_lock (std::atomic<uint32_t>), a mutex and two condition variables. Using a separate writer mutex (as described below) is not possible, because the mutex ownership in a buf_block_t::lock must be able to transfer from a write submitter thread to an I/O completion thread, and pthread_mutex_lock() may assume that the submitter thread is recursively acquiring the mutex that it already holds, while in reality the I/O completion thread is the real owner. POSIX does not define an interface for requesting a mutex to be non-recursive. On Microsoft Windows, srw_lock_low will remain a simple wrapper of SRWLOCK. On 32-bit Microsoft Windows, sizeof(SRWLOCK)=4 while sizeof(srw_lock_low)=8. On other platforms, srw_lock_low is an alias of ssux_lock_low, the Simple (non-recursive) Shared/Update/eXclusive lock. In the futex-based implementation of ssux_lock_low (Linux, OpenBSD, Microsoft Windows), we shall use a dedicated mutex for exclusive requests (writer), and have a WRITER flag in the 'readers' lock word to inform that a writer is holding the lock or waiting for the lock to be granted. When the WRITER flag is set, all lock requests must acquire the writer mutex. Normally, shared (S) lock requests simply perform a compare-and-swap on the 'readers' word. Update locks are implemented as a combination of writer mutex and a normal counter in the 'readers' lock word. The conflict between U and X locks is guaranteed by the writer mutex. Unlike SUX_LOCK_GENERIC, wr_u_downgrade() will not wake up any pending rd_lock() waits. They will wait until u_unlock() releases the writer mutex. The ssux_lock_low is always wrapped by sux_lock (with a recursion count of U and X locks), used for dict_index_t::lock and buf_block_t::lock. Their memory footprint for the futex-based implementation will increase by sizeof(srw_mutex), or 4 bytes. This change addresses a performance regression in read-only benchmarks, such as sysbench oltp_read_only. Also write performance was improved. On 32-bit Linux and OpenBSD, lock_sys_t::hash_table will allocate two hash table elements for each srw_lock (14 instead of 15 hash table cells per 64-byte cache line on IA-32). On Microsoft Windows, sizeof(SRWLOCK)==sizeof(void*) and there is no change. Reviewed by: Vladislav Vaintroub Tested by: Axel Schwenke and Vladislav Vaintroub
1 parent 040c16a commit 8751aa7

File tree

5 files changed

+323
-149
lines changed

5 files changed

+323
-149
lines changed

storage/innobase/include/lock0lock.h

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -548,7 +548,7 @@ class lock_sys_t
548548

549549
/** Hash table latch */
550550
struct hash_latch
551-
#if defined SRW_LOCK_DUMMY && !defined _WIN32
551+
#ifdef SUX_LOCK_GENERIC
552552
: private rw_lock
553553
{
554554
/** Wait for an exclusive lock */
@@ -577,15 +577,18 @@ class lock_sys_t
577577
{ return memcmp(this, field_ref_zero, sizeof *this); }
578578
#endif
579579
};
580-
static_assert(sizeof(hash_latch) <= sizeof(void*), "compatibility");
581580

582581
public:
583582
struct hash_table
584583
{
584+
/** Number of consecutive array[] elements occupied by a hash_latch */
585+
static constexpr size_t LATCH= sizeof(void*) >= sizeof(hash_latch) ? 1 : 2;
586+
static_assert(sizeof(hash_latch) <= LATCH * sizeof(void*), "allocation");
587+
585588
/** Number of array[] elements per hash_latch.
586-
Must be one less than a power of 2. */
589+
Must be LATCH less than a power of 2. */
587590
static constexpr size_t ELEMENTS_PER_LATCH= CPU_LEVEL1_DCACHE_LINESIZE /
588-
sizeof(void*) - 1;
591+
sizeof(void*) - LATCH;
589592

590593
/** number of payload elements in array[]. Protected by lock_sys.latch. */
591594
ulint n_cells;
@@ -608,11 +611,13 @@ class lock_sys_t
608611
/** @return the index of an array element */
609612
inline ulint calc_hash(ulint fold) const;
610613
/** @return raw array index converted to padded index */
611-
static ulint pad(ulint h) { return 1 + (h / ELEMENTS_PER_LATCH) + h; }
614+
static ulint pad(ulint h)
615+
{ return LATCH + LATCH * (h / ELEMENTS_PER_LATCH) + h; }
612616
/** Get a latch. */
613617
static hash_latch *latch(hash_cell_t *cell)
614618
{
615-
void *l= ut_align_down(cell, (ELEMENTS_PER_LATCH + 1) * sizeof *cell);
619+
void *l= ut_align_down(cell, sizeof *cell *
620+
(ELEMENTS_PER_LATCH + LATCH));
616621
return static_cast<hash_latch*>(l);
617622
}
618623
/** Get a hash table cell. */
@@ -646,7 +651,7 @@ class lock_sys_t
646651
/** Number of shared latches */
647652
std::atomic<ulint> readers{0};
648653
#endif
649-
#if defined SRW_LOCK_DUMMY && !defined _WIN32
654+
#ifdef SUX_LOCK_GENERIC
650655
protected:
651656
/** mutex for hash_latch::wait() */
652657
pthread_mutex_t hash_mutex;

storage/innobase/include/rw_lock.h

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,17 @@ this program; if not, write to the Free Software Foundation, Inc.,
2020
#include <atomic>
2121
#include "my_dbug.h"
2222

23+
#if !(defined __linux__ || defined __OpenBSD__ || defined _WIN32)
24+
# define SUX_LOCK_GENERIC
25+
#elif 0 // defined SAFE_MUTEX
26+
# define SUX_LOCK_GENERIC /* Use dummy implementation for debugging purposes */
27+
#endif
28+
29+
#ifdef SUX_LOCK_GENERIC
2330
/** Simple read-update-write lock based on std::atomic */
31+
#else
32+
/** Simple read-write lock based on std::atomic */
33+
#endif
2434
class rw_lock
2535
{
2636
/** The lock word */
@@ -35,8 +45,10 @@ class rw_lock
3545
static constexpr uint32_t WRITER_WAITING= 1U << 30;
3646
/** Flag to indicate that write_lock() or write_lock_wait() is pending */
3747
static constexpr uint32_t WRITER_PENDING= WRITER | WRITER_WAITING;
48+
#ifdef SUX_LOCK_GENERIC
3849
/** Flag to indicate that an update lock exists */
3950
static constexpr uint32_t UPDATER= 1U << 29;
51+
#endif /* SUX_LOCK_GENERIC */
4052

4153
/** Start waiting for an exclusive lock.
4254
@return current value of the lock word */
@@ -54,22 +66,29 @@ class rw_lock
5466
@tparam prioritize_updater whether to ignore WRITER_WAITING for UPDATER
5567
@param l the value of the lock word
5668
@return whether the lock was acquired */
69+
#ifdef SUX_LOCK_GENERIC
5770
template<bool prioritize_updater= false>
71+
#endif /* SUX_LOCK_GENERIC */
5872
bool read_trylock(uint32_t &l)
5973
{
6074
l= UNLOCKED;
6175
while (!lock.compare_exchange_strong(l, l + 1, std::memory_order_acquire,
6276
std::memory_order_relaxed))
6377
{
6478
DBUG_ASSERT(!(WRITER & l) || !(~WRITER_PENDING & l));
79+
#ifdef SUX_LOCK_GENERIC
6580
DBUG_ASSERT((~(WRITER_PENDING | UPDATER) & l) < UPDATER);
6681
if (prioritize_updater
6782
? (WRITER & l) || ((WRITER_WAITING | UPDATER) & l) == WRITER_WAITING
6883
: (WRITER_PENDING & l))
84+
#else /* SUX_LOCK_GENERIC */
85+
if (l & WRITER_PENDING)
86+
#endif /* SUX_LOCK_GENERIC */
6987
return false;
7088
}
7189
return true;
7290
}
91+
#ifdef SUX_LOCK_GENERIC
7392
/** Try to acquire an update lock.
7493
@param l the value of the lock word
7594
@return whether the lock was acquired */
@@ -116,6 +135,7 @@ class rw_lock
116135
lock.fetch_xor(WRITER | UPDATER, std::memory_order_relaxed);
117136
DBUG_ASSERT((l & ~WRITER_WAITING) == WRITER);
118137
}
138+
#endif /* SUX_LOCK_GENERIC */
119139

120140
/** Wait for an exclusive lock.
121141
@return whether the exclusive lock was acquired */
@@ -141,10 +161,15 @@ class rw_lock
141161
bool read_unlock()
142162
{
143163
auto l= lock.fetch_sub(1, std::memory_order_release);
164+
#ifdef SUX_LOCK_GENERIC
144165
DBUG_ASSERT(~(WRITER_PENDING | UPDATER) & l); /* at least one read lock */
166+
#else /* SUX_LOCK_GENERIC */
167+
DBUG_ASSERT(~(WRITER_PENDING) & l); /* at least one read lock */
168+
#endif /* SUX_LOCK_GENERIC */
145169
DBUG_ASSERT(!(l & WRITER)); /* no write lock must have existed */
146170
return (~WRITER_PENDING & l) == 1;
147171
}
172+
#ifdef SUX_LOCK_GENERIC
148173
/** Release an update lock */
149174
void update_unlock()
150175
{
@@ -153,13 +178,18 @@ class rw_lock
153178
/* the update lock must have existed */
154179
DBUG_ASSERT((l & (WRITER | UPDATER)) == UPDATER);
155180
}
181+
#endif /* SUX_LOCK_GENERIC */
156182
/** Release an exclusive lock */
157183
void write_unlock()
158184
{
159185
IF_DBUG_ASSERT(auto l=,)
160186
lock.fetch_and(~WRITER, std::memory_order_release);
161187
/* the write lock must have existed */
188+
#ifdef SUX_LOCK_GENERIC
162189
DBUG_ASSERT((l & (WRITER | UPDATER)) == WRITER);
190+
#else /* SUX_LOCK_GENERIC */
191+
DBUG_ASSERT(l & WRITER);
192+
#endif /* SUX_LOCK_GENERIC */
163193
}
164194
/** Try to acquire a shared lock.
165195
@return whether the lock was acquired */
@@ -176,9 +206,11 @@ class rw_lock
176206
/** @return whether an exclusive lock is being held by any thread */
177207
bool is_write_locked() const
178208
{ return !!(lock.load(std::memory_order_relaxed) & WRITER); }
209+
#ifdef SUX_LOCK_GENERIC
179210
/** @return whether an update lock is being held by any thread */
180211
bool is_update_locked() const
181212
{ return !!(lock.load(std::memory_order_relaxed) & UPDATER); }
213+
#endif /* SUX_LOCK_GENERIC */
182214
/** @return whether a shared lock is being held by any thread */
183215
bool is_read_locked() const
184216
{

0 commit comments

Comments
 (0)