Skip to content

Commit d58734d

Browse files
committed
MDEV-34520 purge_sys_t::wait_FTS sleeps 10ms, even if it does not have to
There were two separate Atomic_counter<uint32_t>, purge_sys.m_SYS_paused and purge_sys.m_FTS_paused. In purge_sys.wait_FTS() we have to read both atomically. We used to use an overkill solution for this, acquiring purge_sys.latch and waiting 10 milliseconds between samples. To make matters worse, the 10-millisecond wait was unconditional, which would unnecessarily suspend the purge_coordinator_task every now and then. It turns out that we can fold both "reference counts" into a single Atomic_relaxed<uint32_t> and avoid the purge_sys.latch. To assess whether std::memory_order_relaxed is acceptable, we should consider the operations that read these "reference counts", that is, purge_sys_t::wait_FTS(bool) and purge_sys_t::must_wait_FTS(). Outside debug assertions, purge_sys.must_wait_FTS() is only invoked in trx_purge_table_acquire(), which is covered by a shared dict_sys.latch. We would increment the counter as part of a DDL operation, but before acquiring an exclusive dict_sys.latch. So, a purge_sys_t::close_and_reopen() loop could be triggered slightly prematurely, before a problematic DDL operation is actually executed. Decrementing the counter is less of an issue; purge_sys.resume_FTS() or purge_sys.resume_SYS() would mostly be invoked while holding an exclusive dict_sys.latch; ha_innobase::delete_table() does it outside that critical section. Still, this would only cause some extra wait in the purge_coordinator_task, just like at the start of a DDL operation. There are two calls to purge_sys_t::wait_FTS(bool): in the above mentioned purge_sys_t::close_and_reopen() and in purge_sys_t::clone_oldest_view(), both invoked by the purge_coordinator_task. There is also a purge_sys.clone_oldest_view<true>() call at startup when no DDL operation can be in progress. purge_sys_t::m_SYS_paused: Merged into m_FTS_paused, using a new multiplier PAUSED_SYS = 65536. purge_sys_t::wait_FTS(): Remove an unnecessary sleep as well as the access to purge_sys.latch. It suffices to poll purge_sys.m_FTS_paused. purge_sys_t::stop_FTS(): Do not acquire purge_sys.latch. Reviewed by: Debarun Banerjee
1 parent 9020baf commit d58734d

File tree

3 files changed

+19
-21
lines changed

3 files changed

+19
-21
lines changed

storage/innobase/include/trx0purge.h

Lines changed: 13 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -149,10 +149,11 @@ class purge_sys_t
149149
private:
150150
/** number of pending stop() calls without resume() */
151151
Atomic_counter<uint32_t> m_paused;
152-
/** number of stop_SYS() calls without resume_SYS() */
153-
Atomic_counter<uint32_t> m_SYS_paused;
154-
/** number of stop_FTS() calls without resume_FTS() */
155-
Atomic_counter<uint32_t> m_FTS_paused;
152+
/** PAUSED_SYS * number of stop_SYS() calls without resume_SYS() +
153+
number of stop_FTS() calls without resume_FTS() */
154+
Atomic_relaxed<uint32_t> m_FTS_paused;
155+
/** The stop_SYS() multiplier in m_FTS_paused */
156+
static constexpr const uint32_t PAUSED_SYS= 1U << 16;
156157

157158
/** latch protecting end_view */
158159
alignas(CPU_LEVEL1_DCACHE_LINESIZE) srw_spin_lock_low end_latch;
@@ -321,16 +322,21 @@ class purge_sys_t
321322
void wait_FTS(bool also_sys);
322323
public:
323324
/** Suspend purge in data dictionary tables */
324-
void stop_SYS() { m_SYS_paused++; }
325+
void stop_SYS()
326+
{
327+
ut_d(const auto p=) m_FTS_paused.fetch_add(PAUSED_SYS);
328+
ut_ad(p < p + PAUSED_SYS);
329+
}
325330
/** Resume purge in data dictionary tables */
326331
static void resume_SYS(void *);
327332

328333
/** Pause purge during a DDL operation that could drop FTS_ tables. */
329334
void stop_FTS();
330335
/** Resume purge after stop_FTS(). */
331-
void resume_FTS() { ut_d(const auto p=) m_FTS_paused--; ut_ad(p); }
336+
void resume_FTS()
337+
{ ut_d(const auto p=) m_FTS_paused.fetch_sub(1); ut_ad(p & ~PAUSED_SYS); }
332338
/** @return whether stop_SYS() is in effect */
333-
bool must_wait_FTS() const { return m_FTS_paused; }
339+
bool must_wait_FTS() const { return m_FTS_paused & ~PAUSED_SYS; }
334340

335341
private:
336342
/**

storage/innobase/srv/srv0srv.cc

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1298,10 +1298,9 @@ bool purge_sys_t::running()
12981298

12991299
void purge_sys_t::stop_FTS()
13001300
{
1301-
latch.rd_lock(SRW_LOCK_CALL);
1302-
m_FTS_paused++;
1303-
latch.rd_unlock();
1304-
while (m_active)
1301+
ut_d(const auto paused=) m_FTS_paused.fetch_add(1);
1302+
ut_ad(paused < PAUSED_SYS);
1303+
while (m_active.load(std::memory_order_acquire))
13051304
std::this_thread::sleep_for(std::chrono::seconds(1));
13061305
}
13071306

@@ -1335,8 +1334,8 @@ void purge_sys_t::stop()
13351334
/** Resume purge in data dictionary tables */
13361335
void purge_sys_t::resume_SYS(void *)
13371336
{
1338-
ut_d(auto paused=) purge_sys.m_SYS_paused--;
1339-
ut_ad(paused);
1337+
ut_d(auto paused=) purge_sys.m_FTS_paused.fetch_sub(PAUSED_SYS);
1338+
ut_ad(paused >= PAUSED_SYS);
13401339
}
13411340

13421341
/** Resume purge at UNLOCK TABLES after FLUSH TABLES FOR EXPORT */

storage/innobase/trx/trx0purge.cc

Lines changed: 1 addition & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1065,15 +1065,8 @@ static void trx_purge_close_tables(purge_node_t *node, THD *thd)
10651065

10661066
void purge_sys_t::wait_FTS(bool also_sys)
10671067
{
1068-
bool paused;
1069-
do
1070-
{
1071-
latch.wr_lock(SRW_LOCK_CALL);
1072-
paused= m_FTS_paused || (also_sys && m_SYS_paused);
1073-
latch.wr_unlock();
1068+
for (const uint32_t mask= also_sys ? ~0U : ~PAUSED_SYS; m_FTS_paused & mask;)
10741069
std::this_thread::sleep_for(std::chrono::milliseconds(10));
1075-
}
1076-
while (paused);
10771070
}
10781071

10791072
__attribute__((nonnull))

0 commit comments

Comments
 (0)