Skip to content

Commit

Permalink
MDEV-33278 Assertion failure in thd_get_thread_id at lock_wait_wsrep
Browse files Browse the repository at this point in the history
Problem is that not all conflicting transactions have THD object.
Therefore, it must be checked that victim has THD
before it's identification is added to victim list as victim's
thread identification is later requested using thd_get_thread_id
function that requires that we have valid pointer to THD object
in trx->mysql_thd.

Victim might not have trx->mysql_thd in two cases:

(1) An incomplete transaction that was recovered from undo logs
on server startup (and not yet rolled back).

(2) Transaction that is in XA PREPARE state and whose client
connection was disconnected.

Neither of these can complete before lock_wait_wsrep()
releases lock_sys.latch.

(1) trx_t::commit_in_memory() is clearing both
trx_t::state and trx_t::is_recovered before it invokes
lock_release(trx_t*) (which would be blocked by the exclusive
lock_sys.latch that we are holding here). Hence, it is not
possible to write a debug assertion to document this scenario.

(2) If is in XA PREPARE state, it would eventually be rolled
back and the lock conflict would be resolved when an XA COMMIT
or XA ROLLBACK statement is executed in some other connection.
  • Loading branch information
janlindstrom authored and montywi committed Feb 27, 2024
1 parent e5c694a commit 5d4adea
Showing 1 changed file with 34 additions and 4 deletions.
38 changes: 34 additions & 4 deletions storage/innobase/lock/lock0lock.cc
Original file line number Diff line number Diff line change
Expand Up @@ -970,8 +970,31 @@ static void lock_wait_wsrep(trx_t *trx)
for (lock_t *lock= UT_LIST_GET_FIRST(table->locks); lock;
lock= UT_LIST_GET_NEXT(un_member.tab_lock.locks, lock))
{
/* if victim has also BF status, but has earlier seqno, we have to wait */
if (lock->trx != trx &&
/* Victim trx needs to be different from BF trx and it has to have a
THD so that we can kill it. Victim might not have THD in two cases:
(1) An incomplete transaction that was recovered from undo logs
on server startup (and not yet rolled back).
(2) Transaction that is in XA PREPARE state and whose client
connection was disconnected.
Neither of these can complete before lock_wait_wsrep() releases
lock_sys.latch.
(1) trx_t::commit_in_memory() is clearing both
trx_t::state and trx_t::is_recovered before it invokes
lock_release(trx_t*) (which would be blocked by the exclusive
lock_sys.latch that we are holding here). Hence, it is not
possible to write a debug assertion to document this scenario.
(2) If is in XA PREPARE state, it would eventually be rolled
back and the lock conflict would be resolved when an XA COMMIT
or XA ROLLBACK statement is executed in some other connection.
If victim has also BF status, but has earlier seqno, we have to wait.
*/
if (lock->trx != trx && lock->trx->mysql_thd &&
!(wsrep_thd_is_BF(lock->trx->mysql_thd, false) &&
wsrep_thd_order_before(lock->trx->mysql_thd, trx->mysql_thd)))
{
Expand Down Expand Up @@ -1003,8 +1026,11 @@ static void lock_wait_wsrep(trx_t *trx)
lock= lock_rec_get_next(heap_no, lock);
do
{
/* if victim has also BF status, but has earlier seqno, we have to wait */
if (lock->trx != trx &&
/* This is similar case as above except here we have
record-locks instead of table locks. See details
from comment above.
*/
if (lock->trx != trx && lock->trx->mysql_thd &&
!(wsrep_thd_is_BF(lock->trx->mysql_thd, false) &&
wsrep_thd_order_before(lock->trx->mysql_thd, trx->mysql_thd)))
{
Expand All @@ -1030,8 +1056,12 @@ static void lock_wait_wsrep(trx_t *trx)

std::vector<std::pair<ulong,trx_id_t>> victim_id;
for (trx_t *v : victims)
{
/* Victim must have THD */
ut_ad(v->mysql_thd);
victim_id.emplace_back(std::pair<ulong,trx_id_t>
{thd_get_thread_id(v->mysql_thd), v->id});
}

DBUG_EXECUTE_IF("sync.before_wsrep_thd_abort",
{
Expand Down

0 comments on commit 5d4adea

Please sign in to comment.