MDEV-32115: Log checkpoint race with wsrep_sst_method=rsync#5046
Conversation
|
|
There was a problem hiding this comment.
Code Review
This pull request refactors the InnoDB write-disabling logic during Galera SST by introducing pause and resume mechanisms for the FTS optimization subsystem and improving synchronization within the page cleaner and checkpoint processes. Feedback was provided to ensure that the FTS timer is explicitly stopped during pause and restarted during resume to prevent potential assertion failures and ensure background optimization continues as expected. Additionally, the reviewer noted that indentation should be updated to use tabs for consistency with the existing codebase.
| void fts_optimize_pause() | ||
| { | ||
| ut_ad(!srv_read_only_mode); | ||
| task.wait(); | ||
| } |
There was a problem hiding this comment.
The fts_optimize_pause function should stop the FTS timer before waiting for the task to complete. If the timer is left running, it could trigger a new execution of fts_optimize_callback while wsrep_sst_disable_writes is true, leading to an assertion failure (at line 2829) or unintended InnoDB writes during the SST process. Additionally, please use tabs for indentation to remain consistent with the InnoDB coding style in this file.
void fts_optimize_pause()
{
ut_ad(!srv_read_only_mode);
if (timer) {
timer->stop();
}
task.wait();
}There was a problem hiding this comment.
I do not see any stop() being declared in tpool/tpool.h:
class timer
{
public:
virtual void set_time(int initial_delay_ms, int period_ms) = 0;
virtual void disarm() = 0;
virtual ~timer(){}
};I did have timer->disarm() in a development version of this patch, but it looks like I accidentally removed that. The tests pass when that call is present. b66db0f adds it back.
Galera snapshot transfer (SST) using the default wsrep_sst_method=rsync is prone to creating corrupted snapshots. The probability for this is rather low and might only affect installations that include ENGINE=InnoDB tables that contain FULLTEXT INDEX. The function sst_disable_innodb_writes() aims to disable all InnoDB writes during the time a snapshot transfer (SST) is in progress using the default wsrep_sst_method=rsync. The logic based on invoking log_make_checkpoint() almost works, except for two things: We failed to ensure that fts_optimize_callback() has stopped executing, and we did not block updates of the log checkpoint header. log_checkpoint_low(): Assert that writes to the log are allowed. buf_flush_page_cleaner(): Do not try to advance the checkpoint while wsrep_sst_method=rsync is in progress. This prevents the assertion in log_checkpoint_low() from failing. fts_optimize_pause(), fts_optimize_resume(): Pause and resume the fts_optimize_callback(). sst_disable_innodb_writes(): Disable all background writers before initiating the log checkpoint. fts_optimize_callback(): Assert that wsrep_sst_method=rsync is not active, and remove the previous incorrect attempt at fixing this race.
This supercedes #5018.
Description
MDEV-32115: Log checkpoint race with
wsrep_sst_method=rsyncGalera snapshot transfer (SST) using the default
wsrep_sst_method=rsyncis prone to creating corrupted snapshots. The probability for this is rather low and might only affect installations that includeENGINE=InnoDBtables that containFULLTEXT INDEX.The function
sst_disable_innodb_writes()aims to disable all InnoDB writes during the time a snapshot transfer (SST) is in progress using the defaultwsrep_sst_method=rsync.The logic based on invoking
log_make_checkpoint()almost works, except for two things: We failed to ensure that fts_optimize_callback() has stopped executing, and we did not block updates of the log checkpoint header.log_checkpoint_low(): Assert that writes to the log are allowed.buf_flush_page_cleaner(): Do not try to advance the checkpoint whilewsrep_sst_method=rsyncis in progress. This prevents the assertion inlog_checkpoint_low()from failing.fts_optimize_pause(),fts_optimize_resume(): Pause and resume thefts_optimize_callback().sst_disable_innodb_writes(): Disable all background writers before initiating the log checkpoint.fts_optimize_callback(): Assert thatwsrep_sst_method=rsyncis not active, and remove the previous incorrect attempt at fixing this race.Release Notes
See the previous section.
How can this PR be tested?
When the assertions in
log_checkpoint_low()orfts_optimize_callback()are in place and the rest of this is not, the assertions would fail in several tests.Basing the PR against the correct MariaDB version
mainbranch.I believe that with some effort this could be reproduced in MariaDB Server 10.6 as well, but the logic around checkpoints is different there.