Skip to content

Commit

Permalink
MDEV-32168: Postpush fix for rpl_domain_id_filter_master_crash
Browse files Browse the repository at this point in the history
While a replica may be reading events from the
primary, the primary is killed. Left to its own
devices, the IO thread may or may not stop in
error, depending on what it is doing when its
connection to the primary is killed (e.g. a
failed read results in an error, whereas if the
IO thread is idly waiting for events when the
connection dies, it will enter into a reconnect
loop and reconnect). MDEV-32168 changed the test
to always wait for the reconnect, thus breaking
the error case, as the IO thread would be stopped
at a time of expecting it to be running.

The fix is to manually stop/start the IO thread
to ensure it is in a consistent state.

Note that rpl_domain_id_filter_master_crash.test
will need additional changes after fixing MDEV-33268

Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
  • Loading branch information
bnestere committed Jan 22, 2024
1 parent 3cd8875 commit 01ca57e
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,9 @@ connection master;
include/rpl_start_server.inc [server_number=1]
# Master has restarted successfully
connection slave;
include/wait_for_slave_io_to_start.inc
include/wait_for_slave_sql_to_start.inc
include/stop_slave_sql.inc
include/stop_slave_io.inc
include/start_slave.inc
select * from ti;
a
1
Expand Down
20 changes: 18 additions & 2 deletions mysql-test/suite/rpl/t/rpl_domain_id_filter_master_crash.test
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,26 @@ connection master;
save_master_pos;

--connection slave

# Left to its own devices, the IO thread may or may not stop in error,
# depending on what it is doing when its connection to the primary is killed
# (e.g. a failed read results in an error, whereas if the IO thread is idly
# waiting for events when the connection dies, it will enter into a reconnect
# loop and reconnect). So we manually stop/start the IO thread to ensure it is
# in a consistent state
#
# FIXME: We shouldn't need to stop/start the SQL thread here, but due to
# MDEV-33268, we have to. So after fixing 33268, this should only stop/start
# the IO thread. Note the SQL thread must be stopped first due to an invalid
# DBUG_ASSERT in the IO thread's stop logic that depends on the state of the
# SQL thread (also reported and to be fixed in the same ticket).
#
--source include/stop_slave_sql.inc
--let rpl_allow_error=1
--source include/wait_for_slave_io_to_start.inc
--source include/stop_slave_io.inc
--let rpl_allow_error=
--source include/wait_for_slave_sql_to_start.inc
--source include/start_slave.inc

sync_with_master;
select * from ti;
select * from tm;
Expand Down

0 comments on commit 01ca57e

Please sign in to comment.