-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
BUG#17943188 SHOW SLAVE STATUS/RETRIEVED_GTID_SET MAY HAVE PARTIAL TRX
OR MISS COMPLETE TRX Problem: ======= The SHOW SLAVE STATUS command contains the column RETRIEVED_GTID_SET. This is supposed to contain the set of GTIDs that exist in the relay log. However, the field is updated when the slave receiver thread (I/O thread) receives a Gtid_log_event, which happens at the beginning of the transaction. If the I/O thread gets disconnected in the middle of a transaction, RETRIEVED_GTID_SET can contain a GTID for a transaction that is only partially received in the relay log. This transaction will subsequently be rolled back, so it is wrong to pretend that the transaction is there. Typical fail-over algorithms use RETRIEVED_GTID_SET to determine which slave has received the most transactions to promote the slave to a master. This is true for e.g. the mysqlfailover utility. When RETRIEVED_GTID_SET can contain partially transmitted transactions, the fail-over utility can choose the wrong slave to promote. This can lead to data corruption later. This means that even if semi-sync is enabled, transactions that have been acknowledged by one slave can be lost. Fix: === It was implemented a transaction boundaries parser that will give information about transaction boundaries of an event stream based on the event types and their queries (when they are Query_log_event). As events are queued by the I/O thread, it feeds the Master_info transaction boundary parser. The slave I/O recovery also uses the transaction parser to determine if a given GTID can be added to the Retrieved_Gtid_Set or not. When the event parser is in GTID state because a Gtid_log_event was queued, the event's GTID isn't added to the retrieved list yet. It is stored in an auxiliary GTID variable. After flushing an event into the relay log, the IO thread verifies if the transaction parser is not inside a transaction anymore (meaning that the last event of the transaction has been flushed). If transaction parser is outside a transaction, the I/O thread verifies if a GTID was stored in the start of the transaction, adding it to the retrieved list, ensuring that all the transaction has arrived and was flushed to the relay log. Also, before this patch, after the I/O thread flushed a single received event into the relaylog, it was possible to rotate the relaylog if the current relaylog file size exceeded max_binlog_size/max_relaylog_size. After this patch, when GTIDs are enabled we only allow this rotation by size if the transaction parser is not in the middle of a transaction. Note: The current patch removed the changes for BUG#17280176, as it also dealt with similar problem in a different way.
- Loading branch information
Joao Gramacho
committed
Jan 16, 2015
1 parent
25d1855
commit 9dab9da
Showing
42 changed files
with
5,686 additions
and
110 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
# ==== Purpose ==== | ||
# | ||
# This include will insert some data into a table at the master varying the | ||
# debug sync point at slave that will be used to stop the IO thread in the | ||
# middle of transaction event stream (trying to let partial transactions in | ||
# the relay log). | ||
# | ||
# It will do this task (insert some data) twice. | ||
# | ||
# The first time with the SQL thread stopped, letting the IO thread do its job | ||
# until all data is replicated, starting the SQL only at the end of the test. | ||
# | ||
# The second time, the SQL thread will be running all the time, syncing on each | ||
# step of the test. | ||
# | ||
# ==== Usage ==== | ||
# | ||
# [--let $storage_engine= InnoDB | MyISAM] | ||
# --source extra/rpl_tests/rpl_trx_boundary_parser.inc | ||
# | ||
# Parameters: | ||
# $storage_engine | ||
# The storage engine that will be used in the CREATE TABLE statement. | ||
# If not specified, InnoDB will be used. | ||
# | ||
|
||
if ( `SELECT '$storage_engine' != '' AND UPPER('$storage_engine') <> 'INNODB' AND UPPER('$storage_engine') <> 'MYISAM'` ) | ||
{ | ||
--die ERROR IN TEST: invalid value for mysqltest variable 'storage_engine': $storage_engine | ||
} | ||
|
||
--echo ## Running the test with the SQL thread stopped | ||
--source include/rpl_connection_slave.inc | ||
--source include/stop_slave_sql.inc | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_all_steps.inc | ||
|
||
--echo ## Starting and syncing the SQL thread before next round | ||
--source include/rpl_connection_slave.inc | ||
--source include/start_slave_sql.inc | ||
--source include/rpl_connection_master.inc | ||
--source include/sync_slave_sql_with_master.inc | ||
|
||
--echo ## Running the test with the SQL thread started | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_all_steps.inc |
252 changes: 252 additions & 0 deletions
252
mysql-test/extra/rpl_tests/rpl_trx_boundary_parser_all_steps.inc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,252 @@ | ||
# ==== Purpose ==== | ||
# | ||
# This include will insert some data into a table at the master varying the | ||
# debug sync point at slave that will be used to stop the IO thread in the | ||
# middle of transaction event stream (trying to let partial transactions in | ||
# the relay log). | ||
# | ||
# ==== Usage ==== | ||
# | ||
# [--let $storage_engine= InnoDB | MyISAM] | ||
# --source extra/rpl_tests/rpl_trx_boundary_parser_all_steps.inc | ||
# | ||
# Parameters: | ||
# $storage_engine | ||
# The storage engine that will be used in the CREATE TABLE statement. | ||
# If not specified, InnoDB will be used. | ||
# | ||
|
||
if (!$storage_engine) | ||
{ | ||
--let $_storage_engine= INNODB | ||
} | ||
if ($storage_engine) | ||
{ | ||
--let $_storage_engine= `SELECT UPPER('$storage_engine')` | ||
} | ||
if ( `SELECT '$_storage_engine' <> 'INNODB' AND '$_storage_engine' <> 'MYISAM'` ) | ||
{ | ||
--die ERROR IN TEST: invalid value for mysqltest variable 'storage_engine': $storage_engine | ||
} | ||
|
||
# Check if SQL thread is running | ||
--source include/rpl_connection_slave.inc | ||
--let $_is_sql_thread_running= query_get_value(SHOW SLAVE STATUS, Slave_SQL_Running, 1) | ||
|
||
# If the SQL thread is stopped, we will assert GTIDs based on | ||
# Retrieved_Gtid_Set | ||
if ( $_is_sql_thread_running == No ) | ||
{ | ||
--let $assert_on_retrieved_gtid_set= 1 | ||
--let $gtid_step_assert_include=include/gtid_step_assert_on_retrieved.inc | ||
--let $gtid_step_reset_include=include/gtid_step_reset_on_retrieved.inc | ||
} | ||
if ( $_is_sql_thread_running == Yes ) | ||
{ | ||
--let $assert_on_retrieved_gtid_set= 0 | ||
--let $gtid_step_assert_include=include/gtid_step_assert.inc | ||
--let $gtid_step_reset_include=include/gtid_step_reset.inc | ||
} | ||
|
||
--source include/rpl_connection_master.inc | ||
# GTID steps will be based on master's UUID | ||
--let $gtid_step_uuid= `SELECT @@GLOBAL.SERVER_UUID` | ||
--source include/rpl_connection_slave.inc | ||
--source $gtid_step_reset_include | ||
|
||
# Creating tables t1 and t2 using $_storage_engine | ||
# Table t1 will log the testcase activity | ||
# Table t2 will be used to insert data to be tested | ||
--source include/rpl_connection_master.inc | ||
--eval CREATE TABLE t1 (i INT NOT NULL AUTO_INCREMENT PRIMARY KEY, info VARCHAR(64)) ENGINE=$_storage_engine | ||
--eval CREATE TABLE t2 (i INT) ENGINE=$_storage_engine | ||
|
||
# | ||
# First, we insert some data, restart the slave IO thread and | ||
# sync slave SQL thread (if it is running) with master | ||
# as a normal case just for control. | ||
# | ||
|
||
# Insert data without splitting transactions in the relay log | ||
INSERT INTO t1 (info) VALUE ('Insert data without splitting transactions in the relay log'); | ||
|
||
BEGIN; | ||
INSERT INTO t2 (i) VALUES (-6); | ||
INSERT INTO t2 (i) VALUES (-5); | ||
INSERT INTO t2 (i) VALUES (-4); | ||
COMMIT; | ||
|
||
# Check if SQL thread was running before to sync it | ||
if ( $_is_sql_thread_running == Yes ) | ||
{ | ||
# Sync SQL thread | ||
--source include/rpl_connection_master.inc | ||
--source include/sync_slave_sql_with_master.inc | ||
--let diff_tables= master:t1, slave:t1 | ||
--source include/diff_tables.inc | ||
} | ||
# Else we sync only the IO thread | ||
if ( $_is_sql_thread_running == No ) | ||
{ | ||
# Sync IO thread | ||
--source include/rpl_connection_master.inc | ||
--source include/sync_slave_io_with_master.inc | ||
} | ||
|
||
# Restart the IO thread not in the middle of transaction | ||
--source include/rpl_connection_slave.inc | ||
--source include/stop_slave_io.inc | ||
--source include/start_slave_io.inc | ||
|
||
# Check if the IO thread retrieved the correct amount of GTIDs | ||
--source include/rpl_connection_slave.inc | ||
--let $gtid_step_count= 4 | ||
if ($_storage_engine == 'MYISAM') | ||
{ | ||
--let $gtid_step_count= 6 | ||
} | ||
--source $gtid_step_assert_include | ||
|
||
# | ||
# Second, we make master rotate its binlog | ||
# | ||
|
||
# Insert data rotating master binlog between two transactions | ||
--source include/rpl_connection_master.inc | ||
INSERT INTO t1 (info) VALUE ('Insert data rotating master binlog between two transactions'); | ||
|
||
BEGIN; | ||
INSERT INTO t2 (i) VALUES (-3); | ||
INSERT INTO t2 (i) VALUES (-2); | ||
COMMIT; | ||
FLUSH LOGS; | ||
INSERT INTO t1 (info) VALUE ('After FLUSH LOGS at master'); | ||
BEGIN; | ||
INSERT INTO t2 (i) VALUES (-1); | ||
INSERT INTO t2 (i) VALUES (0); | ||
COMMIT; | ||
|
||
# Check if SQL thread was running before to sync it | ||
if ( $_is_sql_thread_running == Yes ) | ||
{ | ||
# Sync SQL thread | ||
--source include/rpl_connection_master.inc | ||
--source include/sync_slave_sql_with_master.inc | ||
--let diff_tables= master:t1, slave:t1 | ||
--source include/diff_tables.inc | ||
} | ||
# Else we sync only the IO thread | ||
if ( $_is_sql_thread_running == No ) | ||
{ | ||
# Sync IO thread | ||
--source include/rpl_connection_master.inc | ||
--source include/sync_slave_io_with_master.inc | ||
} | ||
|
||
# Restart the IO thread again, not in the middle of transaction | ||
--source include/rpl_connection_slave.inc | ||
--source include/stop_slave_io.inc | ||
--source include/start_slave_io.inc | ||
|
||
# Check if the IO thread retrieved the correct amount of GTIDs | ||
--source include/rpl_connection_slave.inc | ||
--let $gtid_step_count= 4 | ||
if ($_storage_engine == 'MYISAM') | ||
{ | ||
# We will expect a different amount of GTIDs, as the non-transactional | ||
# storage engine will "ignore" the BEGIN/COMMIT boundaries and will | ||
# generate one transaction for each INSERT statement. | ||
--let $gtid_step_count= 6 | ||
} | ||
--source $gtid_step_assert_include | ||
|
||
# | ||
# Third, let's go with splitting transactions | ||
# | ||
|
||
--let $info_table= t1 | ||
--let $table= t2 | ||
--let $counter= 0 | ||
|
||
# Stop after GTID, just if GTIDs are enabled | ||
--inc $counter | ||
--let $debug_point= stop_io_after_reading_gtid_log_event | ||
--let $gtids_after_stop= 1 | ||
--let $gtids_after_sync= 2 | ||
if ($_storage_engine == 'MYISAM') | ||
{ | ||
--let $gtids_after_sync= 3 | ||
} | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_one_step.inc | ||
|
||
# Stop after BEGIN query | ||
--inc $counter | ||
--let $debug_point= stop_io_after_reading_query_log_event | ||
--let $gtids_after_stop= 1 | ||
--let $gtids_after_sync= 2 | ||
if ($_storage_engine == 'MYISAM') | ||
{ | ||
--let $gtids_after_sync= 3 | ||
} | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_one_step.inc | ||
|
||
# Stop after USER_VAR, just for SBR | ||
if ( `SELECT @@GLOBAL.binlog_format = 'STATEMENT'` ) | ||
{ | ||
--inc $counter | ||
--let $debug_point= stop_io_after_reading_user_var_log_event | ||
--let $gtids_after_stop= 1 | ||
--let $gtids_after_sync= 2 | ||
if ($_storage_engine == 'MYISAM') | ||
{ | ||
--let $gtids_after_stop= 2 | ||
--let $gtids_after_sync= 2 | ||
} | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_one_step.inc | ||
} | ||
|
||
# Stop after TABLE_MAP, just for RBR | ||
if ( `SELECT @@GLOBAL.binlog_format = 'ROW'` ) | ||
{ | ||
--inc $counter | ||
--let $debug_point= stop_io_after_reading_table_map_event | ||
--let $gtids_after_stop= 1 | ||
--let $gtids_after_sync= 2 | ||
if ($_storage_engine == 'MYISAM') | ||
{ | ||
--let $gtids_after_stop= 1 | ||
--let $gtids_after_sync= 3 | ||
} | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_one_step.inc | ||
} | ||
|
||
# Stop after XID, just for InnoDB tables | ||
if ( $_storage_engine == 'INNODB' ) | ||
{ | ||
--inc $counter | ||
--let $debug_point= stop_io_after_reading_xid_log_event | ||
--let $gtids_after_stop= 2 | ||
--let $gtids_after_sync= 1 | ||
--source extra/rpl_tests/rpl_trx_boundary_parser_one_step.inc | ||
} | ||
|
||
# Check if SQL thread was running before to sync it | ||
if ( $_is_sql_thread_running == Yes ) | ||
{ | ||
# Sync SQL thread | ||
--source include/rpl_connection_master.inc | ||
--source include/sync_slave_sql_with_master.inc | ||
--let diff_tables= master:t1, slave:t1 | ||
--source include/diff_tables.inc | ||
} | ||
|
||
# Dropping tables t1 and t2 | ||
--source include/rpl_connection_master.inc | ||
DROP TABLE t1,t2; | ||
|
||
# Check if SQL thread was running before to sync it | ||
if ( $_is_sql_thread_running == Yes ) | ||
{ | ||
# Let the slave to sync with the master before exiting the include | ||
--source include/sync_slave_sql_with_master.inc | ||
} |
Oops, something went wrong.