-
Notifications
You must be signed in to change notification settings - Fork 711
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix missing big transaction with GTIDs
Summary: This diff ports Oracles's patch for missing big transaction with GTIDs. This diff also includes fix for missing big transaction with GTIDs when parallel replication is enabled. Io_thread may receive only a partial transaction before it was stopped using stop slave. This causes a partial transaction with GTID to get logged in the relay log. When the slave is restarted again, it misses the transaction because GTID protocol assumes that the logged GTID in relay log is complete. This is fixed by removing the last GTID in the relay log from the gtid_retrieved_set causing the master to resend that whole transaction. Possible cases: 1) If there is a partial transaction, the whole transaction is retrieved again into the next relay log which will be executed by SQL thread. SQL thread rollbacks the partial transaction after seeing the FDE in the next relay log and starts executing the same transaction which was retrieved again. In MTS mode SQL thread appends a ROLLBACK query to the slave worker queue which got the partial transaction. 2) I/O thread would have retrieved full transaction already and SQL thread would have already executed it. In that case, We are not going to remove last retrieved GTID from "Retrieved_gtid_set" otherwise we will see gaps in "Retrieved set". 3) I/O thread would have retrieved full transaction already in the first time itself and SQL thread has not applied it yet while requesting dump but applied it after I/O thread started receiving events from master. In this case retrieving the same transaction again will not cause problem because GTID number is same, Hence SQL thread will not commit it again. Please note there will be partial transactions written in relay log but they will not cause any problem in case of transactional tables. But in case of non-transaction tables, partial transaction will create inconsistency between master and slave. In that case, users need to check manually. This is not a problem for us since we are using transactional tables. Test Plan: Added a test to verify all the scenarios with and without MTS. Also ran mysqltest.sh --parallel=32 with and without valgrind. Reviewers: steaphan, jtolmer Reviewed By: steaphan
- Loading branch information
1 parent
97c9324
commit 9bcc118
Showing
14 changed files
with
558 additions
and
49 deletions.
There are no files selected for viewing
150 changes: 150 additions & 0 deletions
150
mysql-test/suite/rpl/r/rpl_gtid_missing_big_event.result
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,150 @@ | ||
include/master-slave.inc | ||
Warnings: | ||
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure. | ||
Note #### Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information. | ||
[connection master] | ||
create table t1(a int) engine = innodb; | ||
include/stop_slave.inc | ||
change master to master_auto_position = 1; | ||
include/start_slave.inc | ||
== Testing scenario1 where a partial transaction is written in == | ||
== relay log and a stop slave; start slave; are executed == | ||
** Test scenario1 without MTS ** | ||
insert into t1 values(1); | ||
set global debug = "d,partial_relay_log_transaction"; | ||
insert into t1 values (2); | ||
include/wait_for_slave_io_to_stop.inc | ||
select * from t1; | ||
a | ||
1 | ||
set global debug = ``; | ||
include/start_slave_io.inc | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
** Test scenario1 with MTS ** | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 2; | ||
include/start_slave.inc | ||
delete from t1; | ||
insert into t1 values(1); | ||
set global debug = "d,partial_relay_log_transaction"; | ||
insert into t1 values(2); | ||
include/wait_for_slave_io_to_stop.inc | ||
select * from t1; | ||
a | ||
1 | ||
set global debug = ``; | ||
include/start_slave_io.inc | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 0; | ||
include/start_slave.inc | ||
== Testing scenario2 where a complete transaction is == | ||
== retrieved by i/o thread and sql thread executed it == | ||
** Test scenario2 without MTS ** | ||
delete from t1; | ||
insert into t1 values(1); | ||
include/stop_slave.inc | ||
include/start_slave.inc | ||
insert into t1 values(2); | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
** Test scenario2 with MTS ** | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 2; | ||
include/start_slave.inc | ||
delete from t1; | ||
insert into t1 values(1); | ||
include/stop_slave.inc | ||
include/start_slave.inc | ||
insert into t1 values(2); | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 0; | ||
include/start_slave.inc | ||
== Testing scenario3 where a complete transaction is == | ||
== retrieved by i/o thread but sql thread didn't execute it == | ||
== retrieving same transaction here is not a problem since == | ||
== sql thread just skips if a GTID is already committed == | ||
** Test scenario3 without MTS ** | ||
delete from t1; | ||
include/stop_slave_sql.inc | ||
insert into t1 values(1); | ||
include/sync_slave_io_with_master.inc | ||
include/stop_slave_io.inc | ||
include/start_slave.inc | ||
insert into t1 values(2); | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
** Test scenario3 with MTS ** | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 2; | ||
include/start_slave.inc | ||
delete from t1; | ||
include/stop_slave_sql.inc | ||
insert into t1 values(1); | ||
include/sync_slave_io_with_master.inc | ||
include/stop_slave_io.inc | ||
include/start_slave.inc | ||
insert into t1 values(2); | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 0; | ||
include/start_slave.inc | ||
== Testing scenario4 where a gtid event is written in == | ||
== relay log and a stop slave; start slave; are executed == | ||
** Test scenario4 without MTS ** | ||
delete from t1; | ||
insert into t1 values(1); | ||
set global debug = "d,partial_relay_log_transaction_with_only_gtid"; | ||
insert into t1 values (2); | ||
include/wait_for_slave_io_to_stop.inc | ||
select * from t1; | ||
a | ||
1 | ||
set global debug = ``; | ||
include/start_slave_io.inc | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
** Test scenario4 with MTS ** | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers=2; | ||
include/start_slave.inc | ||
delete from t1; | ||
insert into t1 values(1); | ||
set global debug = "d,partial_relay_log_transaction_with_only_gtid"; | ||
insert into t1 values (2); | ||
include/wait_for_slave_io_to_stop.inc | ||
select * from t1; | ||
a | ||
1 | ||
set global debug = ``; | ||
include/start_slave_io.inc | ||
select * from t1; | ||
a | ||
1 | ||
2 | ||
** Clean up ** | ||
include/stop_slave.inc | ||
set @@global.slave_parallel_workers = 0; | ||
change master to master_auto_position=0; | ||
include/start_slave.inc | ||
drop table t1; | ||
include/rpl_end.inc |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
--gtid_mode=ON --enforce_gtid_consistency --log_slave_updates |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
--gtid_mode=ON --enforce_gtid_consistency --log_slave_updates |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,204 @@ | ||
source include/master-slave.inc; | ||
source include/have_gtid.inc; | ||
source include/have_debug.inc; | ||
source include/have_innodb.inc; | ||
source include/have_binlog_format_statement.inc; | ||
|
||
let $old_debug = `select @@global.debug;`; | ||
connection master; | ||
create table t1(a int) engine = innodb; | ||
sync_slave_with_master; | ||
source include/stop_slave.inc; | ||
change master to master_auto_position = 1; | ||
source include/start_slave.inc; | ||
|
||
--echo == Testing scenario1 where a partial transaction is written in == | ||
--echo == relay log and a stop slave; start slave; are executed == | ||
--echo ** Test scenario1 without MTS ** | ||
connection master; | ||
insert into t1 values(1); | ||
sync_slave_with_master; | ||
set global debug = "d,partial_relay_log_transaction"; | ||
|
||
connection master; | ||
insert into t1 values (2); | ||
connection slave; | ||
|
||
source include/wait_for_slave_io_to_stop.inc; | ||
select * from t1; | ||
eval set global debug = `$old_debug`; | ||
source include/start_slave_io.inc; | ||
|
||
connection master; | ||
sync_slave_with_master; | ||
select * from t1; | ||
|
||
--echo ** Test scenario1 with MTS ** | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 2; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
delete from t1; | ||
insert into t1 values(1); | ||
sync_slave_with_master; | ||
set global debug = "d,partial_relay_log_transaction"; | ||
|
||
connection master; | ||
insert into t1 values(2); | ||
connection slave; | ||
|
||
source include/wait_for_slave_io_to_stop.inc; | ||
select * from t1; | ||
eval set global debug = `$old_debug`; | ||
source include/start_slave_io.inc; | ||
|
||
let $count=2; | ||
let $table=t1; | ||
let $wait_timeout= 300; | ||
source include/wait_until_rows_count.inc; | ||
select * from t1; | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 0; | ||
source include/start_slave.inc; | ||
|
||
--echo == Testing scenario2 where a complete transaction is == | ||
--echo == retrieved by i/o thread and sql thread executed it == | ||
--echo ** Test scenario2 without MTS ** | ||
connection master; | ||
delete from t1; | ||
insert into t1 values(1); | ||
sync_slave_with_master; | ||
|
||
source include/stop_slave.inc; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
insert into t1 values(2); | ||
sync_slave_with_master; | ||
select * from t1; | ||
|
||
--echo ** Test scenario2 with MTS ** | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 2; | ||
source include/start_slave.inc; | ||
connection master; | ||
delete from t1; | ||
insert into t1 values(1); | ||
sync_slave_with_master; | ||
|
||
source include/stop_slave.inc; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
insert into t1 values(2); | ||
sync_slave_with_master; | ||
select * from t1; | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 0; | ||
source include/start_slave.inc; | ||
|
||
--echo == Testing scenario3 where a complete transaction is == | ||
--echo == retrieved by i/o thread but sql thread didn't execute it == | ||
--echo == retrieving same transaction here is not a problem since == | ||
--echo == sql thread just skips if a GTID is already committed == | ||
--echo ** Test scenario3 without MTS ** | ||
connection master; | ||
delete from t1; | ||
sync_slave_with_master; | ||
source include/stop_slave_sql.inc; | ||
|
||
connection master; | ||
insert into t1 values(1); | ||
--let $use_gtids = 0 | ||
source include/sync_slave_io_with_master.inc; | ||
source include/stop_slave_io.inc; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
insert into t1 values(2); | ||
sync_slave_with_master; | ||
select * from t1; | ||
|
||
--echo ** Test scenario3 with MTS ** | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 2; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
delete from t1; | ||
sync_slave_with_master; | ||
source include/stop_slave_sql.inc; | ||
|
||
connection master; | ||
insert into t1 values(1); | ||
--let $use_gtids = 0 | ||
source include/sync_slave_io_with_master.inc; | ||
source include/stop_slave_io.inc; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
insert into t1 values(2); | ||
sync_slave_with_master; | ||
select * from t1; | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 0; | ||
source include/start_slave.inc; | ||
|
||
|
||
--echo == Testing scenario4 where a gtid event is written in == | ||
--echo == relay log and a stop slave; start slave; are executed == | ||
--echo ** Test scenario4 without MTS ** | ||
connection master; | ||
delete from t1; | ||
insert into t1 values(1); | ||
sync_slave_with_master; | ||
set global debug = "d,partial_relay_log_transaction_with_only_gtid"; | ||
|
||
connection master; | ||
insert into t1 values (2); | ||
connection slave; | ||
|
||
source include/wait_for_slave_io_to_stop.inc; | ||
select * from t1; | ||
eval set global debug = `$old_debug`; | ||
source include/start_slave_io.inc; | ||
|
||
connection master; | ||
sync_slave_with_master; | ||
select * from t1; | ||
|
||
--echo ** Test scenario4 with MTS ** | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers=2; | ||
source include/start_slave.inc; | ||
|
||
connection master; | ||
delete from t1; | ||
insert into t1 values(1); | ||
sync_slave_with_master; | ||
set global debug = "d,partial_relay_log_transaction_with_only_gtid"; | ||
|
||
connection master; | ||
insert into t1 values (2); | ||
connection slave; | ||
|
||
source include/wait_for_slave_io_to_stop.inc; | ||
select * from t1; | ||
eval set global debug = `$old_debug`; | ||
source include/start_slave_io.inc; | ||
|
||
let $count=2; | ||
let $table=t1; | ||
source include/wait_until_rows_count.inc; | ||
let $wait_timeout= 300; | ||
select * from t1; | ||
|
||
--echo ** Clean up ** | ||
source include/stop_slave.inc; | ||
set @@global.slave_parallel_workers = 0; | ||
change master to master_auto_position=0; | ||
source include/start_slave.inc; | ||
connection master; | ||
drop table t1; | ||
source include/rpl_end.inc; |
Oops, something went wrong.