--test-on-replica should not write to binlogs #646

Jericon · 2018-09-27T23:30:06Z

In our environment, we do not use GTID's and we run most clusters in an Active/Passive Master/Master configuration. The current behavior of gh-ost's test on replica function is that it assumes the host is a leaf node with no replicas and still writes the changes to the binlog.

It would be beneficial if there was an additional flag to not write to the binlog.

The specific situation I am in is that I am compressing some large tables. Based on small tests and estimates, we should have enough space to complete the compression of one table without running out of disk space. I had intended on running the migration on the passive master, which is taking no traffic and could run out of disk space without any negative issues. With the migration being written to the binlogs, though, this would also cause the active master to run out of space as well.

ggunson · 2018-09-27T23:38:13Z

note: @Jericon and I discussed this earlier

From a GTID standpoint, even if --test-on-replica was run on a leaf node, but then it was later promoted to master, its gtid_executed/gtid_purged would include the events from those local writes to tablename_gho but its new replicas wouldn't have them (requiring some wrangling of gtid_purged to fix replication).

I can see the usefulness of being able to test on a replica both writing to the binary logs and avoiding them, but we've listed several scenarios where it would be better to have the option not to.

shlomi-noach · 2018-10-02T07:53:54Z

Related: #146, #149, #254

zmoazeni · 2019-06-10T13:41:01Z

@shlomi-noach I read #254 (as well as the rest of the issues) and I wanted to verify. The approach we should take here is to reset the GTID purged/executed on the replica to effectively doctor the set of GTIDs applied as if the test never happened? (Only when the option is passed).

For what it's worth, I ran into this same issue in production a year+ ago by promoting a replica and the other replicas were blocked from starting replication. We realized after hours of digging that it was from our replica-only tests long ago.

shlomi-noach · 2019-06-10T13:46:58Z

@zmoazeni yes, correct.
It's noteworthy that since then I've put a lot of GTID logic into orchestrator, and have learned quite a few things myself. So yes, you will probably want to run a GTID/reset master sequence (which is hard to like), or alternatively (which I don't like even more) apply those errant transactions on the master. I really can't recommend the latter for such massive operations as a migration due to the amount of errant GTID transactions.

zmoazeni · 2019-06-10T14:06:23Z

Yeah we ended up doing the latter with a one-off script. But it did make us nervous.

shlomi-noach · 2019-06-10T14:08:58Z

the latter (apply errant GTIDs on the master) is actually safer. However:

it is slower, because you may need to apply millions of transactions
It bloats the gtid_executed set on all servers.

zmoazeni mentioned this issue Jun 22, 2019

Opt out of binlogs #759

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--test-on-replica should not write to binlogs #646

--test-on-replica should not write to binlogs #646

Jericon commented Sep 27, 2018 •

edited

Loading

ggunson commented Sep 27, 2018

shlomi-noach commented Oct 2, 2018

zmoazeni commented Jun 10, 2019 •

edited

Loading

shlomi-noach commented Jun 10, 2019

zmoazeni commented Jun 10, 2019 •

edited

Loading

shlomi-noach commented Jun 10, 2019 •

edited

Loading

--test-on-replica should not write to binlogs #646

--test-on-replica should not write to binlogs #646

Comments

Jericon commented Sep 27, 2018 • edited Loading

ggunson commented Sep 27, 2018

shlomi-noach commented Oct 2, 2018

zmoazeni commented Jun 10, 2019 • edited Loading

shlomi-noach commented Jun 10, 2019

zmoazeni commented Jun 10, 2019 • edited Loading

shlomi-noach commented Jun 10, 2019 • edited Loading

Jericon commented Sep 27, 2018 •

edited

Loading

zmoazeni commented Jun 10, 2019 •

edited

Loading

zmoazeni commented Jun 10, 2019 •

edited

Loading

shlomi-noach commented Jun 10, 2019 •

edited

Loading