Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --until-binlog-last-modified-time option to `wal-g-mysql binlog… #1154

Merged
merged 3 commits into from Nov 22, 2021

Conversation

ostinru
Copy link
Contributor

@ostinru ostinru commented Nov 19, 2021

Add --until-binlog-last-modified-time option to wal-g-mysql binlog-replay this may be useful to achieve exact clones of the same database in scenarios when new binlogs are uploaded during replay process of different hosts.

…-replay`

this may be useful to achieve exact clones of the same database in scenarios when
new binlogs are uploaded during replay process of different hosts.
@ostinru ostinru requested a review from a team as a code owner November 19, 2021 15:04
Copy link
Contributor

@mialinx mialinx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't understand how this PR fixes the problem

@@ -24,7 +27,7 @@ var binlogFetchCmd = &cobra.Command{
Run: func(cmd *cobra.Command, args []string) {
folder, err := internal.ConfigureFolder()
tracelog.ErrorLogger.FatalOnError(err)
mysql.HandleBinlogFetch(folder, fetchBackupName, fetchUntilTS)
mysql.HandleBinlogFetch(folder, fetchBackupName, fetchUntilTS, fetchUntilBinlogLastModifiedTS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between fetchUntilTS and fetchUntilBinlogLastModifiedTS ?
can we use fetchUntilTs instead of fetchUntilBinlogLastModifiedTS

Copy link
Contributor Author

@ostinru ostinru Nov 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am introducing new should distinguish two timestamps:
fetchUntilTS - is about time of individual transactions
fetchUntilBinlogLastModifiedTS - is about time when binlog was created

There are 2 properties:
fetchUntilTS < fetchUntilBinlogLastModifiedTS - because transactions dumped to binlog before binlog upload
--stop-datetime < fetchUntilBinlogLastModifiedTS - it is new rule that fixes multi-host restores (check detailed explanation in another comment)

@@ -271,6 +271,9 @@ func getLogsCoveringInterval(folder storage.Folder, start time.Time, includeStar
})
var logsToFetch []storage.Object
for _, logFile := range logFiles {
if logFile.GetLastModified().After(endBinlogTS) {
continue // don't fetch binlogs from future
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shold not affect the final result, as mysqbinlog started as

mysqlbinlog --stop-datetime="$WALG_MYSQL_BINLOG_END_TS"

and should not replay transactions happened after fetchUntilTS

Copy link
Contributor Author

@ostinru ostinru Nov 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch fixes following issue:

Live MySQL cluster A constantly uploads binlogs. We are trying to restore new cluster (let's call it B) from S3 on some point in time, e.g. NOW() (or future time).
Then we can have following scenario:
'Fast' servers restore MySQL on Snapshot + binlog.123 (Binlog 123 is the latest observed binlog for 'fast' hosts).
'Slow' servers may restore MySQL on Snapshot+binlog.124 (Because cluster A have uploaded one more binlog during 'slow' restore).
As a result we have different GTIDs in a cluster. It is easier to prevent such scenario rather than fixing it afterwards.

The root cause of this issue is simple: newly uploaded binlog file may contain transactions that happend before user-specified --stop-datetime (it is arbitrary time, not a "$WALG_MYSQL_BINLOG_END_TS"). I can imagine scenario when binlog is uploaded once a day (after calling FLUSH BINARY LOGS). In this scenario for any user-specified timestamp from that day we will have transactions in new binlog.

To prevent different GTID-sets we should:

  • point in time for this limits should be NOW() or in past.
  • introduce hard limits on which binlogs we are replaying

This 2 rules guarantees that all restored clones will have same binlog files replayed and as a result - same GTID-sets.

@mialinx mialinx merged commit 700900a into master Nov 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants