Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resume Ghostferry after interruption #17

Open
shuhaowu opened this issue May 1, 2018 · 0 comments
Open

Resume Ghostferry after interruption #17

shuhaowu opened this issue May 1, 2018 · 0 comments

Comments

@shuhaowu
Copy link
Contributor

shuhaowu commented May 1, 2018

In order to handle schema changes on the source and the target databases of the same application, we opted for a method where we pause Ghostferry upon the beginning of a schema change and resume it after the schema change has completed on both the source and the target database.

In addition to being useful for schema changes, having a resumable Ghostferry is useful in the general. An example would be if the target/source database becomes temporarily unavailable. As of right now, the data on the target must be cleaned up and then we have to restart Ghostferry from the beginning.

Resume via Reconciliation

The main issue with Ghostferry being interrupted is that binlogs are no longer being streamed from the source to the target. If the binlogs are not streamed, the target database is then no longer up to date and the data may not be valid. This issue is not exclusive to Ghostferry and also affects regular MySQL replication. The solution there involves starting replication at some user specified binlog position. Implementing the same within Ghostferry will be difficult and inefficient:

  • Implementing the ability to follow the binlogs through an ALTER TABLE event will be difficult to accomplish.
  • A row may change multiple times during the time that Ghostferry is down. Updating the entire row over multiple times will be inefficient.

Instead of replaying the binlogs as is, a different method can be employed to keep the target up to date:

  1. Resume Ghostferry with:
    1. a known good binlog position that has been streamed to the target,
    2. a known good cursor (PK) position where all PK <= this position has been copied to the target, and
    3. a copy of the table schema cache valid at the known good binlog position.
  2. Loop through the binlogs from the known good position to the current position from SHOW MASTER STATUS. For each binlog event encountered, get the associated primary key and for that row:
    1. delete the row from the target database if exists;
    2. copy the current row from the source database to the target database if the row has already been copied, which is determined by a comparison of the row’s PK with the known good cursor position.
    3. NOTE: this step will now be referred to as the reconciliation step.
  3. After the process is complete, we can simply start Ghostferry as normal.

Safety of the Reconciliation Step

To analyze the safety of the reconciliation step, we must first make the following assumptions:

  1. The known good binlog/cursor positions given have been copied by a previous Ghostferry run.
  2. The known good binlog/cursor position can be an underestimate of the actual good binlog/cursor position.
    1. In other words: it’s possible that we overcopied rows/binlogs from a previous run but didn’t manage the save the position as the process is shutdown prematurely.

We can then analyze the situation where the known good binlog position is the same as the actual good binlog position (no overcopy of binlogs occured):

  1. If a row that is known to have been copied is modified (pk <= knownGoodCursorPos):
    1. Suppose there are 4 versions of this row due to modifications: (v1) the state of the row at the time of the interruption, (v2) the state after modification while Ghostferry is down, (v3) the state after a modification that occured before we reached this row during reconciliation but after SHOW MASTER STATUS, and (v4) a modification that occurs after the reconciliation process is done but before Ghostferry finishes.
    2. When we encounter the binlog entry that performs v1 -> v2, the row on the source is at v3 and the row on the target is at v1.
    3. The target row is deleted and recopied from the source, updating it to v3.
    4. After the reconciliation process is complete, the binlog streamer encounters the event v2 -> v3. This event will not execute on the target in the current Ghostferry implementation as is.
    5. When the BinlogStreamer finally encounter the event v3 -> v4, it will be executed.
  2. If a row that is copied but is not known to be copied is modified (knownGoodCursorPos < pk <= actualGoodCursorPos)
    1. the row is deleted from the target if exists;
    2. after the reconciliation step, Ghostferry will resume from the knownGoodCursorPos and thus will copy the row over as a part of its normal run. All the safety properties of regular Ghostferry applies.
  3. If a row that have not been copied is modified (pk > actualGoodCursorPos)
    1. same as case 2.

We can now simply extend this to cases when the known good binlog position is an underestimate of the actual position: if we reconcile a binlog entry that has already been streamed to the target, it will simply be deleted from the target and recopied. Thus it does not pose a problem.

The safety of this reconciliation are also verified with a (unreviewed) TLA+ model.

Safety of Interruption

As demonstrated above, as long as Ghostferry saves at worst an underestimate of the binlog/cursor position, the reconciliation process is safe.

The current code only increments the last successful binlog/cursor position after the binlog/row is successfully streamed/copied. This means that if we were to panic the process at any time and get the saved values out, those values are at worst an underestimate, unless there’s something about Go that we don’t quite understand (?).

Handling Schema Changes with Reconciliation

If a schema change occur on either the source or the target, we must interrupt the Ghostferry and only resume Ghostferry at some future point. We can asynchronously detect the schema changes on either the source or the target and abort the process. If an error occurs elsewhere because of the schema change within Ghostferry but at a different thread, we stall that code until we can positively identify a schema change OR we abort if some timeout has reached and yet we cannot positively identify the schema change.

We assume that:

  • The source and target database are assumed to be for the same application. This means they must eventually have a consistent schema. We only resume Ghostferry when the consistent schema is reached.

Once resumed:

  • For each table that was in progress AND has a schema change applied during the interruption:
    • delete all application records of this table from the target database.
    • set the known good cursor position to 0 so it gets recopied.
  • For all other tables, the regular reconciliation process must apply as otherwise we might lose changes that have occured during the interruption to those tables.
    • Note we need to do this for not just finished tables, but also those that haven’t started as during the previous run, a row could have been INSERTed by the binlog streamer to a table that hasn’t started its copy process. If it is updated during the downtime and we don’t do something about it, it will corrupt.
@shuhaowu shuhaowu changed the title Resume ghostferry after exiting due to an error Resume Ghostferry after interruption Jul 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant