-
Notifications
You must be signed in to change notification settings - Fork 26
Improve MoveTables command #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| var ( | ||
| sentinelCheckInterval = 1 * time.Second | ||
| tableStatUpdateInterval = 5 * time.Minute | ||
| sentinelWaitLimit = 48 * time.Hour |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we understand the use-cases maybe. This timeout is pre-existing and just copied to the move package from migration.
For now, we limit it to 48 hours per "decisions, not options" in https://github.com/block/spirit/blob/main/.github/CONTRIBUTING.md
| if err != nil { | ||
| return status.ErrWatermarkNotReady // it might not be ready, we can try again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we care about logging these sort of errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will log it. It returns the error to dumpCheckpointContinuously, which on 999 will log it.
| func (r *Runner) newCopy(ctx context.Context) error { | ||
| // We are starting fresh: | ||
| // For each table, fetch the CREATE TABLE statement from the source and run it on the target. | ||
| if err := r.createTargetTables(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a timer log to see how long this took?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can roughly tell from the dumpStatus() as things move to different stages. Unfortunately, I've not yet implemented dumpStatus() for the move runner.
| return err | ||
| } | ||
|
|
||
| func (c *CutOver) algorithmRenameUnderLock(ctx context.Context) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding any info logs here for each of the phases? E.g.
c.logger.Info("Starting cutover: acquiring table locks")
tableLock, err := dbconn.NewTableLock(ctx, c.db, c.tables, c.dbConfig, c.logger)
...
c.logger.Info("Cutover: flushing changes under lock")
if err := c.feed.FlushUnderTableLock(ctx, tableLock); err != nil {
return err
}
...
c.logger.Info("Cutover: performing table rename operations")
return tableLock.ExecUnderLock(ctx, renameStatement)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flush phase does record its time and print it out to the logger.
The table lock will do similar - it prints before it executes statements and we can deduce time from there.
Exec under lock I don't think we will have time on, but since it has pre-acquired the lock it should be instant.
A Pull Request should be associated with an Issue.
Fixes #495
It refactors
Statusso it can be shared between Move and Migration. There is still a lot of code duplication though, with future opportunities to cleanup such as the dump checkpoint / resume from checkpoint code paths.