Skip to content

Conversation

@morgo
Copy link
Collaborator

@morgo morgo commented Oct 21, 2025

A Pull Request should be associated with an Issue.

We wish to have discussions in Issues. A single issue may be targeted by multiple PRs.
If you're offering a new feature or fixing anything, we'd like to know beforehand in Issues,
and potentially we'll be able to point development in a particular direction.
Further notes in https://github.com/block/spirit/blob/main/.github/CONTRIBUTING.md

Fixes #495

It refactors Status so it can be shared between Move and Migration. There is still a lot of code duplication though, with future opportunities to cleanup such as the dump checkpoint / resume from checkpoint code paths.

@morgo morgo requested a review from kolbe October 26, 2025 21:10
var (
sentinelCheckInterval = 1 * time.Second
tableStatUpdateInterval = 5 * time.Minute
sentinelWaitLimit = 48 * time.Hour
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be configurable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we understand the use-cases maybe. This timeout is pre-existing and just copied to the move package from migration.

For now, we limit it to 48 hours per "decisions, not options" in https://github.com/block/spirit/blob/main/.github/CONTRIBUTING.md

Comment on lines +591 to +592
if err != nil {
return status.ErrWatermarkNotReady // it might not be ready, we can try again.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about logging these sort of errors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will log it. It returns the error to dumpCheckpointContinuously, which on 999 will log it.

func (r *Runner) newCopy(ctx context.Context) error {
// We are starting fresh:
// For each table, fetch the CREATE TABLE statement from the source and run it on the target.
if err := r.createTargetTables(); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a timer log to see how long this took?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can roughly tell from the dumpStatus() as things move to different stages. Unfortunately, I've not yet implemented dumpStatus() for the move runner.

return err
}

func (c *CutOver) algorithmRenameUnderLock(ctx context.Context) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding any info logs here for each of the phases? E.g.

  c.logger.Info("Starting cutover: acquiring table locks")
  tableLock, err := dbconn.NewTableLock(ctx, c.db, c.tables, c.dbConfig, c.logger)
 
  ...

  c.logger.Info("Cutover: flushing changes under lock")
  if err := c.feed.FlushUnderTableLock(ctx, tableLock); err != nil {
      return err
  }

  ...
  c.logger.Info("Cutover: performing table rename operations")
  return tableLock.ExecUnderLock(ctx, renameStatement)

Copy link
Collaborator Author

@morgo morgo Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flush phase does record its time and print it out to the logger.

The table lock will do similar - it prints before it executes statements and we can deduce time from there.

Exec under lock I don't think we will have time on, but since it has pre-acquired the lock it should be instant.

@morgo morgo enabled auto-merge October 27, 2025 18:38
@morgo morgo merged commit 8c86d7d into block:main Oct 27, 2025
7 checks passed
@morgo morgo deleted the mtocker-refine-move branch October 27, 2025 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement sentinel/checkpoint/progress/cutover for Movetables

2 participants