Skip to content

mysql_cdc: write checkpoint after snapshot#4269

Merged
josephwoodward merged 2 commits intomainfrom
jw/mysqlsnapshotcp
Apr 16, 2026
Merged

mysql_cdc: write checkpoint after snapshot#4269
josephwoodward merged 2 commits intomainfrom
jw/mysqlsnapshotcp

Conversation

@josephwoodward
Copy link
Copy Markdown
Contributor

@josephwoodward josephwoodward commented Apr 16, 2026

Currently after the snapshot the first time we checkpoint is after the first batch has been processed. This change ensures we checkpoint right after snapshotted data is flushed before we start streaming change records.

image

Closes: #4257

Currently after the snapshot the first time we checkpoint is after the first batch has been processed. This change
ensures we checkpoint right after snapshotted data is flushed before we
start streaming change records.
@claude
Copy link
Copy Markdown

claude bot commented Apr 16, 2026

Commits
LGTM

Review
This PR adds a snapshot completion sentinel event to ensure the checkpoint cache is updated immediately after all snapshot rows are flushed, rather than deferring until the first streaming batch is processed. The implementation sends a messageOperationSnapshotComplete event through the existing rawMessageEvents channel, and readMessages handles it by flushing any partial batch, tracking a checkpoint entry with immediate resolution, and persisting the binlog position. The thread-safety reasoning is sound: snapshot batch ack functions never write to the cache (nil offset), and streaming has not started at this point.

LGTM

@josephwoodward josephwoodward marked this pull request as ready for review April 16, 2026 13:06
Comment thread internal/impl/mysql/input_mysql_stream.go
@claude
Copy link
Copy Markdown

claude bot commented Apr 16, 2026

Commits
LGTM

Review
This PR adds a snapshot-completion sentinel event to the MySQL CDC input so that the binlog checkpoint is persisted immediately after all snapshot rows are flushed, rather than waiting for the first streaming batch. The logic — flush remaining batch, track a checkpoint entry with immediate resolution, and clear the timed-batch timer — is sound.

  1. No tests cover the new snapshot-completion checkpoint path. Non-trivial logic (sentinel dispatch, batch flush, checkpoint tracking with immediate resolveFn() call) should have at least a unit test verifying the cached position advances after snapshot completion. See inline comment.

Comment thread internal/impl/mysql/input_mysql_stream.go
@claude
Copy link
Copy Markdown

claude bot commented Apr 16, 2026

Commits
LGTM

Review
This PR adds a snapshot completion sentinel event (messageOperationSnapshotComplete) that flushes any partial batch and writes a checkpoint to the cache immediately after the snapshot finishes, rather than deferring until the first CDC batch is acknowledged. The approach is sound — the Track-then-immediately-resolve pattern is consistent with the checkpoint library's API, and the concurrency reasoning (no mutex needed because snapshot batch ack callbacks never write to cache) holds.

  1. Missing tests for snapshot completion checkpoint logic — The new code path in readMessages (input_mysql_stream.go L704-L731) adds non-trivial checkpoint management (flush + track + resolve + cache write) with no corresponding test. Existing integration tests exercise snapshot+CDC flows but don't verify the checkpoint cache state after snapshot completion.

@josephwoodward josephwoodward merged commit 7ad9043 into main Apr 16, 2026
7 checks passed
@josephwoodward josephwoodward deleted the jw/mysqlsnapshotcp branch April 16, 2026 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

mysql_cdc Checkpoint never written after snapshot input_mysql_stream.go

3 participants