Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-24.1.0-rc: changefeedccl: fix initial scan checkpointing #123968

Merged

Commits on May 4, 2024

  1. changefeedccl: fix initial scan checkpointing

    Initially, all span initial resolved timestamps are kept as zero upon resuming a
    job since initial resolved timestamps are set as initial high water which
    remains zero until initial scan is completed. However, since
    0eda540,
    we began reloading checkpoint timestamps instead of setting them all to zero at
    the start. In PR #102717, we introduced a mechanism to reduce message duplicates
    by re-loading job progress upon resuming which largely increased the likelihood
    of this bug. These errors could lead to incorrect frontier and missing events
    during initial scans. This patches changes how we initialize initial high water
    and frontier by initializing it as zero if there are any zero initial high water
    in initial resolved timestamps.
    
    Fixes: #123371
    
    Release note (enterprise change): Fixed a bug in v22.2+ where long running
    initial scans may incorrectly restore checkpoint job progress and drop events
    during node / changefeed restart. This issue was most likely to occur in
    clusters with: 1) changefeed.shutdown_checkpoint.enabled (v23.2) is set 2)
    Multiple table targets in a changefeed, or 3) Low
    changefeed.frontier_checkpoint_frequency or low
    changefeed.frontier_highwater_lag_checkpoint_threshold.
    wenyihu6 committed May 4, 2024
    Configuration menu
    Copy the full SHA
    6ae9f81 View commit details
    Browse the repository at this point in the history