Skip to content

[fix](streamingjob) Persist cdc_stream TVF offset across FE checkpoint#62902

Merged
JNSimba merged 2 commits into
apache:masterfrom
JNSimba:fix_cdc_tvf_checkpoint
May 15, 2026
Merged

[fix](streamingjob) Persist cdc_stream TVF offset across FE checkpoint#62902
JNSimba merged 2 commits into
apache:masterfrom
JNSimba:fix_cdc_tvf_checkpoint

Conversation

@JNSimba
Copy link
Copy Markdown
Member

@JNSimba JNSimba commented Apr 28, 2026

What problem does this PR solve?

Related PR: #62449

Problem Summary:

PR #62449 fixed streaming job offset state-loss after FE checkpoint restart for the S3 path, but the cdc_stream TVF path has the same root cause and worse impact: after a checkpoint restart in the binlog phase, the job replays from the very beginning of the binlog (because currentOffset == null falls through to a fresh BinlogSplit with no startingOffset).

Root cause: JdbcTvfSourceOffsetProvider.getPersistInfo() returns null, so offsetProviderPersist is never written into the FE image. After checkpoint, the pre-checkpoint journal is GC'd, neither journal-replayed currentOffset nor image-persisted state survives, and recovery falls back to a fresh provider with empty chw/bop.

Only the non-cloud mode is affected. Cloud mode is fine because replayOnCloudMode pulls a cumulative attachment from MS.

Fix — reuse the parent's existing chw/bop/ts @SerializedName persistence:

  • Drop the getPersistInfo() override so the parent's GsonUtils.GSON.toJson(this) writes chw/bop/ts into the image.
  • Add a restoreFromPersistInfo() override to read them back on FE startup (called from gsonPostProcess).
  • In updateOffset binlog branch, mirror startingOffset into binlogOffsetPersist so it survives the image (currentOffset has no @SerializedName).
  • In replayIfNeed currentOffset == null branch, rebuild BinlogSplit from bop, or apply chw (using the existing null.null remap) when restoring snapshot phase.

Release note

Fix cdc_stream TVF offset state loss after FE checkpoint restart (non-cloud mode).

Check List (For Author)

  • Test

    • Regression test (`test_streaming_job_cdc_stream_postgres_checkpoint_restart_fe.groovy` covers both mid-snapshot and binlog-phase checkpoint restart)
    • Unit Test
    • Manual test
    • No need to test
  • Behavior changed:

    • No.
  • Does this need documentation?

    • No.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@JNSimba JNSimba changed the title [fix](fe) Persist cdc_stream TVF offset across FE checkpoint [fix](streamingjob) Persist cdc_stream TVF offset across FE checkpoint Apr 28, 2026
@JNSimba JNSimba requested a review from Copilot April 28, 2026 08:58
@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented Apr 29, 2026

/review

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented Apr 29, 2026

run buildall

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 6, 2026

run p0

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 6, 2026

run nonConcurrent

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 11, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:

I found one blocking test-coverage issue. The production change is focused on persisting and restoring the TVF JDBC offset provider state across checkpoint-image recovery, and the implementation generally follows the existing provider replay model, but the new regression test does not reliably exercise the mid-snapshot checkpoint recovery path that is the risky part of the change.

Critical checkpoint conclusions:

  • Goal/test proof: The goal is to recover cdc_stream TVF progress after checkpoint GC removes pre-checkpoint journals. The binlog-image path is covered, but the snapshot chw-image path is not reliably proven because the test waits while the job continues to run.
  • Scope: The code change is small and focused.
  • Concurrency/lifecycle: No new locks or threads are introduced. The relevant lifecycle is FE image load plus later scheduler replay; no direct lock-order issue found.
  • Persistence/transactions: The change relies on offsetProviderPersist in the image and txn commit attachments. The main uncovered risk is whether a partially completed snapshot is restored from image without txn replay.
  • Parallel paths: Cloud replay still uses the existing latest attachment path; no additional issue found in this PR scope.
  • Test results: Added regression output is ordered, but the new test can pass without exercising the intended mid-snapshot checkpoint scenario.
  • Observability/performance: Added logs are lightweight; no new hot-path performance issue found.

User focus: No additional user-provided review focus was specified.

Copy link
Copy Markdown
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@JNSimba JNSimba merged commit 73b3f5e into apache:master May 15, 2026
33 checks passed
github-actions Bot pushed a commit that referenced this pull request May 15, 2026
#62902)

### What problem does this PR solve?

Related PR: #62449

Problem Summary:

PR #62449 fixed streaming job offset state-loss after FE checkpoint
restart for the S3 path, but the cdc_stream TVF path has the same root
cause and worse impact: after a checkpoint restart in the binlog phase,
the job replays from the very beginning of the binlog (because
`currentOffset == null` falls through to a fresh `BinlogSplit` with no
`startingOffset`).

Root cause: `JdbcTvfSourceOffsetProvider.getPersistInfo()` returns
`null`, so `offsetProviderPersist` is never written into the FE image.
After checkpoint, the pre-checkpoint journal is GC'd, neither
journal-replayed `currentOffset` nor image-persisted state survives, and
recovery falls back to a fresh provider with empty `chw`/`bop`.

Only the non-cloud mode is affected. Cloud mode is fine because
`replayOnCloudMode` pulls a cumulative attachment from MS.

Fix — reuse the parent's existing `chw`/`bop`/`ts` `@SerializedName`
persistence:
- Drop the `getPersistInfo()` override so the parent's
`GsonUtils.GSON.toJson(this)` writes `chw/bop/ts` into the image.
- Add a `restoreFromPersistInfo()` override to read them back on FE
startup (called from `gsonPostProcess`).
- In `updateOffset` binlog branch, mirror `startingOffset` into
`binlogOffsetPersist` so it survives the image (`currentOffset` has no
`@SerializedName`).
- In `replayIfNeed` `currentOffset == null` branch, rebuild
`BinlogSplit` from `bop`, or apply `chw` (using the existing `null.null`
remap) when restoring snapshot phase.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.1.x

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants