[improve](streaming-job) avoid potential OOM when reading large snapshot splits#63833
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
TPC-H: Total hot run time: 31642 ms |
TPC-DS: Total hot run time: 171582 ms |
There was a problem hiding this comment.
Review result: no blocking issues found in the GitHub PR diff.
Critical checkpoint conclusions:
- Goal/test: The change targets CDC snapshot splits whose records span multiple Debezium batches, and the new MySQL/Postgres regression cases exercise the >2048-row single-split path plus incremental follow-up DML.
- Scope: The functional change is focused on snapshot completion detection and related config validation/defaulting.
- Concurrency: Snapshot polling already uses async futures; this PR does not add shared-state mutation from worker futures for completion tracking. The completed split set is updated on the polling path after high-watermark state is consumed.
- Lifecycle: No new static/global lifecycle concern found. Reader cleanup still happens through the existing finish/close paths.
- Config: A new skip_snapshot_backfill source property is added with FE validation and reader consumption; defaulting is explicit for from-to jobs.
- Compatibility: No storage format, thrift, or persisted edit-log compatibility issue found in the reviewed diff.
- Parallel paths: MySQL and Postgres reader paths both receive the snapshot-finished logic; TVF validation is also updated.
- Tests: New external regression tests cover the fat snapshot split behavior for both MySQL and Postgres. I did not run them in this Actions review environment.
- Observability: Existing logs around snapshot polling and completion remain sufficient for this change.
- Transactions/persistence/data correctness: No FE transaction/edit-log change; snapshot offset commit is still based on split high-watermarks after all snapshot splits finish.
- Performance: The change avoids prematurely finishing after the first batch without introducing an obvious hot-path regression.
User focus: no additional user-provided review focus was specified.
FE UT Coverage ReportIncrement line coverage |
|
run buildall |
|
/review |
TPC-H: Total hot run time: 31508 ms |
There was a problem hiding this comment.
Review result: request changes.
Critical checkpoint conclusions:
- Goal/test proof: The PR addresses large snapshot split memory/completion issues and adds external regression tests for fat snapshot splits, but the new default skip_snapshot_backfill=true is not proven safe for concurrent DML during snapshot. Existing handoff code builds the binlog split from finished snapshot high watermarks, so the skipped backfill window can lose source changes.
- Scope/focus: Most CDC changes are focused, but the effective semantics change for all CREATE JOB from-to initial snapshots is broad and needs the offset handoff adjusted or the default reconsidered.
- Concurrency: The snapshot poll future changes use concurrent collections and do not introduce an obvious new data race in the reviewed paths.
- Lifecycle/static initialization: No new static lifecycle issue identified.
- Config: New skip_snapshot_backfill is validated, but making it default true changes runtime behavior and requires correctness guarantees for snapshot-to-binlog transition.
- Compatibility/storage format: No storage format or FE-BE protocol incompatibility identified in the CDC changes.
- Parallel paths: MySQL and Postgres reader paths were both updated.
- Special checks: The high-watermark based completion check is directionally correct for pollWithoutBuffer, but it does not by itself protect the skipped backfill interval.
- Test coverage/results: Added tests cover large single-split draining and post-snapshot DML. They do not cover DML occurring while the snapshot split is open under the new default, which is the risky path.
- Observability: Existing logs are sufficient for the reviewed CDC polling paths.
- Transaction/persistence/data correctness: Blocking issue found: source updates/deletes/inserts committed between snapshot low/high watermark can be omitted.
- Performance: The memory/backpressure direction is reasonable, but correctness must be preserved before defaulting it on.
User focus: no additional user-provided review focus was present.
TPC-DS: Total hot run time: 170787 ms |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
PR approved by at least one committer and no changes requested. |
…hot splits (#63833) ## Summary - Default-skip flink-cdc's in-snapshot backfill on the from-to path so large splits no longer accumulate the entire chunk + backfill stream in the fetcher's outputBuffer; from-to is at-least-once and tolerates the duplicates this introduces. TVF (job-driven and standalone) keeps the standard `false` default for exactly-once via per-task offset commit. - Expose `skip_snapshot_backfill` as a user-facing property with strict `true`/`false` validation on both from-to (CREATE JOB) and TVF (SELECT FROM cdc_stream(...)) entry points. - Fix snapshot completion under `pollWithoutBuffer`: a split is now marked complete only after its high-watermark event has been consumed (`splitState.getHighWatermark() != null`), not on the first non-empty fetcher batch. Without this, enabling the new default truncates any split larger than debezium's `max.batch.size` and yields an NPE on offset extraction. - Read `streaming_task_timeout_multiplier` live in `StreamingMultiTblTask.isTimeout()` so `admin set frontend config` affects already-running tasks, matching the `@ConfField(mutable=true)` contract.
Summary
falsedefault for exactly-once via per-task offset commit.skip_snapshot_backfillas a user-facing property with stricttrue/falsevalidation on both from-to (CREATE JOB) and TVF (SELECT FROM cdc_stream(...)) entry points.pollWithoutBuffer: a split is now marked complete only after its high-watermark event has been consumed (splitState.getHighWatermark() != null), not on the first non-empty fetcher batch. Without this, enabling the new default truncates any split larger than debezium'smax.batch.sizeand yields an NPE on offset extraction.streaming_task_timeout_multiplierlive inStreamingMultiTblTask.isTimeout()soadmin set frontend configaffects already-running tasks, matching the@ConfField(mutable=true)contract.Test plan