[fix](streaming-job) Support delete sign in CDC stream TVF#65146
[fix](streaming-job) Support delete sign in CDC stream TVF#65146JNSimba wants to merge 5 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
I found one blocking regression-test issue on the refreshed head.
Critical checkpoints:
- Goal: TVF schema exposure for CDC delete sign is implemented, but the new regression coverage cannot run in normal comparison mode because expected output files are missing.
- Scope/focus: implementation is small and localized to CdcStreamTableValuedFunction plus tests.
- Concurrency/lifecycle: no new concurrency or lifecycle management introduced.
- Config/compatibility:
include_delete_signis a TVF property validated as boolean; no new FE image/edit-log field found. - Parallel paths: MySQL and PostgreSQL delete-sign paths are covered by the new suites in intent, but both need generated
.outfiles. - Tests: FE unit test was attempted but blocked before Java execution by missing
thirdparty/installed/bin/protoc;git diff --checkpassed.
Subagent conclusions:
- optimizer-rewrite: no candidates.
- tests-session-config: no candidates beyond MAIN-1 after convergence.
- C1 was invalidated by the PR head moving from 173721a to 1b4595d.
- C2 ended with both live subagents reporting
NO_NEW_VALUABLE_FINDINGSfor the current ledger/comment set.
User focus: no additional focus was provided.
|
Codex automated review failed and did not complete. Error: Codex completed, but no new pull request review was submitted for the current head SHA. Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
|
/review |
|
run buildall |
There was a problem hiding this comment.
Automated review summary for PR 65146.
I did not find any additional non-duplicate inline issues in the effective four-file PR diff. The existing inline thread #65146 (comment) remains the blocking item: the two new CDC delete-sign regression suites add qt_* checks but the matching .out files are not present under regression-test/data/job_p0/streaming_job/cdc/tvf/.
Critical checkpoint conclusions:
- Goal/test proof: the FE change exposes
__DORIS_DELETE_SIGN__only wheninclude_delete_sign=true, and the new unit/regression tests target that behavior; regression expected outputs are still missing as already commented. - Scope: implementation is narrowly contained to CDC stream TVF schema exposure plus tests.
- Concurrency/lifecycle: no new shared mutable concurrency or lifecycle ownership path is introduced.
- Configuration/compatibility: the new TVF property is optional, strictly boolean-validated, defaults false, and is not persisted as new job state.
- Parallel paths: MySQL and PostgreSQL CDC TVF paths are both covered; CDC client already emits the delete-sign field on READ/CREATE/UPDATE/DELETE records.
- Test coverage/results: FE unit coverage exists, but the regression
.outfiles must be generated and committed before the suites can pass.
Subagent conclusions: optimizer-rewrite and tests-session-config subagents found no new valuable candidates beyond the existing missing-output thread. Convergence round 1 ended with both subagents replying NO_NEW_VALUABLE_FINDINGS for the same ledger and proposed final comment set.
TPC-H: Total hot run time: 30057 ms |
TPC-DS: Total hot run time: 175109 ms |
ClickBench: Total hot run time: 29.17 s |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
TPC-H: Total hot run time: 29836 ms |
TPC-DS: Total hot run time: 173741 ms |
ClickBench: Total hot run time: 25.26 s |
FE Regression Coverage ReportIncrement line coverage |
|
run external |
|
run cloud_p0 |
FE Regression Coverage ReportIncrement line coverage |
|
run external |
|
/review |
There was a problem hiding this comment.
Automated review result: I did not find additional blocking issues in the current head.
Critical checkpoint conclusions:
- Goal and coverage: the PR exposes
include_delete_signoncdc_stream, keeps the default schema unchanged, validates invalid values, and covers MySQL/PostgreSQL snapshot, insert, update, and delete-sign delete behavior. - Scope: the code change is focused on TVF schema exposure and targeted tests; it does not introduce persistence, edit-log, or FE-BE protocol changes.
- Data correctness: CDC client rows already emit
__DORIS_DELETE_SIGN__; the PR only exposes that field as an optional TVF column and the regression suites explicitly bind it into Unique Key MoW target tables. - Parallel paths: scheduled streaming tasks preserve the original TVF properties during
JdbcTvfSourceOffsetProvider.rewriteTvfParams(), soinclude_delete_signremains available after task-time offset rewrite. - Concurrency/lifecycle: no new shared mutable lifecycle or lock path was introduced.
- Compatibility/config: default
falsepreserves existing output schema; the new option is strict boolean validated and is intentionally removed from the HTTP payload because it is an FE schema switch. - Tests/results: FE unit coverage and MySQL/PostgreSQL regression suites with matching
.outfiles are present. The prior inline thread about missing.outfiles is stale for this head. - Validation: local FE/regression execution was not possible because
thirdparty/installedandthirdparty/installed/bin/protocare missing in this worktree. GitHub currently reports FE UT, CheckStyle, P0 regression, non-concurrent regression, and related checks passing; external regression is still pending. - User focus: no additional user-provided focus was supplied.
Subagent conclusions:
optimizer-rewrite: no candidate became an inline comment; suspicious ALTER/JDBC TVF rewrite points were dismissed with code evidence.tests-session-config: no candidate became an inline comment; the existing missing-output concern is addressed by the current.outfiles.- Final convergence round: both live subagents returned
NO_NEW_VALUABLE_FINDINGSfor the same current ledger/comment set after the proposed final comment set was empty.
|
run external |
FE Regression Coverage ReportIncrement line coverage |
include_delete_signparameter to the CDC stream TVF; when enabled, it outputs__DORIS_DELETE_SIGN__.