Skip to content

[improve](streaming-job) avoid potential OOM when reading large snapshot splits#63833

Merged
JNSimba merged 2 commits into
apache:masterfrom
JNSimba:fix_cdc_snapshot_completion
May 29, 2026
Merged

[improve](streaming-job) avoid potential OOM when reading large snapshot splits#63833
JNSimba merged 2 commits into
apache:masterfrom
JNSimba:fix_cdc_snapshot_completion

Conversation

@JNSimba
Copy link
Copy Markdown
Member

@JNSimba JNSimba commented May 28, 2026

Summary

  • Default-skip flink-cdc's in-snapshot backfill on the from-to path so large splits no longer accumulate the entire chunk + backfill stream in the fetcher's outputBuffer; from-to is at-least-once and tolerates the duplicates this introduces. TVF (job-driven and standalone) keeps the standard false default for exactly-once via per-task offset commit.
  • Expose skip_snapshot_backfill as a user-facing property with strict true/false validation on both from-to (CREATE JOB) and TVF (SELECT FROM cdc_stream(...)) entry points.
  • Fix snapshot completion under pollWithoutBuffer: a split is now marked complete only after its high-watermark event has been consumed (splitState.getHighWatermark() != null), not on the first non-empty fetcher batch. Without this, enabling the new default truncates any split larger than debezium's max.batch.size and yields an NPE on offset extraction.
  • Read streaming_task_timeout_multiplier live in StreamingMultiTblTask.isTimeout() so admin set frontend config affects already-running tasks, matching the @ConfField(mutable=true) contract.

Test plan

  • `mvn compile` passes for `fe-core` and `cdc_client`
  • New `test_streaming_postgres_job_snapshot_fat_split` / `test_streaming_mysql_job_snapshot_fat_split` pass: 2100 rows with `snapshot_split_size=3000` (single split exceeds `max.batch.size=2048`), asserting count=2100, distinct=2100, `id BETWEEN 2049 AND 2100`=52, and post-snapshot DML still flows
  • Existing `test_streaming_id_gap_completeness` / `test_streamingsnapshot` / `test_streaming_async_split` regressions still pass
  • Validator rejects `skip_snapshot_backfill=foo` at SQL analysis on both CREATE JOB and `cdc_stream` TVF
  • `admin set frontend config ("streaming_task_timeout_multiplier"="N")` while a from-to task is running takes effect on the running task

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 28, 2026

/review

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 28, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31642 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 41fd120977678cad7af8f166add762bbfe7c0d92, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17651	4032	3999	3999
q2	q3	10759	1364	792	792
q4	4686	479	350	350
q5	7586	2256	2095	2095
q6	237	174	136	136
q7	938	785	643	643
q8	9369	1741	1606	1606
q9	5174	4866	4872	4866
q10	6392	2200	1885	1885
q11	440	265	242	242
q12	630	430	299	299
q13	18118	3405	2744	2744
q14	270	257	237	237
q15	q16	825	777	708	708
q17	1003	965	934	934
q18	6886	5602	5692	5602
q19	1306	1292	1111	1111
q20	505	496	314	314
q21	6438	2806	2699	2699
q22	457	380	386	380
Total cold run time: 99670 ms
Total hot run time: 31642 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4833	4681	4686	4681
q2	q3	5024	5365	4681	4681
q4	2087	2180	1380	1380
q5	4960	4658	4666	4658
q6	232	177	138	138
q7	1933	1757	1572	1572
q8	2409	2094	2087	2087
q9	7830	7326	7331	7326
q10	4738	4656	4193	4193
q11	523	397	359	359
q12	721	726	526	526
q13	3005	3441	2792	2792
q14	269	289	251	251
q15	q16	675	694	599	599
q17	1279	1250	1250	1250
q18	7279	6978	6799	6799
q19	1120	1100	1125	1100
q20	2203	2213	1925	1925
q21	5217	4567	4335	4335
q22	519	468	441	441
Total cold run time: 56856 ms
Total hot run time: 51093 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171582 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 41fd120977678cad7af8f166add762bbfe7c0d92, data reload: false

query5	4301	663	538	538
query6	335	216	201	201
query7	4282	589	309	309
query8	334	239	225	225
query9	8769	4066	4098	4066
query10	450	344	298	298
query11	5843	2529	2207	2207
query12	174	129	127	127
query13	1287	607	456	456
query14	6110	5466	5151	5151
query14_1	4492	4496	4466	4466
query15	218	204	190	190
query16	1052	464	456	456
query17	1179	766	625	625
query18	2775	492	366	366
query19	233	210	171	171
query20	145	143	134	134
query21	219	140	119	119
query22	13675	13569	13320	13320
query23	17461	16569	16227	16227
query23_1	16400	16274	16450	16274
query24	7531	1767	1298	1298
query24_1	1299	1291	1332	1291
query25	540	459	421	421
query26	1315	311	175	175
query27	2685	572	356	356
query28	4405	1980	2031	1980
query29	966	628	496	496
query30	314	235	196	196
query31	1133	1070	950	950
query32	92	75	71	71
query33	531	357	304	304
query34	1196	1147	677	677
query35	767	783	704	704
query36	1427	1410	1222	1222
query37	156	102	93	93
query38	3221	3147	3136	3136
query39	929	919	894	894
query39_1	887	904	880	880
query40	233	141	121	121
query41	65	64	63	63
query42	108	111	107	107
query43	335	339	290	290
query44	
query45	214	206	202	202
query46	1103	1232	764	764
query47	2380	2358	2315	2315
query48	411	460	288	288
query49	629	489	384	384
query50	1052	350	253	253
query51	4349	4287	4289	4287
query52	109	121	96	96
query53	256	287	197	197
query54	316	269	265	265
query55	94	90	84	84
query56	304	303	292	292
query57	1437	1430	1309	1309
query58	310	278	269	269
query59	1602	1698	1444	1444
query60	319	325	305	305
query61	154	154	148	148
query62	710	649	590	590
query63	250	211	202	202
query64	2375	805	648	648
query65	
query66	1689	488	367	367
query67	29740	29575	29509	29509
query68	
query69	512	348	302	302
query70	1024	974	1003	974
query71	311	280	269	269
query72	3020	2708	2403	2403
query73	891	762	431	431
query74	5150	4962	4785	4785
query75	2698	2596	2253	2253
query76	2310	1173	807	807
query77	397	414	341	341
query78	12396	12507	11850	11850
query79	1273	1094	735	735
query80	578	544	452	452
query81	454	276	233	233
query82	238	154	119	119
query83	267	277	242	242
query84	263	147	109	109
query85	887	647	526	526
query86	359	338	338	338
query87	3407	3373	3237	3237
query88	3642	2757	2764	2757
query89	436	405	356	356
query90	2195	184	188	184
query91	198	185	159	159
query92	84	81	79	79
query93	1499	1448	904	904
query94	561	373	340	340
query95	696	431	448	431
query96	1050	865	343	343
query97	2739	2740	2612	2612
query98	237	233	228	228
query99	1187	1157	1031	1031
Total cold run time: 253414 ms
Total hot run time: 171582 ms

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found in the GitHub PR diff.

Critical checkpoint conclusions:

  • Goal/test: The change targets CDC snapshot splits whose records span multiple Debezium batches, and the new MySQL/Postgres regression cases exercise the >2048-row single-split path plus incremental follow-up DML.
  • Scope: The functional change is focused on snapshot completion detection and related config validation/defaulting.
  • Concurrency: Snapshot polling already uses async futures; this PR does not add shared-state mutation from worker futures for completion tracking. The completed split set is updated on the polling path after high-watermark state is consumed.
  • Lifecycle: No new static/global lifecycle concern found. Reader cleanup still happens through the existing finish/close paths.
  • Config: A new skip_snapshot_backfill source property is added with FE validation and reader consumption; defaulting is explicit for from-to jobs.
  • Compatibility: No storage format, thrift, or persisted edit-log compatibility issue found in the reviewed diff.
  • Parallel paths: MySQL and Postgres reader paths both receive the snapshot-finished logic; TVF validation is also updated.
  • Tests: New external regression tests cover the fat snapshot split behavior for both MySQL and Postgres. I did not run them in this Actions review environment.
  • Observability: Existing logs around snapshot polling and completion remain sufficient for this change.
  • Transactions/persistence/data correctness: No FE transaction/edit-log change; snapshot offset commit is still based on split high-watermarks after all snapshot splits finish.
  • Performance: The change avoids prematurely finishing after the first batch without introducing an obvious hot-path regression.

User focus: no additional user-provided review focus was specified.

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/13) 🎉
Increment coverage report
Complete coverage report

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 28, 2026

run buildall

@JNSimba
Copy link
Copy Markdown
Member Author

JNSimba commented May 28, 2026

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31508 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c99f868bfac95c74ae612246262d8e48c7491646, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17705	4161	4028	4028
q2	q3	10789	1372	811	811
q4	4686	478	340	340
q5	7556	2255	2079	2079
q6	238	175	133	133
q7	914	782	635	635
q8	9519	1739	1527	1527
q9	5162	4948	4953	4948
q10	6438	2202	1896	1896
q11	436	276	251	251
q12	632	416	291	291
q13	18148	3363	2831	2831
q14	261	257	249	249
q15	q16	829	763	711	711
q17	999	866	955	866
q18	6907	5645	5506	5506
q19	1310	1215	1118	1118
q20	559	541	280	280
q21	6311	2849	2699	2699
q22	451	370	309	309
Total cold run time: 99850 ms
Total hot run time: 31508 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5044	4784	4676	4676
q2	q3	4968	5264	4605	4605
q4	2149	2201	1365	1365
q5	4812	4860	4677	4677
q6	233	175	126	126
q7	1853	1818	1564	1564
q8	2395	2159	2174	2159
q9	7964	7844	7398	7398
q10	4726	4647	4210	4210
q11	542	381	348	348
q12	736	733	512	512
q13	3008	3385	2775	2775
q14	269	274	256	256
q15	q16	669	693	628	628
q17	1266	1241	1234	1234
q18	7364	6958	6751	6751
q19	1135	1060	1089	1060
q20	2224	2220	1954	1954
q21	5240	4569	4423	4423
q22	530	445	420	420
Total cold run time: 57127 ms
Total hot run time: 51141 ms

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: request changes.

Critical checkpoint conclusions:

  • Goal/test proof: The PR addresses large snapshot split memory/completion issues and adds external regression tests for fat snapshot splits, but the new default skip_snapshot_backfill=true is not proven safe for concurrent DML during snapshot. Existing handoff code builds the binlog split from finished snapshot high watermarks, so the skipped backfill window can lose source changes.
  • Scope/focus: Most CDC changes are focused, but the effective semantics change for all CREATE JOB from-to initial snapshots is broad and needs the offset handoff adjusted or the default reconsidered.
  • Concurrency: The snapshot poll future changes use concurrent collections and do not introduce an obvious new data race in the reviewed paths.
  • Lifecycle/static initialization: No new static lifecycle issue identified.
  • Config: New skip_snapshot_backfill is validated, but making it default true changes runtime behavior and requires correctness guarantees for snapshot-to-binlog transition.
  • Compatibility/storage format: No storage format or FE-BE protocol incompatibility identified in the CDC changes.
  • Parallel paths: MySQL and Postgres reader paths were both updated.
  • Special checks: The high-watermark based completion check is directionally correct for pollWithoutBuffer, but it does not by itself protect the skipped backfill interval.
  • Test coverage/results: Added tests cover large single-split draining and post-snapshot DML. They do not cover DML occurring while the snapshot split is open under the new default, which is the risky path.
  • Observability: Existing logs are sufficient for the reviewed CDC polling paths.
  • Transaction/persistence/data correctness: Blocking issue found: source updates/deletes/inserts committed between snapshot low/high watermark can be omitted.
  • Performance: The memory/backpressure direction is reasonable, but correctness must be preserved before defaulting it on.

User focus: no additional user-provided review focus was present.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170787 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c99f868bfac95c74ae612246262d8e48c7491646, data reload: false

query5	4308	664	533	533
query6	329	217	208	208
query7	4266	548	327	327
query8	333	235	221	221
query9	8813	4015	3994	3994
query10	477	355	292	292
query11	5762	2421	2194	2194
query12	188	130	125	125
query13	1285	608	402	402
query14	6114	5422	5132	5132
query14_1	4421	4429	4420	4420
query15	211	203	182	182
query16	1019	449	424	424
query17	1144	722	587	587
query18	2515	481	359	359
query19	218	203	157	157
query20	134	135	131	131
query21	216	132	117	117
query22	13662	13661	13394	13394
query23	17411	16566	16225	16225
query23_1	16381	16309	16388	16309
query24	7647	1755	1308	1308
query24_1	1324	1320	1321	1320
query25	600	491	438	438
query26	1338	330	179	179
query27	2656	600	356	356
query28	4492	2012	2029	2012
query29	1020	640	521	521
query30	302	240	203	203
query31	1126	1078	959	959
query32	89	77	77	77
query33	547	367	309	309
query34	1198	1155	624	624
query35	793	817	722	722
query36	1390	1414	1265	1265
query37	151	109	99	99
query38	3255	3177	3088	3088
query39	946	918	912	912
query39_1	873	875	877	875
query40	238	152	128	128
query41	70	68	67	67
query42	113	111	110	110
query43	332	331	290	290
query44	
query45	223	207	200	200
query46	1133	1169	723	723
query47	2384	2395	2253	2253
query48	409	425	314	314
query49	654	502	413	413
query50	945	357	257	257
query51	4393	4289	4360	4289
query52	105	104	97	97
query53	255	287	203	203
query54	337	282	278	278
query55	99	95	88	88
query56	329	323	315	315
query57	1459	1429	1314	1314
query58	307	276	275	275
query59	1642	1656	1462	1462
query60	316	317	302	302
query61	160	154	156	154
query62	702	653	582	582
query63	243	200	207	200
query64	2429	813	635	635
query65	
query66	1729	476	357	357
query67	29867	29854	29546	29546
query68	
query69	462	341	311	311
query70	1041	1004	1015	1004
query71	315	274	265	265
query72	3006	2717	2228	2228
query73	877	810	428	428
query74	5132	4984	4800	4800
query75	2721	2577	2277	2277
query76	2290	1127	751	751
query77	413	404	340	340
query78	12531	12444	11879	11879
query79	1480	1041	747	747
query80	643	539	448	448
query81	457	289	247	247
query82	1346	158	120	120
query83	356	277	249	249
query84	275	140	110	110
query85	903	537	467	467
query86	389	356	334	334
query87	3459	3377	3233	3233
query88	3593	2741	2687	2687
query89	450	396	352	352
query90	1944	181	176	176
query91	178	168	138	138
query92	82	82	75	75
query93	1448	1424	917	917
query94	536	375	306	306
query95	686	378	345	345
query96	1012	761	367	367
query97	2764	2708	2629	2629
query98	239	227	225	225
query99	1167	1147	1023	1023
Total cold run time: 254947 ms
Total hot run time: 170787 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 0.00% (0/13) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 4.44% (2/45) 🎉
Increment coverage report
Complete coverage report

Copy link
Copy Markdown
Contributor

@liaoxin01 liaoxin01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@JNSimba JNSimba merged commit 6e27f11 into apache:master May 29, 2026
33 checks passed
github-actions Bot pushed a commit that referenced this pull request May 29, 2026
…hot splits (#63833)

## Summary
- Default-skip flink-cdc's in-snapshot backfill on the from-to path so
large splits no longer accumulate the entire chunk + backfill stream in
the fetcher's outputBuffer; from-to is at-least-once and tolerates the
duplicates this introduces. TVF (job-driven and standalone) keeps the
standard `false` default for exactly-once via per-task offset commit.
- Expose `skip_snapshot_backfill` as a user-facing property with strict
`true`/`false` validation on both from-to (CREATE JOB) and TVF (SELECT
FROM cdc_stream(...)) entry points.
- Fix snapshot completion under `pollWithoutBuffer`: a split is now
marked complete only after its high-watermark event has been consumed
(`splitState.getHighWatermark() != null`), not on the first non-empty
fetcher batch. Without this, enabling the new default truncates any
split larger than debezium's `max.batch.size` and yields an NPE on
offset extraction.
- Read `streaming_task_timeout_multiplier` live in
`StreamingMultiTblTask.isTimeout()` so `admin set frontend config`
affects already-running tasks, matching the `@ConfField(mutable=true)`
contract.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.1.x

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants