Skip to content

Conversation

@suxiaogang223
Copy link
Contributor

What problem does this PR solve?

Problem Summary:
When querying external tables, if file_split_size is set too small, large files will be split into a huge number of splits. In non-batch mode, all splits are loaded into memory at once, which can cause OOM when there are too many splits.

Solution

Add max_file_splits_num configuration to limit the total number of splits across all files. The system will perform a global assessment before generating splits:

  1. Collect file information: Gather file sizes before splitting
  2. Estimate total split count: Calculate estimated total splits based on total file size and file_split_size
  3. Dynamic adjustment: If the estimated split count exceeds max_file_splits_num, automatically increase file_split_size to ceil(total_file_size / max_file_splits_num)
  4. Execute split: Use the adjusted split size for splitting

Main Changes

  • SessionVariable.java: Add max_file_splits_num configuration (default: 100000)
  • FileQueryScanNode.java: Add adjustSplitSizeForTotalLimit() method to estimate and adjust split size
  • HiveScanNode.java: Collect file information and adjust split size before splitting
  • TVFScanNode.java: Collect file information and adjust split size before splitting
  • PaimonScanNode.java: Collect file information and adjust split size before splitting

Usage

-- Set maximum total split count (default: 1000000)
SET max_file_splits_num = 1000000;

-- Set to 0 to disable the limit
SET max_file_splits_num = 0;

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Dec 5, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223
Copy link
Contributor Author

run buildall

@suxiaogang223
Copy link
Contributor Author

run buildall

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34570 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit dd684a2f23bf54f84260040a028e891620b229bc, data reload: false

------ Round 1 ----------------------------------
q1	17610	5044	4873	4873
q2	2043	314	192	192
q3	10273	1291	716	716
q4	10202	803	306	306
q5	7527	2403	2154	2154
q6	179	163	133	133
q7	967	790	632	632
q8	9338	1405	1070	1070
q9	7007	5310	5277	5277
q10	6832	2195	1791	1791
q11	520	310	294	294
q12	339	367	230	230
q13	17790	3681	3093	3093
q14	238	237	215	215
q15	598	524	517	517
q16	893	883	814	814
q17	669	796	486	486
q18	7441	7179	7739	7179
q19	1398	995	621	621
q20	420	371	245	245
q21	4215	4035	2691	2691
q22	1106	1096	1041	1041
Total cold run time: 107605 ms
Total hot run time: 34570 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5262	5095	5165	5095
q2	337	417	323	323
q3	2337	2821	2513	2513
q4	1517	1874	1493	1493
q5	4584	4438	4445	4438
q6	218	167	125	125
q7	1957	1967	1845	1845
q8	2632	2605	2514	2514
q9	7602	7512	7154	7154
q10	2948	3097	2625	2625
q11	562	497	472	472
q12	619	694	549	549
q13	3236	3624	3009	3009
q14	261	287	285	285
q15	536	512	499	499
q16	879	913	860	860
q17	1112	1341	1321	1321
q18	7236	7270	6926	6926
q19	829	790	815	790
q20	1959	1967	1839	1839
q21	4675	4222	4230	4222
q22	1076	1062	965	965
Total cold run time: 52374 ms
Total hot run time: 49862 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180264 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit dd684a2f23bf54f84260040a028e891620b229bc, data reload: false

query5	5156	654	509	509
query6	338	256	211	211
query7	4660	467	287	287
query8	312	269	237	237
query9	8747	2628	2610	2610
query10	556	308	285	285
query11	15616	15050	14626	14626
query12	176	116	116	116
query13	1676	498	396	396
query14	6179	3317	3057	3057
query14_1	2906	2965	2923	2923
query15	212	200	188	188
query16	7720	485	484	484
query17	1191	693	584	584
query18	2035	426	321	321
query19	204	190	157	157
query20	130	123	116	116
query21	215	132	110	110
query22	3892	3993	3857	3857
query23	16524	16169	15921	15921
query23_1	16214	16191	15912	15912
query24	7174	1641	1198	1198
query24_1	1202	1184	1222	1184
query25	611	477	414	414
query26	1248	289	208	208
query27	2878	466	304	304
query28	4379	2167	2160	2160
query29	813	549	453	453
query30	313	242	221	221
query31	810	689	620	620
query32	81	74	69	69
query33	653	347	297	297
query34	873	883	547	547
query35	808	832	733	733
query36	895	903	816	816
query37	122	90	84	84
query38	3840	3936	3754	3754
query39	760	741	734	734
query39_1	709	692	701	692
query40	221	130	118	118
query41	67	62	63	62
query42	129	100	96	96
query43	445	417	394	394
query44	1320	772	758	758
query45	195	190	186	186
query46	893	970	595	595
query47	1698	1725	1625	1625
query48	408	319	235	235
query49	797	430	347	347
query50	702	317	231	231
query51	3845	3894	3978	3894
query52	119	100	89	89
query53	233	235	179	179
query54	338	266	249	249
query55	100	80	76	76
query56	340	303	301	301
query57	1164	1129	1133	1129
query58	302	264	255	255
query59	2285	2355	2342	2342
query60	372	315	305	305
query61	194	193	190	190
query62	786	684	652	652
query63	237	180	184	180
query64	4675	1309	897	897
query65	4039	3945	3961	3945
query66	1172	447	342	342
query67	15499	15048	14966	14966
query68	4744	965	672	672
query69	526	295	271	271
query70	1122	1005	983	983
query71	431	297	280	280
query72	5981	4923	5114	4923
query73	699	581	303	303
query74	8613	8888	8763	8763
query75	3026	3026	2567	2567
query76	3296	1127	738	738
query77	531	397	308	308
query78	9501	9643	8835	8835
query79	1547	849	591	591
query80	1710	545	457	457
query81	545	270	244	244
query82	412	128	101	101
query83	385	272	259	259
query84	254	120	97	97
query85	964	512	454	454
query86	381	308	284	284
query87	4015	4068	3887	3887
query88	2943	2122	2124	2122
query89	398	326	283	283
query90	1840	175	174	174
query91	178	167	141	141
query92	73	67	66	66
query93	1191	1031	687	687
query94	767	285	281	281
query95	566	341	326	326
query96	549	489	212	212
query97	2621	2671	2632	2632
query98	241	203	195	195
query99	1321	1333	1227	1227
Total cold run time: 266819 ms
Total hot run time: 180264 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.19 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit dd684a2f23bf54f84260040a028e891620b229bc, data reload: false

query1	0.05	0.05	0.05
query2	0.10	0.05	0.04
query3	0.25	0.09	0.09
query4	1.61	0.12	0.11
query5	0.27	0.24	0.27
query6	1.16	0.64	0.62
query7	0.03	0.02	0.03
query8	0.05	0.04	0.05
query9	0.57	0.51	0.50
query10	0.56	0.56	0.57
query11	0.15	0.10	0.11
query12	0.16	0.11	0.12
query13	0.64	0.59	0.60
query14	0.99	0.98	0.98
query15	0.82	0.80	0.80
query16	0.40	0.39	0.39
query17	1.07	1.01	1.04
query18	0.24	0.21	0.22
query19	1.91	1.81	1.74
query20	0.02	0.01	0.02
query21	15.46	0.28	0.14
query22	4.84	0.05	0.05
query23	16.03	0.29	0.10
query24	1.63	0.63	0.18
query25	0.07	0.05	0.06
query26	0.14	0.13	0.13
query27	0.06	0.05	0.04
query28	4.27	1.21	1.02
query29	12.57	3.99	3.26
query30	0.27	0.14	0.11
query31	2.81	0.61	0.40
query32	3.22	0.56	0.46
query33	3.04	3.09	3.12
query34	16.83	5.19	4.56
query35	4.60	4.55	4.58
query36	0.64	0.49	0.50
query37	0.10	0.06	0.07
query38	0.07	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.13
query41	0.08	0.04	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.04
Total cold run time: 98.08 s
Total hot run time: 27.19 s

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34415 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 906053626d692934eef5a18e9be90e03cf87a863, data reload: false

------ Round 1 ----------------------------------
q1	17625	4998	4889	4889
q2	2043	316	197	197
q3	10237	1314	733	733
q4	10207	800	328	328
q5	7530	2465	2152	2152
q6	208	172	135	135
q7	966	775	637	637
q8	9356	1399	1098	1098
q9	7064	5324	5383	5324
q10	6812	2215	1830	1830
q11	508	329	290	290
q12	352	377	226	226
q13	17766	3700	3057	3057
q14	234	235	212	212
q15	574	522	517	517
q16	885	867	807	807
q17	684	800	519	519
q18	7391	7078	7130	7078
q19	961	971	597	597
q20	380	348	219	219
q21	3979	3522	2600	2600
q22	1032	992	970	970
Total cold run time: 106794 ms
Total hot run time: 34415 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4953	4911	4921	4911
q2	336	418	323	323
q3	2126	2716	2286	2286
q4	1322	1776	1285	1285
q5	4265	4821	4660	4660
q6	219	175	126	126
q7	2050	2003	1814	1814
q8	2664	2435	2559	2435
q9	7814	7599	7697	7599
q10	3028	3234	2875	2875
q11	618	507	501	501
q12	709	725	608	608
q13	3589	3954	3391	3391
q14	277	300	269	269
q15	552	502	498	498
q16	884	940	887	887
q17	1175	1414	1451	1414
q18	8149	7764	7702	7702
q19	859	894	916	894
q20	2045	2109	1932	1932
q21	4984	4421	4169	4169
q22	1085	1031	997	997
Total cold run time: 53703 ms
Total hot run time: 51576 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 180789 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 906053626d692934eef5a18e9be90e03cf87a863, data reload: false

query5	5139	663	501	501
query6	346	232	211	211
query7	4668	480	298	298
query8	308	270	245	245
query9	8735	2672	2666	2666
query10	578	332	274	274
query11	15345	14804	14796	14796
query12	197	123	123	123
query13	1687	470	369	369
query14	6320	3334	3053	3053
query14_1	2932	2908	2950	2908
query15	215	198	185	185
query16	7681	500	456	456
query17	1220	739	613	613
query18	2080	433	348	348
query19	227	193	170	170
query20	136	125	122	122
query21	225	137	115	115
query22	4095	3942	3835	3835
query23	16681	16288	16084	16084
query23_1	16118	16036	16132	16036
query24	7294	1639	1220	1220
query24_1	1247	1233	1219	1219
query25	658	512	454	454
query26	1273	310	179	179
query27	2876	490	326	326
query28	4403	2212	2203	2203
query29	848	623	436	436
query30	319	248	215	215
query31	844	699	620	620
query32	78	67	73	67
query33	668	342	296	296
query34	869	914	549	549
query35	800	824	725	725
query36	912	923	839	839
query37	124	91	79	79
query38	3841	3814	3809	3809
query39	749	755	720	720
query39_1	695	708	712	708
query40	228	132	116	116
query41	66	61	59	59
query42	128	99	97	97
query43	441	428	390	390
query44	1306	755	773	755
query45	201	191	185	185
query46	927	962	595	595
query47	1704	1715	1613	1613
query48	417	325	229	229
query49	772	436	376	376
query50	697	305	243	243
query51	3891	3863	3927	3863
query52	127	98	87	87
query53	239	232	177	177
query54	308	272	246	246
query55	105	81	80	80
query56	330	297	295	295
query57	1160	1152	1111	1111
query58	333	275	256	256
query59	2285	2347	2314	2314
query60	354	312	307	307
query61	164	154	157	154
query62	802	677	650	650
query63	234	178	176	176
query64	4535	1179	897	897
query65	4068	3991	4023	3991
query66	1208	444	339	339
query67	15538	14882	14805	14805
query68	8331	946	669	669
query69	523	299	265	265
query70	1124	994	961	961
query71	428	296	265	265
query72	5930	4954	4872	4872
query73	680	548	300	300
query74	8544	8848	8761	8761
query75	3034	3048	2516	2516
query76	3363	1126	745	745
query77	536	391	308	308
query78	9447	9646	8897	8897
query79	1283	870	584	584
query80	660	562	493	493
query81	521	268	243	243
query82	217	130	106	106
query83	283	275	260	260
query84	281	129	95	95
query85	905	496	456	456
query86	377	308	294	294
query87	4047	4085	4009	4009
query88	2948	2184	2116	2116
query89	394	325	282	282
query90	2066	166	160	160
query91	184	167	138	138
query92	89	69	66	66
query93	1693	1065	679	679
query94	774	309	287	287
query95	575	390	344	344
query96	545	479	212	212
query97	2631	2711	2576	2576
query98	247	210	207	207
query99	1357	1319	1203	1203
Total cold run time: 269955 ms
Total hot run time: 180789 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 27.57 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 906053626d692934eef5a18e9be90e03cf87a863, data reload: false

query1	0.05	0.05	0.05
query2	0.11	0.04	0.05
query3	0.26	0.10	0.09
query4	1.61	0.11	0.11
query5	0.29	0.26	0.25
query6	1.16	0.65	0.64
query7	0.03	0.03	0.03
query8	0.06	0.05	0.04
query9	0.58	0.52	0.51
query10	0.56	0.55	0.57
query11	0.16	0.11	0.12
query12	0.15	0.11	0.12
query13	0.62	0.60	0.61
query14	1.00	0.98	0.98
query15	0.81	0.80	0.79
query16	0.40	0.39	0.39
query17	1.02	1.02	1.01
query18	0.22	0.21	0.22
query19	1.95	1.91	1.87
query20	0.02	0.01	0.02
query21	15.46	0.31	0.13
query22	4.68	0.05	0.05
query23	16.12	0.27	0.11
query24	0.99	0.62	0.47
query25	0.09	0.06	0.06
query26	0.13	0.14	0.13
query27	0.05	0.05	0.05
query28	3.86	1.23	1.03
query29	12.58	4.10	3.31
query30	0.28	0.14	0.12
query31	2.82	0.64	0.40
query32	3.23	0.57	0.46
query33	2.98	3.07	3.13
query34	16.93	5.17	4.44
query35	4.54	4.57	4.53
query36	0.66	0.51	0.48
query37	0.11	0.07	0.07
query38	0.07	0.04	0.03
query39	0.05	0.03	0.02
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.04
Total cold run time: 97.02 s
Total hot run time: 27.57 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 37.50% (36/96) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 67.71% (65/96) 🎉
Increment coverage report
Complete coverage report


// Calculate total file size
long totalFileSize = 0;
for (long fileSize : fileSizes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You iterate fileSizes twice.
Only need one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants