Skip to content

[refactoring](multi-catalog)data_lake_reader_refactoring.#61783

Draft
kaka11chen wants to merge 2 commits intoapache:masterfrom
kaka11chen:data_lake_reader_refactoring
Draft

[refactoring](multi-catalog)data_lake_reader_refactoring.#61783
kaka11chen wants to merge 2 commits intoapache:masterfrom
kaka11chen:data_lake_reader_refactoring

Conversation

@kaka11chen
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Mar 26, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.67% (1796/2283)
Line Coverage 64.40% (32282/50128)
Region Coverage 65.28% (16163/24761)
Branch Coverage 55.78% (8620/15454)

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 26855 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b3338a3791cc4c2c64f44abadf924a0685d12f50, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17637	4444	4312	4312
q2	q3	10744	812	531	531
q4	4721	358	252	252
q5	8229	1251	1009	1009
q6	245	173	147	147
q7	818	845	685	685
q8	10867	1515	1330	1330
q9	6766	4771	4702	4702
q10	6379	1952	1803	1803
q11	481	248	241	241
q12	754	585	469	469
q13	18042	2697	1974	1974
q14	233	238	223	223
q15	q16	740	765	676	676
q17	724	865	440	440
q18	6010	5420	5261	5261
q19	1118	977	631	631
q20	551	497	379	379
q21	4738	2091	1506	1506
q22	392	357	284	284
Total cold run time: 100189 ms
Total hot run time: 26855 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4635	4507	4599	4507
q2	q3	3908	4349	3821	3821
q4	901	1185	774	774
q5	4075	4475	4374	4374
q6	199	180	137	137
q7	1769	1629	1543	1543
q8	2543	2737	2611	2611
q9	7586	7501	7459	7459
q10	3821	4001	3614	3614
q11	527	434	416	416
q12	491	593	433	433
q13	2622	3018	2041	2041
q14	312	318	300	300
q15	q16	773	768	717	717
q17	1201	1406	1273	1273
q18	7332	7009	6705	6705
q19	933	912	951	912
q20	2089	2163	1985	1985
q21	4051	3624	3333	3333
q22	455	466	375	375
Total cold run time: 50223 ms
Total hot run time: 47330 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 169048 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b3338a3791cc4c2c64f44abadf924a0685d12f50, data reload: false

query5	4335	640	512	512
query6	334	231	200	200
query7	4231	469	263	263
query8	352	258	238	238
query9	8685	2737	2732	2732
query10	531	376	369	369
query11	7004	5096	4857	4857
query12	184	121	124	121
query13	1298	475	350	350
query14	5731	3752	3472	3472
query14_1	2970	2830	2874	2830
query15	209	195	180	180
query16	998	408	450	408
query17	909	729	635	635
query18	2453	456	357	357
query19	221	214	188	188
query20	131	126	126	126
query21	217	138	113	113
query22	13219	13990	14721	13990
query23	16612	16252	16088	16088
query23_1	16410	15833	15691	15691
query24	7188	1615	1221	1221
query24_1	1193	1233	1223	1223
query25	567	455	406	406
query26	1245	261	184	184
query27	2759	478	294	294
query28	4495	1820	1825	1820
query29	803	570	472	472
query30	302	217	188	188
query31	1019	952	869	869
query32	87	69	73	69
query33	509	318	275	275
query34	869	891	521	521
query35	660	681	590	590
query36	1096	1134	987	987
query37	142	98	86	86
query38	2968	2936	2833	2833
query39	845	822	793	793
query39_1	798	798	803	798
query40	235	153	136	136
query41	63	62	58	58
query42	261	251	252	251
query43	237	251	223	223
query44	
query45	190	190	180	180
query46	878	983	600	600
query47	2128	2158	2058	2058
query48	303	315	230	230
query49	633	464	380	380
query50	705	301	217	217
query51	4052	4069	4018	4018
query52	263	270	259	259
query53	286	332	293	293
query54	312	280	259	259
query55	87	88	83	83
query56	338	314	311	311
query57	1956	1719	1779	1719
query58	282	270	274	270
query59	2767	2943	2737	2737
query60	329	340	322	322
query61	186	155	149	149
query62	619	581	537	537
query63	307	284	282	282
query64	5011	1285	1021	1021
query65	
query66	1472	451	359	359
query67	24231	24252	24407	24252
query68	
query69	410	323	294	294
query70	986	981	967	967
query71	333	315	297	297
query72	2934	2902	2645	2645
query73	549	553	321	321
query74	9618	9553	9439	9439
query75	2845	2767	2503	2503
query76	2303	1030	670	670
query77	361	373	312	312
query78	10979	11070	10440	10440
query79	3110	772	591	591
query80	1729	624	510	510
query81	576	259	231	231
query82	998	158	121	121
query83	328	260	235	235
query84	302	121	106	106
query85	909	495	448	448
query86	423	299	294	294
query87	3134	3095	2954	2954
query88	3518	2611	2623	2611
query89	425	371	350	350
query90	2024	179	177	177
query91	168	168	135	135
query92	75	73	71	71
query93	1368	858	499	499
query94	668	321	294	294
query95	579	393	323	323
query96	647	511	231	231
query97	2436	2462	2413	2413
query98	237	217	226	217
query99	1005	1001	925	925
Total cold run time: 251988 ms
Total hot run time: 169048 ms

@kaka11chen kaka11chen force-pushed the data_lake_reader_refactoring branch from b3338a3 to 34767b8 Compare March 29, 2026 14:02
@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.67% (1796/2283)
Line Coverage 64.40% (32282/50128)
Region Coverage 65.30% (16170/24761)
Branch Coverage 55.76% (8617/15454)

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 26534 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 34767b88726ba013e97e8d1f2385ccea1fd02453, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17637	4403	4319	4319
q2	q3	10742	797	530	530
q4	4721	381	256	256
q5	8247	1234	1023	1023
q6	247	172	145	145
q7	828	840	684	684
q8	10889	1505	1327	1327
q9	6956	4839	4730	4730
q10	6288	1944	1631	1631
q11	464	246	237	237
q12	712	586	458	458
q13	18046	2728	1958	1958
q14	233	238	215	215
q15	q16	735	761	663	663
q17	737	872	440	440
q18	6425	5417	5206	5206
q19	1102	979	615	615
q20	536	489	365	365
q21	4651	2109	1463	1463
q22	383	323	269	269
Total cold run time: 100579 ms
Total hot run time: 26534 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4640	4616	4617	4616
q2	q3	3926	4432	3837	3837
q4	881	1208	761	761
q5	4046	4433	4350	4350
q6	183	176	158	158
q7	1774	1709	1580	1580
q8	2502	2727	2624	2624
q9	7476	7486	7418	7418
q10	3796	3942	3825	3825
q11	507	431	417	417
q12	500	619	434	434
q13	2766	2899	2102	2102
q14	290	301	269	269
q15	q16	737	773	716	716
q17	1178	1380	1340	1340
q18	7269	6919	6762	6762
q19	895	952	901	901
q20	2085	2151	1986	1986
q21	3906	3487	3298	3298
q22	452	444	406	406
Total cold run time: 49809 ms
Total hot run time: 47800 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 169391 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 34767b88726ba013e97e8d1f2385ccea1fd02453, data reload: false

query5	4348	620	507	507
query6	336	222	205	205
query7	4215	459	265	265
query8	346	233	232	232
query9	8714	2723	2686	2686
query10	506	385	346	346
query11	7028	5084	4889	4889
query12	179	131	132	131
query13	1277	475	353	353
query14	5697	3764	3448	3448
query14_1	2843	2819	2862	2819
query15	204	195	182	182
query16	1012	483	476	476
query17	1129	733	632	632
query18	2486	445	342	342
query19	210	211	201	201
query20	128	122	124	122
query21	214	132	107	107
query22	13242	14236	14584	14236
query23	16575	16249	16062	16062
query23_1	16338	16000	15540	15540
query24	7093	1628	1215	1215
query24_1	1219	1229	1209	1209
query25	558	458	452	452
query26	1238	262	147	147
query27	2784	476	292	292
query28	4510	1844	1850	1844
query29	877	568	466	466
query30	300	223	190	190
query31	1013	958	881	881
query32	89	77	66	66
query33	512	339	283	283
query34	909	857	526	526
query35	639	671	589	589
query36	1049	1146	961	961
query37	131	86	86	86
query38	2949	2927	2863	2863
query39	852	844	818	818
query39_1	798	796	797	796
query40	227	147	136	136
query41	63	59	58	58
query42	261	254	255	254
query43	234	246	223	223
query44	
query45	192	191	186	186
query46	883	985	635	635
query47	2134	2129	2061	2061
query48	299	314	227	227
query49	680	467	390	390
query50	679	274	218	218
query51	4100	4061	4003	4003
query52	260	276	253	253
query53	288	338	281	281
query54	297	268	261	261
query55	95	90	86	86
query56	310	317	319	317
query57	1924	1904	1643	1643
query58	282	274	272	272
query59	2777	2969	2762	2762
query60	340	332	310	310
query61	159	154	161	154
query62	631	574	539	539
query63	309	288	277	277
query64	5166	1283	987	987
query65	
query66	1463	449	363	363
query67	24265	24332	24345	24332
query68	
query69	408	314	292	292
query70	967	952	941	941
query71	348	312	293	293
query72	2979	2907	2627	2627
query73	559	564	320	320
query74	9607	9597	9406	9406
query75	2860	2798	2515	2515
query76	2291	1031	694	694
query77	376	427	335	335
query78	10954	11204	10494	10494
query79	2275	791	610	610
query80	1620	633	557	557
query81	537	260	226	226
query82	992	147	119	119
query83	328	264	249	249
query84	253	125	98	98
query85	899	494	472	472
query86	422	325	274	274
query87	3151	3215	3038	3038
query88	3530	2673	2651	2651
query89	430	368	344	344
query90	2022	183	179	179
query91	171	174	136	136
query92	77	75	72	72
query93	982	852	506	506
query94	638	282	291	282
query95	584	342	320	320
query96	642	508	227	227
query97	2465	2522	2413	2413
query98	240	219	219	219
query99	1001	983	899	899
Total cold run time: 251650 ms
Total hot run time: 169391 ms

@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.67% (1796/2283)
Line Coverage 64.40% (32282/50128)
Region Coverage 65.26% (16160/24761)
Branch Coverage 55.70% (8608/15454)

@kaka11chen kaka11chen force-pushed the data_lake_reader_refactoring branch from 1d321df to 2d94bf6 Compare March 30, 2026 01:51
@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@doris-robot
Copy link
Copy Markdown

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.67% (1796/2283)
Line Coverage 64.40% (32280/50128)
Region Coverage 65.26% (16159/24761)
Branch Coverage 55.73% (8613/15454)

@doris-robot
Copy link
Copy Markdown

TPC-H: Total hot run time: 26863 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2d94bf6116ccff6027d9acea9d9747e1d4c7bb96, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17417	4486	4318	4318
q2	q3	10533	795	542	542
q4	4678	372	261	261
q5	7610	1255	1078	1078
q6	182	185	150	150
q7	805	879	691	691
q8	9376	1530	1379	1379
q9	4933	4755	4765	4755
q10	6302	1913	1667	1667
q11	470	268	270	268
q12	745	575	486	486
q13	18079	2728	1982	1982
q14	238	233	222	222
q15	q16	743	732	679	679
q17	736	853	442	442
q18	5927	5363	5218	5218
q19	1136	1001	602	602
q20	557	499	380	380
q21	4459	1902	1455	1455
q22	411	467	288	288
Total cold run time: 95337 ms
Total hot run time: 26863 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4866	4721	4679	4679
q2	q3	3897	4356	3825	3825
q4	877	1230	793	793
q5	4047	4430	4378	4378
q6	189	189	147	147
q7	1815	1670	1546	1546
q8	2554	2768	2689	2689
q9	7566	7453	7558	7453
q10	3762	4039	3586	3586
q11	506	470	416	416
q12	504	598	446	446
q13	2501	3018	2094	2094
q14	282	299	276	276
q15	q16	722	768	738	738
q17	1167	1327	1400	1327
q18	7302	6801	6694	6694
q19	933	911	911	911
q20	2151	2165	2035	2035
q21	3999	3477	3334	3334
q22	542	568	391	391
Total cold run time: 50182 ms
Total hot run time: 47758 ms

@doris-robot
Copy link
Copy Markdown

TPC-DS: Total hot run time: 168909 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2d94bf6116ccff6027d9acea9d9747e1d4c7bb96, data reload: false

query5	4357	661	515	515
query6	343	228	204	204
query7	4206	475	259	259
query8	333	245	227	227
query9	8684	2723	2683	2683
query10	490	397	331	331
query11	6985	5098	4908	4908
query12	182	130	122	122
query13	1301	470	335	335
query14	5733	3695	3410	3410
query14_1	2853	2796	2775	2775
query15	203	193	174	174
query16	952	463	452	452
query17	1000	698	593	593
query18	2450	447	340	340
query19	208	213	172	172
query20	132	123	122	122
query21	213	133	108	108
query22	13312	13905	14391	13905
query23	16806	16439	15893	15893
query23_1	16234	16006	15802	15802
query24	7151	1620	1222	1222
query24_1	1231	1229	1237	1229
query25	585	501	446	446
query26	1243	265	153	153
query27	2799	486	294	294
query28	4499	1884	1819	1819
query29	813	560	484	484
query30	301	230	191	191
query31	1019	951	868	868
query32	89	72	67	67
query33	520	344	299	299
query34	914	879	514	514
query35	644	671	604	604
query36	1105	1118	1002	1002
query37	141	93	91	91
query38	2939	2870	2901	2870
query39	865	832	807	807
query39_1	801	798	797	797
query40	258	153	139	139
query41	61	61	59	59
query42	263	253	255	253
query43	235	248	220	220
query44	
query45	193	188	181	181
query46	878	993	606	606
query47	2734	2727	2076	2076
query48	321	330	230	230
query49	632	457	380	380
query50	704	278	219	219
query51	4124	3985	4012	3985
query52	270	264	256	256
query53	288	327	283	283
query54	319	275	309	275
query55	92	90	84	84
query56	310	324	315	315
query57	1933	1732	1709	1709
query58	283	263	272	263
query59	2783	2972	2820	2820
query60	351	338	327	327
query61	159	155	156	155
query62	636	590	541	541
query63	306	277	274	274
query64	4974	1281	1010	1010
query65	
query66	1469	455	363	363
query67	24242	24259	24113	24113
query68	
query69	415	315	293	293
query70	961	971	941	941
query71	326	308	296	296
query72	2845	2728	2577	2577
query73	544	553	327	327
query74	9635	9549	9403	9403
query75	2884	2767	2514	2514
query76	2266	1055	683	683
query77	386	387	334	334
query78	10885	11155	10515	10515
query79	1134	792	571	571
query80	1037	654	574	574
query81	528	263	237	237
query82	1336	162	124	124
query83	341	282	286	282
query84	277	123	107	107
query85	900	521	458	458
query86	392	321	291	291
query87	3127	3140	3037	3037
query88	3543	2661	2602	2602
query89	432	369	348	348
query90	1887	175	168	168
query91	171	166	145	145
query92	80	76	69	69
query93	917	832	508	508
query94	521	323	297	297
query95	579	345	324	324
query96	643	528	225	225
query97	2497	2496	2411	2411
query98	233	219	217	217
query99	1024	1011	941	941
Total cold run time: 250788 ms
Total hot run time: 168909 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 63.64% (7/11) 🎉
Increment coverage report
Complete coverage report

@kaka11chen kaka11chen force-pushed the data_lake_reader_refactoring branch from 2d94bf6 to 881d907 Compare March 30, 2026 15:40
@kaka11chen kaka11chen force-pushed the data_lake_reader_refactoring branch from 881d907 to 9613fb2 Compare March 30, 2026 17:18
@kaka11chen
Copy link
Copy Markdown
Contributor Author

run buildall

@doris-robot
Copy link
Copy Markdown

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.67% (1796/2283)
Line Coverage 64.44% (32304/50128)
Region Coverage 65.31% (16171/24761)
Branch Coverage 55.77% (8618/15454)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants