Skip to content

Conversation

@zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Oct 9, 2025

What problem does this PR solve?

We are removing LakeSoul external catalog support from the latest Doris release due to several critical issues:

  1. Design Flaws – The current driver management design causes conflicts with other components, leading to instability and increased maintenance complexity.
  2. Security Vulnerabilities – LakeSoul dependencies contain multiple CVE-reported vulnerabilities, posing significant security risks.
  3. Lack of Maintenance – The LakeSoul catalog integration in Doris lacks active maintainers and has not received timely updates.
  4. No User Adoption – No user feedback or usage requests have been received, indicating that the feature has not been used in practice.

Given these factors, maintaining this integration introduces unnecessary security and maintenance burdens without providing tangible value to users. Therefore, we have decided to remove LakeSoul catalog support from this release.


Code Removal

  • Removed all LakeSoul Java implementation code from FE (fe/fe-core/src/main/java/org/apache/doris/datasource/lakesoul/)
  • Removed LakeSoul scanner BE Java extension module (fe/be-java-extensions/lakesoul-scanner/)
  • Removed C++ JNI reader implementation (be/src/vec/exec/format/table/lakesoul_jni_reader.*)
  • Removed all unit tests and regression tests related to LakeSoul
  • Removed Docker deployment configurations

Dependency Cleanup

  • Removed lakesoul-io-java dependency from fe/fe-core/pom.xml
  • Removed lakesoul-scanner module from Maven reactor and build scripts

Code Reference Cleanup

  • Removed LakeSoul catalog factory logic from CatalogFactory.java
  • Removed LAKESOUL enum from TableFormatType.java and TableIf.TableType
  • Removed LakeSoul scan node references from planner and statistics modules
  • Removed GSON serialization registration for LakeSoul classes

Backward Compatibility

  • Marked Thrift structs (TLakeSoulTable, TLakeSoulFileDesc) as deprecated instead of removing them
  • Marked LAKESOUL enum values in InitCatalogLog.Type and InitDatabaseLog.Type as @Deprecated

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@zy-kkk zy-kkk requested a review from morrySnow as a code owner October 9, 2025 07:33
@Thearas
Copy link
Contributor

Thearas commented Oct 9, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 9, 2025

run buildall

@zy-kkk zy-kkk changed the title [chore](catalog) Remove LakeSoul External Catalog Support branch-3.1: [chore](catalog) Remove LakeSoul External Catalog Support Oct 9, 2025
@doris-robot
Copy link

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 82.35% (1236/1501)
Line Coverage 66.22% (22318/33704)
Region Coverage 67.56% (11169/16532)
Branch Coverage 57.32% (5913/10316)

@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 9, 2025

run buildall

@hello-stephen
Copy link
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 82.35% (1236/1501)
Line Coverage 66.28% (22338/33704)
Region Coverage 67.62% (11179/16532)
Branch Coverage 57.39% (5920/10316)

@doris-robot
Copy link

TPC-H: Total hot run time: 33034 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d009c19086165dd3ff64c0ff0248b66722ee0ab4, data reload: false

------ Round 1 ----------------------------------
q1	17895	5553	5531	5531
q2	2040	407	298	298
q3	12315	1279	753	753
q4	10275	882	449	449
q5	8961	2413	2168	2168
q6	238	163	131	131
q7	901	783	608	608
q8	9342	1481	1185	1185
q9	5341	5035	4997	4997
q10	6807	2297	1815	1815
q11	485	279	261	261
q12	344	370	216	216
q13	17795	3610	3032	3032
q14	230	229	211	211
q15	528	469	471	469
q16	421	441	375	375
q17	611	876	365	365
q18	7054	6573	6403	6403
q19	1487	962	548	548
q20	350	339	208	208
q21	2858	2184	1998	1998
q22	1070	1066	1013	1013
Total cold run time: 107348 ms
Total hot run time: 33034 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5559	5517	5509	5509
q2	238	335	236	236
q3	2248	2660	2357	2357
q4	1328	1846	1394	1394
q5	4436	5140	5043	5043
q6	173	163	128	128
q7	2120	1955	1815	1815
q8	2692	2879	2755	2755
q9	7304	7392	7346	7346
q10	3047	3222	2795	2795
q11	563	512	516	512
q12	705	775	608	608
q13	3403	3795	3172	3172
q14	281	299	266	266
q15	525	481	477	477
q16	427	470	451	451
q17	1267	1758	1278	1278
q18	7544	7483	7451	7451
q19	833	1156	1071	1071
q20	2013	2055	1889	1889
q21	5400	4988	4482	4482
q22	1082	1071	1043	1043
Total cold run time: 53188 ms
Total hot run time: 52078 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192890 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d009c19086165dd3ff64c0ff0248b66722ee0ab4, data reload: false

query1	991	399	404	399
query2	6143	1965	1893	1893
query3	8688	200	199	199
query4	33728	24225	23550	23550
query5	3893	610	469	469
query6	326	196	180	180
query7	4205	508	315	315
query8	303	243	233	233
query9	9435	2639	2629	2629
query10	489	334	263	263
query11	17966	15426	15229	15229
query12	159	110	109	109
query13	1572	565	425	425
query14	10368	6662	6749	6662
query15	226	195	182	182
query16	8044	674	536	536
query17	1550	773	598	598
query18	2196	421	340	340
query19	208	221	180	180
query20	142	125	124	124
query21	207	138	120	120
query22	4678	4643	4477	4477
query23	35432	34554	34221	34221
query24	7656	2719	2743	2719
query25	554	505	445	445
query26	1192	309	173	173
query27	2313	506	375	375
query28	5436	2283	2278	2278
query29	794	626	461	461
query30	254	200	168	168
query31	1050	931	837	837
query32	96	59	62	59
query33	497	367	309	309
query34	764	866	521	521
query35	821	824	743	743
query36	1004	1077	948	948
query37	106	93	70	70
query38	4036	4006	3973	3973
query39	1541	1492	1477	1477
query40	211	122	112	112
query41	51	51	46	46
query42	121	104	103	103
query43	497	521	495	495
query44	1362	850	834	834
query45	188	189	178	178
query46	915	1069	689	689
query47	2032	2003	1919	1919
query48	409	429	353	353
query49	769	499	421	421
query50	698	706	436	436
query51	7355	7386	7244	7244
query52	113	103	91	91
query53	238	264	191	191
query54	561	582	483	483
query55	78	81	81	81
query56	274	278	283	278
query57	1276	1263	1227	1227
query58	237	225	219	219
query59	3094	3205	3108	3108
query60	295	297	292	292
query61	112	112	120	112
query62	867	779	733	733
query63	240	198	196	196
query64	4470	1014	636	636
query65	3423	3287	3340	3287
query66	1030	423	309	309
query67	16205	16020	15729	15729
query68	7853	859	550	550
query69	486	308	269	269
query70	1180	1170	1106	1106
query71	417	300	262	262
query72	5788	3799	3789	3789
query73	647	756	359	359
query74	10465	9148	8954	8954
query75	3562	3183	2664	2664
query76	3299	1196	802	802
query77	784	371	272	272
query78	10326	10401	9579	9579
query79	3822	894	592	592
query80	744	521	443	443
query81	497	258	219	219
query82	594	116	90	90
query83	177	167	145	145
query84	289	102	81	81
query85	809	364	306	306
query86	393	330	295	295
query87	4353	4402	4256	4256
query88	4970	2437	2412	2412
query89	408	328	297	297
query90	1776	195	196	195
query91	141	151	110	110
query92	69	56	54	54
query93	2234	889	566	566
query94	654	410	311	311
query95	350	284	266	266
query96	503	620	287	287
query97	3228	3318	3180	3180
query98	225	218	202	202
query99	1548	1420	1286	1286
Total cold run time: 298775 ms
Total hot run time: 192890 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d009c19086165dd3ff64c0ff0248b66722ee0ab4, data reload: false

query1	0.03	0.03	0.03
query2	0.10	0.05	0.04
query3	0.23	0.06	0.05
query4	1.63	0.08	0.08
query5	0.53	0.51	0.52
query6	1.13	0.75	0.75
query7	0.02	0.02	0.02
query8	0.06	0.05	0.05
query9	0.57	0.51	0.50
query10	0.58	0.57	0.56
query11	0.17	0.12	0.12
query12	0.16	0.12	0.12
query13	0.62	0.60	0.59
query14	0.80	0.81	0.80
query15	0.86	0.85	0.85
query16	0.39	0.38	0.38
query17	1.08	1.08	1.03
query18	0.18	0.20	0.20
query19	1.94	1.84	2.02
query20	0.02	0.01	0.02
query21	15.35	0.99	0.67
query22	0.80	0.79	0.71
query23	14.72	1.49	0.71
query24	2.23	0.38	0.22
query25	0.15	0.10	0.09
query26	0.29	0.19	0.18
query27	0.09	0.08	0.08
query28	13.41	1.29	0.55
query29	12.67	4.09	3.33
query30	0.24	0.08	0.06
query31	2.82	0.60	0.40
query32	3.23	0.56	0.49
query33	3.03	3.05	3.04
query34	16.54	5.20	4.55
query35	4.65	4.63	4.64
query36	0.64	0.49	0.51
query37	0.18	0.16	0.16
query38	0.17	0.16	0.16
query39	0.06	0.05	0.04
query40	0.16	0.14	0.12
query41	0.09	0.06	0.05
query42	0.06	0.05	0.06
query43	0.05	0.05	0.05
Total cold run time: 102.73 s
Total hot run time: 29.36 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉
Increment coverage report
Complete coverage report

@morrySnow morrySnow marked this pull request as draft October 11, 2025 03:36
@morningman morningman closed this Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants