Skip to content

[feat](udf) support "prefer_udf_over_builtin" session variable #51195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 27, 2025

Conversation

morningman
Copy link
Contributor

@morningman morningman commented May 23, 2025

What problem does this PR solve?

Problem Summary:

Add a new session variable prefer_udf_over_builtin. Default is false.
If set to true, the planner will first find function from UDF, then builtin function.

This feature is useful when user migration from other system to Doris, but function behavior
with same signature betweem 2 system are different.
In this case, user can create a UDF with same signature to override the builtin function,
so that user don't need to change their SQL.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman morningman changed the title [feat](udf) support prefer_udf_over_builtin session variable [feat](udf) support "prefer_udf_over_builtin" session variable May 23, 2025
@morningman morningman added usercase Important user case type label dev/3.0.x labels May 23, 2025
@morningman
Copy link
Contributor Author

run buildall

@morningman morningman marked this pull request as ready for review May 24, 2025 02:32
@doris-robot
Copy link

TPC-H: Total hot run time: 34700 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit af6045600c95ee11af9d1e22f0155ea2dafb9a15, data reload: false

------ Round 1 ----------------------------------
q1	26028	5045	5055	5045
q2	2067	264	184	184
q3	10418	1238	724	724
q4	10461	982	519	519
q5	7583	2366	2590	2366
q6	181	165	138	138
q7	916	730	642	642
q8	9405	1252	1211	1211
q9	7026	5112	5046	5046
q10	6859	2305	1904	1904
q11	497	291	285	285
q12	342	348	221	221
q13	17779	3674	3101	3101
q14	228	234	210	210
q15	530	500	489	489
q16	409	435	389	389
q17	626	872	383	383
q18	7548	7214	7106	7106
q19	1480	964	572	572
q20	338	336	224	224
q21	4133	3409	2992	2992
q22	1018	996	949	949
Total cold run time: 115872 ms
Total hot run time: 34700 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5250	5073	5107	5073
q2	239	326	227	227
q3	2158	2662	2278	2278
q4	1382	1794	1490	1490
q5	4445	4434	4392	4392
q6	220	164	133	133
q7	2012	1937	1755	1755
q8	2632	2524	2542	2524
q9	7209	7162	6996	6996
q10	3026	3179	2805	2805
q11	599	511	501	501
q12	656	750	606	606
q13	3574	3867	3302	3302
q14	271	307	307	307
q15	526	490	463	463
q16	444	487	438	438
q17	1164	1557	1389	1389
q18	7636	7574	7400	7400
q19	875	906	985	906
q20	2046	2018	1907	1907
q21	4806	4567	4370	4370
q22	1128	1036	1042	1036
Total cold run time: 52298 ms
Total hot run time: 50298 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193272 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit af6045600c95ee11af9d1e22f0155ea2dafb9a15, data reload: false

query1	1417	1115	1054	1054
query2	6282	1854	1851	1851
query3	11055	4472	4588	4472
query4	53807	25117	23083	23083
query5	5170	509	459	459
query6	351	241	203	203
query7	4912	507	292	292
query8	322	258	244	244
query9	5785	2660	2656	2656
query10	456	330	277	277
query11	15027	15507	14976	14976
query12	169	110	107	107
query13	1050	537	414	414
query14	10093	6474	6452	6452
query15	202	195	180	180
query16	7048	673	506	506
query17	1092	736	575	575
query18	1545	404	325	325
query19	205	203	164	164
query20	137	167	137	137
query21	217	132	108	108
query22	4374	4558	4297	4297
query23	34269	33683	33639	33639
query24	6604	2398	2418	2398
query25	448	487	406	406
query26	650	270	155	155
query27	2257	515	350	350
query28	3020	2187	2149	2149
query29	594	588	479	479
query30	272	225	187	187
query31	874	868	796	796
query32	75	66	66	66
query33	492	367	333	333
query34	792	860	542	542
query35	790	839	773	773
query36	980	1004	894	894
query37	115	103	76	76
query38	4219	4306	4198	4198
query39	1513	1447	1518	1447
query40	213	121	120	120
query41	58	57	52	52
query42	127	105	111	105
query43	493	503	480	480
query44	1387	871	862	862
query45	182	176	171	171
query46	858	1030	682	682
query47	1841	1851	1796	1796
query48	433	452	331	331
query49	702	525	429	429
query50	689	706	419	419
query51	4185	4191	4280	4191
query52	113	118	101	101
query53	234	258	201	201
query54	606	597	539	539
query55	92	91	91	91
query56	339	322	300	300
query57	1157	1216	1146	1146
query58	279	264	262	262
query59	2612	2699	2697	2697
query60	345	330	317	317
query61	126	143	142	142
query62	746	764	663	663
query63	225	187	187	187
query64	1444	1054	720	720
query65	4330	4246	4196	4196
query66	721	412	300	300
query67	16130	15578	15522	15522
query68	5415	904	540	540
query69	518	333	274	274
query70	1146	1110	1088	1088
query71	474	325	295	295
query72	5784	4830	5045	4830
query73	1201	682	366	366
query74	9019	8855	8991	8855
query75	3429	3212	2729	2729
query76	3752	1188	746	746
query77	546	378	306	306
query78	10137	10234	9306	9306
query79	1529	794	573	573
query80	836	532	446	446
query81	517	252	220	220
query82	410	128	101	101
query83	269	247	241	241
query84	292	103	95	95
query85	754	358	316	316
query86	347	305	281	281
query87	4406	4451	4412	4412
query88	2919	2316	2293	2293
query89	392	333	279	279
query90	1735	203	208	203
query91	154	151	113	113
query92	65	62	65	62
query93	1341	987	591	591
query94	632	418	301	301
query95	374	293	298	293
query96	505	562	275	275
query97	2666	2764	2648	2648
query98	235	209	204	204
query99	1324	1375	1248	1248
Total cold run time: 292470 ms
Total hot run time: 193272 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.69 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit af6045600c95ee11af9d1e22f0155ea2dafb9a15, data reload: false

query1	0.04	0.04	0.03
query2	0.12	0.11	0.12
query3	0.25	0.19	0.19
query4	1.59	0.20	0.11
query5	0.44	0.42	0.41
query6	1.16	0.66	0.66
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.51	0.51
query10	0.58	0.58	0.57
query11	0.15	0.11	0.11
query12	0.14	0.12	0.12
query13	0.62	0.59	0.60
query14	0.79	0.80	0.81
query15	0.88	0.86	0.87
query16	0.39	0.38	0.38
query17	1.03	1.04	1.08
query18	0.22	0.21	0.20
query19	1.95	1.80	1.81
query20	0.02	0.01	0.01
query21	15.40	0.86	0.52
query22	0.77	1.19	0.64
query23	14.90	1.37	0.62
query24	7.57	1.39	0.43
query25	0.50	0.22	0.07
query26	0.65	0.16	0.14
query27	0.06	0.04	0.04
query28	9.18	0.88	0.43
query29	12.61	4.12	3.47
query30	0.27	0.08	0.07
query31	2.81	0.58	0.39
query32	3.23	0.54	0.47
query33	3.02	3.02	3.04
query34	15.89	5.09	4.53
query35	4.50	4.52	4.49
query36	0.64	0.48	0.48
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.03
query40	0.16	0.14	0.14
query41	0.09	0.03	0.02
query42	0.04	0.02	0.02
query43	0.03	0.04	0.03
Total cold run time: 103.51 s
Total hot run time: 28.69 s

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33903 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d548d558d9b47e816965ee68005e3411d2d2f81a, data reload: false

------ Round 1 ----------------------------------
q1	26163	5025	4996	4996
q2	2068	281	193	193
q3	10501	1279	701	701
q4	10222	1031	515	515
q5	7636	2382	2323	2323
q6	178	162	131	131
q7	891	743	608	608
q8	9300	1270	1072	1072
q9	6765	5053	5042	5042
q10	6869	2313	1914	1914
q11	496	289	270	270
q12	348	346	211	211
q13	17780	3646	3064	3064
q14	226	223	219	219
q15	522	482	503	482
q16	418	431	364	364
q17	631	850	388	388
q18	7760	7331	7267	7267
q19	1708	942	553	553
q20	351	344	235	235
q21	3968	3195	2401	2401
q22	1041	1037	954	954
Total cold run time: 115842 ms
Total hot run time: 33903 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5149	5047	5074	5047
q2	243	332	233	233
q3	2150	2668	2264	2264
q4	1378	1769	1350	1350
q5	4550	4422	4357	4357
q6	215	176	125	125
q7	1988	1878	1718	1718
q8	2603	2486	2405	2405
q9	7115	7125	7173	7125
q10	2992	3196	2780	2780
q11	573	542	487	487
q12	680	775	637	637
q13	3586	3901	3199	3199
q14	281	284	277	277
q15	521	480	471	471
q16	436	494	427	427
q17	1115	1565	1401	1401
q18	7739	7532	7420	7420
q19	788	784	940	784
q20	1940	2015	1867	1867
q21	4785	4320	4288	4288
q22	1026	1020	992	992
Total cold run time: 51853 ms
Total hot run time: 49654 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 186105 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d548d558d9b47e816965ee68005e3411d2d2f81a, data reload: false

query1	1005	499	498	498
query2	6588	1834	1821	1821
query3	6745	253	224	224
query4	26463	23814	23051	23051
query5	4352	651	458	458
query6	312	219	201	201
query7	4626	501	291	291
query8	306	274	255	255
query9	8623	2642	2640	2640
query10	477	317	271	271
query11	15491	15085	14844	14844
query12	152	107	111	107
query13	1666	562	411	411
query14	8822	6098	6167	6098
query15	207	187	167	167
query16	7133	632	468	468
query17	1114	695	555	555
query18	1974	394	291	291
query19	200	185	194	185
query20	117	115	130	115
query21	217	126	104	104
query22	4148	4099	4211	4099
query23	33900	33092	33171	33092
query24	8414	2372	2407	2372
query25	527	473	390	390
query26	1234	266	153	153
query27	2787	509	337	337
query28	4319	2125	2095	2095
query29	801	566	464	464
query30	283	218	185	185
query31	934	829	784	784
query32	77	65	69	65
query33	561	383	327	327
query34	816	851	524	524
query35	804	823	729	729
query36	970	990	909	909
query37	112	101	80	80
query38	4194	4041	4104	4041
query39	1503	1407	1412	1407
query40	221	125	113	113
query41	64	98	55	55
query42	129	108	112	108
query43	518	510	477	477
query44	1304	854	825	825
query45	187	177	165	165
query46	848	1030	641	641
query47	1766	1779	1721	1721
query48	385	425	310	310
query49	776	516	464	464
query50	651	693	420	420
query51	4114	4105	4047	4047
query52	117	106	100	100
query53	219	250	180	180
query54	588	586	512	512
query55	86	87	93	87
query56	314	303	289	289
query57	1134	1124	1091	1091
query58	268	258	269	258
query59	2506	2674	2523	2523
query60	330	309	312	309
query61	130	120	119	119
query62	811	759	679	679
query63	228	184	184	184
query64	4327	1005	673	673
query65	4296	4265	4256	4256
query66	1173	416	309	309
query67	16028	15759	15462	15462
query68	7699	889	531	531
query69	469	310	271	271
query70	1168	1104	1120	1104
query71	472	323	294	294
query72	5784	4775	4773	4773
query73	678	618	353	353
query74	9205	8817	8788	8788
query75	3756	3191	2720	2720
query76	3583	1188	748	748
query77	798	373	317	317
query78	10181	10178	9447	9447
query79	2669	788	576	576
query80	619	511	444	444
query81	524	256	218	218
query82	469	128	97	97
query83	257	249	228	228
query84	265	103	87	87
query85	825	350	368	350
query86	391	314	298	298
query87	4504	4417	4414	4414
query88	3880	2316	2318	2316
query89	398	314	286	286
query90	1855	207	209	207
query91	143	147	114	114
query92	75	61	61	61
query93	2441	959	584	584
query94	648	403	308	308
query95	374	289	283	283
query96	496	572	284	284
query97	2698	2748	2649	2649
query98	229	214	203	203
query99	1367	1420	1267	1267
Total cold run time: 275412 ms
Total hot run time: 186105 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.11 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d548d558d9b47e816965ee68005e3411d2d2f81a, data reload: false

query1	0.04	0.03	0.03
query2	0.14	0.10	0.12
query3	0.25	0.20	0.20
query4	1.59	0.19	0.20
query5	0.47	0.44	0.44
query6	1.15	0.67	0.66
query7	0.03	0.01	0.01
query8	0.05	0.03	0.04
query9	0.59	0.53	0.54
query10	0.58	0.58	0.55
query11	0.15	0.10	0.11
query12	0.15	0.11	0.11
query13	0.62	0.62	0.60
query14	0.80	0.80	0.82
query15	0.88	0.88	0.86
query16	0.37	0.39	0.39
query17	1.03	1.06	1.03
query18	0.22	0.21	0.21
query19	1.90	1.77	1.78
query20	0.01	0.01	0.01
query21	15.46	0.86	0.54
query22	0.76	1.19	0.70
query23	14.89	1.29	0.66
query24	6.75	1.32	0.55
query25	0.51	0.10	0.06
query26	0.52	0.17	0.15
query27	0.05	0.05	0.05
query28	9.30	0.94	0.46
query29	12.57	4.21	3.53
query30	0.25	0.10	0.06
query31	2.82	0.61	0.38
query32	3.23	0.56	0.47
query33	3.02	3.10	3.06
query34	15.85	5.09	4.52
query35	4.46	4.53	4.50
query36	0.67	0.50	0.49
query37	0.08	0.06	0.07
query38	0.05	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.12	0.12
query41	0.08	0.03	0.03
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 102.6 s
Total hot run time: 29.11 s

@morningman morningman requested a review from morrySnow May 26, 2025 13:29
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label May 27, 2025
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit ae9453e into apache:master May 27, 2025
28 of 29 checks passed
github-actions bot pushed a commit that referenced this pull request May 27, 2025
### What problem does this PR solve?

Problem Summary:

Add a new session variable `prefer_udf_over_builtin`. Default is false.
If set to true, the planner will first find function from UDF, then
builtin function.

This feature is useful when user migration from other system to Doris,
but function behavior
with same signature between 2 system are different.
In this case, user can create a UDF with same signature to override the
builtin function,
so that user don't need to change their SQL.
koarz pushed a commit to koarz/doris that referenced this pull request Jun 4, 2025
…e#51195)

### What problem does this PR solve?

Problem Summary:

Add a new session variable `prefer_udf_over_builtin`. Default is false.
If set to true, the planner will first find function from UDF, then
builtin function.

This feature is useful when user migration from other system to Doris,
but function behavior
with same signature between 2 system are different.
In this case, user can create a UDF with same signature to override the
builtin function,
so that user don't need to change their SQL.
dataroaring pushed a commit that referenced this pull request Jun 11, 2025
…iable #51195 (#51275)

Cherry-picked from #51195

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.7-merged reviewed usercase Important user case type label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants