Skip to content

Conversation

@zclllyybb
Copy link
Contributor

@zclllyybb zclllyybb commented Jan 14, 2026

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

before:

mysql> SELECT count(*) FROM hits_100m WHERE URL LIKE concat('%', SearchPhrase, '%');
+----------+
| count(*) |
+----------+
| 90144150 |
+----------+
1 row in set (4 min 5.15 sec)

now:

mysql> SELECT count(*) FROM hits_100m WHERE URL LIKE concat('%', SearchPhrase, '%');
+----------+
| count(*) |
+----------+
| 90144150 |
+----------+
1 row in set (4.50 sec)

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 14, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zclllyybb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31344 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 002661616f144bba38a44d83a6ccf970490d8a01, data reload: false

------ Round 1 ----------------------------------
q1	17624	4347	4083	4083
q2	2077	344	237	237
q3	10176	1263	713	713
q4	10204	821	305	305
q5	7518	2066	1787	1787
q6	193	174	141	141
q7	920	770	648	648
q8	9274	1371	1121	1121
q9	4980	4655	4531	4531
q10	6801	1810	1404	1404
q11	521	311	277	277
q12	702	773	616	616
q13	17786	3805	3100	3100
q14	285	294	282	282
q15	589	526	502	502
q16	703	670	641	641
q17	640	760	534	534
q18	6620	6513	6391	6391
q19	1107	960	612	612
q20	392	360	241	241
q21	3007	2388	2215	2215
q22	1024	1006	963	963
Total cold run time: 103143 ms
Total hot run time: 31344 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4087	4025	4006	4006
q2	334	398	327	327
q3	2090	2593	2221	2221
q4	1338	1738	1329	1329
q5	4074	3979	4061	3979
q6	207	169	125	125
q7	1868	1828	1725	1725
q8	2833	2478	2496	2478
q9	7361	7261	7199	7199
q10	2501	2697	2317	2317
q11	557	476	473	473
q12	720	775	679	679
q13	3690	4062	3453	3453
q14	288	308	271	271
q15	547	517	497	497
q16	628	690	658	658
q17	1249	1370	1385	1370
q18	8149	7871	7536	7536
q19	840	848	878	848
q20	1984	2042	1981	1981
q21	4726	4142	4085	4085
q22	1081	1033	973	973
Total cold run time: 51152 ms
Total hot run time: 48530 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172554 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 002661616f144bba38a44d83a6ccf970490d8a01, data reload: false

query5	4428	615	486	486
query6	349	221	210	210
query7	4224	447	243	243
query8	342	260	241	241
query9	8679	2841	2847	2841
query10	498	400	319	319
query11	15323	15299	14993	14993
query12	175	115	114	114
query13	1262	506	392	392
query14	6077	3037	2709	2709
query14_1	2620	2638	2660	2638
query15	201	190	178	178
query16	972	497	474	474
query17	1109	687	586	586
query18	2463	437	343	343
query19	233	227	200	200
query20	130	116	114	114
query21	227	140	119	119
query22	4017	3889	4001	3889
query23	16059	15635	15331	15331
query23_1	15440	15390	15340	15340
query24	7202	1532	1131	1131
query24_1	1177	1148	1138	1138
query25	553	474	419	419
query26	1226	265	171	171
query27	2762	433	268	268
query28	4593	2164	2175	2164
query29	780	552	444	444
query30	309	244	214	214
query31	800	643	561	561
query32	88	75	81	75
query33	530	370	312	312
query34	892	871	525	525
query35	731	758	727	727
query36	895	920	854	854
query37	122	89	84	84
query38	2753	2769	2658	2658
query39	783	742	732	732
query39_1	705	713	712	712
query40	218	132	119	119
query41	66	63	60	60
query42	107	101	102	101
query43	423	443	348	348
query44	1286	733	721	721
query45	189	179	178	178
query46	832	939	562	562
query47	1457	1456	1342	1342
query48	295	308	224	224
query49	606	468	338	338
query50	595	269	205	205
query51	3756	3726	3785	3726
query52	105	103	93	93
query53	281	315	264	264
query54	281	262	249	249
query55	80	81	73	73
query56	308	318	313	313
query57	1027	1004	921	921
query58	271	260	253	253
query59	2088	2101	2078	2078
query60	345	322	317	317
query61	196	162	161	161
query62	397	348	338	338
query63	300	260	262	260
query64	4910	1312	988	988
query65	3738	3690	3756	3690
query66	1452	434	325	325
query67	14994	15693	15062	15062
query68	6745	1026	691	691
query69	507	364	314	314
query70	1064	949	868	868
query71	363	295	295	295
query72	5827	3347	3392	3347
query73	789	726	314	314
query74	8778	8754	8514	8514
query75	2777	2791	2432	2432
query76	3406	1054	642	642
query77	537	375	336	336
query78	9546	9638	9153	9153
query79	1699	833	563	563
query80	612	573	485	485
query81	520	264	231	231
query82	499	149	113	113
query83	265	259	241	241
query84	261	107	96	96
query85	948	513	456	456
query86	398	299	270	270
query87	2913	2858	2750	2750
query88	4180	2107	2108	2107
query89	381	340	319	319
query90	2118	173	149	149
query91	171	164	140	140
query92	89	76	71	71
query93	1683	909	520	520
query94	587	292	303	292
query95	586	334	377	334
query96	546	473	203	203
query97	2346	2408	2289	2289
query98	215	205	198	198
query99	622	613	532	532
Total cold run time: 253309 ms
Total hot run time: 172554 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.62 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 002661616f144bba38a44d83a6ccf970490d8a01, data reload: false

query1	0.05	0.05	0.04
query2	0.10	0.05	0.04
query3	0.26	0.09	0.09
query4	1.61	0.11	0.11
query5	0.28	0.25	0.26
query6	1.14	0.66	0.64
query7	0.04	0.03	0.02
query8	0.06	0.04	0.04
query9	0.57	0.51	0.51
query10	0.54	0.55	0.56
query11	0.14	0.09	0.10
query12	0.15	0.10	0.11
query13	0.60	0.59	0.59
query14	0.95	0.95	0.96
query15	0.79	0.78	0.78
query16	0.39	0.39	0.40
query17	1.05	1.01	1.06
query18	0.23	0.21	0.21
query19	1.95	1.77	1.86
query20	0.02	0.01	0.01
query21	15.47	0.27	0.14
query22	5.16	0.04	0.04
query23	15.75	0.30	0.10
query24	1.66	0.36	0.62
query25	0.11	0.06	0.08
query26	0.14	0.13	0.14
query27	0.09	0.05	0.06
query28	4.72	1.07	0.88
query29	12.52	3.88	3.13
query30	0.28	0.13	0.12
query31	2.81	0.62	0.40
query32	3.24	0.56	0.45
query33	2.97	3.09	2.97
query34	16.16	5.10	4.37
query35	4.38	4.42	4.56
query36	0.64	0.50	0.47
query37	0.11	0.07	0.06
query38	0.07	0.04	0.04
query39	0.04	0.03	0.02
query40	0.16	0.14	0.13
query41	0.09	0.04	0.03
query42	0.04	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 97.58 s
Total hot run time: 26.62 s

@zclllyybb
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32201 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 70d2bfa21e948b54c6ace9d8a2e5a1da5c8e1d61, data reload: false

------ Round 1 ----------------------------------
q1	17601	4184	4046	4046
q2	2106	370	283	283
q3	10051	1254	742	742
q4	10218	851	306	306
q5	7882	2067	1862	1862
q6	225	169	139	139
q7	918	806	663	663
q8	9270	1391	1126	1126
q9	5095	4717	4630	4630
q10	6819	1792	1414	1414
q11	540	307	289	289
q12	731	734	582	582
q13	17792	3826	3119	3119
q14	299	296	274	274
q15	592	520	502	502
q16	719	673	645	645
q17	662	792	511	511
q18	6608	6597	7130	6597
q19	1590	1063	647	647
q20	414	384	254	254
q21	3299	2552	2636	2552
q22	1148	1133	1018	1018
Total cold run time: 104579 ms
Total hot run time: 32201 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4546	4299	4267	4267
q2	332	441	326	326
q3	2384	2695	2439	2439
q4	1475	1836	1471	1471
q5	4486	4501	4362	4362
q6	215	171	125	125
q7	1989	2003	1795	1795
q8	2671	2370	2359	2359
q9	7255	7261	7167	7167
q10	2525	2841	2214	2214
q11	526	465	447	447
q12	705	791	637	637
q13	3551	3884	3081	3081
q14	279	277	263	263
q15	518	476	474	474
q16	622	653	603	603
q17	1075	1302	1295	1295
q18	7485	7396	7282	7282
q19	831	773	804	773
q20	1884	1983	1859	1859
q21	4449	4278	4009	4009
q22	1043	1020	994	994
Total cold run time: 50846 ms
Total hot run time: 48242 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173885 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 70d2bfa21e948b54c6ace9d8a2e5a1da5c8e1d61, data reload: false

query5	4545	623	497	497
query6	339	224	204	204
query7	4220	454	265	265
query8	360	248	227	227
query9	8713	2910	2876	2876
query10	485	376	322	322
query11	15077	15331	15016	15016
query12	172	112	114	112
query13	1255	477	386	386
query14	6410	3004	2719	2719
query14_1	2683	2671	2713	2671
query15	199	198	178	178
query16	984	488	480	480
query17	1116	694	583	583
query18	2556	447	347	347
query19	236	223	199	199
query20	123	116	116	116
query21	217	141	117	117
query22	3974	4125	4105	4105
query23	16078	15802	15545	15545
query23_1	15804	15455	15691	15455
query24	7117	1538	1158	1158
query24_1	1175	1166	1191	1166
query25	562	475	420	420
query26	1246	272	150	150
query27	2752	451	275	275
query28	4563	2129	2125	2125
query29	827	516	429	429
query30	315	239	204	204
query31	781	620	553	553
query32	82	74	72	72
query33	513	354	299	299
query34	944	884	509	509
query35	717	790	685	685
query36	876	914	814	814
query37	132	95	82	82
query38	2672	2660	2610	2610
query39	770	752	732	732
query39_1	708	733	709	709
query40	218	135	118	118
query41	69	63	63	63
query42	102	99	100	99
query43	431	461	422	422
query44	1303	727	744	727
query45	187	182	173	173
query46	829	953	558	558
query47	1377	1467	1380	1380
query48	318	316	243	243
query49	599	430	340	340
query50	611	267	201	201
query51	3791	3715	3747	3715
query52	105	110	97	97
query53	295	321	272	272
query54	285	262	266	262
query55	78	84	77	77
query56	306	302	304	302
query57	1020	1072	972	972
query58	263	256	247	247
query59	2069	2132	2060	2060
query60	339	342	332	332
query61	166	163	159	159
query62	414	354	303	303
query63	293	258	266	258
query64	4962	1280	985	985
query65	3846	3817	3733	3733
query66	1432	417	317	317
query67	15295	16072	14726	14726
query68	4806	997	729	729
query69	520	361	324	324
query70	1013	956	936	936
query71	338	307	288	288
query72	5752	3336	3637	3336
query73	768	726	307	307
query74	8777	8785	8619	8619
query75	2801	2829	2453	2453
query76	3490	1073	647	647
query77	518	374	301	301
query78	9760	9867	9221	9221
query79	1583	886	578	578
query80	671	582	509	509
query81	523	271	227	227
query82	223	145	111	111
query83	266	261	249	249
query84	259	115	99	99
query85	903	520	453	453
query86	337	293	279	279
query87	2900	2847	2783	2783
query88	3475	2582	2558	2558
query89	380	355	335	335
query90	2026	171	167	167
query91	174	168	149	149
query92	75	71	73	71
query93	1039	911	523	523
query94	573	327	303	303
query95	594	330	373	330
query96	651	517	227	227
query97	2429	2396	2319	2319
query98	221	201	197	197
query99	590	585	504	504
Total cold run time: 250615 ms
Total hot run time: 173885 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.92 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 70d2bfa21e948b54c6ace9d8a2e5a1da5c8e1d61, data reload: false

query1	0.06	0.05	0.05
query2	0.10	0.05	0.05
query3	0.25	0.08	0.08
query4	1.61	0.11	0.11
query5	0.27	0.26	0.25
query6	1.14	0.67	0.64
query7	0.03	0.03	0.02
query8	0.05	0.04	0.04
query9	0.57	0.50	0.48
query10	0.56	0.54	0.54
query11	0.15	0.10	0.09
query12	0.14	0.10	0.11
query13	0.60	0.60	0.59
query14	0.97	0.96	0.95
query15	0.80	0.77	0.79
query16	0.39	0.40	0.38
query17	1.06	1.09	1.07
query18	0.23	0.22	0.22
query19	1.96	1.82	1.78
query20	0.02	0.02	0.01
query21	15.47	0.30	0.14
query22	5.08	0.05	0.05
query23	15.97	0.28	0.11
query24	2.40	0.44	0.31
query25	0.06	0.09	0.09
query26	0.14	0.13	0.14
query27	0.09	0.08	0.06
query28	3.38	1.08	0.90
query29	12.51	3.95	3.20
query30	0.28	0.14	0.12
query31	2.82	0.65	0.39
query32	3.25	0.57	0.46
query33	3.02	3.02	3.11
query34	16.34	5.13	4.43
query35	4.42	4.45	4.44
query36	0.65	0.50	0.48
query37	0.11	0.06	0.06
query38	0.07	0.05	0.04
query39	0.05	0.03	0.03
query40	0.17	0.14	0.14
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.05	0.04	0.04
Total cold run time: 97.43 s
Total hot run time: 26.92 s

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 15, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@BiteTheDDDDt BiteTheDDDDt requested a review from Copilot January 15, 2026 02:09
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the LIKE function for non-literal pattern modes by introducing fast-path pattern analysis to avoid regex compilation for simple patterns. The optimization reduces query time from over 4 minutes to 4.5 seconds (a ~54x speedup) for patterns like LIKE concat('%', SearchPhrase, '%').

Changes:

  • Added LikeFastPath enum to categorize pattern types (ALLPASS, EQUALS, STARTS_WITH, ENDS_WITH, SUBSTRING, REGEX)
  • Introduced extract_like_fast_path function for lightweight pattern analysis without regex compilation
  • Modified like_fn_scalar to use fast-path implementations before falling back to regex

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
be/src/vec/functions/like.h Added LikeFastPath enum and extract_like_fast_path function for pattern analysis; updated includes to C++ standard headers
be/src/vec/functions/like.cpp Modified like_fn_scalar to implement fast-path matching for simple patterns before falling back to regex

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zclllyybb
Copy link
Contributor Author

run beut

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 98.98% (97/98) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.98% (18989/35842)
Line Coverage 39.03% (175957/450871)
Region Coverage 33.62% (136318/405460)
Branch Coverage 34.67% (58952/170058)

@zclllyybb
Copy link
Contributor Author

run p0

@zclllyybb
Copy link
Contributor Author

run external

@zclllyybb
Copy link
Contributor Author

run vault_p0

@zclllyybb
Copy link
Contributor Author

run cloud_p0

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 98.98% (97/98) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 72.20% (25354/35117)
Line Coverage 59.01% (265667/450213)
Region Coverage 53.90% (220948/409887)
Branch Coverage 55.52% (94837/170806)

@zclllyybb zclllyybb merged commit b94e802 into apache:master Jan 16, 2026
34 of 36 checks passed
@zclllyybb zclllyybb deleted the like branch January 16, 2026 00:44
github-actions bot pushed a commit that referenced this pull request Jan 16, 2026
before:
```sql
mysql> SELECT count(*) FROM hits_100m WHERE URL LIKE concat('%', SearchPhrase, '%');
+----------+
| count(*) |
+----------+
| 90144150 |
+----------+
1 row in set (4 min 5.15 sec)
```

now:
```sql
mysql> SELECT count(*) FROM hits_100m WHERE URL LIKE concat('%', SearchPhrase, '%');
+----------+
| count(*) |
+----------+
| 90144150 |
+----------+
1 row in set (4.50 sec)
```
yiguolei pushed a commit that referenced this pull request Jan 16, 2026
…des #59866 (#59943)

Cherry-picked from #59866

Co-authored-by: zclllyybb <zhaochangle@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.3-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants