Skip to content

Conversation

@airborne12
Copy link
Member

@airborne12 airborne12 commented Jan 13, 2026

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #59394

Problem Summary:

This PR adds fields and type parameters to the SEARCH function, allowing queries to search across multiple fields with a single query term. This is similar to Elasticsearch's multi_match query with best_fields and cross_fields types.

Multi-Field Search Support

-- Single term across multiple fields (best_fields mode - default)
SELECT * FROM docs WHERE search('hello', '{"fields":["title","content"]}');
-- Equivalent to: (title:hello) OR (content:hello)

-- Multi-term with AND operator (best_fields mode - default)
SELECT * FROM docs WHERE search('hello world', 
  '{"fields":["title","content"],"default_operator":"and"}');
-- Equivalent to: (title:hello AND title:world) OR (content:hello AND content:world)

-- Multi-term with cross_fields mode
SELECT * FROM docs WHERE search('hello world', 
  '{"fields":["title","content"],"default_operator":"and","type":"cross_fields"}');
-- Equivalent to: (title:hello OR content:hello) AND (title:world OR content:world)

-- Combined with Lucene mode
SELECT * FROM docs WHERE search('machine AND learning', 
  '{"fields":["title","content"],"mode":"lucene","minimum_should_match":0}');

Type Parameter Options

Type Description Behavior
best_fields (default) All terms must match within the SAME field "hello world"(title:hello AND title:world) OR (content:hello AND content:world)
cross_fields Terms can match across DIFFERENT fields "hello world"(title:hello OR content:hello) AND (title:world OR content:world)

Key features:

  • type parameter controls how terms are matched across fields
  • best_fields (default): Finds documents where all terms appear in the same field - ideal for relevance ranking
  • cross_fields: Treats multiple fields as one big field - ideal for name searches across first_name/last_name
  • Compatible with both standard mode and Lucene boolean mode
  • fields and default_field are mutually exclusive
  • Supports functions (EXACT, ANY, ALL) across fields
  • Supports wildcard queries across fields

Behavior examples:

Query Fields Type Expanded DSL
hello ["title","content"] best_fields (title:hello) OR (content:hello)
hello world (AND) ["title","content"] best_fields (title:hello AND title:world) OR (content:hello AND content:world)
hello world (AND) ["title","content"] cross_fields (title:hello OR content:hello) AND (title:world OR content:world)
EXACT(foo bar) ["title","content"] any (title:EXACT(foo bar) OR content:EXACT(foo bar))
hello AND category:tech ["title","content"] any (title:hello OR content:hello) AND category:tech

Use case examples:

  • Product search: Use best_fields when searching product name and description - prefer products where query terms appear together
  • Person name search: Use cross_fields when searching first_name and last_name - "John Smith" should match documents with first_name:John and last_name:Smith

Release note

  • Add multi-field search support for SEARCH function (fields parameter)
  • Add type parameter with best_fields (default) and cross_fields modes
  • best_fields: All terms must match within the same field (default, matches Elasticsearch behavior)
  • cross_fields: Terms can match across different fields
  • Compatible with Lucene mode for MUST/SHOULD/MUST_NOT semantics

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 13, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31901 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit a5a29caa374b585cb93f19a1fd8be3a3762c9112, data reload: false

------ Round 1 ----------------------------------
q1	17661	4240	4039	4039
q2	2053	381	245	245
q3	10103	1260	741	741
q4	10222	928	323	323
q5	7498	2070	1798	1798
q6	186	168	134	134
q7	974	795	666	666
q8	9269	1368	1161	1161
q9	4857	4592	4583	4583
q10	6753	1814	1425	1425
q11	532	306	278	278
q12	693	732	600	600
q13	17773	3835	3053	3053
q14	286	307	275	275
q15	561	519	518	518
q16	684	667	630	630
q17	645	739	539	539
q18	6810	6368	6828	6368
q19	1106	1004	650	650
q20	445	388	275	275
q21	3248	2638	2570	2570
q22	1082	1088	1030	1030
Total cold run time: 103441 ms
Total hot run time: 31901 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4323	4401	4256	4256
q2	345	429	320	320
q3	2410	2728	2437	2437
q4	1324	1843	1531	1531
q5	4514	4281	4326	4281
q6	221	173	130	130
q7	1969	1923	1763	1763
q8	2521	2502	2477	2477
q9	7243	7030	7172	7030
q10	2574	2776	2236	2236
q11	553	481	448	448
q12	693	758	684	684
q13	3499	4043	3370	3370
q14	287	358	300	300
q15	542	511	529	511
q16	660	707	663	663
q17	1109	1206	1231	1206
q18	7526	7455	7274	7274
q19	808	788	797	788
q20	1879	1929	1857	1857
q21	4486	4222	4105	4105
q22	1055	1032	961	961
Total cold run time: 50541 ms
Total hot run time: 48628 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 175733 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit a5a29caa374b585cb93f19a1fd8be3a3762c9112, data reload: false

query5	4469	643	510	510
query6	361	260	246	246
query7	4240	480	266	266
query8	387	296	289	289
query9	8762	2904	2944	2904
query10	533	408	366	366
query11	15156	14987	14838	14838
query12	185	125	127	125
query13	1300	540	412	412
query14	6148	3117	2838	2838
query14_1	2755	2691	2746	2691
query15	205	204	177	177
query16	1039	530	492	492
query17	1166	739	621	621
query18	2484	470	371	371
query19	247	244	217	217
query20	129	128	123	123
query21	220	150	129	129
query22	4088	3939	3850	3850
query23	16171	15859	15243	15243
query23_1	15499	15465	15454	15454
query24	7258	1556	1172	1172
query24_1	1184	1177	1167	1167
query25	575	484	440	440
query26	1250	264	161	161
query27	2759	440	273	273
query28	4561	2175	2161	2161
query29	788	567	503	503
query30	318	247	221	221
query31	819	640	558	558
query32	91	82	78	78
query33	567	371	323	323
query34	920	877	548	548
query35	743	801	684	684
query36	910	909	886	886
query37	145	104	93	93
query38	2700	2699	2698	2698
query39	794	771	759	759
query39_1	719	733	715	715
query40	228	146	128	128
query41	82	77	76	76
query42	113	112	121	112
query43	447	420	398	398
query44	1306	745	749	745
query45	196	187	183	183
query46	836	949	576	576
query47	1392	1487	1417	1417
query48	307	321	239	239
query49	635	453	372	372
query50	628	283	221	221
query51	3768	3799	3822	3799
query52	109	113	100	100
query53	302	334	292	292
query54	318	293	298	293
query55	86	84	81	81
query56	341	340	338	338
query57	1016	1005	931	931
query58	293	282	288	282
query59	2148	2077	2102	2077
query60	367	361	345	345
query61	203	195	202	195
query62	405	370	319	319
query63	306	272	282	272
query64	5142	1478	1217	1217
query65	3846	3725	3791	3725
query66	1491	460	357	357
query67	15438	15188	15819	15188
query68	2779	1026	746	746
query69	513	375	346	346
query70	994	967	947	947
query71	333	314	300	300
query72	6011	3605	3743	3605
query73	588	730	306	306
query74	8856	8804	8629	8629
query75	2783	2832	2482	2482
query76	3051	1071	668	668
query77	557	398	317	317
query78	9619	9937	9210	9210
query79	943	914	564	564
query80	829	621	546	546
query81	510	267	243	243
query82	217	151	123	123
query83	392	278	252	252
query84	260	115	100	100
query85	1121	624	574	574
query86	363	299	323	299
query87	2839	2820	2744	2744
query88	2980	2184	2167	2167
query89	390	354	347	347
query90	1809	173	171	171
query91	200	188	178	178
query92	86	79	74	74
query93	940	922	520	520
query94	582	358	319	319
query95	608	406	338	338
query96	573	482	206	206
query97	2367	2402	2344	2344
query98	213	216	209	209
query99	599	590	536	536
Total cold run time: 248272 ms
Total hot run time: 175733 ms

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 68.64% (116/169) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 68.64% (116/169) 🎉
Increment coverage report
Complete coverage report

@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31243 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0b426515c2cef003732c976a0d45875af3bf6be3, data reload: false

------ Round 1 ----------------------------------
q1	17612	4201	4031	4031
q2	2021	349	245	245
q3	10190	1269	720	720
q4	10224	930	343	343
q5	7568	2015	1883	1883
q6	191	168	136	136
q7	906	779	671	671
q8	9256	1405	1233	1233
q9	4903	4674	4518	4518
q10	6782	1788	1427	1427
q11	518	279	258	258
q12	693	734	603	603
q13	17796	3809	3057	3057
q14	282	290	276	276
q15	593	519	506	506
q16	668	666	643	643
q17	656	836	430	430
q18	6910	6336	6311	6311
q19	1096	979	597	597
q20	387	351	238	238
q21	2963	2437	2174	2174
q22	1022	1006	943	943
Total cold run time: 103237 ms
Total hot run time: 31243 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4066	4048	4033	4033
q2	329	403	323	323
q3	2081	2571	2205	2205
q4	1322	1742	1323	1323
q5	4049	3958	3981	3958
q6	210	167	127	127
q7	1876	1796	1726	1726
q8	2763	2515	2429	2429
q9	7147	7156	7198	7156
q10	2499	2741	2293	2293
q11	622	478	483	478
q12	714	735	613	613
q13	3592	4138	3528	3528
q14	293	305	267	267
q15	535	517	549	517
q16	687	692	649	649
q17	1181	1572	1337	1337
q18	8438	7845	7841	7841
q19	862	838	810	810
q20	1994	2174	1949	1949
q21	4687	4292	4108	4108
q22	1045	1056	990	990
Total cold run time: 50992 ms
Total hot run time: 48660 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 172085 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0b426515c2cef003732c976a0d45875af3bf6be3, data reload: false

query5	4405	614	457	457
query6	312	223	212	212
query7	4204	443	243	243
query8	347	240	244	240
query9	8719	2869	2880	2869
query10	529	371	341	341
query11	15132	15137	14755	14755
query12	175	123	110	110
query13	1260	481	370	370
query14	6235	2943	2724	2724
query14_1	2611	2594	2681	2594
query15	202	194	174	174
query16	993	489	457	457
query17	1099	679	576	576
query18	2498	433	340	340
query19	218	224	196	196
query20	117	115	116	115
query21	211	138	121	121
query22	3966	4160	3740	3740
query23	15865	15603	15333	15333
query23_1	15298	15287	15306	15287
query24	7228	1522	1166	1166
query24_1	1138	1168	1172	1168
query25	574	479	419	419
query26	1236	257	159	159
query27	2783	436	270	270
query28	4556	2146	2124	2124
query29	781	525	452	452
query30	306	245	212	212
query31	772	616	560	560
query32	94	74	73	73
query33	541	353	313	313
query34	882	870	510	510
query35	722	747	667	667
query36	907	935	875	875
query37	174	91	81	81
query38	2713	2666	2587	2587
query39	764	745	719	719
query39_1	707	705	698	698
query40	216	128	112	112
query41	66	72	66	66
query42	99	100	105	100
query43	430	420	420	420
query44	1300	728	736	728
query45	183	187	170	170
query46	818	924	551	551
query47	1401	1407	1433	1407
query48	306	300	217	217
query49	594	432	329	329
query50	591	262	206	206
query51	3772	3890	3736	3736
query52	100	105	96	96
query53	286	321	266	266
query54	280	259	253	253
query55	77	80	74	74
query56	304	289	309	289
query57	1007	1033	927	927
query58	262	248	244	244
query59	2175	2182	1985	1985
query60	322	361	298	298
query61	152	151	153	151
query62	390	351	351	351
query63	294	265	268	265
query64	4897	1264	945	945
query65	3847	3747	3836	3747
query66	1449	418	311	311
query67	14994	15179	15290	15179
query68	6624	959	688	688
query69	502	349	322	322
query70	1051	975	889	889
query71	364	303	276	276
query72	5736	3332	3387	3332
query73	745	703	297	297
query74	8804	8730	8585	8585
query75	2785	2774	2449	2449
query76	3316	1060	662	662
query77	522	362	300	300
query78	9686	9698	9194	9194
query79	1407	876	531	531
query80	640	585	467	467
query81	500	278	236	236
query82	212	146	109	109
query83	255	258	237	237
query84	253	106	97	97
query85	888	494	460	460
query86	393	261	267	261
query87	2886	2918	2746	2746
query88	2894	2113	2106	2106
query89	392	356	327	327
query90	2185	166	150	150
query91	167	155	133	133
query92	85	68	69	68
query93	1893	868	522	522
query94	566	333	299	299
query95	583	378	308	308
query96	552	477	198	198
query97	2313	2377	2348	2348
query98	224	201	193	193
query99	607	577	496	496
Total cold run time: 250993 ms
Total hot run time: 172085 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.74 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 0b426515c2cef003732c976a0d45875af3bf6be3, data reload: false

query1	0.06	0.05	0.05
query2	0.09	0.05	0.05
query3	0.26	0.08	0.08
query4	1.61	0.12	0.11
query5	0.27	0.25	0.25
query6	1.16	0.64	0.64
query7	0.03	0.02	0.03
query8	0.05	0.05	0.04
query9	0.58	0.50	0.49
query10	0.56	0.56	0.55
query11	0.14	0.10	0.10
query12	0.15	0.10	0.10
query13	0.60	0.59	0.59
query14	0.95	0.96	0.94
query15	0.78	0.77	0.80
query16	0.42	0.40	0.38
query17	0.95	1.02	1.05
query18	0.22	0.22	0.21
query19	1.96	1.80	1.88
query20	0.02	0.01	0.02
query21	15.44	0.25	0.13
query22	5.16	0.06	0.05
query23	16.01	0.27	0.09
query24	1.47	0.63	0.19
query25	0.12	0.08	0.07
query26	0.15	0.13	0.14
query27	0.09	0.05	0.05
query28	4.53	1.06	0.89
query29	12.59	4.01	3.15
query30	0.27	0.14	0.12
query31	2.81	0.63	0.39
query32	3.24	0.55	0.47
query33	2.94	3.04	3.08
query34	16.11	5.08	4.49
query35	4.42	4.45	4.49
query36	0.64	0.52	0.48
query37	0.10	0.07	0.06
query38	0.07	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.14	0.13
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.05	0.03	0.03
Total cold run time: 97.4 s
Total hot run time: 26.74 s

airborne12 and others added 2 commits January 14, 2026 17:46
Add `fields` parameter to the search() function, allowing queries to search
across multiple fields with a single query term. Similar to Elasticsearch's
query_string `fields` parameter.

Usage examples:
- search('hello', '{"fields":["title","content"]}')
  -> Equivalent to: (title:hello OR content:hello)

- search('hello world', '{"fields":["title","content"],"default_operator":"and"}')
  -> Equivalent to: (title:hello OR content:hello) AND (title:world OR content:world)

- search('a AND b', '{"fields":["title","content"],"mode":"lucene"}')
  -> Multi-field with Lucene boolean semantics

Key changes:
- Extended SearchOptions with `fields` array and `isMultiFieldMode()`
- Added common helper methods for DRY compliance (parseWithVisitor, expandItemAcrossFields)
- Added FieldTrackingVisitor interface for polymorphic visitor handling
- Added parseDslMultiFieldMode() and parseDslMultiFieldLuceneMode()
- Added comprehensive unit tests (15+ test cases)
- Added regression tests (18 test cases)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…cross_fields behavior

- Add test data id=9 to verify cross_fields vs best_fields semantics
- Add Test 2b: multi_field_multi_term_and_lucene to test default_operator:and with mode:lucene
- Add Test 11b: multi_field_cross_fields_verify to explicitly verify cross_fields behavior
- Update expected output file with new test case results

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31305 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7c0d17897b2775d1826146cd7c2505e5fd40a022, data reload: false

------ Round 1 ----------------------------------
q1	17634	4279	4071	4071
q2	2058	377	238	238
q3	10126	1269	698	698
q4	10229	849	300	300
q5	7505	2161	1812	1812
q6	186	169	137	137
q7	921	791	650	650
q8	9275	1427	1144	1144
q9	4874	4658	4532	4532
q10	6736	1797	1414	1414
q11	517	296	295	295
q12	688	757	591	591
q13	17782	3848	3087	3087
q14	302	289	284	284
q15	595	521	503	503
q16	685	668	642	642
q17	660	705	558	558
q18	6674	6279	6336	6279
q19	1236	964	613	613
q20	383	356	237	237
q21	2999	2390	2254	2254
q22	1077	988	966	966
Total cold run time: 103142 ms
Total hot run time: 31305 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4093	4053	4040	4040
q2	323	415	323	323
q3	2121	2601	2242	2242
q4	1312	1729	1353	1353
q5	4087	4014	4096	4014
q6	214	168	127	127
q7	1863	1793	1669	1669
q8	2841	2538	2550	2538
q9	7284	7248	7206	7206
q10	2677	2795	2300	2300
q11	545	485	462	462
q12	720	752	646	646
q13	3721	4203	3379	3379
q14	286	309	280	280
q15	548	492	509	492
q16	667	682	622	622
q17	1153	1350	1398	1350
q18	8120	8080	7935	7935
q19	960	849	851	849
q20	2021	2093	1926	1926
q21	4718	4508	4439	4439
q22	1082	1046	964	964
Total cold run time: 51356 ms
Total hot run time: 49156 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173919 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7c0d17897b2775d1826146cd7c2505e5fd40a022, data reload: false

query5	4406	631	509	509
query6	348	246	222	222
query7	4220	464	259	259
query8	345	247	241	241
query9	8733	2923	2876	2876
query10	532	384	329	329
query11	15220	15256	14988	14988
query12	190	115	114	114
query13	1264	513	396	396
query14	6653	3086	2789	2789
query14_1	2708	2734	2665	2665
query15	195	191	168	168
query16	1038	513	467	467
query17	1104	704	582	582
query18	2717	441	337	337
query19	225	229	200	200
query20	117	121	114	114
query21	212	135	120	120
query22	3777	3995	3970	3970
query23	16082	15675	15477	15477
query23_1	15509	15431	15311	15311
query24	7113	1562	1175	1175
query24_1	1183	1157	1150	1150
query25	559	491	432	432
query26	1247	276	154	154
query27	2749	441	280	280
query28	4536	2141	2129	2129
query29	796	554	452	452
query30	314	250	212	212
query31	814	619	554	554
query32	89	78	72	72
query33	541	357	314	314
query34	911	882	529	529
query35	731	822	685	685
query36	873	887	869	869
query37	133	97	85	85
query38	2690	2740	2598	2598
query39	764	768	715	715
query39_1	723	702	703	702
query40	216	137	116	116
query41	69	62	66	62
query42	105	102	103	102
query43	441	458	417	417
query44	1307	743	725	725
query45	187	181	174	174
query46	832	931	563	563
query47	1413	1470	1327	1327
query48	316	326	240	240
query49	597	417	336	336
query50	607	279	207	207
query51	3795	3812	3785	3785
query52	103	108	98	98
query53	288	324	268	268
query54	285	268	263	263
query55	86	79	74	74
query56	311	321	306	306
query57	1027	989	956	956
query58	279	259	292	259
query59	2079	2053	2080	2053
query60	335	325	316	316
query61	159	149	154	149
query62	416	365	311	311
query63	301	267	265	265
query64	4874	1282	988	988
query65	3769	3755	3785	3755
query66	1361	434	305	305
query67	15217	15058	15249	15058
query68	6574	974	706	706
query69	504	363	329	329
query70	1061	950	955	950
query71	366	329	303	303
query72	5709	3282	3305	3282
query73	798	715	310	310
query74	8721	8794	8582	8582
query75	2806	2793	2474	2474
query76	3472	1057	640	640
query77	511	369	307	307
query78	9738	9796	9113	9113
query79	1628	918	598	598
query80	695	580	488	488
query81	524	262	225	225
query82	391	160	111	111
query83	261	258	244	244
query84	258	113	105	105
query85	899	535	448	448
query86	400	295	290	290
query87	2885	2823	2742	2742
query88	3522	2591	2582	2582
query89	379	343	327	327
query90	2135	174	155	155
query91	168	165	138	138
query92	89	72	66	66
query93	1516	907	524	524
query94	564	335	310	310
query95	598	337	367	337
query96	642	519	235	235
query97	2385	2397	2324	2324
query98	237	201	190	190
query99	598	571	563	563
Total cold run time: 253075 ms
Total hot run time: 173919 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.8 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 7c0d17897b2775d1826146cd7c2505e5fd40a022, data reload: false

query1	0.06	0.04	0.05
query2	0.10	0.06	0.05
query3	0.26	0.08	0.08
query4	1.61	0.12	0.11
query5	0.28	0.25	0.26
query6	1.15	0.67	0.65
query7	0.04	0.02	0.02
query8	0.05	0.04	0.04
query9	0.57	0.51	0.49
query10	0.56	0.54	0.54
query11	0.15	0.09	0.10
query12	0.14	0.10	0.11
query13	0.61	0.59	0.60
query14	0.96	0.96	0.96
query15	0.80	0.77	0.77
query16	0.39	0.40	0.38
query17	1.07	0.98	1.06
query18	0.23	0.22	0.21
query19	1.97	1.87	1.90
query20	0.02	0.01	0.01
query21	15.44	0.23	0.14
query22	5.34	0.04	0.04
query23	15.91	0.28	0.10
query24	1.18	0.29	1.00
query25	0.11	0.06	0.07
query26	0.14	0.14	0.13
query27	0.09	0.06	0.09
query28	4.95	1.06	0.89
query29	12.56	3.89	3.14
query30	0.27	0.13	0.12
query31	2.82	0.62	0.38
query32	3.25	0.56	0.45
query33	3.07	3.11	3.03
query34	16.27	5.01	4.46
query35	4.47	4.45	4.51
query36	0.67	0.50	0.48
query37	0.10	0.07	0.06
query38	0.07	0.04	0.04
query39	0.04	0.03	0.02
query40	0.17	0.14	0.14
query41	0.08	0.04	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.04
Total cold run time: 98.1 s
Total hot run time: 26.8 s

…_fields/cross_fields modes

Add support for "type" option in multi-field search DSL to control how terms
are matched across fields:

- best_fields (default): All terms must match within the SAME field
  Example: "machine learning" -> (title:machine AND title:learning) OR
           (content:machine AND content:learning)

- cross_fields: Terms can match across DIFFERENT fields
  Example: "machine learning" -> (title:machine OR content:machine) AND
           (title:learning OR content:learning)

This aligns with Elasticsearch's default behavior where best_fields is the
default mode. Users can explicitly set type:"cross_fields" when they need
terms to be distributed across multiple fields.

Also includes:
- Input validation in setType() with IllegalArgumentException for invalid values
- Explicit type checking to prevent silent fallback behavior
- Unit tests for type parameter parsing and expansion
- Regression tests for both modes in standard and Lucene modes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32084 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 098956f72a6288468b7cb560a2df2480273ba758, data reload: false

------ Round 1 ----------------------------------
q1	17682	4298	4068	4068
q2	2001	352	235	235
q3	10178	1249	699	699
q4	10225	901	318	318
q5	7565	2072	1875	1875
q6	194	170	139	139
q7	979	761	663	663
q8	9309	1427	1200	1200
q9	4954	4514	4785	4514
q10	6789	1767	1388	1388
q11	533	294	290	290
q12	724	751	628	628
q13	17790	3868	3107	3107
q14	289	309	276	276
q15	575	525	514	514
q16	668	690	628	628
q17	649	751	558	558
q18	6633	6457	6766	6457
q19	1533	1040	685	685
q20	409	380	263	263
q21	3209	2621	2540	2540
q22	1131	1149	1039	1039
Total cold run time: 104019 ms
Total hot run time: 32084 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4434	4261	4325	4261
q2	342	406	310	310
q3	2361	2874	2429	2429
q4	1446	1833	1420	1420
q5	4469	4337	4494	4337
q6	213	174	130	130
q7	1967	1937	1821	1821
q8	2526	2412	2395	2395
q9	7493	7375	7113	7113
q10	2493	2722	2344	2344
q11	572	493	459	459
q12	699	679	616	616
q13	3362	3770	3087	3087
q14	267	286	257	257
q15	519	486	485	485
q16	597	637	591	591
q17	1088	1303	1338	1303
q18	7184	7202	7298	7202
q19	821	762	803	762
q20	1915	1948	1807	1807
q21	4454	4252	4166	4166
q22	1103	1055	960	960
Total cold run time: 50325 ms
Total hot run time: 48255 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173383 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 098956f72a6288468b7cb560a2df2480273ba758, data reload: false

query5	4817	638	500	500
query6	348	231	227	227
query7	4227	472	259	259
query8	345	253	250	250
query9	8736	2907	2890	2890
query10	561	398	382	382
query11	15427	15255	14736	14736
query12	174	115	113	113
query13	1275	487	377	377
query14	6475	3067	2787	2787
query14_1	2724	2726	2686	2686
query15	205	195	172	172
query16	1047	470	471	470
query17	1048	661	560	560
query18	2668	437	324	324
query19	216	212	194	194
query20	119	113	114	113
query21	219	133	123	123
query22	4005	3969	3868	3868
query23	15930	15587	15319	15319
query23_1	15388	15382	15333	15333
query24	7168	1563	1191	1191
query24_1	1160	1153	1146	1146
query25	530	440	404	404
query26	1238	265	157	157
query27	2764	437	267	267
query28	4566	2150	2159	2150
query29	764	531	447	447
query30	311	244	219	219
query31	824	627	552	552
query32	88	73	79	73
query33	529	336	305	305
query34	894	868	536	536
query35	730	759	671	671
query36	900	897	752	752
query37	123	96	83	83
query38	2738	2688	2691	2688
query39	772	757	721	721
query39_1	713	717	707	707
query40	220	132	116	116
query41	65	65	62	62
query42	102	100	100	100
query43	469	444	413	413
query44	1319	721	740	721
query45	186	185	173	173
query46	821	932	568	568
query47	1362	1482	1406	1406
query48	305	318	236	236
query49	613	428	348	348
query50	615	265	203	203
query51	3715	3809	3803	3803
query52	103	111	98	98
query53	278	325	270	270
query54	292	270	264	264
query55	83	81	75	75
query56	303	310	320	310
query57	995	1024	883	883
query58	265	257	261	257
query59	2047	2003	2124	2003
query60	325	322	319	319
query61	161	187	156	156
query62	395	351	318	318
query63	291	260	266	260
query64	4892	1270	986	986
query65	3779	3801	3718	3718
query66	1394	424	335	335
query67	15354	14576	14675	14576
query68	2712	1004	750	750
query69	467	369	325	325
query70	953	959	905	905
query71	326	303	297	297
query72	5607	3484	3546	3484
query73	607	723	318	318
query74	8748	8726	8586	8586
query75	2776	2853	2501	2501
query76	2447	1053	636	636
query77	354	377	304	304
query78	9838	9905	9155	9155
query79	1053	901	576	576
query80	1510	609	528	528
query81	533	266	240	240
query82	1359	149	118	118
query83	355	269	258	258
query84	255	120	108	108
query85	1142	507	449	449
query86	388	334	320	320
query87	2902	2915	2750	2750
query88	3516	2586	2567	2567
query89	400	350	326	326
query90	1856	180	164	164
query91	174	161	140	140
query92	83	74	73	73
query93	952	897	518	518
query94	589	307	290	290
query95	575	404	319	319
query96	636	512	229	229
query97	2318	2366	2300	2300
query98	223	218	224	218
query99	613	572	499	499
Total cold run time: 249339 ms
Total hot run time: 173383 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.81 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 098956f72a6288468b7cb560a2df2480273ba758, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.26	0.09	0.09
query4	1.61	0.12	0.11
query5	0.28	0.25	0.24
query6	1.14	0.67	0.66
query7	0.04	0.03	0.03
query8	0.05	0.04	0.04
query9	0.57	0.52	0.50
query10	0.55	0.54	0.55
query11	0.15	0.10	0.10
query12	0.15	0.12	0.11
query13	0.61	0.58	0.59
query14	0.95	0.95	0.94
query15	0.80	0.78	0.79
query16	0.41	0.40	0.38
query17	1.04	1.07	1.02
query18	0.23	0.22	0.21
query19	1.85	1.90	1.86
query20	0.02	0.01	0.01
query21	15.45	0.26	0.13
query22	5.07	0.05	0.05
query23	16.10	0.29	0.10
query24	1.44	0.24	0.55
query25	0.12	0.12	0.07
query26	0.15	0.13	0.13
query27	0.08	0.05	0.05
query28	4.85	1.06	0.87
query29	12.57	3.88	3.12
query30	0.30	0.14	0.12
query31	2.82	0.63	0.41
query32	3.24	0.55	0.45
query33	2.94	3.07	3.05
query34	16.08	5.04	4.41
query35	4.42	4.50	4.55
query36	0.65	0.50	0.48
query37	0.10	0.06	0.07
query38	0.07	0.04	0.03
query39	0.04	0.04	0.03
query40	0.16	0.13	0.14
query41	0.09	0.04	0.03
query42	0.04	0.02	0.02
query43	0.04	0.04	0.03
Total cold run time: 97.67 s
Total hot run time: 26.81 s

@airborne12
Copy link
Member Author

run cloud_p0

@airborne12
Copy link
Member Author

run vault_p0

@airborne12
Copy link
Member Author

run external

zzzxl1993
zzzxl1993 previously approved these changes Jan 15, 2026
Copy link
Contributor

@zzzxl1993 zzzxl1993 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 15, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 65.67% (153/233) 🎉
Increment coverage report
Complete coverage report

airborne12 and others added 3 commits January 15, 2026 13:52
…sses

Make fields private with proper getter/setter methods in QsNode, QsPlan,
and QsFieldBinding classes. This follows Java encapsulation best practices
and allows for future validation or transformation logic.

- Make all fields private with @JsonProperty getters for serialization
- Add setters for QsNode.field and QsFieldBinding.fieldName (needed for
  field name normalization in RewriteSearchToSlots)
- Update all usages to use getter/setter methods instead of direct field access
- Add Javadoc to QsNode constructors

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…dling

- Add SearchDslSyntaxException for clearer DSL syntax error messages
- Add @nullable annotations to public parseDsl method parameters
- Replace deprecated ANTLRInputStream with CharStreams.fromString()
- Remove duplicate fields validation in expandMultiFieldDsl methods
- Add context (actual fields value) to validation exception messages
- Improve error handling with specific catch blocks for different
  exception types (syntax errors, argument errors, internal errors)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…mize StringBuilder

- Add @JsonProperty/@JsonSetter annotations to SearchOptions for declarative JSON mapping
- Simplify parseOptions() from 58 lines to 20 lines using JSON_MAPPER.readValue()
- Add validate() method for mutual exclusion and range checks
- Add minimum_should_match negative value validation
- Reuse StringBuilder with setLength(0) instead of creating new instances in tokenizeDsl()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Jan 15, 2026
@airborne12
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 31480 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit fc21cbce3d725aff907d6a5b660fec4059c5b7e5, data reload: false

------ Round 1 ----------------------------------
q1	17602	4181	4014	4014
q2	2059	355	247	247
q3	10148	1263	716	716
q4	10201	763	304	304
q5	7593	2066	1804	1804
q6	185	168	135	135
q7	911	788	659	659
q8	9280	1409	1125	1125
q9	4848	4599	4362	4362
q10	6725	1826	1433	1433
q11	517	302	286	286
q12	719	729	583	583
q13	17788	3813	3083	3083
q14	301	299	284	284
q15	568	524	502	502
q16	673	689	633	633
q17	636	811	474	474
q18	6622	6435	6861	6435
q19	1090	1084	673	673
q20	420	389	261	261
q21	3241	2607	2423	2423
q22	1104	1094	1044	1044
Total cold run time: 103231 ms
Total hot run time: 31480 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4334	4405	4221	4221
q2	347	422	320	320
q3	2254	2740	2498	2498
q4	1419	1928	1514	1514
q5	4401	4290	4403	4290
q6	209	181	128	128
q7	2027	1977	1774	1774
q8	2661	2409	2396	2396
q9	7189	7232	7154	7154
q10	2528	2779	2262	2262
q11	546	461	454	454
q12	682	743	634	634
q13	3661	3966	3095	3095
q14	269	275	265	265
q15	528	481	487	481
q16	607	644	590	590
q17	1065	1232	1258	1232
q18	7422	7272	7190	7190
q19	812	787	766	766
q20	1850	1950	1866	1866
q21	4491	4203	4218	4203
q22	1086	1014	977	977
Total cold run time: 50388 ms
Total hot run time: 48310 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174157 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit fc21cbce3d725aff907d6a5b660fec4059c5b7e5, data reload: false

query5	5757	618	480	480
query6	326	226	217	217
query7	4228	459	254	254
query8	352	253	253	253
query9	8736	2866	2865	2865
query10	564	366	346	346
query11	15149	15135	14903	14903
query12	184	115	123	115
query13	1271	497	390	390
query14	7792	3087	2805	2805
query14_1	2673	2640	2666	2640
query15	251	195	182	182
query16	999	490	464	464
query17	1322	650	544	544
query18	2688	422	333	333
query19	283	211	189	189
query20	120	112	112	112
query21	214	132	115	115
query22	3966	4100	4001	4001
query23	15996	15472	15279	15279
query23_1	15354	15477	15389	15389
query24	6834	1545	1185	1185
query24_1	1171	1183	1168	1168
query25	514	444	396	396
query26	1222	259	151	151
query27	2719	437	273	273
query28	4547	2129	2130	2129
query29	751	521	425	425
query30	310	244	203	203
query31	777	624	587	587
query32	84	75	69	69
query33	523	381	303	303
query34	852	896	532	532
query35	723	786	650	650
query36	866	865	841	841
query37	131	95	82	82
query38	2662	2666	2729	2666
query39	779	759	740	740
query39_1	735	717	726	717
query40	220	134	119	119
query41	67	60	64	60
query42	102	103	100	100
query43	441	451	411	411
query44	1319	733	722	722
query45	183	188	177	177
query46	826	949	572	572
query47	1416	1484	1428	1428
query48	311	313	243	243
query49	600	426	339	339
query50	633	263	206	206
query51	3806	3793	3774	3774
query52	101	104	96	96
query53	293	321	266	266
query54	288	263	259	259
query55	81	78	75	75
query56	315	308	292	292
query57	1035	1025	948	948
query58	282	256	260	256
query59	2151	2154	2083	2083
query60	332	349	301	301
query61	160	148	162	148
query62	378	345	310	310
query63	293	267	262	262
query64	4826	1254	958	958
query65	3726	3721	3735	3721
query66	1371	417	317	317
query67	15515	15613	15444	15444
query68	2419	1078	749	749
query69	424	371	330	330
query70	944	973	951	951
query71	326	308	285	285
query72	5428	3424	3569	3424
query73	591	720	323	323
query74	8679	8724	8590	8590
query75	2766	2813	2488	2488
query76	2248	1049	647	647
query77	376	379	311	311
query78	9832	10040	9113	9113
query79	1089	904	574	574
query80	765	601	541	541
query81	499	266	229	229
query82	1301	145	110	110
query83	327	248	245	245
query84	253	108	89	89
query85	861	501	446	446
query86	368	301	317	301
query87	2891	2897	2759	2759
query88	3497	2594	2571	2571
query89	376	355	323	323
query90	1654	164	153	153
query91	166	161	135	135
query92	72	76	65	65
query93	946	881	536	536
query94	453	327	289	289
query95	557	341	372	341
query96	666	497	225	225
query97	2343	2400	2320	2320
query98	209	198	194	194
query99	588	574	512	512
Total cold run time: 248442 ms
Total hot run time: 174157 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 26.75 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit fc21cbce3d725aff907d6a5b660fec4059c5b7e5, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.04	0.05
query3	0.25	0.09	0.08
query4	1.61	0.11	0.11
query5	0.27	0.26	0.26
query6	1.15	0.65	0.65
query7	0.04	0.03	0.02
query8	0.06	0.04	0.04
query9	0.56	0.51	0.49
query10	0.56	0.55	0.55
query11	0.14	0.10	0.10
query12	0.14	0.10	0.11
query13	0.60	0.59	0.59
query14	0.95	0.94	0.95
query15	0.78	0.78	0.78
query16	0.40	0.39	0.38
query17	1.06	1.03	1.04
query18	0.23	0.22	0.21
query19	1.94	1.85	1.83
query20	0.02	0.01	0.01
query21	15.44	0.24	0.14
query22	5.41	0.06	0.05
query23	16.05	0.28	0.10
query24	1.36	0.59	0.22
query25	0.09	0.07	0.10
query26	0.14	0.13	0.14
query27	0.06	0.08	0.04
query28	4.33	1.08	0.88
query29	12.53	3.92	3.14
query30	0.28	0.14	0.12
query31	2.81	0.63	0.40
query32	3.23	0.57	0.46
query33	3.03	3.03	3.06
query34	15.93	5.01	4.46
query35	4.40	4.47	4.43
query36	0.64	0.49	0.50
query37	0.10	0.06	0.07
query38	0.07	0.04	0.03
query39	0.04	0.02	0.02
query40	0.17	0.15	0.12
query41	0.10	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 97.19 s
Total hot run time: 26.75 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 60.48% (228/377) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 58.62% (221/377) 🎉
Increment coverage report
Complete coverage report

Copy link
Member

@eldenmoon eldenmoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 16, 2026
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants