Skip to content

[fix](inverted index) guard BM25 stats collection against non-fulltext variant subcolumn entries#63692

Open
airborne12 wants to merge 6 commits into
apache:masterfrom
airborne12:fix-doris-25510-bm25-variant-coredump
Open

[fix](inverted index) guard BM25 stats collection against non-fulltext variant subcolumn entries#63692
airborne12 wants to merge 6 commits into
apache:masterfrom
airborne12:fix-doris-25510-bm25-variant-coredump

Conversation

@airborne12
Copy link
Copy Markdown
Member

Proposed changes

Issue Number: close #N/A (Jira DORIS-25510)

What problem does this PR solve?

When a variant column has a parent INVERTED index with parser, and a sub-column is materialized in some segment as a non-string value (e.g. {"c": false}), variant_util::inherit_index calls remove_parser_and_analyzer() and writes a BKD/numeric index for that sub-column. The on-disk entry for (parent_index_id, "<sub>") therefore exists but is not a Lucene fulltext segment.

MatchPredicateCollector::collect (called from BM25 stats collection in OlapScanner::_prepare_impl) does not have segment context, so when the predicate references a variant sub-column it clones the parent fulltext index meta and sets the sub-column path as suffix. In segments where the sub-column happens to be non-string, IndexFileReader::open(...) returns a valid DorisCompoundReader pointing at the BKD entry, and lucene::index::IndexReader::open(compound_reader.get()) throws CLuceneError(\"No segments* file found in DorisCompoundReader@...\").

That CLuceneError (derives from std::exception, not doris::Exception) escapes CollectionStatistics::process_segment, bubbles through collect() and OlapScanner::_prepare_impl, and the ASSIGN_STATUS_IF_CATCH_EXCEPTION wrapper in scanner_scheduler.cpp only catches doris::Exception — so the BE SIGABRTs during scanner prepare.

Minimal reproducer (from DORIS-25510):

create table t (
    `id` int(11) NULL,
    `v` variant NULL,
    INDEX idx_v (`v`) USING INVERTED PROPERTIES(\"parser\" = \"english\")
) ENGINE=OLAP DUPLICATE KEY(`id`)
  DISTRIBUTED BY HASH(`id`) BUCKETS 1
  PROPERTIES (\"replication_allocation\" = \"tag.location.default:1\");

insert into t values(1, '{\"a\": \"abc\"}');
insert into t values(2, '{\"b\": \"abc\"}');
insert into t values(3, '{\"c\": false}');

select score() from t where v[\"c\"] match \"abc\" order by score() limit 10;
-- BE coredumps

This PR wraps the IndexReader::open + searcher-cache-fill path in CollectionStatistics::process_segment with a try { ... } catch (CLuceneError& e) that logs and continues to the next field. Skipping contributes 0 to _total_num_tokens / _term_doc_freqs for the affected field in that segment, which is the intended semantics for no fulltext data for this sub-column in this segment. Existing INVERTED_INDEX_FILE_NOT_FOUND / INVERTED_INDEX_BYPASS handling at CollectionStatistics::collect is unchanged and still applies when the entry is genuinely absent.

The deeper schema-level fix — never cloning a fulltext parent meta for a sub-column whose actual segment-level index was written as BKD — needs segment context and is a follow-up. The defensive try/catch is enough to stop the abort and is the same shape Doris uses elsewhere when CLucene exceptions cross the BE / Doris boundary.

Release note

Fix BE crash when running score() / BM25-scoring queries against a variant sub-column whose data in some segments is non-string while the parent variant column has a fulltext INVERTED index.

Check List (For Author)

  • Test:
    • Regression test: regression-test/suites/inverted_index_p0/test_bm25_score_variant_boolean_subcolumn.groovy replays the exact DORIS-25510 reproducer (3 single-row inserts so each lands in its own segment, including the boolean sub-column seg) and asserts the query returns without crash.
  • Behavior changed: No (only converts a crash into a logged warning + empty stats contribution for the affected sub-column / segment).
  • Does this need documentation: No

…t variant subcolumn entries

### What problem does this PR solve?

Issue Number: close #N/A (Jira DORIS-25510)

Problem Summary:

When a variant column has a parent INVERTED index with parser, and a
sub-column is materialized in some segment as a non-string (e.g. boolean)
value, `variant_util::inherit_index` calls `remove_parser_and_analyzer()`
and writes a BKD/numeric index for that sub-column. The on-disk entry
for (parent_index_id, "<sub>") therefore exists but is **not** a Lucene
fulltext segment.

`MatchPredicateCollector::collect` (called from BM25 stats collection in
`OlapScanner::_prepare_impl`) does not have segment context, so when the
predicate references a variant sub-column it clones the parent fulltext
index meta and sets the sub-column path as suffix. In segments where the
sub-column happens to be non-string, `IndexFileReader::open(...)` then
returns a valid `DorisCompoundReader` pointing at the BKD entry, and
`lucene::index::IndexReader::open(compound_reader.get())` throws
`CLuceneError("No segments* file found in DorisCompoundReader@...")`.

The exception escapes `CollectionStatistics::process_segment` (no
try/catch), bubbles through `collect()`, `OlapScanner::_prepare_impl`,
and the `ASSIGN_STATUS_IF_CATCH_EXCEPTION` wrapper in
`scanner_scheduler.cpp` only catches `doris::Exception` — not
`CLuceneError` (which derives from `std::exception`). Result: BE
SIGABRT during scanner prepare.

Minimal reproducer (from DORIS-25510):

```sql
create table t (
    `id` int(11) NULL,
    `v` variant NULL,
    INDEX idx_v (`v`) USING INVERTED PROPERTIES("parser" = "english")
) ENGINE=OLAP DUPLICATE KEY(`id`)
  DISTRIBUTED BY HASH(`id`) BUCKETS 1
  PROPERTIES ("replication_allocation" = "tag.location.default:1");

insert into t values(1, '{"a": "abc"}');
insert into t values(2, '{"b": "abc"}');
insert into t values(3, '{"c": false}');

select score() from t where v["c"] match "abc" order by score() limit 10;
-- BE coredumps
```

This PR wraps the `IndexReader::open` + searcher-cache fill path in
`CollectionStatistics::process_segment` with a `try { ... } catch
(CLuceneError& e)` that logs and skips this (field, segment). Skipping
contributes 0 to `_total_num_tokens` / `_term_doc_freqs` for the
affected field in that segment, which is the intended semantics for
"no fulltext data for this sub-column in this segment". Existing
`INVERTED_INDEX_FILE_NOT_FOUND` / `INVERTED_INDEX_BYPASS` handling at
`CollectionStatistics::collect` is unchanged and still kicks in for
segments where the entry is genuinely absent.

The deeper schema-level fix — never cloning a fulltext parent meta for a
sub-column whose actual segment-level index was written as BKD — needs
segment context and is a follow-up; the defensive try/catch is enough to
stop the abort and is the same shape Doris uses elsewhere when CLucene
exceptions cross the BE/Doris boundary.

### Release note

Fix BE crash when running `score()` / BM25-scoring queries against a
variant sub-column whose data in some segments is non-string while the
parent variant column has a fulltext INVERTED index.

### Check List (For Author)

- Test:
    - Regression test: `regression-test/suites/inverted_index_p0/test_bm25_score_variant_boolean_subcolumn.groovy`
      replays the exact DORIS-25510 reproducer (3 single-row inserts so
      each lands in its own segment, including the boolean sub-column
      seg) and asserts the query returns without crash.
- Behavior changed: No (only converts a crash into a logged warning +
  empty stats contribution for the affected sub-column / segment).
- Does this need documentation: No
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@airborne12
Copy link
Copy Markdown
Member Author

run buildall

1 similar comment
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 30966 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8a2e925d06ada992754527af75a41d7f4f61e93d, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17662	3960	3948	3948
q2	q3	10769	1316	790	790
q4	4687	469	353	353
q5	7619	2256	2093	2093
q6	253	187	136	136
q7	956	785	658	658
q8	9382	1547	1549	1547
q9	6435	5007	4974	4974
q10	6429	2185	1856	1856
q11	441	271	239	239
q12	702	429	294	294
q13	18191	3798	2781	2781
q14	274	260	234	234
q15	q16	821	769	702	702
q17	1048	847	947	847
q18	6756	5596	5540	5540
q19	1271	1340	975	975
q20	519	379	263	263
q21	5726	2543	2433	2433
q22	422	365	303	303
Total cold run time: 100363 ms
Total hot run time: 30966 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4354	4264	4243	4243
q2	q3	4489	4928	4275	4275
q4	2114	2210	1412	1412
q5	4456	4284	4284	4284
q6	229	173	129	129
q7	2370	2001	1624	1624
q8	2479	2198	2101	2101
q9	8149	8065	7924	7924
q10	4826	4800	4317	4317
q11	759	430	391	391
q12	755	764	569	569
q13	3274	3588	2974	2974
q14	314	320	284	284
q15	q16	727	748	649	649
q17	1373	1341	1365	1341
q18	7887	7539	6901	6901
q19	1107	1074	1096	1074
q20	2234	2230	1966	1966
q21	5324	4617	4525	4525
q22	532	492	401	401
Total cold run time: 57752 ms
Total hot run time: 51384 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170851 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8a2e925d06ada992754527af75a41d7f4f61e93d, data reload: false

query5	4311	651	526	526
query6	324	212	192	192
query7	4235	577	303	303
query8	318	233	216	216
query9	8865	4007	4004	4004
query10	466	334	291	291
query11	5810	2502	2262	2262
query12	182	125	129	125
query13	1278	616	439	439
query14	6169	5528	5197	5197
query14_1	4525	4477	4491	4477
query15	212	202	184	184
query16	1008	442	428	428
query17	1111	723	584	584
query18	2530	491	348	348
query19	221	193	157	157
query20	133	127	125	125
query21	209	137	115	115
query22	13624	13728	13488	13488
query23	17289	16424	16037	16037
query23_1	16326	16169	16186	16169
query24	7463	1762	1283	1283
query24_1	1345	1312	1304	1304
query25	545	469	413	413
query26	1315	310	168	168
query27	2715	568	337	337
query28	4478	1978	1968	1968
query29	1028	639	512	512
query30	305	239	199	199
query31	1109	1083	957	957
query32	87	76	76	76
query33	560	368	317	317
query34	1193	1121	656	656
query35	772	790	717	717
query36	1429	1426	1224	1224
query37	157	108	90	90
query38	3207	3161	3082	3082
query39	930	913	888	888
query39_1	871	887	879	879
query40	234	148	131	131
query41	71	67	68	67
query42	110	111	113	111
query43	333	337	325	325
query44	
query45	210	202	197	197
query46	1110	1232	750	750
query47	2387	2364	2381	2364
query48	403	420	305	305
query49	630	507	391	391
query50	1103	355	256	256
query51	4461	4286	4363	4286
query52	101	102	90	90
query53	252	279	196	196
query54	307	280	248	248
query55	89	88	85	85
query56	303	324	292	292
query57	1452	1419	1305	1305
query58	289	268	258	258
query59	1599	1686	1446	1446
query60	312	313	305	305
query61	158	154	148	148
query62	688	654	583	583
query63	244	208	220	208
query64	2421	800	616	616
query65	
query66	1724	481	361	361
query67	29103	29659	29541	29541
query68	
query69	450	335	302	302
query70	1090	1036	1017	1017
query71	300	279	276	276
query72	3027	2741	2417	2417
query73	838	756	425	425
query74	5098	4944	4782	4782
query75	2675	2592	2272	2272
query76	2288	1108	797	797
query77	421	416	361	361
query78	12331	12328	11799	11799
query79	1334	1024	722	722
query80	563	544	453	453
query81	450	277	238	238
query82	235	157	121	121
query83	268	290	247	247
query84	254	143	109	109
query85	854	529	474	474
query86	373	331	328	328
query87	3465	3359	3231	3231
query88	3605	2726	2713	2713
query89	429	382	342	342
query90	2130	187	183	183
query91	183	160	137	137
query92	76	89	75	75
query93	1534	1439	838	838
query94	521	346	314	314
query95	679	482	348	348
query96	1077	788	324	324
query97	2709	2725	2625	2625
query98	230	225	225	225
query99	1174	1165	1052	1052
Total cold run time: 252372 ms
Total hot run time: 170851 ms

@airborne12
Copy link
Copy Markdown
Member Author

Note: the BE-UT failure on build 953796 is a master-tree breakage (master commits #63049 + #63491 left be/test/exprs/function/geo/functions_geo_test.cpp:375 with a now-rejected same-type assert_cast<ColumnUInt8*>). It is not caused by this PR — the regular Compile and all regression configs are green here. Standalone fix opened at #63701; this PR will be re-runable once that lands and is rebased in.

…t regression

### What problem does this PR solve?

Issue Number: close #N/A (follow-up to DORIS-25510 / PR apache#63692)

Problem Summary:

The first regression run for `test_bm25_score_variant_boolean_subcolumn`
failed with:

```
java.lang.IllegalStateException: Missing outputFile:
  regression-test/data/inverted_index_p0/
    test_bm25_score_variant_boolean_subcolumn.out
```

The test was written with `qt_<name>` SQL blocks, which require an
auto-generated `.out` file. Per the repo guideline (CLAUDE.md, "test
result files must not be handwritten; they must be auto-generated via
test scripts") and since the property under test is *"BE survives the
query and returns the expected row count"* — not exact BM25 score values
— switching to plain `sql` + `assertEquals` removes the .out dependency
without weakening the regression.

The DORIS-25510 fix itself is unchanged; the BE in build 953840 already
ran the queries without crashing.

### Release note

None (test-only refinement).

### Check List (For Author)

- Test:
    - Regression-test only update; no .out file needed.
- Behavior changed: No
- Does this need documentation: No
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31759 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b20ebfbc07ee902da2e1c779c013e2a47aa51019, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17696	4139	4114	4114
q2	q3	10803	1408	841	841
q4	4685	474	347	347
q5	7541	2284	2094	2094
q6	240	177	141	141
q7	955	790	668	668
q8	9409	1681	1639	1639
q9	5145	5009	4955	4955
q10	6471	2163	1863	1863
q11	434	280	249	249
q12	628	426	301	301
q13	18158	3411	2789	2789
q14	264	257	237	237
q15	q16	823	774	702	702
q17	995	941	960	941
q18	6953	5896	5589	5589
q19	1373	1403	1123	1123
q20	571	444	287	287
q21	6221	2873	2572	2572
q22	577	366	307	307
Total cold run time: 99942 ms
Total hot run time: 31759 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4943	5020	4856	4856
q2	q3	4943	5312	4632	4632
q4	2169	2207	1408	1408
q5	5033	4804	4646	4646
q6	230	178	127	127
q7	1906	1762	1601	1601
q8	2731	2170	2201	2170
q9	7977	7477	7371	7371
q10	4770	4705	4261	4261
q11	545	385	359	359
q12	730	755	537	537
q13	2995	3431	2764	2764
q14	274	276	251	251
q15	q16	687	705	608	608
q17	1303	1283	1279	1279
q18	7399	6695	6723	6695
q19	1122	1072	1072	1072
q20	2227	2242	1971	1971
q21	5298	4652	4525	4525
q22	516	471	413	413
Total cold run time: 57798 ms
Total hot run time: 51546 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171356 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b20ebfbc07ee902da2e1c779c013e2a47aa51019, data reload: false

query5	4346	686	522	522
query6	326	216	196	196
query7	4218	593	310	310
query8	332	238	215	215
query9	8803	4122	4090	4090
query10	442	343	298	298
query11	5833	2545	2301	2301
query12	173	154	127	127
query13	1287	591	457	457
query14	6148	5525	5233	5233
query14_1	4512	4540	4481	4481
query15	218	207	185	185
query16	1012	457	419	419
query17	1141	724	580	580
query18	2493	476	348	348
query19	212	202	161	161
query20	139	132	129	129
query21	215	136	114	114
query22	13694	13679	13486	13486
query23	17391	16572	16209	16209
query23_1	16401	16333	16341	16333
query24	7459	1814	1341	1341
query24_1	1319	1312	1303	1303
query25	587	502	451	451
query26	1314	330	176	176
query27	2678	607	342	342
query28	4441	1972	1971	1971
query29	1029	643	518	518
query30	308	246	206	206
query31	1140	1105	960	960
query32	90	85	73	73
query33	556	363	308	308
query34	1166	1135	656	656
query35	780	802	713	713
query36	1429	1425	1254	1254
query37	173	116	96	96
query38	3210	3194	3055	3055
query39	930	898	903	898
query39_1	868	868	893	868
query40	244	152	130	130
query41	71	69	69	69
query42	111	112	120	112
query43	339	350	295	295
query44	
query45	218	209	203	203
query46	1122	1206	717	717
query47	2412	2384	2257	2257
query48	391	425	312	312
query49	656	510	407	407
query50	1079	355	263	263
query51	4400	4333	4325	4325
query52	104	107	95	95
query53	262	288	210	210
query54	337	287	267	267
query55	100	90	88	88
query56	321	330	309	309
query57	1436	1428	1325	1325
query58	305	282	275	275
query59	1605	1653	1446	1446
query60	318	318	306	306
query61	166	162	158	158
query62	702	647	589	589
query63	245	197	205	197
query64	2389	804	601	601
query65	
query66	1705	475	357	357
query67	29832	29121	29615	29121
query68	
query69	453	330	302	302
query70	1029	1020	958	958
query71	301	283	272	272
query72	3031	2781	2409	2409
query73	851	751	424	424
query74	5133	4953	4788	4788
query75	2684	2623	2260	2260
query76	2293	1146	792	792
query77	406	424	339	339
query78	12552	12442	11791	11791
query79	1471	1062	764	764
query80	651	539	461	461
query81	455	286	241	241
query82	1407	155	125	125
query83	387	280	247	247
query84	267	137	112	112
query85	876	544	462	462
query86	410	341	329	329
query87	3443	3398	3256	3256
query88	3652	2783	2710	2710
query89	448	406	345	345
query90	1973	190	191	190
query91	180	171	137	137
query92	89	80	74	74
query93	1508	1389	850	850
query94	536	355	314	314
query95	677	379	348	348
query96	1081	785	364	364
query97	2719	2706	2609	2609
query98	231	227	236	227
query99	1155	1173	1013	1013
Total cold run time: 254828 ms
Total hot run time: 171356 ms

@airborne12
Copy link
Copy Markdown
Member Author

Heads up on the latest run: CloudP0 953920 reports 1 failed test — statistics.test_full_analyze_hot_value (Sample analyze should also collect hot_value → false at line 98). This is a known statistics flake addressed by the open PR #63625 ([fix](statistics) full analyze not collect hot value by default), authored by @yujun777 on 2026-05-25. It is unrelated to this PR — the BM25 / variant / inverted-index path doesn't touch hot_value collection. Skipping or rerunning after #63625 lands should clear it.

…h down

### What problem does this PR solve?

Issue Number: close #N/A (follow-up to DORIS-25510 / PR apache#63692)

Problem Summary:

The first SELECT in `test_bm25_score_variant_boolean_subcolumn` used
`order by score(), id limit 10`. FE refuses the score() TopN push-down
when there is more than one ordering expression:

```
SQLException: errCode = 2,
detailMessage = TopN must have exactly one ordering expression for
                score() push down optimization
```

That makes the SQL itself fail before the BM25 collection path runs,
which is exactly the path this test is supposed to exercise. Use
`order by score() limit 10` (single ordering expression) so the
push-down kicks in, the BM25 statistics collection on the variant
sub-column runs, and the assertion can verify that the BE survives.

### Release note

None (test-only refinement).

### Check List (For Author)

- Test:
    - Regression-test only update.
- Behavior changed: No
- Does this need documentation: No
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31235 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4b7bca5c076ecf78d3b808aa8d68f2c06634044f, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17648	4022	3991	3991
q2	q3	10745	1354	804	804
q4	4703	470	348	348
q5	7593	2274	2132	2132
q6	251	177	135	135
q7	956	801	643	643
q8	9345	1756	1576	1576
q9	6698	4933	4940	4933
q10	6449	2220	1891	1891
q11	446	278	241	241
q12	695	426	295	295
q13	18200	3313	2784	2784
q14	266	259	237	237
q15	q16	821	779	695	695
q17	1006	961	936	936
q18	7232	6085	5635	5635
q19	1232	1286	1029	1029
q20	521	386	256	256
q21	5774	2554	2378	2378
q22	426	359	296	296
Total cold run time: 101007 ms
Total hot run time: 31235 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4356	4288	4295	4288
q2	q3	4505	4963	4310	4310
q4	2093	2243	1392	1392
q5	4458	4329	4737	4329
q6	256	202	146	146
q7	2027	1811	1683	1683
q8	2545	2134	2180	2134
q9	8048	7933	7975	7933
q10	4842	4776	4493	4493
q11	574	421	392	392
q12	753	773	544	544
q13	3254	3636	2982	2982
q14	299	318	282	282
q15	q16	728	741	655	655
q17	1385	1333	1373	1333
q18	7916	7327	6909	6909
q19	1097	1099	1083	1083
q20	2229	2217	1952	1952
q21	5345	4626	4489	4489
q22	547	462	422	422
Total cold run time: 57257 ms
Total hot run time: 51751 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172628 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4b7bca5c076ecf78d3b808aa8d68f2c06634044f, data reload: false

query5	4347	671	526	526
query6	333	217	201	201
query7	4246	544	333	333
query8	331	236	212	212
query9	8838	4145	4097	4097
query10	444	340	306	306
query11	5777	2491	2269	2269
query12	182	132	137	132
query13	1276	619	447	447
query14	6156	5578	5244	5244
query14_1	4552	4512	4597	4512
query15	215	207	180	180
query16	998	461	444	444
query17	1183	754	608	608
query18	2563	489	368	368
query19	216	207	169	169
query20	136	131	130	130
query21	232	139	120	120
query22	13703	13605	13354	13354
query23	17177	16571	16278	16278
query23_1	16449	16385	16345	16345
query24	7456	1773	1356	1356
query24_1	1342	1326	1360	1326
query25	570	500	439	439
query26	1311	327	182	182
query27	2753	557	347	347
query28	4423	2026	1995	1995
query29	1027	652	520	520
query30	308	241	202	202
query31	1136	1092	966	966
query32	92	75	74	74
query33	547	368	295	295
query34	1165	1168	670	670
query35	778	797	707	707
query36	1449	1437	1264	1264
query37	151	100	84	84
query38	3198	3182	3082	3082
query39	953	914	892	892
query39_1	876	871	894	871
query40	227	142	124	124
query41	64	63	62	62
query42	110	109	108	108
query43	328	342	303	303
query44	
query45	215	198	198	198
query46	1087	1187	748	748
query47	2422	2351	2243	2243
query48	380	432	291	291
query49	628	490	382	382
query50	987	345	247	247
query51	4369	4423	4311	4311
query52	104	105	92	92
query53	252	298	209	209
query54	313	310	260	260
query55	92	91	89	89
query56	300	292	290	290
query57	1428	1414	1337	1337
query58	303	266	269	266
query59	1626	1639	1443	1443
query60	313	325	305	305
query61	152	157	157	157
query62	698	658	588	588
query63	233	204	211	204
query64	2388	806	628	628
query65	
query66	1713	475	345	345
query67	29678	29750	29540	29540
query68	
query69	465	340	299	299
query70	1034	1033	967	967
query71	302	312	268	268
query72	3012	2665	2404	2404
query73	885	772	449	449
query74	5097	5003	4781	4781
query75	2687	2606	2276	2276
query76	2325	1146	789	789
query77	398	413	333	333
query78	12417	12467	11792	11792
query79	1531	1075	766	766
query80	1358	541	448	448
query81	528	281	239	239
query82	939	160	118	118
query83	361	277	247	247
query84	272	138	107	107
query85	947	528	454	454
query86	456	325	346	325
query87	3459	3401	3288	3288
query88	3619	2757	2720	2720
query89	451	392	341	341
query90	1920	186	182	182
query91	178	169	139	139
query92	82	76	70	70
query93	1580	1370	974	974
query94	734	350	293	293
query95	683	478	354	354
query96	1024	802	371	371
query97	2733	2730	2630	2630
query98	234	225	228	225
query99	1164	1152	1032	1032
Total cold run time: 255087 ms
Total hot run time: 172628 ms

…x in BM25 variant repro

### What problem does this PR solve?

Issue Number: close #N/A (follow-up to DORIS-25510 / PR apache#63692)

Problem Summary:

`test_bm25_score_variant_boolean_subcolumn` set
`enable_match_without_inverted_index=false`, which makes BE reject any
match on a column missing a fulltext inverted index BEFORE reaching the
BM25 collection path:

```
SQLException: errCode = 2,
detailMessage = match_any not support execute_match
                failed to initialize storage reader.
```

That short-circuits the entire test — we never exercise the
`process_segment` code path the DORIS-25510 fix is about. The original
reproducer in the Jira ticket did not set that flag, so its default
(true) is what triggers the BM25 stats collection on the variant
sub-column with the now-fixed try/catch.

Drop the strict-mode setting; the predicate still returns no rows in
segments where v.c has no fulltext index, and now BM25 collection runs
the path under test.

### Release note

None (test-only refinement).

### Check List (For Author)

- Test:
    - Regression-test only update.
- Behavior changed: No
- Does this need documentation: No
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31909 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4b2e683b1dbe4fc37e3d651dd5892549b19927e3, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17683	4066	4089	4066
q2	q3	10788	1376	811	811
q4	4683	484	350	350
q5	7571	2298	2105	2105
q6	250	181	139	139
q7	932	785	652	652
q8	9363	1784	1609	1609
q9	5209	4973	4994	4973
q10	6401	2196	1837	1837
q11	439	279	251	251
q12	639	430	302	302
q13	18108	3380	2818	2818
q14	271	261	238	238
q15	q16	820	773	709	709
q17	1000	950	972	950
q18	7117	5715	5644	5644
q19	1348	1262	1239	1239
q20	594	488	286	286
q21	6253	2891	2616	2616
q22	596	365	314	314
Total cold run time: 100065 ms
Total hot run time: 31909 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4911	4876	4842	4842
q2	q3	4977	5227	4690	4690
q4	2128	2222	1440	1440
q5	5059	4730	4744	4730
q6	226	175	128	128
q7	1883	1805	1555	1555
q8	2437	2184	2175	2175
q9	7899	7429	7469	7429
q10	4737	4695	4271	4271
q11	538	382	356	356
q12	738	753	542	542
q13	3063	3426	2772	2772
q14	287	284	250	250
q15	q16	679	700	619	619
q17	1309	1286	1281	1281
q18	7295	6790	6739	6739
q19	1103	1085	1123	1085
q20	2243	2229	1955	1955
q21	5362	4604	4526	4526
q22	509	471	407	407
Total cold run time: 57383 ms
Total hot run time: 51792 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171516 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4b2e683b1dbe4fc37e3d651dd5892549b19927e3, data reload: false

query5	4324	668	534	534
query6	336	219	196	196
query7	4226	581	321	321
query8	323	235	220	220
query9	8834	4020	4016	4016
query10	457	345	294	294
query11	5739	2472	2283	2283
query12	188	127	125	125
query13	1263	610	435	435
query14	6092	5540	5210	5210
query14_1	4496	4512	4480	4480
query15	215	208	183	183
query16	991	482	439	439
query17	962	745	623	623
query18	2449	492	362	362
query19	230	214	167	167
query20	136	133	129	129
query21	213	137	118	118
query22	13635	13607	13435	13435
query23	17451	16618	16282	16282
query23_1	16341	16265	16353	16265
query24	7477	1783	1325	1325
query24_1	1316	1299	1332	1299
query25	584	496	442	442
query26	1336	338	170	170
query27	2689	572	345	345
query28	4464	1982	1997	1982
query29	1030	665	514	514
query30	311	251	198	198
query31	1130	1088	973	973
query32	83	75	77	75
query33	555	364	334	334
query34	1155	1163	671	671
query35	777	792	707	707
query36	1446	1480	1308	1308
query37	154	108	99	99
query38	3249	3167	3085	3085
query39	918	917	892	892
query39_1	888	871	890	871
query40	228	142	122	122
query41	64	62	62	62
query42	109	113	107	107
query43	335	331	288	288
query44	
query45	216	206	194	194
query46	1065	1235	740	740
query47	2374	2460	2245	2245
query48	407	426	307	307
query49	632	488	383	383
query50	1022	352	258	258
query51	4373	4271	4207	4207
query52	104	104	93	93
query53	250	287	201	201
query54	310	269	251	251
query55	93	91	87	87
query56	296	311	304	304
query57	1444	1437	1334	1334
query58	294	268	264	264
query59	1592	1687	1423	1423
query60	320	324	304	304
query61	156	163	157	157
query62	702	649	568	568
query63	249	207	204	204
query64	2415	804	607	607
query65	
query66	1739	485	350	350
query67	29747	29622	29563	29563
query68	
query69	465	339	301	301
query70	1063	994	990	990
query71	342	270	266	266
query72	2971	2716	2443	2443
query73	840	835	409	409
query74	5087	4925	4823	4823
query75	2707	2617	2275	2275
query76	2297	1138	774	774
query77	411	411	343	343
query78	12434	12503	11812	11812
query79	1492	1049	774	774
query80	636	534	451	451
query81	451	277	241	241
query82	1380	156	119	119
query83	364	284	248	248
query84	303	137	112	112
query85	882	547	447	447
query86	396	364	304	304
query87	3466	3390	3235	3235
query88	3657	2754	2752	2752
query89	447	394	346	346
query90	1970	180	182	180
query91	181	168	138	138
query92	82	79	71	71
query93	1449	1624	865	865
query94	569	359	316	316
query95	673	467	349	349
query96	1042	789	357	357
query97	2718	2728	2607	2607
query98	240	227	230	227
query99	1197	1159	1030	1030
Total cold run time: 253997 ms
Total hot run time: 171516 ms

…variant test

### What problem does this PR solve?

Issue Number: close #N/A (follow-up to DORIS-25510 / PR apache#63692)

Problem Summary:

The happy-path query in `test_bm25_score_variant_boolean_subcolumn`
(score() on the string sub-column `v.a`) used `order by id` with no
LIMIT. FE rejects it:

```
SQLException: errCode = 2,
detailMessage = score() function requires WHERE clause with MATCH
                function, ORDER BY and LIMIT for optimization
```

Switch to `order by score() limit 10` (same shape as the negative-case
query earlier in the file) so the score() TopN push-down is exercised
and the BM25 stats collection path on a fulltext sub-column is verified.

### Release note

None (test-only refinement).

### Check List (For Author)

- Test:
    - Regression-test only update.
- Behavior changed: No
- Does this need documentation: No
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31516 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 241377ea6c291cae6a3d9ce9022bb60c7fa1ee28, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17667	3959	3975	3959
q2	q3	10802	1380	818	818
q4	4685	473	345	345
q5	7578	2282	2090	2090
q6	256	175	138	138
q7	952	771	628	628
q8	9478	1720	1675	1675
q9	6964	4977	4964	4964
q10	6453	2256	1877	1877
q11	433	272	253	253
q12	699	421	292	292
q13	18233	3465	2805	2805
q14	267	256	238	238
q15	q16	827	791	707	707
q17	996	995	872	872
q18	6740	5728	5455	5455
q19	1184	1451	1108	1108
q20	551	428	271	271
q21	6075	2757	2716	2716
q22	461	383	305	305
Total cold run time: 101301 ms
Total hot run time: 31516 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4891	4741	5030	4741
q2	q3	4899	5295	4622	4622
q4	2157	2199	1434	1434
q5	4913	4727	4751	4727
q6	235	185	132	132
q7	1929	1788	1601	1601
q8	2269	1989	1953	1953
q9	7460	7478	7458	7458
q10	4769	4693	4281	4281
q11	540	390	381	381
q12	733	750	540	540
q13	3099	3409	2781	2781
q14	278	289	257	257
q15	q16	687	706	619	619
q17	1285	1268	1264	1264
q18	7237	7019	6916	6916
q19	1092	1107	1118	1107
q20	2235	2256	1962	1962
q21	5296	4604	4537	4537
q22	523	453	437	437
Total cold run time: 56527 ms
Total hot run time: 51750 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171465 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 241377ea6c291cae6a3d9ce9022bb60c7fa1ee28, data reload: false

query5	4302	680	521	521
query6	325	218	201	201
query7	4225	611	301	301
query8	321	232	215	215
query9	8838	4047	4044	4044
query10	440	355	292	292
query11	5770	2498	2311	2311
query12	179	125	128	125
query13	1278	626	387	387
query14	6193	5550	5227	5227
query14_1	4497	4541	4486	4486
query15	208	204	184	184
query16	1005	449	433	433
query17	1117	720	585	585
query18	2647	468	348	348
query19	215	195	161	161
query20	137	131	129	129
query21	219	143	114	114
query22	13593	13632	13329	13329
query23	17355	16434	16160	16160
query23_1	16381	16352	16326	16326
query24	7511	1776	1314	1314
query24_1	1290	1328	1306	1306
query25	544	485	469	469
query26	1300	317	175	175
query27	2696	570	362	362
query28	4474	1991	2003	1991
query29	966	629	533	533
query30	303	247	193	193
query31	1135	1084	977	977
query32	93	77	71	71
query33	546	343	292	292
query34	1156	1103	666	666
query35	777	798	700	700
query36	1374	1395	1231	1231
query37	160	102	90	90
query38	3212	3144	3073	3073
query39	918	933	907	907
query39_1	870	912	899	899
query40	237	148	125	125
query41	66	65	61	61
query42	111	112	119	112
query43	334	334	289	289
query44	
query45	218	207	200	200
query46	1113	1232	767	767
query47	2447	2420	2337	2337
query48	413	451	326	326
query49	635	522	399	399
query50	1048	364	262	262
query51	4398	4310	4294	4294
query52	105	104	92	92
query53	259	290	203	203
query54	319	270	254	254
query55	91	91	87	87
query56	293	318	314	314
query57	1436	1432	1332	1332
query58	310	283	275	275
query59	1575	1688	1458	1458
query60	332	341	327	327
query61	180	182	181	181
query62	695	661	591	591
query63	248	205	209	205
query64	2420	860	693	693
query65	
query66	1700	497	369	369
query67	29758	29645	29614	29614
query68	
query69	481	377	338	338
query70	1059	1040	1002	1002
query71	307	279	278	278
query72	3145	2671	2370	2370
query73	833	723	448	448
query74	5150	5017	4777	4777
query75	2725	2607	2267	2267
query76	2285	1122	772	772
query77	414	415	340	340
query78	12442	12346	11940	11940
query79	1433	1061	771	771
query80	653	547	470	470
query81	453	286	241	241
query82	1352	157	123	123
query83	364	282	248	248
query84	256	140	113	113
query85	896	553	459	459
query86	404	333	311	311
query87	3443	3400	3247	3247
query88	3659	2732	2771	2732
query89	450	389	341	341
query90	1955	191	185	185
query91	181	172	145	145
query92	78	77	76	76
query93	1520	1485	954	954
query94	544	358	291	291
query95	716	377	437	377
query96	1071	836	326	326
query97	2738	2711	2650	2650
query98	231	225	226	225
query99	1191	1147	1033	1033
Total cold run time: 254743 ms
Total hot run time: 171465 ms

### What problem does this PR solve?

Issue Number: close #N/A (follow-up to DORIS-25510 / PR apache#63692)

Problem Summary:

The string-subcolumn happy-path query asserted `score() > 0`, but the
DORIS-25510 fix in `process_segment` is allowed to skip a segment whose
on-disk inverted index is BKD/numeric. That means the BM25 stats for the
sub-column can come from a subset of segments — including, in some
schedules, none of them — and the resulting score can legitimately be
0.0 / NaN / null without the BE having crashed.

This regression's purpose is "BE survives the query and returns the
expected row", not "BM25 produces a particular score for these three
rows". Replace `assertTrue(score > 0)` with `assertNotNull(score)`:
that still proves the score() pipeline didn't abort the BE, which is
what DORIS-25510 is about.

### Release note

None (test-only refinement).

### Check List (For Author)

- Test:
    - Regression-test only update.
- Behavior changed: No
- Does this need documentation: No
@airborne12
Copy link
Copy Markdown
Member Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31432 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 45771025f8b234ca5ad677bc61de9a7ac104c641, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17711	4091	4119	4091
q2	q3	10801	1401	791	791
q4	4685	479	345	345
q5	7598	2261	2086	2086
q6	243	178	139	139
q7	972	771	660	660
q8	9365	1761	1717	1717
q9	6742	4932	4966	4932
q10	6455	2287	1884	1884
q11	433	269	240	240
q12	687	436	294	294
q13	18241	3332	2772	2772
q14	269	261	241	241
q15	q16	822	777	708	708
q17	982	992	932	932
q18	7272	5682	5524	5524
q19	1213	1247	1109	1109
q20	519	409	268	268
q21	5761	2597	2392	2392
q22	433	362	307	307
Total cold run time: 101204 ms
Total hot run time: 31432 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4434	4386	4444	4386
q2	q3	4571	4950	4333	4333
q4	2142	2271	1484	1484
q5	4497	4359	4908	4359
q6	263	204	145	145
q7	2053	1935	1646	1646
q8	2538	2270	2253	2253
q9	8095	8100	8089	8089
q10	4881	4793	4335	4335
q11	624	424	386	386
q12	822	757	586	586
q13	3302	3680	2927	2927
q14	301	313	286	286
q15	q16	730	754	645	645
q17	1418	1415	1375	1375
q18	7994	7480	6886	6886
q19	1093	1106	1127	1106
q20	2238	2229	1963	1963
q21	5409	4684	4598	4598
q22	538	472	405	405
Total cold run time: 57943 ms
Total hot run time: 52193 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172030 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 45771025f8b234ca5ad677bc61de9a7ac104c641, data reload: false

query5	4320	651	507	507
query6	340	213	197	197
query7	4264	583	313	313
query8	348	246	217	217
query9	8819	4034	4014	4014
query10	457	329	320	320
query11	5764	2423	2265	2265
query12	179	127	126	126
query13	1286	574	425	425
query14	6144	5506	5245	5245
query14_1	4548	4535	4537	4535
query15	215	207	184	184
query16	995	447	453	447
query17	1138	743	583	583
query18	2723	495	346	346
query19	207	201	156	156
query20	137	132	133	132
query21	213	132	119	119
query22	13633	13669	13444	13444
query23	17372	16620	16245	16245
query23_1	16332	16437	16281	16281
query24	7440	1763	1329	1329
query24_1	1341	1308	1330	1308
query25	543	464	419	419
query26	1307	318	178	178
query27	2703	559	352	352
query28	4417	1984	1954	1954
query29	1015	622	489	489
query30	308	241	198	198
query31	1158	1087	948	948
query32	90	73	70	70
query33	522	342	298	298
query34	1192	1121	652	652
query35	782	820	709	709
query36	1402	1379	1289	1289
query37	153	112	90	90
query38	3207	3161	3084	3084
query39	938	905	905	905
query39_1	893	881	909	881
query40	221	143	126	126
query41	64	61	62	61
query42	109	108	109	108
query43	325	349	299	299
query44	
query45	213	207	198	198
query46	1090	1269	751	751
query47	2380	2321	2365	2321
query48	415	433	302	302
query49	640	493	393	393
query50	1065	366	264	264
query51	4453	4333	4284	4284
query52	103	106	95	95
query53	257	285	207	207
query54	317	277	267	267
query55	95	92	91	91
query56	299	310	316	310
query57	1477	1419	1394	1394
query58	327	278	267	267
query59	1607	1715	1435	1435
query60	326	324	311	311
query61	161	158	156	156
query62	689	654	586	586
query63	240	195	204	195
query64	2360	851	619	619
query65	
query66	1661	467	352	352
query67	29770	29747	28987	28987
query68	
query69	456	340	311	311
query70	1072	1016	970	970
query71	305	266	268	266
query72	3119	2751	2428	2428
query73	858	787	415	415
query74	5127	4954	4808	4808
query75	2728	2619	2274	2274
query76	2277	1171	788	788
query77	409	416	346	346
query78	12340	12356	11931	11931
query79	1240	1027	758	758
query80	579	563	462	462
query81	450	283	240	240
query82	241	157	120	120
query83	362	280	248	248
query84	262	141	113	113
query85	869	535	457	457
query86	383	352	334	334
query87	3433	3371	3273	3273
query88	3619	2741	2728	2728
query89	430	408	341	341
query90	2135	194	183	183
query91	178	163	138	138
query92	77	76	73	73
query93	1397	1420	908	908
query94	531	366	303	303
query95	670	473	342	342
query96	1130	806	348	348
query97	2736	2708	2630	2630
query98	243	231	236	231
query99	1185	1151	1034	1034
Total cold run time: 253316 ms
Total hot run time: 172030 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants