Skip to content

[fix](be) Use debug point for string overflow conversion#63392

Merged
zclllyybb merged 1 commit into
apache:masterfrom
zclllyybb:codex-d25531-debugpoint-overflow
May 19, 2026
Merged

[fix](be) Use debug point for string overflow conversion#63392
zclllyybb merged 1 commit into
apache:masterfrom
zclllyybb:codex-d25531-debugpoint-overflow

Conversation

@zclllyybb
Copy link
Copy Markdown
Contributor

Remove the mutable BE config used to force string overflow behavior and keep the production string column limit as a named constant with the original uint32 max value. The conversion helper now reads a scoped debug point override only when debug points are enabled, so tests can exercise the ColumnString64 conversion path without changing global BE config.

The fault-injection regression now validates through qt_sql baselines before and after enabling the debug point. Its query builds the string join key from convert_tz output, which drives the same string-column conversion path while keeping the expected results deterministic.

Also keep the existing conversion-focused BE unit coverage on debug point injection rather than global config mutation, and preserve the overflow conversion loop guard so forced small limits do not read before the first offset.

Remove the mutable BE config used to force string overflow behavior and keep the production string column limit as a named constant with the original uint32 max value. The conversion helper now reads a scoped debug point override only when debug points are enabled, so tests can exercise the ColumnString64 conversion path without changing global BE config.

The fault-injection regression now validates through qt_sql baselines before and after enabling the debug point. Its query builds the string join key from convert_tz output, which drives the same string-column conversion path while keeping the expected results deterministic.

Also keep the existing conversion-focused BE unit coverage on debug point injection rather than global config mutation.

Verification:

- build-support/clang-format.sh

- ./run-be-ut.sh --run --filter=ColumnStringTest.convert_column_if_overflow:HybridSetTest.StringValueSet:MinmaxPredicateTest.String

- ./build.sh --be --fe

- ./run-regression-test.sh --run -d fault_injection_p0 -s test_doris_25531_string_overflow_fault_injection -forceGenOut

- ./run-regression-test.sh --run -d fault_injection_p0 -s test_doris_25531_string_overflow_fault_injection
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31508 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1e3921320cabaa5d29899f7ccc3e8be04cf63f7d, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17641	3943	3865	3865
q2	q3	10802	1415	823	823
q4	4708	477	345	345
q5	8278	2263	2136	2136
q6	375	182	139	139
q7	963	788	636	636
q8	9404	1695	1666	1666
q9	7068	4922	4921	4921
q10	6452	2065	1862	1862
q11	445	273	252	252
q12	668	443	301	301
q13	18124	3429	2816	2816
q14	261	258	239	239
q15	q16	819	788	712	712
q17	940	882	919	882
q18	7213	5807	5618	5618
q19	1160	1415	1091	1091
q20	527	412	277	277
q21	5872	2722	2613	2613
q22	458	373	314	314
Total cold run time: 102178 ms
Total hot run time: 31508 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4603	4554	4766	4554
q2	q3	4863	5233	4646	4646
q4	2387	2213	1446	1446
q5	4992	4689	4658	4658
q6	232	185	141	141
q7	1868	1759	1398	1398
q8	2210	1930	1902	1902
q9	7233	7268	7167	7167
q10	4491	4429	3956	3956
q11	549	394	352	352
q12	720	720	519	519
q13	3080	3417	2744	2744
q14	277	279	254	254
q15	q16	679	700	606	606
q17	1279	1252	1247	1247
q18	7355	6747	6922	6747
q19	1169	1097	1130	1097
q20	2218	2238	1937	1937
q21	5365	4634	4581	4581
q22	540	453	411	411
Total cold run time: 56110 ms
Total hot run time: 50363 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170360 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1e3921320cabaa5d29899f7ccc3e8be04cf63f7d, data reload: false

query5	4320	656	529	529
query6	333	225	206	206
query7	4326	582	286	286
query8	326	242	222	222
query9	8807	4066	4020	4020
query10	445	367	296	296
query11	5840	2386	2262	2262
query12	189	136	128	128
query13	1289	601	428	428
query14	5979	5359	5052	5052
query14_1	4338	4320	4356	4320
query15	217	201	185	185
query16	984	456	435	435
query17	1115	730	604	604
query18	2453	533	375	375
query19	226	213	174	174
query20	147	145	137	137
query21	214	148	125	125
query22	13621	13531	13372	13372
query23	17288	16431	16171	16171
query23_1	16120	16227	16126	16126
query24	7624	1773	1294	1294
query24_1	1305	1329	1349	1329
query25	575	508	440	440
query26	1300	324	183	183
query27	2721	577	346	346
query28	4445	1973	1988	1973
query29	1023	641	538	538
query30	300	242	211	211
query31	1142	1085	953	953
query32	94	81	81	81
query33	539	370	312	312
query34	1181	1118	664	664
query35	774	794	689	689
query36	1333	1372	1214	1214
query37	161	109	94	94
query38	3248	3167	3061	3061
query39	934	954	914	914
query39_1	894	890	900	890
query40	237	158	138	138
query41	73	70	69	69
query42	118	116	113	113
query43	331	342	297	297
query44	
query45	221	207	198	198
query46	1144	1232	747	747
query47	2363	2344	2152	2152
query48	412	433	292	292
query49	652	514	408	408
query50	1023	344	263	263
query51	4384	4298	4270	4270
query52	112	108	111	108
query53	269	284	216	216
query54	331	296	271	271
query55	96	93	89	89
query56	332	317	319	317
query57	1419	1409	1315	1315
query58	318	325	271	271
query59	1566	1607	1414	1414
query60	323	322	323	322
query61	160	154	157	154
query62	663	639	569	569
query63	246	201	205	201
query64	2433	826	629	629
query65	
query66	1722	479	361	361
query67	30012	29989	29941	29941
query68	
query69	467	347	310	310
query70	1011	1000	994	994
query71	305	278	271	271
query72	3052	2718	2486	2486
query73	846	807	406	406
query74	5057	4914	4727	4727
query75	2678	2635	2291	2291
query76	2273	1139	763	763
query77	414	417	335	335
query78	12343	12288	11648	11648
query79	1477	1055	750	750
query80	852	553	462	462
query81	483	282	244	244
query82	1310	163	129	129
query83	319	285	248	248
query84	257	139	108	108
query85	904	544	464	464
query86	448	334	327	327
query87	3453	3393	3202	3202
query88	3561	2691	2668	2668
query89	459	384	341	341
query90	1792	188	187	187
query91	177	165	145	145
query92	80	76	76	76
query93	1575	1518	881	881
query94	629	366	297	297
query95	671	487	349	349
query96	1036	757	352	352
query97	2706	2687	2548	2548
query98	239	233	227	227
query99	1128	1124	1009	1009
Total cold run time: 254178 ms
Total hot run time: 170360 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (10/10) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.66% (27927/37914)
Line Coverage 57.59% (302797/525745)
Region Coverage 54.86% (253830/462687)
Branch Coverage 56.33% (109514/194403)

@zclllyybb zclllyybb merged commit 7ac8160 into apache:master May 19, 2026
32 of 33 checks passed
@zclllyybb zclllyybb deleted the codex-d25531-debugpoint-overflow branch May 19, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants