Skip to content

[fix](expr) fix mixed const probe constant handling regressions#63810

Open
Mryange wants to merge 2 commits into
apache:masterfrom
Mryange:fix-some-const-exec-dev-5.28
Open

[fix](expr) fix mixed const probe constant handling regressions#63810
Mryange wants to merge 2 commits into
apache:masterfrom
Mryange:fix-some-const-exec-dev-5.28

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented May 28, 2026

What problem does this PR solve?

Problem Summary:

The mixed const execution probe exposed several constant-handling problems in BE vectorized functions.

  • ColumnConst::clone_resized reused the original nested column, so cloned const columns could still alias the source data.
  • quantile_percent requires its percentile argument to stay constant, but the all-const probe path unpacked it and triggered a false constant-check failure.
  • regexp_count accessed string columns directly and did not handle mixed const inputs correctly.
  • uniform still went through the default constant implementation even though its result depends on per-row seed values.

This change fixes those behaviors and adds focused unit tests for the uncovered cases.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 28, 2026

/review

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 28, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review summary:

No blocking PR-introduced issues were found in the changed lines. Existing inline review context was empty, and there was no additional user-provided focus.

Critical checkpoint conclusions:

  • Goal/test proof: The PR targets mixed-const constant handling for ColumnConst cloning, quantile_percent, regexp_count, and uniform. The added unit tests cover each targeted behavior.
  • Scope: The code changes are small and focused on the described BE constant-handling paths.
  • Concurrency/lifecycle: No new shared state, locking, threads, or non-trivial lifecycle management were introduced. regexp_count continues to use FunctionContext thread-local regex state.
  • Configuration/compatibility/protocol: No new configs, storage formats, or FE/BE protocol changes.
  • Parallel paths: The changed paths match the specific functions affected by the mixed-const probe; no additional equivalent PR-introduced path requiring the same fix was identified.
  • Error handling: New production Status-returning paths are propagated through existing function execution wrappers; no unchecked production Status was introduced.
  • Tests/results: Added BE unit tests cover ColumnConst clone_resized aliasing, quantile_percent const percentile preservation, regexp_count mixed const/null combinations, and uniform mixed-const execution.
  • Observability/transactions/data writes: Not applicable; this PR does not add operational paths, transactions, or data-write logic.
  • Performance: The new ColumnView use is in regexp_count where regex work dominates per-row access overhead; no obvious hot-path regression was found.

Focus response: No additional user focus was provided.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented May 28, 2026

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking findings found in the actual PR diff.

Critical checkpoint conclusions:

  • Goal/test proof: The patch targets mixed constant execution regressions for ColumnConst cloning, regexp_count, quantile_percent, and uniform. The added BE unit tests cover the changed paths.
  • Scope/focus: The actual PR is small and focused across BE expression/column handling plus targeted tests.
  • Concurrency/lifecycle: No new concurrency, static initialization, or non-obvious lifecycle management was introduced.
  • Config/compatibility/protocol: No new configuration items, storage format changes, or FE-BE protocol changes.
  • Parallel paths: The changes are localized to the affected functions and ColumnConst clone behavior; no missed parallel path was identified in this review.
  • Error handling/memory: No ignored Status or new untracked large allocation issue identified.
  • Data correctness: The changes preserve constant column semantics and avoid sharing the nested data column after clone_resized.
  • Test coverage: Targeted BE unit tests were added for the changed behavior. I did not run tests in this review environment.
  • Observability/transactions/persistence: Not applicable to this PR.

User focus: No additional user-provided review focus was present.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31788 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5dc83dfa6c37886b59b8842f06672bd8af907788, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17595	4030	4015	4015
q2	q3	10757	1438	841	841
q4	4692	479	355	355
q5	7608	2315	2180	2180
q6	241	176	137	137
q7	1020	770	661	661
q8	9354	1701	1680	1680
q9	5625	4958	5011	4958
q10	6474	2258	1901	1901
q11	453	277	242	242
q12	694	434	296	296
q13	18251	3353	2774	2774
q14	263	265	240	240
q15	q16	834	789	717	717
q17	1010	980	923	923
q18	6923	5911	5706	5706
q19	1172	1309	1148	1148
q20	536	403	274	274
q21	5737	2665	2439	2439
q22	449	362	301	301
Total cold run time: 99688 ms
Total hot run time: 31788 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4394	4275	4298	4275
q2	q3	4591	4953	4410	4410
q4	2110	2209	1397	1397
q5	4422	4298	4659	4298
q6	256	214	149	149
q7	2191	1972	1681	1681
q8	2557	2162	2225	2162
q9	8271	8070	7955	7955
q10	4950	4777	4315	4315
q11	626	440	393	393
q12	752	783	560	560
q13	3256	3651	2867	2867
q14	311	298	280	280
q15	q16	726	743	643	643
q17	1532	1347	1324	1324
q18	7917	7370	7165	7165
q19	1116	1076	1130	1076
q20	2244	2236	1952	1952
q21	5331	4656	4517	4517
q22	541	478	409	409
Total cold run time: 58094 ms
Total hot run time: 51828 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172277 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5dc83dfa6c37886b59b8842f06672bd8af907788, data reload: false

query5	4343	662	515	515
query6	340	220	200	200
query7	4257	550	287	287
query8	325	234	221	221
query9	8791	4098	4091	4091
query10	450	348	300	300
query11	5768	2378	2203	2203
query12	188	130	125	125
query13	1311	623	403	403
query14	6044	5464	5123	5123
query14_1	4432	4444	4412	4412
query15	215	206	183	183
query16	997	429	439	429
query17	952	721	591	591
query18	2466	490	356	356
query19	214	200	158	158
query20	137	134	131	131
query21	217	140	121	121
query22	13663	13541	13366	13366
query23	17329	16778	16109	16109
query23_1	16321	16350	16504	16350
query24	7501	1759	1349	1349
query24_1	1340	1348	1350	1348
query25	587	507	453	453
query26	1317	325	175	175
query27	2692	569	371	371
query28	4557	2106	2065	2065
query29	1019	678	524	524
query30	308	244	200	200
query31	1126	1092	963	963
query32	109	82	78	78
query33	557	363	305	305
query34	1196	1161	661	661
query35	767	825	714	714
query36	1407	1400	1253	1253
query37	160	110	96	96
query38	3213	3158	3088	3088
query39	946	928	932	928
query39_1	882	873	887	873
query40	238	155	130	130
query41	72	69	68	68
query42	111	110	113	110
query43	333	340	301	301
query44	
query45	220	210	213	210
query46	1109	1167	744	744
query47	2423	2404	2256	2256
query48	410	419	308	308
query49	661	509	412	412
query50	943	336	263	263
query51	4416	4300	4255	4255
query52	107	109	96	96
query53	262	308	217	217
query54	327	295	286	286
query55	101	96	98	96
query56	326	337	324	324
query57	1451	1436	1343	1343
query58	327	295	285	285
query59	1629	1681	1476	1476
query60	336	373	332	332
query61	184	181	177	177
query62	707	686	598	598
query63	263	202	209	202
query64	2485	894	629	629
query65	
query66	1772	494	358	358
query67	29727	29714	29740	29714
query68	
query69	463	346	312	312
query70	1116	1002	1022	1002
query71	304	266	262	262
query72	3001	2486	2416	2416
query73	877	778	418	418
query74	5124	4976	4819	4819
query75	2699	2616	2300	2300
query76	2297	1143	814	814
query77	413	398	334	334
query78	12380	12485	11933	11933
query79	1531	1080	750	750
query80	661	540	463	463
query81	483	286	247	247
query82	1340	162	130	130
query83	351	270	284	270
query84	305	138	114	114
query85	874	544	452	452
query86	460	358	334	334
query87	3445	3381	3254	3254
query88	3692	2793	2783	2783
query89	452	391	339	339
query90	2011	191	183	183
query91	181	171	142	142
query92	75	82	77	77
query93	1436	1545	885	885
query94	558	377	317	317
query95	690	473	358	358
query96	1018	788	355	355
query97	2789	2721	2694	2694
query98	236	231	243	231
query99	1182	1155	1040	1040
Total cold run time: 254719 ms
Total hot run time: 172277 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 100.00% (12/12) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.99% (20978/38857)
Line Coverage 37.54% (198874/529777)
Region Coverage 33.82% (155834/460840)
Branch Coverage 34.81% (67851/194897)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (12/12) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.78% (28078/38057)
Line Coverage 57.74% (305139/528427)
Region Coverage 54.82% (255049/465262)
Branch Coverage 56.41% (110350/195627)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants