Skip to content

[refine](function) use concrete column pointers for local result columns#63938

Merged
yiguolei merged 1 commit into
apache:masterfrom
Mryange:use-concrete-column-pointers-for-local-result-column
Jun 2, 2026
Merged

[refine](function) use concrete column pointers for local result columns#63938
yiguolei merged 1 commit into
apache:masterfrom
Mryange:use-concrete-column-pointers-for-local-result-column

Conversation

@Mryange
Copy link
Copy Markdown
Contributor

@Mryange Mryange commented Jun 1, 2026

What problem does this PR solve?

Some BE expression and storage code creates a concrete column type and then immediately casts the generic ColumnPtr or MutableColumnPtr back to the same concrete type before writing data. This adds unnecessary casts and makes the ownership intent less direct. Root cause: several local result columns were declared as generic column pointers even though the concrete column type was already known at creation time.

This PR refines those local variables to keep concrete column pointers where the type is explicit, and directly accesses the concrete column data. It also updates the explode-numbers table function member to use a concrete column pointer. The change is limited to local refactoring and does not change runtime behavior.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Jun 1, 2026

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated code review summary for PR 63938.

Overall opinion: approval. I did not find blocking correctness, lifecycle, concurrency, or compatibility issues in the changed BE refactor paths.

Critical checkpoint conclusions:

  • Goal and tests: The PR goal is to keep concrete column pointers where the concrete type is already known and remove immediate casts back from generic pointers. The changed code accomplishes that goal without apparent behavior changes. No new tests were added, which is acceptable for this narrow refactor because existing function/storage coverage should exercise these paths.
  • Scope and clarity: The actual GitHub PR file list is focused on 11 BE files and the changes are small/local. The raw local two-SHA diff is noisy due to shallow/grafted history, so review was based on GitHub PR metadata and patch.
  • Concurrency: The modified code does not introduce new shared state, locks, threads, or atomic/lifecycle signals. Existing function context/table-function object lifecycles are preserved.
  • Lifecycle/static initialization: No new global/static cross-translation-unit initialization dependency was introduced. The DateTrunc state remains per FunctionContext THREAD_LOCAL state and uses shared_ptr-managed storage as before.
  • Configuration/compatibility: No configuration, storage format, thrift/protocol, persisted metadata, or function symbol compatibility changes were introduced.
  • Parallel code paths: The PR consistently applies the same concrete-pointer cleanup to the changed function, table-function, and variant/storage local result-column creation sites. I did not identify a missing parallel path required for correctness.
  • Conditional checks/error handling: No new speculative defensive checks were added. Existing Status-returning paths remain checked, and no ignored Status was introduced.
  • Test coverage/result files: No test result changes are part of this PR. I did not run the full build or test suite in this review.
  • Observability: No new runtime behavior or distributed operation was introduced, so no additional logging or metrics appear necessary.
  • Transaction/persistence/data visibility: The changes do not affect transaction processing, visible versions, delete bitmap alignment, or persistence.
  • Data writes/modifications: The storage changes only adjust local column pointer types and preserve ColumnVariant insertion/padding behavior; no write atomicity or crash-recovery behavior is changed.
  • FE/BE variable passing: No FE/BE transmitted variables or send paths are changed.
  • Performance: The refactor removes redundant casts/assert_mutable calls and should be neutral or slightly cleaner on hot paths without adding allocations or copies.

User focus points: No additional user-provided review focus was specified.

@Mryange
Copy link
Copy Markdown
Contributor Author

Mryange commented Jun 1, 2026

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 38.71% (24/62) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.07% (21045/38922)
Line Coverage 37.63% (199540/530330)
Region Coverage 33.88% (156210/461079)
Branch Coverage 34.85% (67925/194932)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 96.77% (60/62) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.94% (28185/38118)
Line Coverage 57.89% (306278/529031)
Region Coverage 55.09% (256429/465491)
Branch Coverage 56.55% (110655/195662)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29548 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit af5105a96394520b271f89e05ff70280a30847b8, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17739	4084	4110	4084
q2	q3	10878	1438	853	853
q4	4684	498	350	350
q5	7597	917	596	596
q6	189	174	142	142
q7	780	846	661	661
q8	9385	1621	1596	1596
q9	5902	4513	4511	4511
q10	6756	1863	1570	1570
q11	444	278	259	259
q12	632	433	299	299
q13	18138	3416	2797	2797
q14	265	261	242	242
q15	q16	820	775	708	708
q17	949	984	926	926
q18	7019	5810	5522	5522
q19	1312	1277	1101	1101
q20	525	417	274	274
q21	6314	2894	2737	2737
q22	474	376	320	320
Total cold run time: 100802 ms
Total hot run time: 29548 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5017	4896	4831	4831
q2	q3	4944	5415	4645	4645
q4	2168	2242	1426	1426
q5	4808	4854	4792	4792
q6	233	177	132	132
q7	1851	1713	1643	1643
q8	2454	2174	2196	2174
q9	8085	7856	7357	7357
q10	4721	4670	4219	4219
q11	529	388	358	358
q12	751	738	527	527
q13	3050	3418	2763	2763
q14	287	287	270	270
q15	q16	675	704	613	613
q17	1319	1283	1281	1281
q18	7304	6864	6938	6864
q19	1135	1140	1123	1123
q20	2218	2222	1940	1940
q21	5386	4731	4450	4450
q22	523	450	413	413
Total cold run time: 57458 ms
Total hot run time: 51821 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171826 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit af5105a96394520b271f89e05ff70280a30847b8, data reload: false

query5	4308	672	529	529
query6	330	222	212	212
query7	4236	578	322	322
query8	334	246	220	220
query9	8816	4123	4121	4121
query10	454	343	310	310
query11	5785	2354	2155	2155
query12	193	136	129	129
query13	1274	654	456	456
query14	6109	5512	5141	5141
query14_1	4467	4443	4397	4397
query15	212	208	186	186
query16	1029	469	438	438
query17	959	729	600	600
query18	2464	495	359	359
query19	241	209	191	191
query20	143	138	135	135
query21	215	137	128	128
query22	13670	13746	13471	13471
query23	17353	16554	16258	16258
query23_1	16349	16304	16393	16304
query24	7609	1769	1316	1316
query24_1	1293	1344	1358	1344
query25	617	515	457	457
query26	1330	329	182	182
query27	2690	598	359	359
query28	4461	2058	2061	2058
query29	1034	664	528	528
query30	325	240	205	205
query31	1157	1098	961	961
query32	99	87	81	81
query33	548	378	334	334
query34	1179	1160	667	667
query35	804	856	731	731
query36	1410	1432	1226	1226
query37	155	110	96	96
query38	3221	3204	3081	3081
query39	940	919	909	909
query39_1	895	877	893	877
query40	237	156	136	136
query41	74	71	69	69
query42	117	117	118	117
query43	343	340	302	302
query44	
query45	219	207	201	201
query46	1135	1217	763	763
query47	2361	2392	2267	2267
query48	429	405	316	316
query49	667	519	401	401
query50	996	373	262	262
query51	4497	4354	4380	4354
query52	113	113	103	103
query53	258	297	218	218
query54	331	297	276	276
query55	102	97	90	90
query56	347	329	337	329
query57	1442	1410	1330	1330
query58	320	292	308	292
query59	1619	1691	1486	1486
query60	335	346	340	340
query61	215	158	149	149
query62	697	666	590	590
query63	263	208	210	208
query64	2409	835	621	621
query65	
query66	1791	489	378	378
query67	29794	29789	29568	29568
query68	
query69	465	347	306	306
query70	991	1045	1004	1004
query71	324	283	277	277
query72	3076	2845	2508	2508
query73	906	730	425	425
query74	5111	4965	4818	4818
query75	2680	2633	2299	2299
query76	2311	1189	812	812
query77	415	417	344	344
query78	12475	12467	11902	11902
query79	1505	1060	776	776
query80	668	546	484	484
query81	473	286	244	244
query82	1428	163	131	131
query83	349	277	257	257
query84	305	149	117	117
query85	865	543	452	452
query86	419	354	339	339
query87	3410	3390	3283	3283
query88	3712	2777	2759	2759
query89	451	396	344	344
query90	2003	210	189	189
query91	186	167	139	139
query92	87	83	76	76
query93	1544	1427	944	944
query94	568	350	339	339
query95	702	372	445	372
query96	1054	811	329	329
query97	2754	2765	2610	2610
query98	237	234	236	234
query99	1177	1181	1028	1028
Total cold run time: 255480 ms
Total hot run time: 171826 ms

@yiguolei yiguolei merged commit a0a09b0 into apache:master Jun 2, 2026
31 of 32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants