Skip to content

[refine](code) Add comments and benchmarks for ColumnView.#61678

Open
Mryange wants to merge 1 commit intoapache:masterfrom
Mryange:benchmark-column-view
Open

[refine](code) Add comments and benchmarks for ColumnView.#61678
Mryange wants to merge 1 commit intoapache:masterfrom
Mryange:benchmark-column-view

Conversation

@Mryange
Copy link
Contributor

@Mryange Mryange commented Mar 24, 2026

What problem does this PR solve?

-----------------------------------------------------------------------------------------
Benchmark                                               Time             CPU   Iterations
-----------------------------------------------------------------------------------------
Handwritten_Unary_Plain                               326 ns          326 ns      2151459
ColumnView_Unary_Plain                                326 ns          326 ns      2146584
Handwritten_Unary_Nullable                           2067 ns         2067 ns       342110
ColumnView_Unary_Nullable                            2061 ns         2061 ns       341236
Handwritten_Binary_Plain_Plain                        680 ns          680 ns      1028990
ColumnView_Binary_Plain_Plain                         679 ns          679 ns      1025809
Handwritten_Binary_Plain_Const                        277 ns          277 ns      2534313
ColumnView_Binary_Plain_Const                         282 ns          282 ns      2484547
Handwritten_Binary_Plain_Nullable                     776 ns          776 ns       881182
ColumnView_Binary_Plain_Nullable                      779 ns          779 ns       897644
Handwritten_Binary_Nullable_Nullable                 3233 ns         3233 ns       217793
ColumnView_Binary_Nullable_Nullable                  4469 ns         4469 ns       157379
Handwritten_Ternary_Plain_Plain_Plain                1016 ns         1016 ns       688153
ColumnView_Ternary_Plain_Plain_Plain                 1017 ns         1017 ns       685327
Handwritten_Ternary_Const_Const_Plain                 278 ns          278 ns      2506171
ColumnView_Ternary_Const_Const_Plain                  285 ns          285 ns      2456870
Handwritten_Ternary_Plain_Const_Plain                 678 ns          678 ns      1026683
ColumnView_Ternary_Plain_Const_Plain                  681 ns          681 ns      1027665
Handwritten_Ternary_Nullable_Nullable_Nullable       4729 ns         4729 ns       149026
ColumnView_Ternary_Nullable_Nullable_Nullable        8608 ns         8608 ns        82746
  1. Expensive per-element operations (e.g. geo functions, complex string ops):
    Use ColumnView freely — its overhead is negligible relative to the work.

  2. Cheap per-element operations that the compiler can inline (e.g. simple arithmetic):

    a) Inputs are NOT nullable (e.g. the function framework already strips nullable):
    Safe to use. The compiler optimizes the is_const branch into code equivalent
    to hand-written direct array access (verified via assembly and benchmarks).

    b) Inputs involve nullable columns:

    • Unary operations: safe to use, the compiler still optimizes effectively.
    • Binary / ternary operations: the combined is_null_at checks across multiple
      columns inhibit compiler vectorization and branch optimization, causing
      significant regression (~1.4x for binary, ~1.8x for ternary in benchmarks).
      In this case, hand-written column access is recommended for best performance.

In summary, ColumnView is designed to eliminate the combinatorial explosion of
handling 4 column forms. It is suitable for the vast majority of use cases.
Only the specific combination of "cheap computation + nullable + multi-column"
requires weighing whether to hand-write the access code.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Mar 24, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Mryange
Copy link
Contributor Author

Mryange commented Mar 24, 2026

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 26598 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2419829ded75b99e9e40c4c9a928dcd0b7266a78, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17578	4513	4303	4303
q2	q3	10648	789	524	524
q4	4676	349	251	251
q5	7569	1224	1028	1028
q6	174	176	145	145
q7	770	834	699	699
q8	9313	1464	1336	1336
q9	4903	4788	4696	4696
q10	6327	1926	1672	1672
q11	458	258	244	244
q12	757	581	473	473
q13	18046	2691	1931	1931
q14	222	237	217	217
q15	q16	748	739	662	662
q17	759	840	451	451
q18	5931	5376	5211	5211
q19	1137	979	619	619
q20	545	479	381	381
q21	4565	1851	1434	1434
q22	348	321	432	321
Total cold run time: 95474 ms
Total hot run time: 26598 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4676	4615	4672	4615
q2	q3	3924	4373	3853	3853
q4	888	1208	817	817
q5	4092	4352	4388	4352
q6	190	178	139	139
q7	1766	1637	1538	1538
q8	2497	2770	2653	2653
q9	7676	7382	7392	7382
q10	3736	3977	3782	3782
q11	512	443	464	443
q12	497	601	458	458
q13	2489	3046	2094	2094
q14	300	306	286	286
q15	q16	729	801	770	770
q17	1186	1593	1382	1382
q18	7350	6923	6732	6732
q19	935	1009	935	935
q20	2281	2171	2010	2010
q21	3974	3496	3415	3415
q22	453	428	384	384
Total cold run time: 50151 ms
Total hot run time: 48040 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 168948 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2419829ded75b99e9e40c4c9a928dcd0b7266a78, data reload: false

query5	4332	633	501	501
query6	334	240	206	206
query7	4205	462	270	270
query8	352	242	225	225
query9	8704	2711	2688	2688
query10	508	407	333	333
query11	6994	5080	4869	4869
query12	175	125	125	125
query13	1288	450	348	348
query14	5775	3725	3492	3492
query14_1	2874	2852	2781	2781
query15	197	188	176	176
query16	975	491	440	440
query17	859	702	610	610
query18	2443	446	345	345
query19	211	201	186	186
query20	133	128	125	125
query21	216	135	107	107
query22	13345	13888	15044	13888
query23	16756	16193	16000	16000
query23_1	16078	16235	15612	15612
query24	7197	1621	1208	1208
query24_1	1219	1222	1232	1222
query25	539	463	418	418
query26	1246	259	152	152
query27	2788	511	295	295
query28	4469	1820	1824	1820
query29	867	584	492	492
query30	297	227	191	191
query31	983	949	888	888
query32	85	73	72	72
query33	520	337	283	283
query34	889	884	532	532
query35	641	666	584	584
query36	1039	1119	991	991
query37	130	95	85	85
query38	2908	2903	2890	2890
query39	874	856	819	819
query39_1	800	796	794	794
query40	229	151	139	139
query41	62	59	60	59
query42	258	256	256	256
query43	242	261	218	218
query44	
query45	195	189	178	178
query46	904	974	616	616
query47	2121	2133	2050	2050
query48	311	329	231	231
query49	629	462	382	382
query50	705	290	212	212
query51	4068	4043	4059	4043
query52	265	272	263	263
query53	297	333	291	291
query54	311	278	280	278
query55	91	84	86	84
query56	309	330	314	314
query57	1898	1805	1667	1667
query58	284	272	273	272
query59	2807	2950	2736	2736
query60	336	343	335	335
query61	153	151	153	151
query62	627	588	543	543
query63	309	277	284	277
query64	5127	1355	1096	1096
query65	
query66	1491	471	406	406
query67	24192	24451	24162	24162
query68	
query69	405	315	299	299
query70	978	1002	951	951
query71	337	307	299	299
query72	2831	2746	2428	2428
query73	542	549	318	318
query74	9645	9590	9402	9402
query75	2859	2773	2469	2469
query76	2293	1049	689	689
query77	364	403	321	321
query78	10934	11081	10440	10440
query79	1137	777	575	575
query80	1349	648	557	557
query81	537	255	228	228
query82	1059	155	122	122
query83	342	272	252	252
query84	302	123	109	109
query85	1003	575	526	526
query86	416	312	297	297
query87	3129	3126	3012	3012
query88	3622	2719	2702	2702
query89	431	375	351	351
query90	2003	186	184	184
query91	200	183	158	158
query92	80	82	74	74
query93	915	852	493	493
query94	653	328	318	318
query95	609	364	407	364
query96	655	515	232	232
query97	2464	2491	2407	2407
query98	239	220	227	220
query99	1014	997	965	965
Total cold run time: 250272 ms
Total hot run time: 168948 ms

@Mryange
Copy link
Contributor Author

Mryange commented Mar 25, 2026

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 26489 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2419829ded75b99e9e40c4c9a928dcd0b7266a78, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17582	4434	4302	4302
q2	q3	10643	790	521	521
q4	4683	361	248	248
q5	7559	1222	1036	1036
q6	178	172	149	149
q7	789	859	671	671
q8	9304	1493	1345	1345
q9	4923	4601	4708	4601
q10	6251	1919	1658	1658
q11	470	261	250	250
q12	685	577	466	466
q13	18034	2695	1943	1943
q14	229	231	225	225
q15	q16	709	712	665	665
q17	726	824	468	468
q18	5974	5339	5258	5258
q19	1254	986	611	611
q20	536	486	379	379
q21	4637	1859	1407	1407
q22	348	286	431	286
Total cold run time: 95514 ms
Total hot run time: 26489 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4841	4741	4635	4635
q2	q3	3923	4381	3829	3829
q4	887	1190	770	770
q5	4068	4360	4376	4360
q6	190	175	143	143
q7	1776	1677	1543	1543
q8	2476	2693	2585	2585
q9	7955	7300	7273	7273
q10	3864	4008	3625	3625
q11	505	442	408	408
q12	494	589	442	442
q13	2469	2940	2143	2143
q14	287	316	291	291
q15	q16	754	777	748	748
q17	1174	1381	1321	1321
q18	7083	6882	6673	6673
q19	933	908	930	908
q20	2090	2217	1992	1992
q21	3930	3464	3335	3335
q22	446	456	435	435
Total cold run time: 50145 ms
Total hot run time: 47459 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 169242 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2419829ded75b99e9e40c4c9a928dcd0b7266a78, data reload: false

query5	4349	650	508	508
query6	339	230	213	213
query7	4215	463	260	260
query8	359	234	224	224
query9	8746	2736	2683	2683
query10	534	388	338	338
query11	7016	5162	4939	4939
query12	182	131	125	125
query13	1279	466	355	355
query14	5719	3747	3466	3466
query14_1	2889	2876	2841	2841
query15	205	193	179	179
query16	1023	473	372	372
query17	1119	740	637	637
query18	2547	460	362	362
query19	215	224	188	188
query20	134	129	126	126
query21	212	133	115	115
query22	13327	14078	14769	14078
query23	16842	16394	16030	16030
query23_1	15886	15680	15770	15680
query24	7123	1664	1220	1220
query24_1	1192	1228	1250	1228
query25	566	465	402	402
query26	1244	263	150	150
query27	2761	478	291	291
query28	4459	1850	1833	1833
query29	857	572	483	483
query30	302	232	191	191
query31	1018	945	882	882
query32	83	69	74	69
query33	527	331	291	291
query34	889	860	528	528
query35	650	679	618	618
query36	1073	1102	961	961
query37	141	100	92	92
query38	3017	2911	2898	2898
query39	871	821	817	817
query39_1	791	807	792	792
query40	239	155	136	136
query41	65	60	59	59
query42	266	253	256	253
query43	237	247	230	230
query44	
query45	194	195	183	183
query46	871	982	611	611
query47	2113	2854	2042	2042
query48	300	319	230	230
query49	633	465	395	395
query50	676	279	210	210
query51	4093	4066	4099	4066
query52	261	272	251	251
query53	299	347	291	291
query54	314	276	302	276
query55	92	88	92	88
query56	323	334	324	324
query57	1958	1658	1706	1658
query58	287	278	274	274
query59	2825	2972	2746	2746
query60	346	343	326	326
query61	163	161	162	161
query62	626	593	532	532
query63	317	283	281	281
query64	5059	1280	1043	1043
query65	
query66	1457	454	385	385
query67	24249	24306	24270	24270
query68	
query69	403	319	285	285
query70	936	978	953	953
query71	354	316	300	300
query72	2886	2706	2501	2501
query73	528	549	316	316
query74	9676	9575	9402	9402
query75	2895	2757	2476	2476
query76	2296	1038	693	693
query77	372	392	314	314
query78	10943	11122	10519	10519
query79	1123	773	574	574
query80	735	643	550	550
query81	482	258	227	227
query82	1350	155	124	124
query83	380	271	254	254
query84	261	119	99	99
query85	852	497	449	449
query86	357	306	298	298
query87	3168	3152	3014	3014
query88	3548	2658	2651	2651
query89	429	374	347	347
query90	1990	180	179	179
query91	171	162	138	138
query92	77	75	75	75
query93	899	876	489	489
query94	452	333	291	291
query95	599	346	328	328
query96	651	517	233	233
query97	2466	2460	2391	2391
query98	241	220	219	219
query99	1004	987	910	910
Total cold run time: 249903 ms
Total hot run time: 169242 ms

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.82% (19872/37619)
Line Coverage 36.33% (185660/511098)
Region Coverage 32.57% (143805/441501)
Branch Coverage 33.79% (62968/186372)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.26% (26249/36835)
Line Coverage 54.12% (275768/509524)
Region Coverage 51.14% (227879/445619)
Branch Coverage 52.72% (98554/186934)

1 similar comment
@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.26% (26249/36835)
Line Coverage 54.12% (275768/509524)
Region Coverage 51.14% (227879/445619)
Branch Coverage 52.72% (98554/186934)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.55% (26357/36836)
Line Coverage 54.44% (277372/509535)
Region Coverage 51.54% (229678/445627)
Branch Coverage 53.06% (99193/186936)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants