Skip to content

[feat](be) add PLAIN_ENCODING_V3 binary plain page with contiguous data + lengths trailer#63570

Open
csun5285 wants to merge 1 commit into
apache:masterfrom
csun5285:bench/v2-layout
Open

[feat](be) add PLAIN_ENCODING_V3 binary plain page with contiguous data + lengths trailer#63570
csun5285 wants to merge 1 commit into
apache:masterfrom
csun5285:bench/v2-layout

Conversation

@csun5285
Copy link
Copy Markdown
Contributor

@csun5285 csun5285 commented May 25, 2026

V3 layout: |data1..dataN|varuint_len1..varuint_lenN|data_block_size(u32)|num_elems(u32)|

Compared to V2 (length and data interleaved per entry), V3 lets the pre-decoder memcpy the entire binary payload in a single shot and walk the contiguous varuint length block once to fill the V1 offsets array, with no data-pointer-vs-length-pointer dependency between the two passes.

Benchmark (15-rep x 2s median, V3 / V2 speedup at 256 KiB page):
8B: 3.56x 16B: 3.07x 32B: 2.46x 64B: 2.63x
128B: 2.25x 256B: 1.39x 512B: 1.22x 1024B: 1.11x 4096B: 1.01x
V3 strictly does not lose to V2 across the tested grid.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@csun5285
Copy link
Copy Markdown
Contributor Author

run buildall

…ta + lengths trailer

V3 layout: |data1..dataN|varuint_len1..varuint_lenN|data_block_size(u32)|num_elems(u32)|

Compared to V2 (length and data interleaved per entry), V3 lets the pre-decoder
memcpy the entire binary payload in a single shot and walk the contiguous
varuint length block once to fill the V1 offsets array, with no
data-pointer-vs-length-pointer dependency between the two passes.

Backward-compat: V3 is registered as a new EncodingTypePB (= 9). Existing
segments persist their per-column encoding meta (V1=2, V2=8, V3=9), so the
read path dispatches to the matching pre-decoder. Old V2 segments continue
to be served by BinaryPlainPageV2PreDecoder.

Wired through the entire write path:
- encoding_info.cpp: Hook 1b mirrors the V2 hook, rewriting PLAIN_ENCODING
  to PLAIN_ENCODING_V3 when the schema preference is BINARY_PLAIN_ENCODING_V3.
- segment_writer.cpp + vertical_segment_writer.cpp: row-store-column path
  switch-ifies over BinaryPlainEncodingTypePB to pick PLAIN_ENCODING_V3.
- binary_dict_page.cpp: dict word page and fallback binary page use a small
  shared helper to map the preference to the on-disk encoding.
- tablet_meta.cpp: new TStorageFormat::V3 tablets default to
  BINARY_PLAIN_ENCODING_V3 for both data schema and row binlog schema.

Tests: 15 BinaryPlainPageV3Test cases covering encode/decode roundtrip, seek,
read_by_rowids, empty page, page_full, large N, mixed lengths (including
unicode), reset, varint length boundaries (127/128/16383/16384 byte values
across 1/2/3-byte varint bands), raw trailer layout assertions, and two
corruption-rejection cases. All pass under ASAN.

Benchmark (15-rep x 2s median, V3 / V2 speedup at 256 KiB page):
  8B:   3.56x   16B:  3.07x   32B:  2.46x   64B:  2.63x
  128B: 2.25x   256B: 1.39x   512B: 1.22x   1024B: 1.11x   4096B: 1.01x
V3 strictly does not lose to V2 across the tested grid.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@csun5285
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31523 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 7f21734b8dcbbbf0b7d1992bda9c91c04ec17641, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17659	4105	4078	4078
q2	q3	10796	1457	806	806
q4	4770	483	354	354
q5	9322	2296	2117	2117
q6	364	173	137	137
q7	976	781	647	647
q8	9578	1754	1611	1611
q9	7034	5039	4963	4963
q10	6484	2255	1902	1902
q11	443	279	256	256
q12	689	422	297	297
q13	18290	3370	2746	2746
q14	269	258	235	235
q15	q16	831	803	705	705
q17	925	900	927	900
q18	6848	5900	5643	5643
q19	1215	1242	1062	1062
q20	519	403	270	270
q21	5633	2521	2485	2485
q22	442	353	309	309
Total cold run time: 103087 ms
Total hot run time: 31523 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4385	4295	4284	4284
q2	q3	4528	4984	4395	4395
q4	2118	2234	1428	1428
q5	4472	4353	5541	4353
q6	258	203	149	149
q7	2113	1859	1656	1656
q8	2611	2241	2417	2241
q9	8056	8110	8044	8044
q10	4868	4873	4343	4343
q11	598	413	379	379
q12	776	777	583	583
q13	3206	3634	2890	2890
q14	327	429	297	297
q15	q16	734	728	646	646
q17	1387	1354	1345	1345
q18	8107	7480	7116	7116
q19	1094	1103	1159	1103
q20	2240	2238	1947	1947
q21	5389	4705	4525	4525
q22	517	456	403	403
Total cold run time: 57784 ms
Total hot run time: 52127 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 31963 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c589ceabb09942f372b9c8ccd49d652b72dd9a71, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17593	4049	3997	3997
q2	q3	10832	1465	801	801
q4	4758	488	349	349
q5	9819	2345	2124	2124
q6	392	176	140	140
q7	946	790	645	645
q8	9585	1870	1627	1627
q9	7117	4995	5005	4995
q10	6467	2251	1917	1917
q11	450	283	253	253
q12	700	442	308	308
q13	18190	3470	2857	2857
q14	266	265	246	246
q15	q16	832	791	730	730
q17	942	893	957	893
q18	7076	5835	6304	5835
q19	1203	1295	1118	1118
q20	531	431	286	286
q21	5916	2810	2514	2514
q22	460	373	328	328
Total cold run time: 104075 ms
Total hot run time: 31963 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4790	4813	4803	4803
q2	q3	4998	5431	4619	4619
q4	2181	2221	1438	1438
q5	4817	4832	4752	4752
q6	239	181	135	135
q7	1879	1754	1580	1580
q8	2505	2143	1963	1963
q9	7474	7456	7505	7456
q10	4814	4690	4247	4247
q11	548	393	363	363
q12	749	755	539	539
q13	3065	3359	2767	2767
q14	269	277	268	268
q15	q16	688	698	618	618
q17	1305	1325	1287	1287
q18	7436	6776	6861	6776
q19	1087	1097	1109	1097
q20	2253	2242	1940	1940
q21	5441	4660	4545	4545
q22	525	481	406	406
Total cold run time: 57063 ms
Total hot run time: 51599 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172821 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 7f21734b8dcbbbf0b7d1992bda9c91c04ec17641, data reload: false

query5	4322	667	507	507
query6	346	218	205	205
query7	4205	566	307	307
query8	329	244	222	222
query9	8856	4058	4071	4058
query10	457	353	316	316
query11	5804	2802	2208	2208
query12	179	130	126	126
query13	1265	640	471	471
query14	6130	5520	5242	5242
query14_1	4586	4472	4482	4472
query15	210	209	189	189
query16	1033	468	420	420
query17	1146	723	593	593
query18	2718	488	354	354
query19	221	195	166	166
query20	139	131	128	128
query21	222	137	120	120
query22	13793	13690	13404	13404
query23	17326	16646	16310	16310
query23_1	16385	16332	16373	16332
query24	7459	1788	1307	1307
query24_1	1321	1323	1335	1323
query25	559	486	442	442
query26	1320	317	175	175
query27	2697	550	337	337
query28	4413	2002	2008	2002
query29	989	614	494	494
query30	308	246	198	198
query31	1128	1081	947	947
query32	89	83	74	74
query33	546	350	303	303
query34	1178	1125	654	654
query35	774	805	688	688
query36	1416	1438	1294	1294
query37	152	108	94	94
query38	3247	3175	3058	3058
query39	932	929	911	911
query39_1	922	883	901	883
query40	240	147	123	123
query41	65	67	62	62
query42	111	110	107	107
query43	326	336	295	295
query44	
query45	213	206	207	206
query46	1108	1248	738	738
query47	2366	2417	2247	2247
query48	390	410	308	308
query49	630	493	404	404
query50	971	345	254	254
query51	4419	4365	4347	4347
query52	107	106	95	95
query53	250	281	215	215
query54	314	291	271	271
query55	95	94	86	86
query56	335	307	304	304
query57	1438	1418	1346	1346
query58	288	279	268	268
query59	1609	1679	1508	1508
query60	332	333	308	308
query61	159	152	156	152
query62	699	668	593	593
query63	245	206	210	206
query64	2399	805	638	638
query65	
query66	1677	489	353	353
query67	30118	30133	29944	29944
query68	
query69	493	351	304	304
query70	1104	1056	1005	1005
query71	316	275	270	270
query72	3050	2888	2604	2604
query73	864	775	429	429
query74	5128	4929	4809	4809
query75	2744	2662	2302	2302
query76	2283	1160	816	816
query77	419	422	343	343
query78	12453	12631	11951	11951
query79	1461	1034	763	763
query80	680	592	485	485
query81	457	291	250	250
query82	1347	165	130	130
query83	362	291	258	258
query84	266	149	120	120
query85	969	540	457	457
query86	401	353	366	353
query87	3443	3440	3255	3255
query88	3624	2855	2864	2855
query89	440	402	348	348
query90	1913	186	182	182
query91	180	175	141	141
query92	81	81	71	71
query93	1534	1474	843	843
query94	541	357	311	311
query95	681	399	340	340
query96	1102	831	351	351
query97	2749	2730	2608	2608
query98	246	237	223	223
query99	1186	1148	1039	1039
Total cold run time: 255331 ms
Total hot run time: 172821 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 173713 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c589ceabb09942f372b9c8ccd49d652b72dd9a71, data reload: false

query5	4308	653	522	522
query6	331	214	211	211
query7	4244	565	308	308
query8	334	237	232	232
query9	8819	4207	4163	4163
query10	464	357	303	303
query11	5804	2413	2187	2187
query12	192	128	129	128
query13	1285	630	436	436
query14	6135	5533	5233	5233
query14_1	4549	4572	4536	4536
query15	220	206	186	186
query16	1028	500	478	478
query17	1192	727	584	584
query18	2526	484	349	349
query19	220	196	157	157
query20	136	136	133	133
query21	219	139	119	119
query22	13645	13575	13421	13421
query23	17446	16643	16314	16314
query23_1	16435	16487	16540	16487
query24	7807	1791	1307	1307
query24_1	1348	1332	1322	1322
query25	541	486	425	425
query26	1329	326	172	172
query27	2704	552	350	350
query28	4442	1999	2035	1999
query29	988	641	498	498
query30	312	243	206	206
query31	1144	1085	952	952
query32	93	78	73	73
query33	543	366	301	301
query34	1220	1126	654	654
query35	798	796	704	704
query36	1398	1425	1264	1264
query37	164	108	95	95
query38	3216	3163	3092	3092
query39	930	938	939	938
query39_1	871	898	889	889
query40	247	144	122	122
query41	68	64	64	64
query42	110	111	114	111
query43	334	338	312	312
query44	
query45	218	205	203	203
query46	1096	1164	759	759
query47	2383	2441	2278	2278
query48	405	426	296	296
query49	631	502	395	395
query50	1036	347	258	258
query51	4409	4402	4318	4318
query52	114	107	94	94
query53	249	278	205	205
query54	327	278	256	256
query55	92	90	88	88
query56	298	316	313	313
query57	1444	1424	1348	1348
query58	300	278	274	274
query59	1593	1734	1500	1500
query60	322	323	307	307
query61	166	162	164	162
query62	704	689	595	595
query63	251	208	208	208
query64	2450	803	651	651
query65	
query66	1714	500	383	383
query67	30011	29993	29875	29875
query68	
query69	476	363	323	323
query70	1061	995	1025	995
query71	310	283	279	279
query72	3291	2946	2525	2525
query73	845	785	443	443
query74	5084	5003	4793	4793
query75	2701	2635	2293	2293
query76	2278	1131	789	789
query77	402	410	341	341
query78	12460	12558	11870	11870
query79	1472	1047	772	772
query80	645	538	451	451
query81	452	291	240	240
query82	1392	157	123	123
query83	363	279	252	252
query84	255	150	109	109
query85	915	552	468	468
query86	419	360	328	328
query87	3427	3393	3279	3279
query88	3913	2812	2778	2778
query89	452	400	345	345
query90	1959	189	187	187
query91	182	169	153	153
query92	83	79	73	73
query93	1646	1435	870	870
query94	535	356	300	300
query95	683	487	360	360
query96	1119	796	351	351
query97	2779	2704	2633	2633
query98	236	230	227	227
query99	1180	1158	1039	1039
Total cold run time: 256413 ms
Total hot run time: 173713 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 74.75% (148/198) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.79% (20864/38791)
Line Coverage 37.37% (197699/529005)
Region Coverage 33.68% (154894/459847)
Branch Coverage 34.65% (67381/194440)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants