Skip to content

[Improve](variant) Add streaming compaction writer for NestedGroup#61383

Open
eldenmoon wants to merge 1 commit intoapache:masterfrom
eldenmoon:stream-writer-ng
Open

[Improve](variant) Add streaming compaction writer for NestedGroup#61383
eldenmoon wants to merge 1 commit intoapache:masterfrom
eldenmoon:stream-writer-ng

Conversation

@eldenmoon
Copy link
Member

Build a NestedGroup streaming write plan from input rowset metadata and use it to initialize root, regular subcolumn and nested-group writers during compaction, so Variant data can be written chunk by chunk instead of being fully materialized first.

Refactor shared subcolumn writer helpers, allow the NestedGroup provider to consume prebuilt groups or a streaming plan, and skip sparse/doc path-stat collection for nested-group variant columns.

Add tests for streaming plan construction and array subcolumn writes across multiple batches.

Copilot AI review requested due to automatic review settings March 16, 2026 09:07
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Member Author

run buildall

Build a NestedGroup streaming write plan from input rowset metadata and use it
to initialize root, regular subcolumn and nested-group writers during
compaction, so Variant data can be written chunk by chunk instead of being
fully materialized first.

Refactor shared subcolumn writer helpers, allow the NestedGroup provider to
consume prebuilt groups or a streaming plan, and skip sparse/doc path-stat
collection for nested-group variant columns.

Add tests for streaming plan construction and array subcolumn writes across
multiple batches.
@eldenmoon
Copy link
Member Author

run buildall

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a streaming compaction write path for VARIANT columns when NestedGroup is enabled, building a streaming plan from input rowset metadata so compaction can write chunk-by-chunk (avoiding full VARIANT materialization). Also refactors shared subcolumn-writer helpers, extends the NestedGroup provider interface to accept prebuilt groups or a streaming plan, and disables sparse/doc path-stat handling for NestedGroup VARIANT columns.

Changes:

  • Introduce VariantStreamingCompactionWriter and NestedGroupStreamingWritePlan to drive chunked compaction writes for NestedGroup-enabled VARIANT.
  • Refactor common VARIANT subcolumn writer logic into variant_writer_helpers and update legacy writers to use it.
  • Update NestedGroup routing/provider/path utilities and add new unit tests for streaming plan construction and multi-batch array subcolumn writes.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
be/test/storage/segment/variant_column_writer_reader_test.cpp Adds compaction streaming tests + helpers for creating rowsets/readers and validating streamed writes.
be/test/storage/segment/nested_group_provider_test.cpp Updates default provider test to match the new provider API / error behavior.
be/test/storage/segment/nested_group_path_test.cpp Adds prefix/overlap unit tests for NestedGroup path utilities.
be/test/storage/segment/nested_group_path_filter_test.cpp Adds descendant match test coverage for path filtering.
be/src/storage/segment/variant/variant_writer_helpers.h New shared helpers for converting/writing columns and preparing subcolumn writers.
be/src/storage/segment/variant/variant_streaming_compaction_writer.h Declares the streaming compaction writer and its lifecycle.
be/src/storage/segment/variant/variant_streaming_compaction_writer.cpp Implements streaming chunk append + root/subcolumn writes + provider integration.
be/src/storage/segment/variant/variant_column_writer_impl.h Hooks streaming writer into the existing VARIANT writer implementation.
be/src/storage/segment/variant/variant_column_writer_impl.cpp Refactors finalize path, adds prebuilt-groups option, and uses shared writer helpers.
be/src/storage/segment/variant/variant_column_reader.h Adjusts includes / iterator surface for VARIANT reader.
be/src/storage/segment/variant/variant_column_reader.cpp Skips merging sparse/doc stats when NestedGroup disables path stats; adjusts NG index-path handling.
be/src/storage/segment/variant/nested_group_streaming_write_plan.h New data structures for streaming write planning.
be/src/storage/segment/variant/nested_group_streaming_write_plan.cpp Builds streaming plan from input rowset segment metadata, including conflict detection.
be/src/storage/segment/variant/nested_group_routing_plan.h Exposes shared conflict formatting/validation and candidate-based routing build.
be/src/storage/segment/variant/nested_group_routing_plan.cpp Reuses shared prefix/overlap helpers and shared conflict validation.
be/src/storage/segment/variant/nested_group_provider.h Extends provider API for prebuilt groups + streaming plan and adjusts path filter logic.
be/src/storage/segment/variant/nested_group_provider.cpp Implements default provider arg validation, real nested-group path matching, and no-op builder stub.
be/src/storage/segment/variant/nested_group_path.h Adds prefix/overlap helpers used across routing/planning/filtering.
be/src/storage/segment/variant/nested_group_builder.h Adds NestedGroup builder interface/data structures for building NGs from JSONB.
be/src/storage/segment/column_writer.h Adds test accessor to reach VariantColumnWriterImpl.
be/src/exec/common/variant_util.h Adds helpers to disable path-stats/binary-column behavior when NestedGroup is enabled.
be/src/exec/common/variant_util.cpp Updates compaction stats/schema logic to skip work for NestedGroup-enabled VARIANT columns.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 450 to 457
Status read_by_rowids(const rowid_t* rowids, const size_t count,
MutableColumnPtr& dst) override;

ordinal_t get_current_ordinal() const override { return _inner_iter->get_current_ordinal(); }

Status read_null_map(size_t* n, NullMap& null_map) override {
return _inner_iter->read_null_map(n, null_map);
}

Status init_prefetcher(const SegmentPrefetchParams& params) override;
void collect_prefetchers(
std::map<PrefetcherInitMethod, std::vector<SegmentPrefetcher*>>& prefetchers,
Comment on lines +991 to +995
for (const auto& column : output->tablet_schema()->columns()) {
if (!column->is_variant_type()) {
continue;
}
}
Comment on lines 681 to +682

VariantColumnWriterImpl* impl_for_test() const { return _impl.get(); }
@eldenmoon
Copy link
Member Author

/review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR adds a streaming compaction writer for NestedGroup variant columns, enabling chunk-by-chunk writing during compaction instead of full materialization. The architecture is clean and well-structured overall. Below are the critical checkpoint conclusions and findings.

Critical Checkpoint Analysis (Part 1.3)

Goal & Correctness: The PR accomplishes its stated goal of enabling streaming writes for NestedGroup variant columns during compaction. The VariantStreamingCompactionWriter correctly orchestrates root column, regular subcolumn, and nested group writes per chunk. An end-to-end test (test_streaming_compaction_writer_streams_regular_array_paths_across_batches) verifies the round-trip. However, there is a dead code issue in variant_util.cpp (see inline comment).

Modification Focus: The change is well-scoped to the variant/nested-group write path with appropriate refactoring of shared helpers into variant_writer_helpers.h.

Concurrency: No new concurrency is introduced. The streaming writer operates single-threaded within the compaction write path. The existing concurrency test (test_concurrent_load_external_meta_and_get_subcolumn_meta) covers the read path. No issues found.

Lifecycle Management: VariantStreamingCompactionWriter has a clear phase-based lifecycle (UNINITIALIZED -> INITIALIZED -> APPENDING -> CLOSED) with proper state checks. The raw pointer to _nested_group_provider and _tablet_column is safe because the owner (VariantColumnWriterImpl) outlives the streaming writer. No circular references.

Configuration: No new configuration items added. The streaming path is activated when variant_enable_nested_group() is true and it's a compaction write with input readers present.

Incompatible Changes: No storage format changes; the output format is identical to the legacy path.

Parallel Code Paths: The streaming writer correctly mirrors all operations from the legacy VariantColumnWriterImpl path (finish, write_data, write_ordinal_index, write_zone_map, write_inverted_index, write_bloom_filter_index). All delegation points in VariantColumnWriterImpl check for _streaming_compaction_writer != nullptr consistently.

Test Coverage: Good coverage with the new end-to-end streaming test and unit tests for the write plan construction. Negative cases for error paths are tested via phase checks. The find_in_nested_groups implementation has unit tests covering empty readers, but not the full search logic (acceptable since real NestedGroupReaders require EE wiring).

Observability: No new metrics or logging added. The streaming path reuses existing variant statistics infrastructure. For a first iteration this is acceptable, though a VLOG for streaming plan selection would help debugging.

Performance: The streaming approach avoids materializing all data at once, which is the core performance improvement. insert_range_from + finalize() per chunk in _append_input creates a temporary copy; this is acceptable for correctness but could be optimized later if chunks are already finalized. No redundant operations found in hot paths.

Other Issues: See inline comments for a dead code loop in variant_util.cpp and a minor observation about early-return semantics in check_path_stats.

if (!column->is_variant_type()) {
continue;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Bug: Dead code / empty loop body]

This loop iterates over columns, skips non-variant types via continue, but then does nothing for variant-type columns — the loop body is entirely empty. This appears to be leftover code from an incomplete refactoring.

If the intent was to further filter variant columns (e.g., skip columns matching some condition before proceeding to the stats aggregation below), the filtering logic is missing. If the intent was just the early-return in the loop above (lines 986-990), then this loop should be removed entirely.

// This loop does nothing — remove it or add the intended logic
for (const auto& column : output->tablet_schema()->columns()) {
    if (!column->is_variant_type()) {
        continue;
    }
    // <-- empty body for variant columns
}

for (const auto& column : output->tablet_schema()->columns()) {
if (column->is_variant_type() && !should_check_variant_path_stats(*column)) {
return Status::OK();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Correctness concern: early-return skips all variant columns]

This loop returns OK as soon as it finds any variant column where should_check_variant_path_stats() returns false (i.e., any column with nested groups enabled). This means if the schema has two variant columns — one with nested groups and one without — the path stats check for the non-nested-group variant is skipped entirely.

Was the intent to skip only the nested-group variant columns from path stats checking, while still checking the others? If so, this should collect which variant UIDs to skip rather than returning early for all columns.

Similarly, the equivalent check at lines 943-951 in the same function has the same early-return-for-all semantics from input rowset columns.

@doris-robot
Copy link

TPC-H: Total hot run time: 26818 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b2f9151b280dcf43f469bf8e0cf5ceecd352417e, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17624	4435	4272	4272
q2	q3	10641	774	520	520
q4	4671	342	250	250
q5	7552	1236	999	999
q6	175	175	145	145
q7	801	876	664	664
q8	9289	1479	1334	1334
q9	4959	4800	4657	4657
q10	6247	1925	1660	1660
q11	479	264	237	237
q12	748	582	465	465
q13	18049	2930	2175	2175
q14	234	233	206	206
q15	q16	710	742	671	671
q17	735	807	478	478
q18	5980	5381	5339	5339
q19	1115	974	617	617
q20	540	499	371	371
q21	4349	1821	1389	1389
q22	341	479	369	369
Total cold run time: 95239 ms
Total hot run time: 26818 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4807	4586	4577	4577
q2	q3	3873	4333	3816	3816
q4	890	1199	773	773
q5	4109	4379	4404	4379
q6	186	172	146	146
q7	1766	1668	1531	1531
q8	2555	2689	2590	2590
q9	7834	7626	7366	7366
q10	3818	3975	3558	3558
q11	517	428	419	419
q12	488	569	455	455
q13	2722	3092	2320	2320
q14	282	386	398	386
q15	q16	746	787	713	713
q17	1254	1357	1358	1357
q18	7233	6964	6726	6726
q19	913	909	908	908
q20	2129	2242	2004	2004
q21	4030	3485	3338	3338
q22	470	456	381	381
Total cold run time: 50622 ms
Total hot run time: 47743 ms

@doris-robot
Copy link

TPC-H: Total hot run time: 27146 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit b2f9151b280dcf43f469bf8e0cf5ceecd352417e, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17609	4430	4296	4296
q2	q3	10646	782	525	525
q4	4682	368	249	249
q5	7556	1231	1014	1014
q6	170	172	147	147
q7	815	839	677	677
q8	9311	1481	1381	1381
q9	4748	4714	4715	4714
q10	6307	1942	1670	1670
q11	484	275	247	247
q12	709	580	465	465
q13	18074	3022	2211	2211
q14	232	231	222	222
q15	q16	765	748	668	668
q17	715	877	425	425
q18	6006	5572	5358	5358
q19	1511	976	598	598
q20	547	502	375	375
q21	4807	1860	1609	1609
q22	448	377	295	295
Total cold run time: 96142 ms
Total hot run time: 27146 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4739	4594	4549	4549
q2	q3	3901	4404	3832	3832
q4	885	1192	750	750
q5	4098	4392	4386	4386
q6	180	177	141	141
q7	1786	1666	1512	1512
q8	2524	2706	2542	2542
q9	7813	7490	7349	7349
q10	3779	3968	3616	3616
q11	497	434	421	421
q12	490	618	441	441
q13	2727	3121	2350	2350
q14	294	310	280	280
q15	q16	764	766	721	721
q17	1134	1351	1327	1327
q18	7335	6930	6833	6833
q19	863	899	905	899
q20	2054	2169	1997	1997
q21	3986	3536	3281	3281
q22	465	437	374	374
Total cold run time: 50314 ms
Total hot run time: 47601 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 167998 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b2f9151b280dcf43f469bf8e0cf5ceecd352417e, data reload: false

query5	4318	640	496	496
query6	366	220	198	198
query7	4282	469	267	267
query8	352	252	236	236
query9	8726	2754	2752	2752
query10	568	394	329	329
query11	6980	5091	4880	4880
query12	182	133	128	128
query13	1278	461	349	349
query14	5643	3719	3426	3426
query14_1	2846	2776	2795	2776
query15	208	203	182	182
query16	979	479	455	455
query17	893	741	618	618
query18	2466	454	347	347
query19	224	210	189	189
query20	141	128	129	128
query21	216	134	111	111
query22	13449	13997	14748	13997
query23	16268	15927	15648	15648
query23_1	15615	15525	15321	15321
query24	7252	1613	1237	1237
query24_1	1236	1203	1228	1203
query25	545	504	399	399
query26	1231	254	144	144
query27	2811	478	296	296
query28	4516	1870	1826	1826
query29	791	559	474	474
query30	305	225	191	191
query31	1013	950	864	864
query32	89	73	70	70
query33	511	329	287	287
query34	919	880	516	516
query35	630	697	591	591
query36	1046	1075	993	993
query37	129	94	83	83
query38	2963	2917	2872	2872
query39	857	849	809	809
query39_1	794	806	796	796
query40	229	151	134	134
query41	66	57	58	57
query42	258	258	255	255
query43	237	251	218	218
query44	
query45	196	184	182	182
query46	869	977	598	598
query47	2723	2115	2057	2057
query48	308	315	232	232
query49	622	464	379	379
query50	681	287	215	215
query51	4082	4056	4026	4026
query52	259	266	261	261
query53	287	336	281	281
query54	297	264	263	263
query55	95	87	85	85
query56	331	328	313	313
query57	1972	1858	1768	1768
query58	288	279	270	270
query59	2800	2945	2741	2741
query60	342	336	330	330
query61	150	151	150	150
query62	634	567	543	543
query63	306	283	273	273
query64	4915	1272	991	991
query65	
query66	1451	451	355	355
query67	24218	24276	24152	24152
query68	
query69	403	303	291	291
query70	990	909	965	909
query71	343	315	302	302
query72	2869	2861	2591	2591
query73	558	541	321	321
query74	9643	9602	9395	9395
query75	2892	2783	2461	2461
query76	2301	1011	672	672
query77	371	382	304	304
query78	10872	11186	10475	10475
query79	1069	799	594	594
query80	867	604	530	530
query81	500	257	223	223
query82	1321	160	117	117
query83	348	258	240	240
query84	296	115	94	94
query85	894	483	439	439
query86	400	319	291	291
query87	3119	3103	3056	3056
query88	3502	2648	2597	2597
query89	429	367	350	350
query90	1804	186	181	181
query91	175	164	135	135
query92	76	72	72	72
query93	915	862	499	499
query94	508	309	296	296
query95	590	341	316	316
query96	643	508	228	228
query97	2476	2482	2431	2431
query98	237	222	228	222
query99	1029	971	884	884
Total cold run time: 248895 ms
Total hot run time: 167998 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 167825 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit b2f9151b280dcf43f469bf8e0cf5ceecd352417e, data reload: false

query5	4400	648	499	499
query6	322	226	194	194
query7	4218	463	262	262
query8	337	244	220	220
query9	8689	2700	2721	2700
query10	527	413	348	348
query11	6965	5104	4895	4895
query12	183	127	122	122
query13	1256	457	336	336
query14	5764	3707	3473	3473
query14_1	2761	2739	2735	2735
query15	194	190	172	172
query16	988	478	439	439
query17	856	721	593	593
query18	2426	437	335	335
query19	211	201	176	176
query20	148	124	122	122
query21	213	136	106	106
query22	13202	13918	14758	13918
query23	16270	15896	15770	15770
query23_1	15753	15519	15291	15291
query24	7151	1610	1210	1210
query24_1	1234	1212	1198	1198
query25	528	458	448	448
query26	1243	261	148	148
query27	2785	480	293	293
query28	4537	1822	1815	1815
query29	831	568	468	468
query30	298	234	186	186
query31	992	939	868	868
query32	87	73	70	70
query33	517	329	285	285
query34	894	867	506	506
query35	630	683	602	602
query36	1090	1171	986	986
query37	133	97	81	81
query38	2926	2905	2872	2872
query39	868	841	809	809
query39_1	793	785	786	785
query40	225	150	133	133
query41	63	63	59	59
query42	259	256	256	256
query43	243	241	223	223
query44	
query45	196	192	182	182
query46	871	982	603	603
query47	2128	2164	2075	2075
query48	310	323	226	226
query49	625	455	378	378
query50	684	281	212	212
query51	4108	4077	3996	3996
query52	259	267	254	254
query53	306	337	283	283
query54	301	275	260	260
query55	93	91	86	86
query56	320	340	324	324
query57	1909	1707	1741	1707
query58	298	284	278	278
query59	2815	2925	2747	2747
query60	359	357	343	343
query61	179	172	180	172
query62	635	583	528	528
query63	310	291	276	276
query64	5155	1401	1095	1095
query65	
query66	1492	473	372	372
query67	24203	24267	24213	24213
query68	
query69	420	318	293	293
query70	989	983	868	868
query71	343	316	307	307
query72	2983	2715	2474	2474
query73	528	537	314	314
query74	9618	9529	9404	9404
query75	2852	2747	2462	2462
query76	2278	1047	670	670
query77	367	367	312	312
query78	11005	11182	10440	10440
query79	1074	826	564	564
query80	1350	622	542	542
query81	563	258	229	229
query82	1020	157	125	125
query83	337	278	245	245
query84	298	117	103	103
query85	925	494	439	439
query86	415	308	296	296
query87	3123	3093	3052	3052
query88	3522	2644	2623	2623
query89	424	370	339	339
query90	2080	178	174	174
query91	170	157	137	137
query92	75	73	70	70
query93	944	846	495	495
query94	639	286	293	286
query95	580	339	332	332
query96	629	516	224	224
query97	2457	2501	2392	2392
query98	232	221	219	219
query99	1042	1003	914	914
Total cold run time: 248669 ms
Total hot run time: 167825 ms

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.73% (19767/37486)
Line Coverage 36.28% (184715/509090)
Region Coverage 32.46% (142612/439396)
Branch Coverage 33.64% (62398/185508)

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100% (0/0) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.58% (26264/36692)
Line Coverage 54.37% (275851/507381)
Region Coverage 51.58% (228739/443450)
Branch Coverage 53.06% (98692/185996)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants