Skip to content

[improvement](fe) TopN lazy materialization support struct/variant nested column pruning#63736

Draft
englefly wants to merge 3 commits into
apache:masterfrom
englefly:lazy-mat-nest
Draft

[improvement](fe) TopN lazy materialization support struct/variant nested column pruning#63736
englefly wants to merge 3 commits into
apache:masterfrom
englefly:lazy-mat-nest

Conversation

@englefly
Copy link
Copy Markdown
Contributor

@englefly englefly commented May 27, 2026

Previously, TopN lazy materialization only worked for top-level scalar
columns. Complex-type projection expressions like struct_element() and
element_at() remained eagerly evaluated at the scan, losing the benefit
of reading the expensive base column for only the TopN-selected rows.

This PR extends TopN lazy materialization to recognize complex-type
projection expressions (struct_element, element_at, map_keys, etc.)
and defer reading their base columns (struct, variant, map, array)
until after the TopN filter step.

Motivation

For queries like:

SELECT id, struct_element(struct_col, 'city')
FROM t ORDER BY id LIMIT 10;

The struct column is very wide (many fields). Previously, struct_col
was read for ALL rows before sorting, even though only 10 rows survive
TopN. Now struct_col becomes a lazy slot — the scan outputs only the
columns needed for sorting (id) + rowId; struct_col is fetched remotely
only for the 10 winning rows, with nested column pruning further
limiting the read to just the 'city' sub-field.

The same applies to variant (element_at + subColPath), map/array
(element_at subscript), and map functions (map_keys, map_values, etc.).

Core Changes

1. LazyMaterializeTopN — restructure plan to pull up PPD/subPath exprs

  • findLeafProject() walks the standard TopN physical plan shape
    (MERGE_SORT → Distribute → LOCAL_SORT → Project) to locate the leaf
    project where complex-type expressions reside.

  • isNestedLazyExpression() identifies lazy candidates via two paths:
    a) containsType(PreferPushDownProject.class) — catches
    StructElement, ElementAt (map/array/variant subscript), MapKeys,
    MapValues, MapContainsKey, MapContainsValue, and other functions
    b) (SlotReference) slot.hasSubColPath() — catches variant sub-path
    slots generated by VariantSubPathPruning

  • The leaf project is split: PPD/subPath expressions are pulled up into
    a new project above Materialize, while their input slots (base columns
    like struct_col, payload, map_col, arr_col) become lazy candidates.

  • replaceLeafProject() rebuilds the plan chain bottom-up after removing
    pulled-up expressions. New PhysicalProject nodes use null
    LogicalProperties because outputs have changed.

  • createLazySlotPruning() returns an anonymous subclass that overrides
    shouldPruneChild() to always return true. This is necessary because
    intermediate nodes (localTopN, distribute) retain stale logical
    properties after plan restructuring that don't include the new lazy
    slots — the default containsAll check would skip the subtree.

  • computeMaterializeSource() gains a fallback: when the lazy candidate
    is not directly in the TopN output (it's a hidden base column only
    referenced by a pulled-up PPD expression), probing starts from the
    leaf project's child (the scan) instead of from the TopN.

  • Base slot access paths (subColPath / allAccessPaths) are carried
    from the baseSlot to the materialize slot so the BE receives the
    correct nested column / sub-path pruning metadata.

  • When moveRowIdsToTail returns null (stale logical properties),
    correctInput is built from materializedSlots + rowIds directly
    instead of using result.getOutput().

2. LazySlotPruning — extract shouldPruneChild() for override

  • The child.getOutput().containsAll(context.lazySlots) guard is
    extracted from visit() into a protected shouldPruneChild() method.
    This allows the caller to override it when logical properties are
    stale, without duplicating the entire traversal logic.

3. OperativeColumnDerive — skip PreferPushDownProject input slots

  • visitLogicalProject() is refactored: input slots are only propagated
    to operative when the output slot is actually needed upstream.
    Previously all input slots of complex expressions were unconditionally
    marked operative, blocking them from lazy materialization.

  • New addOperativeSlotsSkipPreferPushDownProject() traverses the
    expression tree adding Slot nodes to operative, but skips entire
    subtrees rooted at PreferPushDownProject nodes (StructElement,
    ElementAt, MapKeys, etc.). This prevents base columns of complex
    types from being marked operative when they only appear as inputs to
    PPD functions, allowing them to be lazy.

4. PhysicalLazyMaterialize — propagate access paths to lazy outputs

  • When constructing the lazy scan to column mapping, baseSlot access
    paths (allAccessPaths, predicateAccessPaths, display variants)
    are copied to the lazy output slot. This ensures nested column
    pruning metadata survives through the materialization pipeline and
    reaches the BE.

5. MaterializationNode — fix nested column info display

  • printNestedColumns() uses materializeTupleDescriptor instead of
    outputTupleDesc. The latter was never set, so nested column info
    was silently missing from EXPLAIN output.

6. PlanNode — add subColLables to nested columns display

  • printNestedColumns() now prints sub path: [a.b.c] for variant
    sub-column paths, making them visible in EXPLAIN output.

Type Coverage

Type Access Pattern Lazy + Pruning
struct struct_element(s, 'field') per-field pruning
struct struct_element(struct_element(s, 'a'), 'b') multi-level pruning
variant element_at(payload, 'key') subColPath pruning
variant payload['a']['b'] multi-level subPath
map element_at(map_col, 'key') lazy (ACCESS_ALL)
map map_keys(m), map_values(m) keys/values only
map map_contains_key(m, k) keys only
array element_at(arr_col, 0) lazy (ACCESS_ALL)

Note: map/array subscript uses ACCESS_ALL (entire column read) because
individual key/element lookup requires scanning the whole map/array.
The lazy materialization benefit comes from deferring this read to only
the TopN-surviving rows. Struct and variant additionally benefit from
sub-field/key pruning within the lazy read.

Constraints

  • Assumes standard OLAP single-table TopN physical plan shape:
    MERGE_SORT → Distribute → LOCAL_SORT → Project. If the plan shape
    differs, lazy mat is skipped gracefully.

  • Aggregate key tables are excluded (BE limitation).

  • topn_lazy_materialization_threshold must be >= query LIMIT.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…sted column pruning

Core changes:
- LazyMaterializeTopN: find projects below TopN with PreferPushDownProject
  or variant subColPath expressions, simplify them to expose base columns
  as lazy candidates. Pull up expressions above Materialize for lazy mat.
- LazySlotPruning: extract shouldPruneChild() for safe override when
  logical properties are stale after plan restructuring
- OperativeColumnDerive: skip PreferPushDownProject input slots from
  operative propagation so struct/variant columns can be lazy
- PhysicalLazyMaterialize: propagate access paths to lazy output slots
- MaterializationNode: use materializeTupleDescriptor for nested column
  info display (was outputTupleDesc, never set)
- PlanNode: add subColLables to printNestedColumns for variant

Tests cover struct (struct_element + pruned type) and variant
(element_at + sub path) lazy materialization with nested column pruning.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Copy Markdown
Contributor Author

/review

@englefly
Copy link
Copy Markdown
Contributor Author

run buildall

@englefly
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking correctness issue that should be fixed before merge.

Critical checkpoint conclusions:

  • Goal/test coverage: The PR aims to enable TopN lazy materialization for nested/PPD expressions. It adds regression coverage for several id, nested_expr shapes, but misses output-order permutations where the pulled-up expression appears before other selected columns.
  • Scope/focus: The feature is generally focused, but the final TopN projection reconstruction changes user-visible column order in the new pulled-up expression path.
  • Concurrency/lifecycle: No new shared mutable concurrency or non-obvious lifecycle issue found in the reviewed FE paths.
  • Config/compatibility: No new config or incompatible storage/protocol format change found.
  • Parallel paths: Normal and using_index TopN lazy materialization cases were reviewed; the reported issue is specific to the new nested/PPD pull-up path.
  • Special checks/error handling: No ignored Status or FE exception-boundary issue found in the actual PR diff.
  • Tests/results: Regression tests are deterministic, but they do not cover pulled-up nested expressions in non-tail output positions.
  • Observability/performance/transactions: No additional observability, performance, transaction, or persistence blocker found for this PR.

User focus: No additional user-provided review focus was specified.

}
}
outputExprs.addAll(pulledUpExprs);
result = new PhysicalProject(ImmutableList.copyOf(outputExprs), null, result);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rebuilds the final projection as all non-pulled slots followed by all pulled-up expressions, which changes the user-visible column order whenever a pulled-up nested expression is not already the last selected column. For example select substring(struct_element(struct_col, 'city'), 1) as city, id ... order by id limit 3 has userVisibleOutput = [city, id], but this code produces [id, city], so the result schema and row values are swapped relative to the SQL. Please preserve the original userVisibleOutput order by replacing each pulled slot in-place with its corresponding pulled-up expression, and add a regression case with the nested expression before another selected column.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 32544 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4897f0cfc299fd90b64a17d6658e5891a6bbc55b, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17656	4380	4235	4235
q2	q3	10791	1465	811	811
q4	4692	481	346	346
q5	7606	2333	2216	2216
q6	239	184	141	141
q7	974	811	636	636
q8	9396	1778	1792	1778
q9	5560	5082	5052	5052
q10	6424	2199	1905	1905
q11	445	276	256	256
q12	698	437	312	312
q13	18165	3446	2782	2782
q14	273	261	244	244
q15	q16	831	775	709	709
q17	1023	1031	1048	1031
q18	7032	5897	5483	5483
q19	1182	1495	1114	1114
q20	595	451	274	274
q21	6049	3012	2891	2891
q22	454	387	328	328
Total cold run time: 100085 ms
Total hot run time: 32544 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4974	4889	5177	4889
q2	q3	5013	5253	4648	4648
q4	2284	2309	1487	1487
q5	5106	4828	4736	4736
q6	249	189	135	135
q7	1914	1774	1642	1642
q8	2621	2422	2303	2303
q9	7889	7506	7453	7453
q10	4762	4722	4207	4207
q11	576	431	387	387
q12	750	755	550	550
q13	3063	3419	2815	2815
q14	271	276	260	260
q15	q16	703	704	617	617
q17	1334	1307	1300	1300
q18	7331	7142	6830	6830
q19	1133	1147	1151	1147
q20	2234	2230	1963	1963
q21	5432	4820	4730	4730
q22	528	475	422	422
Total cold run time: 58167 ms
Total hot run time: 52521 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 172873 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4897f0cfc299fd90b64a17d6658e5891a6bbc55b, data reload: false

query5	4319	661	547	547
query6	338	233	199	199
query7	4305	539	317	317
query8	334	269	221	221
query9	8791	4119	4096	4096
query10	474	349	299	299
query11	5826	2537	2304	2304
query12	191	131	124	124
query13	1273	597	462	462
query14	6141	5511	5259	5259
query14_1	4571	4539	4581	4539
query15	219	215	186	186
query16	1034	469	442	442
query17	1155	760	637	637
query18	2701	500	377	377
query19	227	213	175	175
query20	145	137	135	135
query21	219	143	124	124
query22	13670	13655	13325	13325
query23	17531	16437	16206	16206
query23_1	16308	16462	16496	16462
query24	7426	1771	1371	1371
query24_1	1320	1350	1341	1341
query25	573	506	454	454
query26	1310	331	178	178
query27	2663	585	345	345
query28	4436	2059	2051	2051
query29	1013	662	515	515
query30	307	247	213	213
query31	1134	1079	955	955
query32	96	79	76	76
query33	557	357	336	336
query34	1241	1146	651	651
query35	785	783	715	715
query36	1396	1375	1224	1224
query37	154	108	90	90
query38	3217	3169	3096	3096
query39	934	923	902	902
query39_1	878	886	876	876
query40	228	146	125	125
query41	65	64	62	62
query42	114	116	108	108
query43	342	334	296	296
query44	
query45	209	209	199	199
query46	1071	1210	742	742
query47	2367	2333	2213	2213
query48	383	425	295	295
query49	638	497	412	412
query50	980	361	257	257
query51	4372	4388	4253	4253
query52	107	104	98	98
query53	258	284	209	209
query54	325	267	258	258
query55	95	91	90	90
query56	311	328	334	328
query57	1441	1421	1298	1298
query58	298	270	280	270
query59	1603	1663	1462	1462
query60	334	319	312	312
query61	155	151	155	151
query62	706	645	592	592
query63	252	205	211	205
query64	2397	842	622	622
query65	
query66	1678	492	360	360
query67	29246	29739	29545	29545
query68	
query69	473	344	310	310
query70	1000	1059	1069	1059
query71	306	284	275	275
query72	3048	2768	2420	2420
query73	890	788	435	435
query74	5111	4993	4809	4809
query75	2707	2615	2264	2264
query76	2274	1169	764	764
query77	414	420	343	343
query78	12467	12492	11823	11823
query79	1446	1004	761	761
query80	687	539	454	454
query81	467	282	240	240
query82	1374	157	124	124
query83	328	271	252	252
query84	263	142	115	115
query85	903	538	452	452
query86	392	341	336	336
query87	3454	3392	3256	3256
query88	3629	2761	2749	2749
query89	455	393	345	345
query90	1933	186	191	186
query91	186	170	141	141
query92	80	74	79	74
query93	1562	1422	830	830
query94	555	364	300	300
query95	670	383	437	383
query96	1117	794	375	375
query97	2744	2746	2606	2606
query98	235	229	226	226
query99	1181	1154	1041	1041
Total cold run time: 254570 ms
Total hot run time: 172873 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 75.15% (127/169) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants