[fix](be) struct_element(stru, 'str') offset optimization bug#63564
[fix](be) struct_element(stru, 'str') offset optimization bug#63564englefly wants to merge 5 commits into
Conversation
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Nested column pruning collects struct field access paths by field name, but expressions such as struct_element(s, 2) and element_at(s, 2) still use the original ordinal. After the scan slot type is pruned to only selected struct fields, that ordinal can refer to the pruned struct shape instead of the original struct shape. Convert ordinal-based StructElement expressions to field-name access while SlotTypeReplacer rewrites the scan slot type, so the expression keeps the original field semantics after nested column pruning and offset-only string reads.
### Release note
Fix incorrect struct field access for ordinal-based struct_element/element_at when nested column pruning is enabled.
### Check List (For Author)
- Test: Regression test / FE check
- tools/fast-compile-fe.sh
- cd fe && mvn checkstyle:check -pl fe-core -q
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s string_length_column_pruning -forceGenOut
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s string_length_column_pruning
- Behavior changed: Yes (ordinal-based struct field access is normalized to name-based access during nested column pruning)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: IS NULL and IS NOT NULL on ordinal-based struct field access have the same nested column pruning risk as other struct_element(struct, ordinal) expressions. The access path is collected by original field name, but after scan slot pruning the expression would be unsafe if it still used the original ordinal. Add regression coverage for struct_element(struct_col, 2) and element_at(struct_col, 2) through NULL-only access paths.
### Release note
None
### Check List (For Author)
- Test: Regression test
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s null_column_pruning -forceGenOut
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s null_column_pruning
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Revert the previous struct ordinal normalization fix and its regression tests because they do not address the original nested column pruning bug. The original plan already normalizes struct_element(struct_col, 5) to struct_element(struct_col, 'c_string'), so the incorrect result must be fixed in the OFFSET-only nested string read path instead of by rewriting ordinal access.
### Release note
None
### Check List (For Author)
- Test: Manual test
- Reproduced the original query with enable_prune_nested_column=true/false and confirmed the remaining issue is unrelated to ordinal normalization
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: length(struct_element(...)) can read a struct string field through an OFFSET-only nested column path. FunctionStructElement cloned the selected child column before wrapping it as nullable, which could break the OFFSET-only string column state in scan projection and aggregation expressions and produce wrong length group keys. Return the selected child column directly with nullable wrapping so length() consumes the original OFFSET-only column while preserving nested column pruning performance.
### Release note
Fix wrong length(struct_element(...)) results when nested column pruning uses OFFSET-only string access paths.
### Check List (For Author)
- Test: Regression test
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s string_length_column_pruning -forceGenOut
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s string_length_column_pruning
- ./build.sh --be
- Behavior changed: Yes (fix incorrect query result)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: length(struct_element(...)) can use an OFFSET-only access path for a string field inside a struct. When the original struct also contains CHAR fields, scan output can run char padding shrink recursively on the pruned struct and recompute the OFFSET-only string offsets from placeholder chars, producing wrong length group keys. Fill OFFSET-only string placeholder chars with non-zero bytes so the offsets remain stable while preserving the .OFFSET read path.
### Release note
Fix wrong length(struct_element(...)) results when nested column pruning reads struct string fields through OFFSET-only paths.
### Check List (For Author)
- Test: Regression test / Manual test
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s string_length_column_pruning -forceGenOut
- ./run-regression-test.sh --run -d nereids_rules_p0/column_pruning -s string_length_column_pruning
- ./build.sh --be
- Manual SQL comparison with enable_prune_nested_column=false/true for data_test/bug/b.sql and pure length group-by
- Behavior changed: Yes (fix incorrect query result)
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 31598 ms |
TPC-DS: Total hot run time: 172931 ms |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
I found one blocking issue. The fix targets a struct-with-CHAR shrink path, but the implementation changes the global OFFSET_ONLY string reader to materialize and initialize synthetic chars for every logical byte. That turns length-only scans on large STRING/VARCHAR columns from offset-only metadata processing into O(total string bytes) memory writes, which defeats a key purpose of the optimization.
Critical checkpoints: goal is clear and tests cover the reported struct length/group-by case, but the implementation is broader than necessary. The change is small locally but not sufficiently focused because it affects all OFFSET_ONLY string reads. No new concurrency, lifecycle, configuration, protocol, persistence, transaction, or FE/BE variable-passing concerns were found. Data correctness is improved for the tested case, but performance has a blocking regression risk on hot scan paths. Test coverage includes the new positive regression; I did not run tests in this review runner. Existing review threads: none. User focus: no additional focus was provided.
| // OFFSET_ONLY columns carry valid offsets but no real string payload. Use non-zero | ||
| // placeholders so char-padding shrink logic cannot recompute these offsets as zero-length | ||
| // strings when this column is nested under a struct that also contains CHAR fields. | ||
| chars.resize_fill(offsets[old_rows + num - 1], 1); |
There was a problem hiding this comment.
This makes every OFFSET_ONLY string read write one synthetic byte for every logical byte in the column. The previous resize() only advanced the chars size after allocating, while resize_fill(..., 1) touches the whole appended range. For a query such as select length(big_string_col) ... with nested pruning enabled, the BE still only needs offsets, but this now performs O(total string bytes) memory writes per block and can dominate the scan for large values, even though the CHAR/struct shrink issue only applies to the later shrink_padding_chars() path. Please keep the general OFFSET_ONLY path sparse and fix the shrink path more narrowly, e.g. by preventing shrink from recomputing offsets for offset-only string children or only materializing placeholders when that specific shrink path is actually required.
What problem does this PR solve?
ColumnString::insert_offsets_from_lengths()should fill \1 not \0 to avoid rewrite OFFSET afterColumnString::shrink_padding_chars()Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)