[Chore](be) Check column sanity across operator outputs#63518
[Chore](be) Check column sanity across operator outputs#63518BiteTheDDDDt wants to merge 1 commit into
Conversation
### What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Invalid column internals, such as a nullable column whose nested column size differs from its null map size, may only fail later when a downstream operator reads the nested data. Run column sanity checks at pipeline operator output boundaries, including projection outputs, so malformed columns are reported closer to their producer. Also keep targeted nested loop join lazy materialization checks at the source-copy points that exposed the issue.
### Release note
None
### Check List (For Author)
- Test: Manual test
- build-support/clang-format.sh
- build-support/check-format.sh
- git diff --check
- ninja -C be/build_Release src/core/CMakeFiles/Core.dir/block/block.cpp.o src/core/CMakeFiles/Core.dir/column/column_string.cpp.o src/exec/CMakeFiles/Exec.dir/operator/operator.cpp.o src/exec/CMakeFiles/Exec.dir/operator/nested_loop_join_probe_operator.cpp.o
- Behavior changed: No
- Does this need documentation: No
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
run buildall |
There was a problem hiding this comment.
Pull request overview
This PR aims to surface malformed column internals (e.g., mismatched nested column sizes / null maps) earlier by running column sanity checks at pipeline operator output boundaries (including after projections), and by adding additional checks at nested loop join lazy materialization copy points.
Changes:
- Add
Block::check_type_and_column()calls atOperatorXBase::get_block_after_projects()outputs (both projected and non-projected paths). - Make
Block::check_type_and_column()runIColumn::sanity_check()in non-debug builds (while keeping type compatibility checks debug-only). - Add explicit
sanity_check()calls during nested loop join lazy materialization column copying/filtering, and adjustColumnStrsanity checking behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| be/src/exec/operator/operator.cpp | Runs block sanity/type checks at operator output boundaries (including projection outputs). |
| be/src/exec/operator/nested_loop_join_probe_operator.cpp | Adds sanity checks at lazy materialization source-copy/filter points in nested loop join. |
| be/src/core/column/column_string.h | Makes sanity_check_simple() always active (previously debug-only). |
| be/src/core/column/column_string.cpp | Ensures ColumnStr::sanity_check() performs basic checks outside NDEBUG. |
| be/src/core/block/block.cpp | Runs column->sanity_check() in check_type_and_column() for release builds, with exception-to-Status conversion. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| constexpr int8_t MARK_NULL = -1; | ||
|
|
||
| ColumnPtr make_const_column_from_row(const ColumnWithTypeAndName& source, size_t row, size_t rows) { | ||
| source.column->sanity_check(); |
| void append_many_from_source(MutableColumnPtr& dst_column, const ColumnWithTypeAndName& src_column, | ||
| size_t row, size_t rows) { | ||
| src_column.column->sanity_check(); | ||
| if (!src_column.column->is_nullable() && dst_column->is_nullable()) { |
| void append_filtered_from_source(MutableColumnPtr& dst_column, | ||
| const ColumnWithTypeAndName& src_column, | ||
| const IColumn::Filter& filter, size_t selected_rows) { | ||
| if (selected_rows == 0) { | ||
| return; | ||
| } | ||
| src_column.column->sanity_check(); | ||
| auto filtered_column = src_column.column->filter(filter, selected_rows); | ||
| filtered_column->sanity_check(); | ||
| if (!src_column.column->is_nullable() && dst_column->is_nullable()) { |
| @@ -128,7 +127,6 @@ class ColumnStr final : public COWHelper<IColumn, ColumnStr<T>> { | |||
| if (offsets[-1] != 0) { | |||
| throw Exception(Status::InternalError("wrong offsets[-1]: {}", offsets[-1])); | |||
| } | |||
| #endif | |||
| } | |||
| RETURN_IF_ERROR(block->check_type_and_column()); | ||
| return Status::OK(); | ||
| } | ||
| status = get_block(state, block, eos); | ||
| RETURN_IF_ERROR(status); | ||
| RETURN_IF_ERROR(block->check_type_and_column()); |
There was a problem hiding this comment.
I reviewed the PR diff and the existing inline review context. I did not add duplicate inline comments, but the already-open review threads identify blocking issues that should be addressed before merge.
Critical checkpoint conclusions:
- Goal/test coverage: the PR appears to move selected column sanity checks into release/operator execution paths. Existing tests for these new runtime validation paths were not added in the PR diff.
- Scope/clarity: the code change is small, but it expands validation into hot operator/string access paths.
- Concurrency/lifecycle/config/compatibility: no new concurrency, lifecycle, config, wire protocol, or storage-format concerns found in the reviewed diff.
- Error handling: existing inline comments already cover unconverted
sanity_check()exceptions in nested loop join helper paths. - Parallel paths/performance: existing inline comments already cover duplicate output block validation and release-build string accessor overhead.
- Observability/transactions/data writes/FE-BE variable passing: not applicable to this PR diff.
User focus: no additional user-provided review focus was specified.
TPC-H: Total hot run time: 31382 ms |
TPC-DS: Total hot run time: 169023 ms |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
What problem does this PR solve?
Issue Number: None
Related PR: None
Problem Summary: Invalid column internals, such as a nullable column whose nested column size differs from its null map size, may only fail later when a downstream operator reads nested data. This change runs column sanity checks at pipeline operator output boundaries, including projection outputs, so malformed columns are reported closer to their producer. It also keeps targeted nested loop join lazy materialization sanity checks at the source-copy points that exposed the issue.
Release note
None
Check List (For Author)