Fix column type mismatch exception in DirectJoinMergeTreeEntity#101046
Fix column type mismatch exception in DirectJoinMergeTreeEntity#101046alexey-milovidov merged 12 commits intomasterfrom
Conversation
When pulling multiple blocks from the pipeline in `executePlan`, columns from different blocks may have different types (e.g., `ColumnConst` in one block vs regular column in another). The `insertRangeFrom` call triggers `assertTypeEquality` which fails in debug/sanitizer builds because `typeid(*this) != typeid(rhs)`. Fix by calling `convertToFullColumnIfConst` on columns before merging, ensuring consistent column types across blocks. https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100270&sha=59b1b5fefa389130a5d1328d4a47f9c7bea974f2&name_0=PR&name_1=Stress%20test%20%28arm_asan_ubsan%2C%20s3%29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Workflow [PR], commit [9d8a538] Summary: ✅ AI ReviewSummaryThis PR fixes Findings💡 Nits
ClickHouse Rules
Final Verdict
|
|
|
||
| for (size_t i = 0; i < columns.size(); ++i) | ||
| { | ||
| auto new_col = new_columns[i]->convertToFullColumnIfConst(); |
There was a problem hiding this comment.
Thanks for fixing the ColumnConst/full-column mismatch in executePlan.
Could we add a regression test for this path? Right now the fix is only covered implicitly, and this code is in join execution. A small stateless test that forces multi-block right-side output with mixed const/non-const columns would protect against future refactors reintroducing insertRangeFrom type mismatches.
…umn-type-mismatch
…eEntity::executePlan` When pulling multiple blocks from the pipeline, columns from different blocks may have different types (e.g., `ColumnConst` from ALIAS columns vs regular columns). This test ensures the `convertToFullColumnIfConst` fix works correctly by using a small `max_block_size` to force multiple blocks and an ALIAS column that produces `ColumnConst`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…onst A column added via ALTER TABLE after data is already written is not stored in existing parts, so MergeTree fills it as ColumnConst on read. Combined with small max_block_size, this reliably triggers the multi-block merging path in executePlan that was previously broken. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…umn-type-mismatch
… mismatch Use two parts in the right table: one written before ALTER ADD COLUMN (missing column filled as ColumnConst on read) and one written after (column stored as regular column). This creates a more realistic scenario with mixed column types across parts. Note: the bug (assertion failure in `insertRangeFrom`) only manifests in debug/sanitizer builds where `assertTypeEquality` is checked. In release builds, the MergeTree reader materializes ColumnConst from missing columns before they reach `executePlan`, so bugfix validation cannot reproduce it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…umn-type-mismatch
The previous fix only called `convertToFullColumnIfConst` when merging blocks from the pipeline. However, with sparse serialization enabled (`ratio_of_defaults_for_sparse_serialization`), some blocks may contain `ColumnSparse` while others contain regular columns, causing the same `assertTypeEquality` failure in `insertRangeFrom`. Add `convertToFullColumnIfSparse` after `convertToFullColumnIfConst` to handle both cases. Also add a sparse column test case to the regression test. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101046&sha=0caf5030dd5d0811b8635d7b423b0dd2e85194c5&name_0=PR&name_1=Stateless%20tests%20%28arm_asan_ubsan%2C%20targeted%29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…umn-type-mismatch
The SQL inserts `number + 100` where `number` is 0..4, producing values 100..104 for rows with id 5..9. The reference file incorrectly expected 105..109. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| -- (written after ALTER) stores new_col as a regular column. With small | ||
| -- max_block_size, blocks from both parts are merged in executePlan, which | ||
| -- previously triggered an assertTypeEquality failure (debug/sanitizer builds) | ||
| -- or produced wrong results (release builds) in insertRangeFrom. |
There was a problem hiding this comment.
This comment looks outdated/misleading now:
or produced wrong results (release builds) in insertRangeFrom.
In this PR's own commit history, the failure mode is described as reproducible in debug/sanitizer builds (assertTypeEquality), while release builds materialize before this path. Please adjust this sentence to match the actually reproducible behavior to avoid confusing future triage.
|
The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix. |
| @@ -0,0 +1,77 @@ | |||
| -- Test that DirectJoinMergeTreeEntity handles ColumnConst columns correctly | |||
There was a problem hiding this comment.
The file and top-level comment still say column_const_mismatch / "ColumnConst columns", but the test now also validates mixed ColumnSparse/regular columns.
Please update the test description to mention both ColumnConst and ColumnSparse, so future triage does not miss the second scenario covered here.
LLVM Coverage Report
Changed lines: 100.00% (17/17) | lost baseline coverage: 1 line(s) · Uncovered code |
|
Hi — this PR may need backporting to Affected code: Why: Other supported branches ( If this should be backported, consider adding |
When pulling multiple blocks from the pipeline in
DirectJoinMergeTreeEntity::executePlan, columns from different blocks may have different types (e.g.,ColumnConstin one block vs a regular column in another). TheinsertRangeFromcall triggersassertTypeEqualitywhich fails in debug/sanitizer builds becausetypeid(*this) != typeid(rhs).Fix by calling
convertToFullColumnIfConston columns before merging, ensuring consistent column types across blocks.CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100270&sha=59b1b5fefa389130a5d1328d4a47f9c7bea974f2&name_0=PR&name_1=Stress%20test%20%28arm_asan_ubsan%2C%20s3%29
#100270
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Fix exception in
DirectJoinMergeTreeEntitywhen pipeline blocks containColumnConstcolumns that are merged with regular columns.Documentation entry for user-facing changes