Skip to content

Fix column type mismatch exception in DirectJoinMergeTreeEntity#101046

Merged
alexey-milovidov merged 12 commits intomasterfrom
fix-direct-join-column-type-mismatch
Apr 9, 2026
Merged

Fix column type mismatch exception in DirectJoinMergeTreeEntity#101046
alexey-milovidov merged 12 commits intomasterfrom
fix-direct-join-column-type-mismatch

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Mar 28, 2026

When pulling multiple blocks from the pipeline in DirectJoinMergeTreeEntity::executePlan, columns from different blocks may have different types (e.g., ColumnConst in one block vs a regular column in another). The insertRangeFrom call triggers assertTypeEquality which fails in debug/sanitizer builds because typeid(*this) != typeid(rhs).

Fix by calling convertToFullColumnIfConst on columns before merging, ensuring consistent column types across blocks.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100270&sha=59b1b5fefa389130a5d1328d4a47f9c7bea974f2&name_0=PR&name_1=Stress%20test%20%28arm_asan_ubsan%2C%20s3%29
#100270

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix exception in DirectJoinMergeTreeEntity when pipeline blocks contain ColumnConst columns that are merged with regular columns.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

When pulling multiple blocks from the pipeline in `executePlan`, columns
from different blocks may have different types (e.g., `ColumnConst` in
one block vs regular column in another). The `insertRangeFrom` call
triggers `assertTypeEquality` which fails in debug/sanitizer builds
because `typeid(*this) != typeid(rhs)`.

Fix by calling `convertToFullColumnIfConst` on columns before merging,
ensuring consistent column types across blocks.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=100270&sha=59b1b5fefa389130a5d1328d4a47f9c7bea974f2&name_0=PR&name_1=Stress%20test%20%28arm_asan_ubsan%2C%20s3%29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 28, 2026

Workflow [PR], commit [9d8a538]

Summary:


AI Review

Summary

This PR fixes DirectJoinMergeTreeEntity::executePlan to normalize mixed column representations (ColumnConst/ColumnSparse) before insertRangeFrom, and adds a regression test covering both scenarios. The core fix is correct and targeted; I found one documentation-quality issue in the new test that should be clarified to match current scope.

Findings

💡 Nits

  • [tests/queries/0_stateless/04065_direct_join_column_const_mismatch.sql:1] The file name and top-level test description mention only ColumnConst, but the test also covers mixed ColumnSparse/regular columns. This can mislead future triage and maintenance.
    • Suggested fix: update the opening comment (and optionally file name in a follow-up) to explicitly mention both ColumnConst and ColumnSparse mismatch paths.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions:
    • Update the regression test description to reflect that it validates both ColumnConst and ColumnSparse mixed-type merge paths.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 28, 2026

for (size_t i = 0; i < columns.size(); ++i)
{
auto new_col = new_columns[i]->convertToFullColumnIfConst();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the ColumnConst/full-column mismatch in executePlan.

Could we add a regression test for this path? Right now the fix is only covered implicitly, and this code is in join execution. A small stateless test that forces multi-block right-side output with mixed const/non-const columns would protect against future refactors reintroducing insertRangeFrom type mismatches.

alexey-milovidov and others added 10 commits March 29, 2026 15:29
…eEntity::executePlan`

When pulling multiple blocks from the pipeline, columns from different
blocks may have different types (e.g., `ColumnConst` from ALIAS columns
vs regular columns). This test ensures the `convertToFullColumnIfConst`
fix works correctly by using a small `max_block_size` to force multiple
blocks and an ALIAS column that produces `ColumnConst`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…onst

A column added via ALTER TABLE after data is already written is not
stored in existing parts, so MergeTree fills it as ColumnConst on read.
Combined with small max_block_size, this reliably triggers the
multi-block merging path in executePlan that was previously broken.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… mismatch

Use two parts in the right table: one written before ALTER ADD COLUMN
(missing column filled as ColumnConst on read) and one written after
(column stored as regular column). This creates a more realistic scenario
with mixed column types across parts.

Note: the bug (assertion failure in `insertRangeFrom`) only manifests in
debug/sanitizer builds where `assertTypeEquality` is checked. In release
builds, the MergeTree reader materializes ColumnConst from missing columns
before they reach `executePlan`, so bugfix validation cannot reproduce it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The previous fix only called `convertToFullColumnIfConst` when merging
blocks from the pipeline. However, with sparse serialization enabled
(`ratio_of_defaults_for_sparse_serialization`), some blocks may contain
`ColumnSparse` while others contain regular columns, causing the same
`assertTypeEquality` failure in `insertRangeFrom`.

Add `convertToFullColumnIfSparse` after `convertToFullColumnIfConst` to
handle both cases. Also add a sparse column test case to the regression
test.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=101046&sha=0caf5030dd5d0811b8635d7b423b0dd2e85194c5&name_0=PR&name_1=Stateless%20tests%20%28arm_asan_ubsan%2C%20targeted%29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SQL inserts `number + 100` where `number` is 0..4, producing values
100..104 for rows with id 5..9. The reference file incorrectly expected
105..109.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
-- (written after ALTER) stores new_col as a regular column. With small
-- max_block_size, blocks from both parts are merged in executePlan, which
-- previously triggered an assertTypeEquality failure (debug/sanitizer builds)
-- or produced wrong results (release builds) in insertRangeFrom.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment looks outdated/misleading now:

or produced wrong results (release builds) in insertRangeFrom.

In this PR's own commit history, the failure mode is described as reproducible in debug/sanitizer builds (assertTypeEquality), while release builds materialize before this path. Please adjust this sentence to match the actually reproducible behavior to avoid confusing future triage.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

The Stress test (arm_msan) failure is fixed by #101239, which should be merged first. After it is merged, please update the branch to include the fix.

@@ -0,0 +1,77 @@
-- Test that DirectJoinMergeTreeEntity handles ColumnConst columns correctly
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file and top-level comment still say column_const_mismatch / "ColumnConst columns", but the test now also validates mixed ColumnSparse/regular columns.

Please update the test description to mention both ColumnConst and ColumnSparse, so future triage does not miss the second scenario covered here.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 7, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.00% +0.00%
Functions 90.90% 90.90% +0.00%
Branches 76.60% 76.50% -0.10%

Changed lines: 100.00% (17/17) | lost baseline coverage: 1 line(s) · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov requested a review from vdimir April 9, 2026 19:38
@alexey-milovidov alexey-milovidov self-assigned this Apr 9, 2026
@alexey-milovidov alexey-milovidov merged commit 3a4a4e9 into master Apr 9, 2026
320 of 322 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-direct-join-column-type-mismatch branch April 9, 2026 19:38
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 9, 2026
@clickgapai
Copy link
Copy Markdown
Contributor

Hi — this PR may need backporting to 26.3 (LTS), 26.2, 26.1, but no backport label was found.

Affected code: DirectJoinMergeTreeEntity::executePlan — introduced in 26.1.

Why: DirectJoinMergeTreeEntity was introduced in commit bc58326 (2025-12-15), after the 25.8 LTS release but before 26.1. The bug produces silent wrong results (garbage memory values) in release builds when one MergeTree part has a column stored with Sparse serialization and another has Default serialization. This is a P0 wrong-results bug that should be backported to all supported branches containing the affected code.

Other supported branches (25.8) predate 26.1 and do not contain this code.

If this should be backported, consider adding pr-must-backport or a version-specific label (e.g. v26.3-must-backport). Ignore this if backporting is not applicable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants