Fix wrong results for integer boolean predicates in WHERE on MergeTree#101287
Conversation
When a large integer constant is used as a boolean predicate in a WHERE clause with AND on a MergeTree table, e.g. `WHERE (256 > b) AND 256`, the query incorrectly returns 0 rows instead of matching all rows. Root cause: `splitFilterNodeForAllowedInputs()` in VirtualColumnUtils.cpp reduces `AND(greater(256, b), 256)` to just `256` when column `b` is not an allowed input for virtual column filtering. Since the remaining child's type (UInt16) differs from the AND's return type (UInt8), it applies a numeric cast: `CAST(256, 'UInt8')` = 0 (truncation). This makes `filterPartsByVirtualColumns()` see always_false, pruning all parts. Fix: Replace the truncating numeric cast with `notEquals(x, 0)`, which correctly converts any numeric value to boolean (0 or 1) without truncation. Values like 256, 512, 65536, 2147483648 now correctly evaluate to true. Closes ClickHouse#101269
Pre-PR Validation (session cron:clickhouse-ci-task-worker:20260331-001500)a) Deterministic repro? Yes. CREATE TABLE t (b Int8) ENGINE = MergeTree ORDER BY ();
INSERT INTO t VALUES (1), (0);
SELECT count() FROM t WHERE (256 > b) AND 256; -- returns 0 (should be 2)Any integer where b) Root cause explained? c) Fix matches root cause? Yes. Replaced d) New test added? Yes — e) Both directions demonstrated?
|
|
cc @CurtizJ @KochetovNicolai — could you review this? It fixes a wrong-results bug where |
|
Workflow [PR], commit [d4801eb] Summary: ❌
AI ReviewSummaryThis PR fixes incorrect ClickHouse Rules
Final Verdict
|
| @@ -0,0 +1,31 @@ | |||
| -- Tags: no-parallel | |||
There was a problem hiding this comment.
no-parallel should be avoided unless strictly required. This test uses a unique table name (t_large_int_bool) and does not rely on shared mutable global state, so it should be safe to run in parallel. Please remove the tag to keep stateless test concurrency high.
The performance-move-const-arg clang-tidy check flags std::move() on ColumnPtr (immutable_ptr<IColumn> wrapping boost::intrusive_ptr<const IColumn>). Copying the intrusive pointer is just an atomic refcount increment, so the move is unnecessary. Removing it fixes the arm_tidy CI failure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
arm_tidy fix: Removed unnecessary |
LLVM Coverage Report
Changed lines: 100.00% (9/9) · Uncovered code |
b5e6e90
|
Hi — this PR may need backporting to release branches Why: This is a P0 wrong-results bug in VirtualColumnUtils.cpp which has existed for a long time in the MergeTree part pruning path. The splitFilterNodeForAllowedInputs function and its CAST-based type conversion are present in all active release branches. Users hitting WHERE expr AND large_const patterns get silently wrong (zero) results. If this should be backported, consider adding |
|
Hi @groeneai @nikitamikhaylov @davenger — while reviewing this PR I found the following:
Happy to discuss — close anything that's wrong or already addressed. |
PR ClickHouse#101287 changed addCast(res, UInt8) to notEquals(x, 0) in the AND single-child path of splitFilterNodeForAllowedInputs to avoid truncating large values (e.g. CAST(256, 'UInt8') → 0). However, for Nullable types, DataTypeNullable::getDefault() returns a Null Field, so notEquals(x, NULL) always yields NULL (SQL three-valued logic), incorrectly filtering out all rows/parts. Issue ClickHouse#101433. Fix: use removeNullable() to obtain the nested type's zero instead of NULL. For the special case of Nullable(Nothing) — a bare NULL literal — fall back to the Nullable default (Null field), since Nothing has no getDefault(); notEquals(x, NULL) correctly yields NULL → false. The previous version of this fix (commit e9a6d5a) also incorrectly changed the index-hint path from addCast to notEquals; that path performs type coercion (not boolean conversion) and was never affected by the Nullable issue. This version leaves it unchanged. Closes ClickHouse#101433
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix wrong query results when a large integer constant (e.g. 256, 2147483648) is used as a boolean predicate in a WHERE clause with AND on MergeTree tables. For example,
SELECT count() FROM t WHERE (2147483648 > b) AND 2147483648would incorrectly return 0 instead of matching all rows.Summary
When a non-UInt8 integer constant is used as a boolean predicate in a WHERE clause with AND (e.g.
WHERE (256 > b) AND 256), MergeTree incorrectly prunes all parts and returns 0 rows. This affects any integer value wherevalue % 256 == 0(256, 512, 65536, 2147483648, etc.).Root cause:
splitFilterNodeForAllowedInputs()inVirtualColumnUtils.cppreducesAND(expr_with_column, constant)to justconstantwhen the column is not a virtual column. Since the constant's type (e.g. UInt16) differs from the AND function's return type (UInt8), a numeric castCAST(256, 'UInt8')is applied, truncating 256 → 0 (256 mod 256). This makesfilterPartsByVirtualColumns()seealways_falseand prune all parts.Fix: Replace the truncating numeric cast with
notEquals(x, 0), which correctly converts any numeric value to boolean (UInt8 0/1) without truncation. Values wherevalue % 256 == 0now correctly evaluate to true (1).Closes #101269
Closes #99979
🤖 Generated with Claude Code