Company or project name
No response
Describe what's wrong
Summary
I discovered a silent correctness bug in ClickHouse where logically equivalent query formulations under De Morgan's laws return different results (cardinality mismatch: 1 vs 0).
The equivalence relation under test is:
$$\text{Query}(\text{NOT } (A \lor B)) \equiv \text{Query}(\text{NOT } A) \ \text{INTERSECT DISTINCT} \ \text{Query}(\text{NOT } B)$$
When the second query's independent filter $\text{NOT } B$ contains a function that produces NaN on negative values (such as sqrt(negative)), ClickHouse's partition pruner incorrectly evaluates NOT(NaN > X) as false and skips those partitions entirely. However, the first query's compound expression $\text{NOT}(A \lor B)$ prevents partition pruning from being applied, scanning the full table and evaluating the expression correctly on the CPU.
This execution path divergence leads to a direct silent correctness violation under our metamorphic testing framework.
Does it reproduce on the most recent release?
Yes
How to reproduce
ClickHouse 26.3.12.3
1. Setup Table and Data
DROP TABLE IF EXISTS t;
CREATE TABLE t (c0 Int32) ENGINE = MergeTree() ORDER BY c0 PARTITION BY c0;
INSERT INTO t VALUES (1), (-1);
2. Query 1: HAVING NOT (A OR B) — Returns 1 row (Correct)
Here, $A \equiv \text{SUM}(c0) > 0$, and $B \equiv \text{sqrt}(c0) > 10$.
SELECT c0 FROM t GROUP BY c0
HAVING NOT ((SUM(c0) > 0) OR (sqrt(c0) > 10))
SETTINGS aggregate_functions_null_for_empty=1, enable_optimize_predicate_expression=0;
-- Output:
-- ┌──c0─┐
-- │ -1 │
-- └─────┘
-- (1 row, correct. For c0 = -1, SUM(c0)>0 is false, sqrt(-1)>10 is NaN > 10 = false. false OR false = false. NOT(false) = true.)
3. Query 2: HAVING NOT A INTERSECT DISTINCT HAVING NOT B — Returns 0 rows (Incorrect!)
SELECT c0 FROM t GROUP BY c0
HAVING NOT (SUM(c0) > 0)
INTERSECT DISTINCT
SELECT c0 FROM t GROUP BY c0
HAVING NOT (sqrt(c0) > 10)
SETTINGS aggregate_functions_null_for_empty=1, enable_optimize_predicate_expression=0;
-- Output:
-- Empty set. (0 rows, incorrect! Mismatch with Query 1.)
Expected behavior
- Any comparison involving
NaN (such as NaN > X) should evaluate to false in IEEE 754 logic, meaning its negation NOT (NaN > X) must evaluate to true (represented as 1).
- Partitions containing negative values should NOT be skipped during partition pruning when the filter is
NOT (sqrt(c0) > X).
- Query 1 and Query 2 should both return
c0 = -1, satisfying the metamorphic equivalence relation.
Error message and/or stacktrace
No response
Related issues and pull requests
No response
Additional context
No response
Company or project name
No response
Describe what's wrong
Summary
I discovered a silent correctness bug in ClickHouse where logically equivalent query formulations under De Morgan's laws return different results (cardinality mismatch: 1 vs 0).
The equivalence relation under test is:
$$\text{Query}(\text{NOT } (A \lor B)) \equiv \text{Query}(\text{NOT } A) \ \text{INTERSECT DISTINCT} \ \text{Query}(\text{NOT } B)$$
When the second query's independent filter$\text{NOT } B$ contains a function that produces $\text{NOT}(A \lor B)$ prevents partition pruning from being applied, scanning the full table and evaluating the expression correctly on the CPU.
NaNon negative values (such assqrt(negative)), ClickHouse's partition pruner incorrectly evaluatesNOT(NaN > X)asfalseand skips those partitions entirely. However, the first query's compound expressionThis execution path divergence leads to a direct silent correctness violation under our metamorphic testing framework.
Does it reproduce on the most recent release?
Yes
How to reproduce
ClickHouse 26.3.12.31. Setup Table and Data
2. Query 1: HAVING NOT (A OR B) — Returns 1 row (Correct)
Here, $A \equiv \text{SUM}(c0) > 0$, and $B \equiv \text{sqrt}(c0) > 10$.
3. Query 2: HAVING NOT A INTERSECT DISTINCT HAVING NOT B — Returns 0 rows (Incorrect!)
Expected behavior
NaN(such asNaN > X) should evaluate tofalsein IEEE 754 logic, meaning its negationNOT (NaN > X)must evaluate totrue(represented as1).NOT (sqrt(c0) > X).c0 = -1, satisfying the metamorphic equivalence relation.Error message and/or stacktrace
No response
Related issues and pull requests
No response
Additional context
No response