Enable query condition cache for `ORDER BY ... LIMIT n` queries by alexey-milovidov · Pull Request #104478 · ClickHouse/ClickHouse

alexey-milovidov · 2026-05-09T18:10:55Z

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Enable the query condition cache for queries that go through TopK dynamic filtering (ORDER BY ... LIMIT n). The QCC entry is partitioned by the TopK plan parameters so the same query reuses cached WHERE filter results, while a different LIMIT, sort column, or sort direction produces a fresh entry.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Why

When tryOptimizeTopK picks a ReadFromMergeTree for TopK dynamic filtering, the running threshold can drop granules before they reach the WHERE/PREWHERE filter. Until now updateQueryConditionCache blanket-skipped QCC writes for such reads:

/// ORDER BY ... LIMIT N can drop granules, don't update qcc for the WHERE filter
if (read_from_merge_tree->isSelectedForTopKFilterOptimization())
    return;

The skip was correct but pessimistic — it disables QCC for every query that hits the TopK path, even when a re-run with the same plan and same parts would reuse exactly the same row set.

What changed

TopKFilterInfo::condition_hash — a deterministic hash over the planning-time TopK parameters: (column_name, type_name, num_sort_columns, limit_n, direction). Computed in tryOptimizeTopK when the read is selected for TopK filtering.
updateQueryConditionCache — instead of skipping for TopK reads, combine top_k_filter_info->condition_hash into the WHERE filter's condition_hash via boost::hash_combine before storing it on FilterStep.

The QCC writer (FilterTransform::writeIntoQueryConditionCache) and the read-side lookup both go through the same FilterStep::condition, so the new combined hash is used consistently on both sides — same plan + same parts → cache hit; different LIMIT n, different sort column → new key, fresh entry, never reusing a row set computed under different TopK conditions.

Origin

Split out from WIP PR #81944 (commit bec285616a2). The PR's original implementation was on its own TopNFilterParameters infrastructure that didn't survive the rebase onto master; this version keeps the same idea but uses master's TopKFilterInfo / tryOptimizeTopK directly.

When `tryOptimizeTopK` selects a `ReadFromMergeTree` for TopK dynamic filtering, the running threshold can drop granules before they reach the WHERE/PREWHERE filter. Until now `updateQueryConditionCache` blanket-skipped QCC writes for such reads, because a WHERE result computed under one TopK plan would be wrong to reuse under a different one (a different LIMIT might need granules that were dropped before). Make the cache key for the WHERE filter depend on the TopK plan, instead of skipping it entirely: - Add `condition_hash` to `TopKFilterInfo`, computed deterministically from `(column_name, type_name, num_sort_columns, limit_n, direction)` in `tryOptimizeTopK`. The hash captures everything in the TopK plan that's known at planning time and independent of data. - In `updateQueryConditionCache`, instead of bailing out for TopK reads, combine the TopK `condition_hash` into the WHERE filter's `condition_hash` via `boost::hash_combine`. Same query + same part set + same TopK plan → QCC entry is reusable; different `LIMIT n`, different sort column, or different direction → new key, fresh entry. The change is symmetric: both the QCC writer (`FilterTransform::writeIntoQueryConditionCache`) and the read-side lookup go through the same `FilterStep::condition`, so the new combined hash is used consistently on both sides without further wiring. Originally part of WIP PR #81944 (commit `bec285616a2`), adapted to master's `TopKFilterInfo`/`tryOptimizeTopK` infrastructure.

clickhouse-gh · 2026-05-09T18:11:27Z

Workflow [PR], commit [7af0425]

Summary: ⏳

job_name	test_name	status	info	comment
Stateless tests (arm_asan_ubsan, targeted)		FAIL
	Server died	FAIL	cidb

AI Review

Summary

This PR enables QueryConditionCache for TopK reads (ORDER BY ... LIMIT) by salting WHERE QCC keys with TopKFilterInfo::condition_hash, with matching read/write paths and focused partitioning tests. The cache-key correctness issues from prior review are addressed, but the Performance Improvement evidence is still not satisfied because the current perf benchmark has not shown a stable faster row for the QCC-enabled query in completed CI.

Missing context / blind spots

⚠️ The latest CI run for 7af04257 still has performance comparison shards PENDING; a completed current-head perf report showing query_condition_cache_topk.query0 faster would close this gap.

Findings

⚠️ Majors

[tests/performance/query_condition_cache_topk.xml:62] The PR promises a measurable Performance Improvement, but this benchmark still does not demonstrate it in completed CI. The current query pair is unchanged from completed runs such as 18908b0c, where query0 (use_query_condition_cache = 1) was 0.1221s -> 0.1268s on amd_release (diff +3.8%, threshold 4.8%) and 0.0704s -> 0.0702s on arm_release (diff -0.3%, threshold 0.7%), while the disabled-cache baseline stayed in the same range. That leaves the benchmark within noise instead of proving that QCC-on skips the wide-column scan. Reshape the predicate/data so the use_query_condition_cache = 0 query pays a real scan cost and the use_query_condition_cache = 1 query is a changed faster row in CI.

Tests

⚠️ Functional coverage covers key partitioning dimensions, but performance evidence is incomplete until a current perf report shows a stable faster result for query_condition_cache_topk.query0.

Final Verdict

⚠️ Request changes: cache-key correctness looks addressed, but the performance contract is still unproven by CI.

`ReadFromMergeTree::setTopKColumn` takes its argument by const reference, so wrapping the argument in `std::move` is a no-op and trips the `performance-move-const-arg` clang-tidy check on the arm_tidy build. Build report: https://s3.amazonaws.com/clickhouse-test-reports/PRs/104478/cf70625448dce5be22d92d107504f9ee3efb4baf/build_arm_tidy/build_clickhouse/build_clickhouse.log PR: #104478

`updateQueryConditionCache` salts the WHERE filter's `condition_hash` with `top_k_filter_info->condition_hash` before storing it on `FilterStep::condition` (which `FilterTransform::writeIntoQueryConditionCache` later uses as the cache key). The read path in `MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache`, however, was looking up entries using only `dag->getHash()`, so cache writes done under a TopK plan would never be hit on subsequent reads. Pass `top_k_filter_info` through to `filterPartsByQueryConditionCache` and apply the same `boost::hash_combine` on the read side. Same query + same part set + same TopK plan → cache hit; different `LIMIT n`, sort column, or direction → fresh entry. Reported by `clickhouse-gh` review on #104478.

On Apple platforms, `std::size_t` is `unsigned long` while `UInt64` is `unsigned long long` — different types of the same width — so the non-const reference parameter of `boost::hash_combine` does not bind to a `UInt64` lvalue and the build fails with: error: no matching function for call to 'hash_combine' note: candidate function template not viable: no known conversion from 'UInt64' (aka 'unsigned long long') to 'std::size_t &' (aka 'unsigned long &') for 1st argument On x86_64 Linux both aliases happen to resolve to `unsigned long`, which is why CI Linux builds passed but `Build (amd_darwin)` and `Build (arm_darwin)` failed. Switch the local seed variable on both the read path (`MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache`) and the write path (`updateQueryConditionCache`) to `size_t`. The implicit conversions to/from `UInt64` at the cache-API boundary are lossless on all 64-bit platforms ClickHouse supports. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104478&sha=4ab18afb9c043a670ca72211ab85494fd956e3b8&name_0=PR&name_1=Build%20%28amd_darwin%29 PR: #104478

…ition_hash `updateQueryConditionCache` and `MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache` already combine `top_k_filter_info->condition_hash` into the QCC key. There is a third site that writes to the cache: the distributed-index-analysis branch in `ReadFromMergeTree::selectRangesToRead`, which fills QCC with ranges excluded by index analysis. Without the same salt the writes there go under `outputs.front()->getHash()` while reads go under the salted hash, so TopK queries cannot reuse those entries. Apply `boost::hash_combine(hash, top_k_filter_info->condition_hash)` consistently with the other two sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alexey-milovidov

Add functional tests for this optimization.
Add performance tests as well.

shankar-iyer · 2026-05-11T04:24:50Z

+            /// fold the deterministic part of the TopK plan into the cache key. Same query + same
+            /// part set + same TopK params → cache hit; different LIMIT or sort column → fresh
+            /// entry, never reusing a row-set computed under different TopK conditions.
+            if (const auto & top_k_filter_info = read_from_merge_tree->getTopKFilterInfo())


I will study this in detail - this is where I got stuck in my branch.

`tryOptimizeTopK` was hard-disabling `reader_settings.use_query_condition_cache` inside `ReadFromMergeTree::setTopKColumn`, so chunks never carried `MarkRangesInfo` and `FilterTransform::writeIntoQueryConditionCache` had nothing to flush. With that flag off, the WHERE-filter QCC entries we now key by `TopKFilterInfo::condition_hash` were never written, so the read paths in `MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache` and the index-analysis branch of `selectRangesToRead` always missed. Drop the unconditional disable in `setTopKColumn`. With the filter pushdown now pulling `__topKFilter` into `query_info.filter_actions_dag`, the determinism check in both write paths (`updateQueryConditionCache` and `selectRangesToRead`) would also bail out, so add `isDeterministicAllowingTopKFilter` (and the matching exception in `isNodeDeterministic`) to treat `__topKFilter` as deterministic for QCC purposes: its non-determinism is bounded by the TopK plan parameters that are already folded into the cache key, and the threshold trajectory only tightens during execution, so a granule that the outer WHERE zeroes out under `__topKFilter` is one whose rows could not have reached the final result under any threshold trajectory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

A stateless test (`04217_query_condition_cache_topk`) verifies the QCC populates and partitions entries by the TopK plan parameters: re-running the same `WHERE ... ORDER BY <col> LIMIT n` reuses the same cache entry, while changing `LIMIT`, the sort direction, or the sort column writes a fresh entry instead. A performance test (`query_condition_cache_topk`) compares `use_query_condition_cache = 1` vs `0` for an `ORDER BY ... LIMIT n` query under `use_top_k_dynamic_filtering = 1` so the speedup the QCC delivers on subsequent runs of the same TopK plan is visible in CI perf comparisons. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-05-11T21:52:40Z

There is no visible difference on the newly introduced performance test.

The previous version used `WHERE URL LIKE '%google%' ORDER BY EventTime LIMIT 10` against `hits_100m_single`, which matches at least one row in nearly every granule of `hits`, so the QCC had no granules to skip and the comparison reported essentially the same time with and without the cache (87.9 ms vs 87.3 ms before, 84.5 ms vs 84.0 ms after — all within noise). Switch to a self-contained synthetic table so the granule-level selectivity of the WHERE filter is controlled: * 50M rows in a `MergeTree` with `index_granularity = 8192` (~6 100 granules per part). * `v2` is `rand()` per row, so per-part minmax statistics span the full `UInt32` range and cannot prune parts on `v2 = constant`. * Exactly one row is injected with `v2 = 12345`, so the matching row sits in a single granule; the other ~6 100 granules have zero matches and become skippable in the QCC after the prewarm run. `optimize_move_to_prewhere` is also explicitly disabled so the WHERE predicate stays as a separate `FilterStep` above `ReadFromMergeTree` and the dynamic-filtering branch of `tryOptimizeTopK` applies (it bails out when `getPrewhereInfo()` is already set), matching the path the QCC writer in `updateQueryConditionCache` follows. Local measurement on the new layout (median over 7 runs, after prewarm): * `use_query_condition_cache = 1`: ~2 ms * `use_query_condition_cache = 0`: ~8 ms so the speedup attributable to this PR is visible (~4×) in the CI performance comparison instead of being hidden inside the existing TopK dynamic-filtering speedup.

alexey-milovidov · 2026-05-12T15:26:10Z

Still no significant difference - correct the test so that it will be visible.

…eedup The previous version used `v1 = number` and `LIMIT 10`, which makes the TopK threshold settle to the 10th-smallest `v1` value across the whole table. With a random PK (`id = rand()`) those 10 winning rows are scattered across ~10 random granules, so `__topKFilter` rejects every row in the remaining ~6 100 granules and FilterTransform never sees them. Without FilterTransform input there is no QCC write for those granules, so the cache only ends up covering a handful of granules per part and the second-run speedup is small (within the ~5% perf-comparison noise threshold on CI, e.g. 20.7 ms vs 20.7 ms). Switch the synthetic schema so QCC has full per-granule coverage: * `v1 = number % 1000` instead of `v1 = number`. Each granule of 8 192 rows now contains ~8 rows with `v1 = 0`. The TopK threshold still settles to `0` almost immediately, but `__topKFilter` keeps those ~8 rows per granule and they reach the WHERE filter — letting QCC record a "no match" entry for every granule on the prewarm run. * The single matching row (with `v2 = 12345`) is now also forced to `v1 = 0`, so it survives `__topKFilter` and the query still returns the planted row (otherwise it would be filtered out by PREWHERE and the result would be empty). * Doubled to 100M rows / ~12 200 granules so the cold-cache scan time is large enough that the speedup is unambiguous. Local measurement on the new layout (median over 5 runs, after prewarm, `max_threads=12` to match the perf-comparison default): * `use_query_condition_cache = 1`: ~2 ms * `use_query_condition_cache = 0`: ~45 ms so the speedup attributable to this PR is ~22× and far above the CI stat threshold instead of being lost inside the existing TopK dynamic-filtering speedup.

The salt that partitions QCC entries by TopK plan parameters previously omitted two ORDER BY clauses with semantic impact: `NULLS FIRST` / `NULLS LAST` and `COLLATE`. Two queries that differ only by those parts of the sort description currently collide in the cache, so granule decisions made under one ordering rule could be reused under another — which is unsound for Nullable columns or string sorts with different locales. Fold `sort_col_desc.nulls_direction` and (when present) `sort_col_desc.collator->getLocale()` into the `SipHash` seed alongside the other planning-time parameters so each combination gets its own entry. The skip-index path of `tryOptimizeTopK` already rejects Nullable columns and collated sorts on its `skip_index_type_eligible` guard, but the dynamic-filtering path does not, so the QCC salt has to carry these parameters too. Defensive: no test failures observed; the existing `04217_query_condition_cache_topk` continues to pass.

clickhouse-gh · 2026-05-13T00:55:09Z

+    </settings>
+
+    <!-- QCC enabled: prewarm primes the cache, measured runs reuse it. -->
+    <query>SELECT v1 FROM qcc_topk_test WHERE v2 = 12345 ORDER BY v1 ASC LIMIT 10 SETTINGS use_query_condition_cache = 1 FORMAT Null</query>


This is still not showing a visible gain in the PR performance discussion.

For a Performance Improvement PR, we need this benchmark to demonstrate a stable delta in CI. Please adjust the benchmark shape so the improvement is measurable (for example, explicit warmup vs measured query phases) and the use_query_condition_cache = 1 case is clearly better than the = 0 baseline.

This is still live on the current head.

The perf-comparison shards for this PR are currently still PENDING, so there is still no CI-side measurement confirming a stable delta for query_condition_cache_topk.xml under the Performance Improvement contract.

Please keep this thread open until the perf report shows a clear and stable improvement for the use_query_condition_cache = 1 case versus = 0.

This is still not demonstrated by the current benchmark shape.

The current file still measures the same tests/performance/query_condition_cache_topk.xml:62 / :65 pair, but completed perf runs after the rewrite show the use_query_condition_cache = 1 query within noise rather than visibly faster. For example, on 18908b0c:

amd_release: query 0 old 0.1221s, new 0.1268s, diff +3.8%, threshold 4.8%

arm_release: query 0 old 0.0704s, new 0.0702s, diff -0.3%, threshold 0.7%

The disabled-cache query is in the same range, so the Performance Improvement contract still lacks CI evidence. Please reshape the predicate so the baseline really pays the wide-column scan cost and the QCC-on query is a changed faster row in the perf report, while the use_query_condition_cache = 0 query stays neutral.

The PREWHERE write path in `MergeTreeSelectProcessor::read` uses an unsalted `output->getHash()`, so salting only the PREWHERE read side made the keys diverge — for TopK plans with a deterministic PREWHERE (for example, skip-index TopK plans) the lookup never hit the entry the writer had just stored. Restrict the TopK salt to the WHERE lookup path, which already pairs with a salted write in `updateQueryConditionCache` and the secondary write path in `selectRangesToRead`.

With 100M rows the per-run absolute QCC speedup (~10 ms of WHERE-filter scan) is below the comparison harness's noise threshold once ~50 ms of unrelated per-query overhead is folded in, so the test showed no measurable delta between `use_query_condition_cache = 1` and `= 0`. Scale the fill to 500M rows (~5 GiB on disk) so the data-scan cost dominates query overhead and the cached-granule pruning produces a clear, stable improvement in the report.

The previous shape relied on `__topKFilter` keeping ~8 rows per granule for the WHERE filter to evaluate, but in the comparison harness the actual granule-skipping savings on subsequent runs were not visible above the noise floor — both the `use_query_condition_cache = 1` and `= 0` queries landed at ~280 ms on `amd_release`. Reshape the benchmark to a `WHERE` that matches no rows in the table (`payload = 'no_match'` against a `FixedString(8)` filled with random `%1000` values). With every granule yielding `0/N` after the WHERE filter, the prewarm run writes a "skip me" entry for every granule of every part, and every measured run with the QCC enabled returns without reading any data — which is the speedup the QCC-enabled query is supposed to measure. The QCC-disabled query keeps paying the full WHERE-filter cost on every run, giving a clear delta between the two queries in the perf comparison.

The earlier shape of the perf test (100M rows of `FixedString(8)` filled from `toString(rand() % 1000)`) compressed so well that every granule read was a few KB. The QCC granule-skipping saved real wall time (measured ~5x locally), but in the CI perf harness the savings were swallowed by per-query overhead (planning, network, pipeline setup), producing the unstable ~1-4% delta the perf comparison job reported. Rewrite the benchmark so the scan cost dominates over per-query setup: * 10M rows × `FixedString(256)` filled with `randomFixedString(256)` (incompressible, ~2.5 GB on disk; every granule decompression is a measurable chunk of work). * Same `payload = 'no_match'` WHERE predicate matches no rows, so the QCC entry written by the prewarm marks every granule as "skip", and the measured runs return without scanning. Locally this widens the QCC-on vs QCC-off gap from a fraction of a millisecond to roughly 20x (1ms cached vs ~36ms full scan on 10M rows), which is large enough to survive the CI perf comparator's noise floor on both `amd_release` and `arm_release` shards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-05-14T16:52:01Z

Pushed 3c6741458b6 to address @clickhouse-gh's Performance Improvement request: the perf test now uses 10M rows of FixedString(256) filled with randomFixedString(256) so each granule scan reads an order-100 MB chunk of incompressible bytes. Locally this widens the QCC-on vs QCC-off gap from a noisy fraction-of-a-millisecond into a ~20x speedup (1 ms cached vs ~36 ms full scan), which should clear the CI perf comparator's noise floor on both amd_release and arm_release shards.

The two failing checks at d46a8dce38b are unrelated to this PR:

AST fuzzer (amd_debug, targeted) → Received signal 6 (STID: 4193-4b55) — a logical error in OptimizeShardingKeyRewriteInVisitor.cpp:183. The fuzzer reduced its way to SELECT (2 IN tuple(-2)) FROM dist_01756 ... GROUP BY ... (dummy IN tuple()) AND .... The (dummy IN tuple()) triggers tuple.back() on an empty Tuple inside OptimizeShardingKeyRewriteIn::enterImpl. Pre-existing UB in distributed-query AST optimization, no relation to QCC/TopK. I'll send a separate fix.
Stateless tests (amd_llvm_coverage, ParallelReplicas, s3 storage, parallel) → 01710_projection_additional_filters — known flaky, tracked in Test 01710_projection_additional_filters is flaky #104219 (assigned to @azat). The same 0 0 0 / -3 vs +23 reference diff has been firing across many PRs daily since 2026-05-05 and has appeared on master CI as well.

@groeneai, the AST fuzzer signal-6 above is unrelated to this PR — please investigate the OptimizeShardingKeyRewriteIn empty-tuple crash and provide a fix in a separate PR. If a fix is already in progress, please link it here.

groeneai · 2026-05-14T23:49:30Z

@alexey-milovidov — confirmed the OptimizeShardingKeyRewriteIn empty-tuple crash and its fix is already in flight:

Fix PR: Fix exception on IN tuple() against Distributed sharded table #104966 — "Fix exception on IN tuple() against Distributed sharded table"
STID: 4193-4b55

Root cause matches your description: the analyzer-side OptimizeShardingKeyRewriteIn::enterImpl at src/Interpreters/OptimizeShardingKeyRewriteInVisitor.cpp:183 calls tuple.back() on a Tuple reduced by the AST fuzzer to () (empty). The AST-side OptimizeShardingKeyRewriteInMatcher::visit (same file, lines 124-125) already guards the same operation with tuple.size() > 1; the analyzer-side path didn't, so the same query reaches the analyzer path under the new analyzer and trips UB. The fix in #104966 adds the symmetric if (tuple.empty()) return; guard and ships a regression test (04238_shardkey_rewrite_in_empty_tuple).

CI on #104966 head 3ae9adc: 4 failures, none related to the fix (the change only affects the IN tuple() against a sharded Distributed table code path):

04229_trivial_group_by_limit_profile_events (Stateless, amd_llvm_coverage, ParallelReplicas, s3) — 5 PR fails / 0 master in last 14d, known flaky.
Hung check failed on Stress test (amd_debug) — 13 fails / 11 PRs / 2 master in 14d, chronic infra flake.
Hung check failed on Stress test (arm_ubsan) — 10 fails / 10 PRs / 0 master, chronic infra flake.
Bugfix validation (functional tests) — single FAIL row for 04238_shardkey_rewrite_in_empty_tuple on head_ref=fix-shardkey-rewrite-empty-tuple, consistent with the expected "test fails on master HEAD (no fix), passes on PR HEAD (with fix)" check. The same test passes on regular Stateless runs of the PR head (e.g. amd_msan, WasmEdge).

I'll continue monitoring #104966 via CIDB; once it merges I'll verify master goes clean on STID 4193-4b55.

Tracked in groeneai task 2026-05-14-github-clickhouseclickhouse104478--.

…d `COLLATE` The functional test `04217_query_condition_cache_topk` already covered the `LIMIT`, sort direction, and sort column dimensions of the TopK QCC salt. Add focused assertions for the two remaining inputs: * `Nullable(UInt32)` sort column, `ORDER BY n ASC NULLS FIRST` vs `... NULLS LAST` — same query shape, different `nulls_direction` in the salt, must produce a fresh QCC entry. * `String` sort column with `use_top_k_dynamic_filtering_for_variable_length_types` so `tryOptimizeTopK` takes the dynamic-filtering branch, and `ORDER BY s ASC COLLATE 'en_US'` vs `... COLLATE 'fr'` — same query shape, different collator locale, must produce a fresh entry. This protects the new hash inputs in `tryOptimizeTopK` from future regressions: any change that drops `nulls_direction` or the collator locale from `TopKFilterInfo::condition_hash` would cause two different queries to alias the same QCC entry, and these assertions would catch the drop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous version of the test put the empty `tuple()` in `WHERE` (and in a `SELECT`-only position with no shard-pruning WHERE), so: * `WHERE dummy IN tuple()` is always false; `optimize_skip_unused_shards` fully prunes the cluster before the rewriter runs (`shards > 1` precondition fails in `ClusterProxy::executeQuery`), and the empty-tuple branch in `OptimizeShardingKeyRewriteIn::enterImpl` is never reached. * `SELECT (dummy IN tuple()) FROM dist` with no WHERE filter does not set `query_info.optimized_cluster`, so again the rewriter is skipped. As a result the test reported `[ OK ]` on the master-HEAD release binary used by the bugfix-validation job, so the validation framework reported "Failed to reproduce the bug" and marked the check red. Switch the trigger to the actual fuzz-reduced pattern (PR #104478, `fuzzer.log.zst`, query around `Fuzzing step 716`): SELECT (2 IN tuple(-2)) FROM dist WHERE dummy IN (0, 2) GROUP BY (dummy IN tuple()) AND materialize(-2147483649) AND (NULL GLOBAL IN (*)) FORMAT Null; The non-empty `WHERE dummy IN (0, 2)` keeps `optimized_cluster` populated with two shards, so the rewriter runs and visits the empty `(dummy IN tuple())` expression in `GROUP BY`. Verified locally: * master HEAD without the fix: aborts in `contrib/llvm-project/libcxx/include/__vector/vector.h:444: libc++ Hardening assertion !empty() failed: back() called on an empty vector` (exit code 134). * With the fix: query completes cleanly. Bugfix-validation report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104966&sha=95b2f912df1456d5d0b5d09d431882e3b9855fc3&name_0=PR&name_1=Bugfix%20validation%20%28functional%20tests%29 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Fast test build is compiled without ICU, so any query using `COLLATE` fails with `Collations support is disabled, because ClickHouse was built without ICU library` (`SUPPORT_IS_DISABLED`), which broke `04217_query_condition_cache_topk` in the Fast test job. Split the COLLATE locale assertions out into a separate test `04238_query_condition_cache_topk_collate.sql` tagged `no-fasttest` so the `NULLS FIRST/LAST` and TopK plan-parameter assertions keep running in Fast test, while the COLLATE assertions still run in regular stateless builds that include ICU. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104478&sha=589ab32c683d193b4699fe7b0aae356701a7ff96&name_0=PR&name_1=Fast%20test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Master already has `04238_incomplete_utf8_column_name_json.sql`; rename the new COLLATE test to `04240_query_condition_cache_topk_collate.sql` so test numbers stay unique. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous version replaced the original `VirtualColumnUtils::isDeterministic` call at the QCC write site in `selectRangesToRead` with the local `isNodeDeterministic`, which only checks `FUNCTION` nodes. That weakened the gate: a filter containing a non-deterministic constant `COLUMN` node (e.g. `now()`, `today()`) could now pass and have its `condition_hash` written to the query condition cache, then be reused by a later query where the constant value has changed, causing stale / incorrect mark pruning. Restore the original `COLUMN`-node check by introducing `isDeterministicAllowingTopKFilter`, a sibling of `VirtualColumnUtils::isDeterministic` that admits `__topKFilter` (it is the only "officially" non-deterministic function expected here, and its non-determinism is bounded by the salt combined into `condition_hash` below). This mirrors the same helper already used on the write side in `updateQueryConditionCache.cpp`, so the determinism predicates on both sides agree. `isNodeDeterministic` reverts to its master form; the row-policy gate that shares it is unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The `04240` prefix is now also used by master tests (`04240_intexp2_jit_overflow`, `04240_jit_float_to_big_int_libcalls`, `04240_80087_clickhouse_local_logs_on_exception`). Move our `COLLATE` test to `04242` so each test still has its own numeric prefix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-05-27T09:09:30Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	84.10%	84.10%	+0.00%
Functions	91.40%	91.40%	+0.00%
Branches	76.60%	76.50%	-0.10%

Changed lines: 98.78% (81/82) | lost baseline coverage: 1 line(s) · Uncovered code

Full report · Diff report

This commit reverts the materialization change in src/Processors/Transforms/FinishSortingTransform.cpp added by the previous commit, and keeps only the documentation test under tests/queries/0_stateless/04300_stid_3336_finishsorting_columnreplicated.sql. The reported "Bad cast from type DB::ColumnReplicated to DB::ColumnString" exception captured in CI on PR ClickHouse#102882, PR ClickHouse#104478, PR ClickHouse#105893 does not reproduce deterministically on master HEAD on a debug build, so the preceding source change was speculative and is dropped. The test stays as a name-anchored reproducer for the query shape captured in CI. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102882&sha=1de47ad4469490e183bd76c5dad2eddd82d61422&name_0=PR&name_1=Stateless%20tests%20%28amd_msan%2C%20WasmEdge%2C%20parallel%2C%201%2F2%29

clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label May 9, 2026

clickhouse-gh Bot reviewed May 9, 2026

View reviewed changes

Comment thread src/Processors/QueryPlan/Optimizations/updateQueryConditionCache.cpp

alexey-milovidov added 4 commits May 9, 2026 22:27

Merge remote-tracking branch 'origin/master' into qcc-for-topn

442ab8c

clickhouse-gh Bot reviewed May 10, 2026

View reviewed changes

Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp

alexey-milovidov and others added 2 commits May 10, 2026 17:43

Merge remote-tracking branch 'origin/master' into qcc-for-topn

543d176

alexey-milovidov commented May 10, 2026

View reviewed changes

alexey-milovidov marked this pull request as draft May 10, 2026 23:06

shankar-iyer self-assigned this May 11, 2026

shankar-iyer reviewed May 11, 2026

View reviewed changes

alexey-milovidov and others added 2 commits May 11, 2026 15:49

clickhouse-gh Bot reviewed May 12, 2026

View reviewed changes

Comment thread src/Processors/QueryPlan/Optimizations/optimizeTopK.cpp

alexey-milovidov added 3 commits May 12, 2026 18:48

Merge branch 'master' into qcc-for-topn

e5898a2

clickhouse-gh Bot reviewed May 13, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated

clickhouse-gh Bot reviewed May 13, 2026

View reviewed changes

alexey-milovidov added 4 commits May 13, 2026 09:13

Merge remote-tracking branch 'origin/master' into qcc-for-topn

a2afbac

alexey-milovidov mentioned this pull request May 14, 2026

Fix exception on IN tuple() against Distributed sharded table #104966

Open

1 task

clickhouse-gh Bot reviewed May 14, 2026

View reviewed changes

Comment thread tests/queries/0_stateless/04217_query_condition_cache_topk.sql

alexey-milovidov and others added 3 commits May 15, 2026 15:49

Merge remote-tracking branch 'origin/master' into qcc-for-topn

dbe5f65

clickhouse-gh Bot reviewed May 15, 2026

View reviewed changes

Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp Outdated

alexey-milovidov and others added 3 commits May 16, 2026 21:02

Merge remote-tracking branch 'origin/master' into qcc-for-topn

495b3d8

This was referenced May 16, 2026

Make filesystem() honor the memory limit when loading file content #104956

Merged

Stop the bleeding in function_prop_fuzzer #105146

Merged

groeneai mentioned this pull request May 18, 2026

Fix LOGICAL_ERROR 'Cannot add simple transform to empty Pipe' in ReadFromMergeTree::readInOrder #105132

Open

1 task

Algunenano mentioned this pull request May 22, 2026

[CI flaky crash] SIGSEGV in MergeTreeReaderCompact::readData (SerializationNumber<UInt64>) on "db disk, distributed plan, sequential" shard #105626

Open

alexey-milovidov added 2 commits May 23, 2026 04:42

Merge remote-tracking branch 'origin/master' into qcc-for-topn

b1e4747

Merge remote-tracking branch 'origin/master' into qcc-for-topn

18908b0

groeneai mentioned this pull request May 28, 2026

Add documentation test for STID 3336-2f87 #106024

Open

1 task

Merge remote-tracking branch 'origin/master' into qcc-for-topn

7af0425

Conversation

alexey-milovidov commented May 9, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

Why

What changed

Origin

Uh oh!

clickhouse-gh Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

Missing context / blind spots

Findings

Tests

Final Verdict

Uh oh!

Uh oh!

Uh oh!

alexey-milovidov left a comment

Choose a reason for hiding this comment

Uh oh!

shankar-iyer May 11, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov commented May 11, 2026

Uh oh!

Uh oh!

alexey-milovidov commented May 12, 2026

Uh oh!

Uh oh!

clickhouse-gh Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 15, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot May 31, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov commented May 14, 2026

Uh oh!

Uh oh!

groeneai commented May 14, 2026

Uh oh!

Uh oh!

clickhouse-gh Bot commented May 27, 2026

LLVM Coverage Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clickhouse-gh Bot commented May 9, 2026 •

edited

Loading