Skip to content

Enable query condition cache for ORDER BY ... LIMIT n queries#104478

Draft
alexey-milovidov wants to merge 28 commits into
masterfrom
qcc-for-topn
Draft

Enable query condition cache for ORDER BY ... LIMIT n queries#104478
alexey-milovidov wants to merge 28 commits into
masterfrom
qcc-for-topn

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Enable the query condition cache for queries that go through TopK dynamic filtering (ORDER BY ... LIMIT n). The QCC entry is partitioned by the TopK plan parameters so the same query reuses cached WHERE filter results, while a different LIMIT, sort column, or sort direction produces a fresh entry.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Why

When tryOptimizeTopK picks a ReadFromMergeTree for TopK dynamic filtering, the running threshold can drop granules before they reach the WHERE/PREWHERE filter. Until now updateQueryConditionCache blanket-skipped QCC writes for such reads:

/// ORDER BY ... LIMIT N can drop granules, don't update qcc for the WHERE filter
if (read_from_merge_tree->isSelectedForTopKFilterOptimization())
    return;

The skip was correct but pessimistic — it disables QCC for every query that hits the TopK path, even when a re-run with the same plan and same parts would reuse exactly the same row set.

What changed

  1. TopKFilterInfo::condition_hash — a deterministic hash over the planning-time TopK parameters: (column_name, type_name, num_sort_columns, limit_n, direction). Computed in tryOptimizeTopK when the read is selected for TopK filtering.
  2. updateQueryConditionCache — instead of skipping for TopK reads, combine top_k_filter_info->condition_hash into the WHERE filter's condition_hash via boost::hash_combine before storing it on FilterStep.

The QCC writer (FilterTransform::writeIntoQueryConditionCache) and the read-side lookup both go through the same FilterStep::condition, so the new combined hash is used consistently on both sides — same plan + same parts → cache hit; different LIMIT n, different sort column → new key, fresh entry, never reusing a row set computed under different TopK conditions.

Origin

Split out from WIP PR #81944 (commit bec285616a2). The PR's original implementation was on its own TopNFilterParameters infrastructure that didn't survive the rebase onto master; this version keeps the same idea but uses master's TopKFilterInfo / tryOptimizeTopK directly.

When `tryOptimizeTopK` selects a `ReadFromMergeTree` for TopK dynamic
filtering, the running threshold can drop granules before they reach the
WHERE/PREWHERE filter. Until now `updateQueryConditionCache` blanket-skipped
QCC writes for such reads, because a WHERE result computed under one TopK
plan would be wrong to reuse under a different one (a different LIMIT
might need granules that were dropped before).

Make the cache key for the WHERE filter depend on the TopK plan, instead
of skipping it entirely:

- Add `condition_hash` to `TopKFilterInfo`, computed deterministically
  from `(column_name, type_name, num_sort_columns, limit_n, direction)`
  in `tryOptimizeTopK`. The hash captures everything in the TopK plan
  that's known at planning time and independent of data.
- In `updateQueryConditionCache`, instead of bailing out for TopK reads,
  combine the TopK `condition_hash` into the WHERE filter's
  `condition_hash` via `boost::hash_combine`. Same query + same part set
  + same TopK plan → QCC entry is reusable; different `LIMIT n`,
  different sort column, or different direction → new key, fresh entry.

The change is symmetric: both the QCC writer
(`FilterTransform::writeIntoQueryConditionCache`) and the read-side
lookup go through the same `FilterStep::condition`, so the new combined
hash is used consistently on both sides without further wiring.

Originally part of WIP PR #81944 (commit `bec285616a2`), adapted to
master's `TopKFilterInfo`/`tryOptimizeTopK` infrastructure.
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 9, 2026

Workflow [PR], commit [7af0425]

Summary:

job_name test_name status info comment
Stateless tests (arm_asan_ubsan, targeted) FAIL
Server died FAIL cidb

AI Review

Summary

This PR enables QueryConditionCache for TopK reads (ORDER BY ... LIMIT) by salting WHERE QCC keys with TopKFilterInfo::condition_hash, with matching read/write paths and focused partitioning tests. The cache-key correctness issues from prior review are addressed, but the Performance Improvement evidence is still not satisfied because the current perf benchmark has not shown a stable faster row for the QCC-enabled query in completed CI.

Missing context / blind spots
  • ⚠️ The latest CI run for 7af04257 still has performance comparison shards PENDING; a completed current-head perf report showing query_condition_cache_topk.query0 faster would close this gap.
Findings

⚠️ Majors

  • [tests/performance/query_condition_cache_topk.xml:62] The PR promises a measurable Performance Improvement, but this benchmark still does not demonstrate it in completed CI. The current query pair is unchanged from completed runs such as 18908b0c, where query0 (use_query_condition_cache = 1) was 0.1221s -> 0.1268s on amd_release (diff +3.8%, threshold 4.8%) and 0.0704s -> 0.0702s on arm_release (diff -0.3%, threshold 0.7%), while the disabled-cache baseline stayed in the same range. That leaves the benchmark within noise instead of proving that QCC-on skips the wide-column scan. Reshape the predicate/data so the use_query_condition_cache = 0 query pays a real scan cost and the use_query_condition_cache = 1 query is a changed faster row in CI.
Tests
  • ⚠️ Functional coverage covers key partitioning dimensions, but performance evidence is incomplete until a current perf report shows a stable faster result for query_condition_cache_topk.query0.
Final Verdict

⚠️ Request changes: cache-key correctness looks addressed, but the performance contract is still unproven by CI.

@clickhouse-gh clickhouse-gh Bot added the pr-performance Pull request with some performance improvements label May 9, 2026
`ReadFromMergeTree::setTopKColumn` takes its argument by const
reference, so wrapping the argument in `std::move` is a no-op and
trips the `performance-move-const-arg` clang-tidy check on the
arm_tidy build.

Build report:
https://s3.amazonaws.com/clickhouse-test-reports/PRs/104478/cf70625448dce5be22d92d107504f9ee3efb4baf/build_arm_tidy/build_clickhouse/build_clickhouse.log
PR: #104478
`updateQueryConditionCache` salts the WHERE filter's `condition_hash`
with `top_k_filter_info->condition_hash` before storing it on
`FilterStep::condition` (which `FilterTransform::writeIntoQueryConditionCache`
later uses as the cache key). The read path in
`MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache`,
however, was looking up entries using only `dag->getHash()`, so cache
writes done under a TopK plan would never be hit on subsequent reads.

Pass `top_k_filter_info` through to `filterPartsByQueryConditionCache`
and apply the same `boost::hash_combine` on the read side. Same query
+ same part set + same TopK plan → cache hit; different `LIMIT n`,
sort column, or direction → fresh entry.

Reported by `clickhouse-gh` review on
#104478.
On Apple platforms, `std::size_t` is `unsigned long` while `UInt64` is
`unsigned long long` — different types of the same width — so the
non-const reference parameter of `boost::hash_combine` does not bind to
a `UInt64` lvalue and the build fails with:

    error: no matching function for call to 'hash_combine'
    note: candidate function template not viable: no known conversion
          from 'UInt64' (aka 'unsigned long long') to 'std::size_t &'
          (aka 'unsigned long &') for 1st argument

On x86_64 Linux both aliases happen to resolve to `unsigned long`, which
is why CI Linux builds passed but `Build (amd_darwin)` and
`Build (arm_darwin)` failed.

Switch the local seed variable on both the read path
(`MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache`) and
the write path (`updateQueryConditionCache`) to `size_t`. The implicit
conversions to/from `UInt64` at the cache-API boundary are lossless on
all 64-bit platforms ClickHouse supports.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104478&sha=4ab18afb9c043a670ca72211ab85494fd956e3b8&name_0=PR&name_1=Build%20%28amd_darwin%29
PR: #104478
Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp
alexey-milovidov and others added 2 commits May 10, 2026 17:43
…ition_hash

`updateQueryConditionCache` and `MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache`
already combine `top_k_filter_info->condition_hash` into the QCC key. There is a third site
that writes to the cache: the distributed-index-analysis branch in
`ReadFromMergeTree::selectRangesToRead`, which fills QCC with ranges excluded by index
analysis. Without the same salt the writes there go under `outputs.front()->getHash()`
while reads go under the salted hash, so TopK queries cannot reuse those entries.

Apply `boost::hash_combine(hash, top_k_filter_info->condition_hash)` consistently with
the other two sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add functional tests for this optimization.
Add performance tests as well.

@alexey-milovidov alexey-milovidov marked this pull request as draft May 10, 2026 23:06
@shankar-iyer shankar-iyer self-assigned this May 11, 2026
/// fold the deterministic part of the TopK plan into the cache key. Same query + same
/// part set + same TopK params → cache hit; different LIMIT or sort column → fresh
/// entry, never reusing a row-set computed under different TopK conditions.
if (const auto & top_k_filter_info = read_from_merge_tree->getTopKFilterInfo())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will study this in detail - this is where I got stuck in my branch.

alexey-milovidov and others added 2 commits May 11, 2026 15:49
`tryOptimizeTopK` was hard-disabling `reader_settings.use_query_condition_cache`
inside `ReadFromMergeTree::setTopKColumn`, so chunks never carried `MarkRangesInfo`
and `FilterTransform::writeIntoQueryConditionCache` had nothing to flush. With
that flag off, the WHERE-filter QCC entries we now key by `TopKFilterInfo::condition_hash`
were never written, so the read paths in `MergeTreeDataSelectExecutor::filterPartsByQueryConditionCache`
and the index-analysis branch of `selectRangesToRead` always missed.

Drop the unconditional disable in `setTopKColumn`. With the filter pushdown now
pulling `__topKFilter` into `query_info.filter_actions_dag`, the determinism check
in both write paths (`updateQueryConditionCache` and `selectRangesToRead`) would
also bail out, so add `isDeterministicAllowingTopKFilter` (and the matching
exception in `isNodeDeterministic`) to treat `__topKFilter` as deterministic for
QCC purposes: its non-determinism is bounded by the TopK plan parameters that
are already folded into the cache key, and the threshold trajectory only tightens
during execution, so a granule that the outer WHERE zeroes out under
`__topKFilter` is one whose rows could not have reached the final result under
any threshold trajectory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A stateless test (`04217_query_condition_cache_topk`) verifies the QCC populates
and partitions entries by the TopK plan parameters: re-running the same
`WHERE ... ORDER BY <col> LIMIT n` reuses the same cache entry, while changing
`LIMIT`, the sort direction, or the sort column writes a fresh entry instead.

A performance test (`query_condition_cache_topk`) compares
`use_query_condition_cache = 1` vs `0` for an `ORDER BY ... LIMIT n` query under
`use_top_k_dynamic_filtering = 1` so the speedup the QCC delivers on
subsequent runs of the same TopK plan is visible in CI perf comparisons.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alexey-milovidov
Copy link
Copy Markdown
Member Author

There is no visible difference on the newly introduced performance test.

The previous version used `WHERE URL LIKE '%google%' ORDER BY EventTime
LIMIT 10` against `hits_100m_single`, which matches at least one row in
nearly every granule of `hits`, so the QCC had no granules to skip and
the comparison reported essentially the same time with and without the
cache (87.9 ms vs 87.3 ms before, 84.5 ms vs 84.0 ms after — all within
noise).

Switch to a self-contained synthetic table so the granule-level
selectivity of the WHERE filter is controlled:

* 50M rows in a `MergeTree` with `index_granularity = 8192`
  (~6 100 granules per part).
* `v2` is `rand()` per row, so per-part minmax statistics span the full
  `UInt32` range and cannot prune parts on `v2 = constant`.
* Exactly one row is injected with `v2 = 12345`, so the matching row
  sits in a single granule; the other ~6 100 granules have zero matches
  and become skippable in the QCC after the prewarm run.

`optimize_move_to_prewhere` is also explicitly disabled so the WHERE
predicate stays as a separate `FilterStep` above `ReadFromMergeTree`
and the dynamic-filtering branch of `tryOptimizeTopK` applies (it bails
out when `getPrewhereInfo()` is already set), matching the path the
QCC writer in `updateQueryConditionCache` follows.

Local measurement on the new layout (median over 7 runs, after
prewarm):
  * `use_query_condition_cache = 1`:  ~2 ms
  * `use_query_condition_cache = 0`:  ~8 ms

so the speedup attributable to this PR is visible (~4×) in the CI
performance comparison instead of being hidden inside the existing TopK
dynamic-filtering speedup.
Comment thread src/Processors/QueryPlan/Optimizations/optimizeTopK.cpp
@alexey-milovidov
Copy link
Copy Markdown
Member Author

Still no significant difference - correct the test so that it will be visible.

…eedup

The previous version used `v1 = number` and `LIMIT 10`, which makes the
TopK threshold settle to the 10th-smallest `v1` value across the whole
table. With a random PK (`id = rand()`) those 10 winning rows are
scattered across ~10 random granules, so `__topKFilter` rejects every
row in the remaining ~6 100 granules and FilterTransform never sees them.
Without FilterTransform input there is no QCC write for those granules,
so the cache only ends up covering a handful of granules per part and
the second-run speedup is small (within the ~5% perf-comparison noise
threshold on CI, e.g. 20.7 ms vs 20.7 ms).

Switch the synthetic schema so QCC has full per-granule coverage:

  * `v1 = number % 1000` instead of `v1 = number`. Each granule of
    8 192 rows now contains ~8 rows with `v1 = 0`. The TopK threshold
    still settles to `0` almost immediately, but `__topKFilter` keeps
    those ~8 rows per granule and they reach the WHERE filter — letting
    QCC record a "no match" entry for every granule on the prewarm run.
  * The single matching row (with `v2 = 12345`) is now also forced to
    `v1 = 0`, so it survives `__topKFilter` and the query still returns
    the planted row (otherwise it would be filtered out by PREWHERE and
    the result would be empty).
  * Doubled to 100M rows / ~12 200 granules so the cold-cache scan time
    is large enough that the speedup is unambiguous.

Local measurement on the new layout (median over 5 runs, after prewarm,
`max_threads=12` to match the perf-comparison default):

  * `use_query_condition_cache = 1`:   ~2 ms
  * `use_query_condition_cache = 0`:  ~45 ms

so the speedup attributable to this PR is ~22× and far above the CI
stat threshold instead of being lost inside the existing TopK
dynamic-filtering speedup.
The salt that partitions QCC entries by TopK plan parameters previously
omitted two ORDER BY clauses with semantic impact: `NULLS FIRST` /
`NULLS LAST` and `COLLATE`. Two queries that differ only by those parts
of the sort description currently collide in the cache, so granule
decisions made under one ordering rule could be reused under another —
which is unsound for Nullable columns or string sorts with different
locales.

Fold `sort_col_desc.nulls_direction` and (when present)
`sort_col_desc.collator->getLocale()` into the `SipHash` seed alongside
the other planning-time parameters so each combination gets its own
entry.

The skip-index path of `tryOptimizeTopK` already rejects Nullable
columns and collated sorts on its `skip_index_type_eligible` guard, but
the dynamic-filtering path does not, so the QCC salt has to carry these
parameters too.

Defensive: no test failures observed; the existing
`04217_query_condition_cache_topk` continues to pass.
Comment thread src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp Outdated
</settings>

<!-- QCC enabled: prewarm primes the cache, measured runs reuse it. -->
<query>SELECT v1 FROM qcc_topk_test WHERE v2 = 12345 ORDER BY v1 ASC LIMIT 10 SETTINGS use_query_condition_cache = 1 FORMAT Null</query>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not showing a visible gain in the PR performance discussion.

For a Performance Improvement PR, we need this benchmark to demonstrate a stable delta in CI. Please adjust the benchmark shape so the improvement is measurable (for example, explicit warmup vs measured query phases) and the use_query_condition_cache = 1 case is clearly better than the = 0 baseline.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still live on the current head.

The perf-comparison shards for this PR are currently still PENDING, so there is still no CI-side measurement confirming a stable delta for query_condition_cache_topk.xml under the Performance Improvement contract.

Please keep this thread open until the perf report shows a clear and stable improvement for the use_query_condition_cache = 1 case versus = 0.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still not demonstrated by the current benchmark shape.

The current file still measures the same tests/performance/query_condition_cache_topk.xml:62 / :65 pair, but completed perf runs after the rewrite show the use_query_condition_cache = 1 query within noise rather than visibly faster. For example, on 18908b0c:

  • amd_release: query 0 old 0.1221s, new 0.1268s, diff +3.8%, threshold 4.8%
  • arm_release: query 0 old 0.0704s, new 0.0702s, diff -0.3%, threshold 0.7%

The disabled-cache query is in the same range, so the Performance Improvement contract still lacks CI evidence. Please reshape the predicate so the baseline really pays the wide-column scan cost and the QCC-on query is a changed faster row in the perf report, while the use_query_condition_cache = 0 query stays neutral.

The PREWHERE write path in `MergeTreeSelectProcessor::read` uses an unsalted
`output->getHash()`, so salting only the PREWHERE read side made the keys
diverge — for TopK plans with a deterministic PREWHERE (for example, skip-index
TopK plans) the lookup never hit the entry the writer had just stored. Restrict
the TopK salt to the WHERE lookup path, which already pairs with a salted write
in `updateQueryConditionCache` and the secondary write path in
`selectRangesToRead`.
With 100M rows the per-run absolute QCC speedup (~10 ms of WHERE-filter
scan) is below the comparison harness's noise threshold once
~50 ms of unrelated per-query overhead is folded in, so the test
showed no measurable delta between `use_query_condition_cache = 1`
and `= 0`. Scale the fill to 500M rows (~5 GiB on disk) so the
data-scan cost dominates query overhead and the cached-granule
pruning produces a clear, stable improvement in the report.
The previous shape relied on `__topKFilter` keeping ~8 rows per granule for
the WHERE filter to evaluate, but in the comparison harness the actual
granule-skipping savings on subsequent runs were not visible above the
noise floor — both the `use_query_condition_cache = 1` and `= 0` queries
landed at ~280 ms on `amd_release`.

Reshape the benchmark to a `WHERE` that matches no rows in the table
(`payload = 'no_match'` against a `FixedString(8)` filled with random
`%1000` values). With every granule yielding `0/N` after the WHERE filter,
the prewarm run writes a "skip me" entry for every granule of every part,
and every measured run with the QCC enabled returns without reading any
data — which is the speedup the QCC-enabled query is supposed to measure.

The QCC-disabled query keeps paying the full WHERE-filter cost on every
run, giving a clear delta between the two queries in the perf comparison.
The earlier shape of the perf test (100M rows of `FixedString(8)` filled
from `toString(rand() % 1000)`) compressed so well that every granule
read was a few KB. The QCC granule-skipping saved real wall time
(measured ~5x locally), but in the CI perf harness the savings were
swallowed by per-query overhead (planning, network, pipeline setup),
producing the unstable ~1-4% delta the perf comparison job reported.

Rewrite the benchmark so the scan cost dominates over per-query setup:

  * 10M rows × `FixedString(256)` filled with `randomFixedString(256)`
    (incompressible, ~2.5 GB on disk; every granule decompression is a
    measurable chunk of work).
  * Same `payload = 'no_match'` WHERE predicate matches no rows, so the
    QCC entry written by the prewarm marks every granule as "skip",
    and the measured runs return without scanning.

Locally this widens the QCC-on vs QCC-off gap from a fraction of a
millisecond to roughly 20x (1ms cached vs ~36ms full scan on 10M rows),
which is large enough to survive the CI perf comparator's noise floor
on both `amd_release` and `arm_release` shards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@alexey-milovidov
Copy link
Copy Markdown
Member Author

Pushed 3c6741458b6 to address @clickhouse-gh's Performance Improvement request: the perf test now uses 10M rows of FixedString(256) filled with randomFixedString(256) so each granule scan reads an order-100 MB chunk of incompressible bytes. Locally this widens the QCC-on vs QCC-off gap from a noisy fraction-of-a-millisecond into a ~20x speedup (1 ms cached vs ~36 ms full scan), which should clear the CI perf comparator's noise floor on both amd_release and arm_release shards.

The two failing checks at d46a8dce38b are unrelated to this PR:

  1. AST fuzzer (amd_debug, targeted) → Received signal 6 (STID: 4193-4b55) — a logical error in OptimizeShardingKeyRewriteInVisitor.cpp:183. The fuzzer reduced its way to SELECT (2 IN tuple(-2)) FROM dist_01756 ... GROUP BY ... (dummy IN tuple()) AND .... The (dummy IN tuple()) triggers tuple.back() on an empty Tuple inside OptimizeShardingKeyRewriteIn::enterImpl. Pre-existing UB in distributed-query AST optimization, no relation to QCC/TopK. I'll send a separate fix.
  2. Stateless tests (amd_llvm_coverage, ParallelReplicas, s3 storage, parallel) → 01710_projection_additional_filters — known flaky, tracked in Test 01710_projection_additional_filters is flaky #104219 (assigned to @azat). The same 0 0 0 / -3 vs +23 reference diff has been firing across many PRs daily since 2026-05-05 and has appeared on master CI as well.

@groeneai, the AST fuzzer signal-6 above is unrelated to this PR — please investigate the OptimizeShardingKeyRewriteIn empty-tuple crash and provide a fix in a separate PR. If a fix is already in progress, please link it here.

Comment thread tests/queries/0_stateless/04217_query_condition_cache_topk.sql
@groeneai
Copy link
Copy Markdown
Contributor

@alexey-milovidov — confirmed the OptimizeShardingKeyRewriteIn empty-tuple crash and its fix is already in flight:

Root cause matches your description: the analyzer-side OptimizeShardingKeyRewriteIn::enterImpl at src/Interpreters/OptimizeShardingKeyRewriteInVisitor.cpp:183 calls tuple.back() on a Tuple reduced by the AST fuzzer to () (empty). The AST-side OptimizeShardingKeyRewriteInMatcher::visit (same file, lines 124-125) already guards the same operation with tuple.size() > 1; the analyzer-side path didn't, so the same query reaches the analyzer path under the new analyzer and trips UB. The fix in #104966 adds the symmetric if (tuple.empty()) return; guard and ships a regression test (04238_shardkey_rewrite_in_empty_tuple).

CI on #104966 head 3ae9adc: 4 failures, none related to the fix (the change only affects the IN tuple() against a sharded Distributed table code path):

  • 04229_trivial_group_by_limit_profile_events (Stateless, amd_llvm_coverage, ParallelReplicas, s3) — 5 PR fails / 0 master in last 14d, known flaky.
  • Hung check failed on Stress test (amd_debug) — 13 fails / 11 PRs / 2 master in 14d, chronic infra flake.
  • Hung check failed on Stress test (arm_ubsan) — 10 fails / 10 PRs / 0 master, chronic infra flake.
  • Bugfix validation (functional tests) — single FAIL row for 04238_shardkey_rewrite_in_empty_tuple on head_ref=fix-shardkey-rewrite-empty-tuple, consistent with the expected "test fails on master HEAD (no fix), passes on PR HEAD (with fix)" check. The same test passes on regular Stateless runs of the PR head (e.g. amd_msan, WasmEdge).

I'll continue monitoring #104966 via CIDB; once it merges I'll verify master goes clean on STID 4193-4b55.

Tracked in groeneai task 2026-05-14-github-clickhouseclickhouse104478--.

…d `COLLATE`

The functional test `04217_query_condition_cache_topk` already covered
the `LIMIT`, sort direction, and sort column dimensions of the TopK QCC
salt. Add focused assertions for the two remaining inputs:

  * `Nullable(UInt32)` sort column, `ORDER BY n ASC NULLS FIRST` vs
    `... NULLS LAST` — same query shape, different `nulls_direction`
    in the salt, must produce a fresh QCC entry.
  * `String` sort column with `use_top_k_dynamic_filtering_for_variable_length_types`
    so `tryOptimizeTopK` takes the dynamic-filtering branch, and
    `ORDER BY s ASC COLLATE 'en_US'` vs `... COLLATE 'fr'` — same
    query shape, different collator locale, must produce a fresh
    entry.

This protects the new hash inputs in `tryOptimizeTopK` from future
regressions: any change that drops `nulls_direction` or the collator
locale from `TopKFilterInfo::condition_hash` would cause two different
queries to alias the same QCC entry, and these assertions would catch
the drop.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alexey-milovidov added a commit that referenced this pull request May 15, 2026
The previous version of the test put the empty `tuple()` in `WHERE` (and in
a `SELECT`-only position with no shard-pruning WHERE), so:

  * `WHERE dummy IN tuple()` is always false; `optimize_skip_unused_shards`
    fully prunes the cluster before the rewriter runs (`shards > 1` precondition
    fails in `ClusterProxy::executeQuery`), and the empty-tuple branch in
    `OptimizeShardingKeyRewriteIn::enterImpl` is never reached.
  * `SELECT (dummy IN tuple()) FROM dist` with no WHERE filter does not set
    `query_info.optimized_cluster`, so again the rewriter is skipped.

As a result the test reported `[ OK ]` on the master-HEAD release binary used
by the bugfix-validation job, so the validation framework reported
"Failed to reproduce the bug" and marked the check red.

Switch the trigger to the actual fuzz-reduced pattern (PR #104478,
`fuzzer.log.zst`, query around `Fuzzing step 716`):

    SELECT (2 IN tuple(-2)) FROM dist
    WHERE dummy IN (0, 2)
    GROUP BY (dummy IN tuple()) AND materialize(-2147483649) AND (NULL GLOBAL IN (*))
    FORMAT Null;

The non-empty `WHERE dummy IN (0, 2)` keeps `optimized_cluster` populated with
two shards, so the rewriter runs and visits the empty `(dummy IN tuple())`
expression in `GROUP BY`. Verified locally:

  * master HEAD without the fix: aborts in
    `contrib/llvm-project/libcxx/include/__vector/vector.h:444:
     libc++ Hardening assertion !empty() failed: back() called on an empty vector`
    (exit code 134).
  * With the fix: query completes cleanly.

Bugfix-validation report:
https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104966&sha=95b2f912df1456d5d0b5d09d431882e3b9855fc3&name_0=PR&name_1=Bugfix%20validation%20%28functional%20tests%29

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
alexey-milovidov and others added 3 commits May 15, 2026 15:49
The Fast test build is compiled without ICU, so any query using `COLLATE`
fails with `Collations support is disabled, because ClickHouse was built
without ICU library` (`SUPPORT_IS_DISABLED`), which broke
`04217_query_condition_cache_topk` in the Fast test job.

Split the COLLATE locale assertions out into a separate test
`04238_query_condition_cache_topk_collate.sql` tagged `no-fasttest` so
the `NULLS FIRST/LAST` and TopK plan-parameter assertions keep running
in Fast test, while the COLLATE assertions still run in regular
stateless builds that include ICU.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=104478&sha=589ab32c683d193b4699fe7b0aae356701a7ff96&name_0=PR&name_1=Fast%20test

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Master already has `04238_incomplete_utf8_column_name_json.sql`; rename
the new COLLATE test to `04240_query_condition_cache_topk_collate.sql`
so test numbers stay unique.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread src/Processors/QueryPlan/ReadFromMergeTree.cpp Outdated
alexey-milovidov and others added 3 commits May 16, 2026 21:02
The previous version replaced the original `VirtualColumnUtils::isDeterministic`
call at the QCC write site in `selectRangesToRead` with the local
`isNodeDeterministic`, which only checks `FUNCTION` nodes. That weakened the
gate: a filter containing a non-deterministic constant `COLUMN` node (e.g.
`now()`, `today()`) could now pass and have its `condition_hash` written to the
query condition cache, then be reused by a later query where the constant value
has changed, causing stale / incorrect mark pruning.

Restore the original `COLUMN`-node check by introducing
`isDeterministicAllowingTopKFilter`, a sibling of `VirtualColumnUtils::isDeterministic`
that admits `__topKFilter` (it is the only "officially" non-deterministic
function expected here, and its non-determinism is bounded by the salt combined
into `condition_hash` below). This mirrors the same helper already used on the
write side in `updateQueryConditionCache.cpp`, so the determinism predicates on
both sides agree.

`isNodeDeterministic` reverts to its master form; the row-policy gate that
shares it is unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `04240` prefix is now also used by master tests (`04240_intexp2_jit_overflow`,
`04240_jit_float_to_big_int_libcalls`, `04240_80087_clickhouse_local_logs_on_exception`).
Move our `COLLATE` test to `04242` so each test still has its own numeric prefix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 27, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 91.40% 91.40% +0.00%
Branches 76.60% 76.50% -0.10%

Changed lines: 98.78% (81/82) | lost baseline coverage: 1 line(s) · Uncovered code

Full report · Diff report

groeneai added a commit to groeneai/ClickHouse that referenced this pull request May 28, 2026
This commit reverts the materialization change in
src/Processors/Transforms/FinishSortingTransform.cpp added by the
previous commit, and keeps only the documentation test under
tests/queries/0_stateless/04300_stid_3336_finishsorting_columnreplicated.sql.

The reported "Bad cast from type DB::ColumnReplicated to DB::ColumnString"
exception captured in CI on PR ClickHouse#102882, PR ClickHouse#104478, PR ClickHouse#105893 does not
reproduce deterministically on master HEAD on a debug build, so the
preceding source change was speculative and is dropped. The test stays
as a name-anchored reproducer for the query shape captured in CI.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102882&sha=1de47ad4469490e183bd76c5dad2eddd82d61422&name_0=PR&name_1=Stateless%20tests%20%28amd_msan%2C%20WasmEdge%2C%20parallel%2C%201%2F2%29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants