Skip to content

Fix incorrect or missed pruning when startsWith, LIKE, NOT LIKE used with FixedString column#99001

Merged
nihalzp merged 5 commits intoClickHouse:masterfrom
nihalzp:fix-fixedstring-prefix-keycondition
Mar 9, 2026
Merged

Fix incorrect or missed pruning when startsWith, LIKE, NOT LIKE used with FixedString column#99001
nihalzp merged 5 commits intoClickHouse:masterfrom
nihalzp:fix-fixedstring-prefix-keycondition

Conversation

@nihalzp
Copy link
Member

@nihalzp nihalzp commented Mar 8, 2026

Given,

CREATE TABLE test (fixed_string_col FixedString(40), text String) ENGINE = MergeTree ORDER BY fixed_string_col SETTINGS index_granularity = 1;

INSERT INTO test VALUES ('11','plain-11'),('110','plain-110'),('11z','plain-11z'),('12','plain-12'),('21','plain-21'),('999','plain-999');

Wrong answer:

SELECT count()
FROM test
WHERE startsWith(fixed_string_col, '11'); -- returns 1 but should be 3
SELECT count()
FROM test
WHERE fixed_string_col LIKE '11%'; -- returns 1 but should be 3
SELECT text
FROM test
WHERE fixed_string_col NOT LIKE '99%'; -- Read more than needed because `LIKE` internval was incorrectly small so `NOT LIKE` became larger

Addtionally, FixedString to String cast function monotonicity added. As a result, we prune for cases like these:

SELECT text
FROM test
WHERE startsWith(CAST(fixed_string_col, 'String'), '11');

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix incorrect or less pruning when startsWith, LIKE, NOT LIKE used with FixedString column. Additionally, FixedString to String cast function can now prune granules when wrapped around key column. Closes #98940.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Note

Medium Risk
Changes predicate-to-range conversion in KeyCondition and function monotonicity metadata, which can affect query correctness and index pruning behavior for MergeTree scans.

Overview
Fixes incorrect/overly-narrow primary-key range construction when startsWith, LIKE/NOT LIKE, or match are applied to FixedString key columns by preserving string literals instead of converting them to padded FixedString(N) during key-condition analysis.

Marks toString(FixedString) (including LowCardinality(FixedString)) as always strict monotonic, allowing MergeTree pruning when casts wrap the key column. Adds unit tests for the new monotonicity and stateless query tests covering the FixedString prefix/pattern pruning regressions.

Written by Cursor Bugbot for commit 30ae7ae. This will update automatically on new commits. Configure here.

@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Mar 8, 2026

Workflow [PR], commit [30ae7ae]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 8, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Mar 8, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.10% 83.80% +0.70%
Functions 23.60% 23.90% +0.30%
Branches 75.20% 76.30% +1.10%

PR changed lines: PR changed-lines coverage: 97.83% (45/46)
Diff coverage report
Uncovered code

@Fgrtue Fgrtue self-assigned this Mar 9, 2026
@SmitaRKulkarni SmitaRKulkarni assigned SmitaRKulkarni and Fgrtue and unassigned Fgrtue Mar 9, 2026
@Fgrtue Fgrtue removed their assignment Mar 9, 2026
Copy link
Member

@SmitaRKulkarni SmitaRKulkarni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nihalzp nihalzp added this pull request to the merge queue Mar 9, 2026
Merged via the queue into ClickHouse:master with commit 6a7811c Mar 9, 2026
161 checks passed
@nihalzp nihalzp deleted the fix-fixedstring-prefix-keycondition branch March 9, 2026 15:43
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 9, 2026
zlareb1 added a commit to zlareb1/ClickHouse that referenced this pull request Mar 17, 2026
…(PR ClickHouse#99001)

PR ClickHouse#99001 fixed incorrect or missed pruning for startsWith, LIKE, NOT LIKE used with FixedString columns (KeyCondition.cpp:3101-3107). The fix keeps the original string constant when the key type (after stripping LowCardinality at line 3070) is String or FixedString.

The existing test (04032_fixedstring_prefix_key_condition) covers plain FixedString and String keys but not LowCardinality(FixedString), which is a reachable path through the same fix. This adds LowCardinality(FixedString(40)) key variants for startsWith, LIKE, NOT LIKE, match, and their CAST forms.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
zlareb1 added a commit to zlareb1/ClickHouse that referenced this pull request Mar 17, 2026
…(PR ClickHouse#99001)

The fix keeps the original string constant when the key type after LowCardinality
stripping is FixedString or LowCardinality(FixedString). The existing test covers
plain FixedString and String keys but not LowCardinality(FixedString), which is a
reachable code path through the same fix.

Extends 04032 with a LowCardinality(FixedString(40)) key table and exercises
startsWith, LIKE, NOT LIKE, and match — with direct column, CAST(..., 'String'),
and CAST(..., 'LowCardinality(String)') forms — all with force_primary_key = 1
to assert pruning is used and row-read counts are tight.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

primary key index for startsWith() uses an unexpected condition for FixedString

4 participants