Skip to content

Fix wrong results with distributed index analysis and query condition cache#98269

Merged
azat merged 2 commits intoClickHouse:masterfrom
azat:dia-qcc-fix
Mar 2, 2026
Merged

Fix wrong results with distributed index analysis and query condition cache#98269
azat merged 2 commits intoClickHouse:masterfrom
azat:dia-qcc-fix

Conversation

@azat
Copy link
Member

@azat azat commented Feb 27, 2026

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix wrong results with distributed index analysis (experimental feature) and query condition cache

Writes to QCC depends on part_index_in_query which was incorrect after DIA

Also QCC may split part ranges, and we will have more ranges, but this is OK, since it will be still correct, but only do more work. To address DIA should support analysis of separate ranges.

This pops up during testing

@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Feb 27, 2026

Workflow [PR], commit [cfdab89]

Summary:

job_name test_name status info comment
BuzzHouse (amd_debug) failure
Logical error: Incorrect mark rows for part A for mark #B (compressed offset C, decompressed offset D), in-memory E, on disk F, total marks G (STID: 5629-58bc) FAIL cidb

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label Feb 27, 2026
@azat azat requested a review from KochetovNicolai February 27, 2026 20:47
… cache

Writes to QCC depends on part_index_in_query which was incorrect after DIA

Also QCC may split part ranges, and we will have more ranges, but this
is OK, since it will be still correct, but only do more work. To address
DIA should support analysis of separate ranges.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KochetovNicolai KochetovNicolai self-assigned this Mar 2, 2026
Copy link
Member

@KochetovNicolai KochetovNicolai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine.
I guess this code used the invariant that ranges would be sorted

@azat
Copy link
Member Author

azat commented Mar 2, 2026

Yes, it assumes that both are sorted, otherwise it cannot be match here -

if (it_result != result.parts_with_ranges.end() && it_parts->part_index_in_query == it_result->part_index_in_query)

And will assume that the part was erased completely

@azat azat enabled auto-merge March 2, 2026 21:08
@azat azat added this pull request to the merge queue Mar 2, 2026
Merged via the queue into ClickHouse:master with commit 2ee5214 Mar 2, 2026
146 of 148 checks passed
@azat azat deleted the dia-qcc-fix branch March 2, 2026 21:21
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 2, 2026
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Mar 2, 2026
robot-clickhouse-ci-2 added a commit that referenced this pull request Mar 2, 2026
Cherry pick #98269 to 26.2: Fix wrong results with distributed index analysis and query condition cache
robot-clickhouse added a commit that referenced this pull request Mar 2, 2026
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Mar 2, 2026
clickhouse-gh bot added a commit that referenced this pull request Mar 3, 2026
Backport #98269 to 26.2: Fix wrong results with distributed index analysis and query condition cache
alexey-milovidov added a commit that referenced this pull request Mar 5, 2026
…ry condition cache

PR #82380 moved Query Condition Cache filtering before the distributed
index analysis code path. QCC can split a single mark range into multiple
ranges when it has cached data about which marks don't match. PR #98269
removed the same assertion from `distributedIndexAnalysis.cpp` but missed
this one in `ReadFromMergeTree.cpp`. The assertion is unnecessary since
the ranges are immediately replaced on the next line.

https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=98770&sha=ad7e094fdd4b057b0bac0acb5b505270f79f3b3d&name_0=PR&name_1=AST%20fuzzer%20%28amd_debug%29

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v26.2-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants