Skip to content

Fix exception when finding minimum size column during merge#95073

Merged
alexey-milovidov merged 3 commits intomasterfrom
fix-minimum-size-column-not-found
Jan 25, 2026
Merged

Fix exception when finding minimum size column during merge#95073
alexey-milovidov merged 3 commits intomasterfrom
fix-minimum-size-column-not-found

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Jan 25, 2026

When no physical columns are needed for reading a part during merge, injectRequiredColumns tries to find a minimum size column to determine row count. Previously, it searched among storage metadata columns, but the part might not have files for any of those columns if the table schema changed after the part was created.

The fix uses the part's own columns instead of metadata columns, ensuring we always find a column that exists in the specific part.

Changelog category (leave one):

  • Not For Changelog

See https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=95065&sha=e047a3f739277d9272ec5108eb879df601c7aaf7&name_0=PR&name_1=BuzzHouse%20%28amd_ubsan%29
#95065

When no physical columns are needed for reading a part during merge,
`injectRequiredColumns` tries to find a minimum size column to determine
row count. Previously, it searched among storage metadata columns, but
the part might not have files for any of those columns if the table
schema changed after the part was created.

The fix uses the part's own columns instead of metadata columns,
ensuring we always find a column that exists in the specific part.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Jan 25, 2026

Workflow [PR], commit [8cd443f]

Summary:

job_name test_name status info comment
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: (STID: 1941-1bfa) FAIL cidb, issue
Finish Workflow failure
python3 ./ci/jobs/scripts/workflow_hooks/pr_body_check.py failure

alexey-milovidov and others added 2 commits January 25, 2026 14:08
Test that reading from a MergeTree table works correctly after schema
changes (adding new columns, dropping original columns) when the part's
columns differ from the current table metadata.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test passed but did not actually trigger the bug scenario that was
found by BuzzHouse fuzzer. The fix will be validated by CI (BuzzHouse
should not crash anymore after the fix).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@alexey-milovidov alexey-milovidov self-assigned this Jan 25, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member Author

Looks reasonable.

@alexey-milovidov alexey-milovidov merged commit 0dc0971 into master Jan 25, 2026
130 of 134 checks passed
@alexey-milovidov alexey-milovidov deleted the fix-minimum-size-column-not-found branch January 25, 2026 19:49
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 25, 2026
@alexey-milovidov
Copy link
Copy Markdown
Member Author

The fix is strange. Now we can read from a column, which no longer exists in the storage. But there should be no such cases.

Please revert if this fix has complications.

@clickgapai
Copy link
Copy Markdown
Contributor

Hi @alexey-milovidov — while reviewing this PR I found the following:

Happy to discuss — close anything that's wrong or already addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants