Skip to content

Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED#92390

Merged
azat merged 1 commit intoClickHouse:masterfrom
azat:aggregate-column-exception-safety
Dec 17, 2025
Merged

Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED#92390
azat merged 1 commit intoClickHouse:masterfrom
azat:aggregate-column-exception-safety

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Dec 17, 2025

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC) or LOGICAL_ERROR

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED

The problem is when ColumnAggregateFunction is reused after exception, this is the case of async_insert=1, where we are trying to load each INSERT batch, and if it fails, continue with other batches.

The problem was that once ColumnAggregateFunction::ensureOwnership() throws, it leaves the column in a broken state, since all aggregation states up to rollback_pos will be broken.

The fix is simple - just copy them from source column again.

And add a unit test.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Dec 17, 2025

Workflow [PR], commit [1cd46f9]

Summary:

job_name test_name status info comment
Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, sequential) failure
01079_parallel_alter_detach_table_zookeeper FAIL cidb, issue
Stateless tests (amd_msan, parallel) failure
02950_parallel_replicas_used_count FAIL cidb, issue
01548_query_log_query_execution_ms FAIL cidb, issue
BuzzHouse (amd_debug) failure
Logical error: 'Inconsistent AST formatting: the query: (STID: 1941-1bfa) FAIL cidb, issue
BuzzHouse (amd_msan) failure
Server unresponsive: memory limit exceeded FAIL cidb, issue
AST fuzzer (amd_ubsan) error

@novikd novikd self-assigned this Dec 17, 2025
@clickhouse-gh clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! labels Dec 17, 2025
@azat azat force-pushed the aggregate-column-exception-safety branch from ac79586 to a2747c2 Compare December 17, 2025 14:39
@azat azat force-pushed the aggregate-column-exception-safety branch 2 times, most recently from f19944d to a7fac1a Compare December 17, 2025 15:09
The problem is when ColumnAggregateFunction is reused after exception,
this is the case of async_insert=1, where we are trying to load each
INSERT batch, and if it fails, continue with other batches.

The problem was that once ColumnAggregateFunction::ensureOwnership()
throws, it leaves the column in a broken state, since all aggregation
states up to rollback_pos will be broken.

The fix is simple - just copy them from source column again.

And add a unit test.
@azat azat force-pushed the aggregate-column-exception-safety branch from a7fac1a to 1cd46f9 Compare December 17, 2025 15:33
Copy link
Copy Markdown
Member

@novikd novikd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@azat azat enabled auto-merge December 17, 2025 17:12
@azat azat added this pull request to the merge queue Dec 17, 2025
Merged via the queue into ClickHouse:master with commit 2ba6d59 Dec 17, 2025
125 of 131 checks passed
@azat azat deleted the aggregate-column-exception-safety branch December 17, 2025 18:36
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Dec 17, 2025
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Dec 17, 2025
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
Cherry pick #92390 to 25.8: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
Cherry pick #92390 to 25.9: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
Cherry pick #92390 to 25.10: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
Cherry pick #92390 to 25.11: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
robot-ch-test-poll2 added a commit that referenced this pull request Dec 17, 2025
Cherry pick #92390 to 25.3: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
robot-clickhouse added a commit that referenced this pull request Dec 17, 2025
@robot-ch-test-poll robot-ch-test-poll added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Dec 17, 2025
azat added a commit that referenced this pull request Dec 18, 2025
Backport #92390 to 25.8: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
azat added a commit that referenced this pull request Dec 18, 2025
Backport #92390 to 25.10: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
azat added a commit that referenced this pull request Dec 18, 2025
Backport #92390 to 25.3: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
azat added a commit that referenced this pull request Dec 18, 2025
Backport #92390 to 25.11: Fix possible crash in aggregate functions after MEMORY_LIMIT_EXCEEDED
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants