Do size checks when deserializing data from aggregation states and other sources#90031
Merged
Algunenano merged 8 commits intoClickHouse:masterfrom Nov 19, 2025
Merged
Conversation
Contributor
|
Workflow [PR], commit [7b2820f] Summary: ❌
|
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR addresses a critical security issue by implementing proper size checks when deserializing data from aggregation states and other untrusted sources. The changes prevent potential crashes and data corruption by validating buffer boundaries before reading serialized data.
Key Changes:
- Modified deserialization methods across all column types to use
ReadBufferinstead of raw pointers for safer boundary checking - Added size validation checks before reading serialized data to prevent buffer overruns
- Updated test cases to verify proper error handling for malformed data
Reviewed Changes
Copilot reviewed 56 out of 56 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Columns/*.{h,cpp} | Refactored deserialization methods to use ReadBuffer API with proper bounds checking |
| src/Interpreters/AggregationMethod.cpp | Updated key deserialization to use ReadBuffer for size validation |
| src/Dictionaries/*.{h,cpp} | Modified dictionary deserialization to validate buffer sizes before reading |
| src/Functions/array/arrayIntersect.cpp | Added buffer size checks when deserializing array elements |
| src/AggregateFunctions/*.{h,cpp} | Updated aggregate function deserialization to use safer ReadBuffer-based approach |
| tests/queries/0_stateless/03716_topk_bad_data.* | Added test cases to verify proper error handling for malformed aggregation state data |
Algunenano
commented
Nov 14, 2025
| const char * IColumnDummy::skipSerializedInArena(const char * pos) const | ||
| void IColumnDummy::skipSerializedInArena(ReadBuffer &) const | ||
| { | ||
| return pos; |
Member
Author
There was a problem hiding this comment.
This looks odd. It seems it was incorrect before (and consequently now too), since it should skip one byte.
Member
Author
|
Failures:
|
Avogar
approved these changes
Nov 18, 2025
Merged
via the queue into
ClickHouse:master
with commit Nov 19, 2025
e86b749
125 of 132 checks passed
This was referenced Nov 19, 2025
robot-clickhouse-ci-1
added a commit
that referenced
this pull request
Nov 19, 2025
Cherry pick #90031 to 25.11: Do size checks when deserializing data from aggregation states and other sources
robot-clickhouse
added a commit
that referenced
this pull request
Nov 19, 2025
… aggregation states and other sources
clickhouse-gh bot
added a commit
that referenced
this pull request
Nov 19, 2025
Backport #90031 to 25.11: Do size checks when deserializing data from aggregation states and other sources
robot-ch-test-poll4
added a commit
that referenced
this pull request
Nov 20, 2025
Cherry pick #90031 to 25.8: Do size checks when deserializing data from aggregation states and other sources
robot-clickhouse
added a commit
that referenced
this pull request
Nov 20, 2025
…aggregation states and other sources
robot-ch-test-poll4
added a commit
that referenced
this pull request
Nov 20, 2025
Cherry pick #90031 to 25.9: Do size checks when deserializing data from aggregation states and other sources
robot-clickhouse
added a commit
that referenced
this pull request
Nov 20, 2025
…aggregation states and other sources
robot-ch-test-poll4
added a commit
that referenced
this pull request
Nov 20, 2025
Cherry pick #90031 to 25.10: Do size checks when deserializing data from aggregation states and other sources
robot-clickhouse
added a commit
that referenced
this pull request
Nov 20, 2025
… aggregation states and other sources
Algunenano
added a commit
that referenced
this pull request
Nov 20, 2025
Backport #90031 to 25.9: Do size checks when deserializing data from aggregation states and other sources
clickhouse-gh bot
added a commit
that referenced
this pull request
Nov 20, 2025
Backport #90031 to 25.8: Do size checks when deserializing data from aggregation states and other sources
Algunenano
added a commit
that referenced
this pull request
Nov 21, 2025
Backport #90031 to 25.10: Do size checks when deserializing data from aggregation states and other sources
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Do size checks when deserializing data from aggregation states and other sources
Closes #86882
I'll do a review before unmarking it as draft
Documentation entry for user-facing changes