New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lower memory usage in vertical merges #59340
Lower memory usage in vertical merges #59340
Conversation
This is an automated comment for commit d3cdc88 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
|
||
SELECT | ||
merge_algorithm, | ||
peak_memory_usage < 500 * 1024 * 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
peak_memory_usage
before: 4.57 GiB
, after: 89.95 MiB
.
while (row_source_pos < row_sources_end && cur_size < cur_block_preferred_size) | ||
while (row_source_pos < row_sources_end | ||
&& column_res.size() < block_preferred_size_rows | ||
&& column_res.allocatedBytes() < block_preferred_size_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really minor. maybe it makes sense to use byteSize()
because I'd expect block_size_bytes
to refer to the actual data size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked that allocatedBytes
is used inMergedData
and did the same. But we can change it in both places.
Note, that I've checked this change by reverting ClickHouse#59340, i.e. diff --git a/src/Processors/Transforms/ColumnGathererTransform.h b/src/Processors/Transforms/ColumnGathererTransform.h index 4e56cffa46a..dd3c555f361 100644 --- a/src/Processors/Transforms/ColumnGathererTransform.h +++ b/src/Processors/Transforms/ColumnGathererTransform.h @@ -195,7 +195,7 @@ void ColumnGathererStream::gather(Column & column_res) } source.pos += len; - } while (column_res.size() < block_preferred_size_rows && column_res.byteSize() < block_preferred_size_bytes); + } while (true); } } And it fails: 02981_vertical_merges_memory_usage: [ FAIL ] - result differs with reference: --- /src/ch/clickhouse/tests/queries/0_stateless/02981_vertical_merges_memory_usage.reference 2024-02-13 10:37:54.771393874 +0100 +++ /src/ch/clickhouse/tests/queries/0_stateless/02981_vertical_merges_memory_usage.stdout 2024-02-13 10:41:06.781109565 +0100 @@ -1 +1 @@ -Vertical OK +Vertical FAIL: memory usage: 626.59 MiB Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
FWIW this patch changes serialization for LowCardinality, so you can get |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Lower memory usage in vertical merges
Fixes #57773.