Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement memory bound merging of aggregation results #40588

Open
1 of 2 tasks
nickitat opened this issue Aug 24, 2022 · 0 comments
Open
1 of 2 tasks

Implement memory bound merging of aggregation results #40588

nickitat opened this issue Aug 24, 2022 · 0 comments
Assignees
Labels

Comments

@nickitat
Copy link
Member

nickitat commented Aug 24, 2022

We already have a mechanism to reliably bound memory consumption during partial aggregation phase (when we consume table rows and accumulating per-thread aggregation states). This mechanism is called "external aggregation" and the mentioned memory bound is provided by the user through the max_bytes_before_external_group_by setting. Currently maintained aggregation states are flushed to disk once this limit is reached.
During merging we don't have such guarantees: the amount of data residing in memory is proportional to total_aggregation_state_size * (aggregation_memory_efficient_merge_threads / MAX_BUCKETS), MAX_BUCKETS value is constant and equals 256.
This fact complicates running CH in memory constrained environment.

Use case

Aggregation with large state that doesn't fit in memory limit during merging phase.

Describe the solution you'd like

Implement merging algorithm similar to the one we use in aggregation in order.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant