-
Notifications
You must be signed in to change notification settings - Fork 2k
feat: make sort pushdown BufferExec capacity configurable, default 1GB #21426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3e34877
130acd4
626c9ff
f944ac6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -557,6 +557,19 @@ config_namespace! { | |
| /// batches and merged. | ||
| pub sort_in_place_threshold_bytes: usize, default = 1024 * 1024 | ||
|
|
||
| /// Maximum buffer capacity (in bytes) per partition for BufferExec | ||
| /// inserted during sort pushdown optimization. | ||
| /// | ||
| /// When PushdownSort eliminates a SortExec under SortPreservingMergeExec, | ||
| /// a BufferExec is inserted to replace SortExec's buffering role. This | ||
| /// prevents I/O stalls by allowing the scan to run ahead of the merge. | ||
| /// | ||
| /// This uses strictly less memory than the SortExec it replaces (which | ||
| /// buffers the entire partition). The buffer respects the global memory | ||
| /// pool limit. Setting this to a large value is safe — actual memory | ||
| /// usage is bounded by partition size and global memory limits. | ||
| pub sort_pushdown_buffer_capacity: usize, default = 1024 * 1024 * 1024 | ||
zhuqi-lucas marked this conversation as resolved.
Show resolved
Hide resolved
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This PR increases the size buffer because
To be clear to anyone reading this, what will be on main is still better than 53.0.0 because prior to #21182 DataFusion would have sorted the entire thing (rather than just buffering it) A fixed size like this is likely to buffer more than required for narrow cases I suspect a better solution than a fixed size buffer would be some calculation based on the actual size of the data (e.g. the number of rows to buffer). However, that is tricky to compute / constrain memory when large strings are involved. We probably would need to have both a row limit and a memory cap and pick the smaller of the two. We can perhaps do this as a follow on issue/PR
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point! A dynamic approach with both a row limit and a memory cap (whichever is reached first) would adapt better to different row widths. Created #21440 to track this. |
||
|
|
||
| /// Maximum size in bytes for individual spill files before rotating to a new file. | ||
| /// | ||
| /// When operators spill data to disk (e.g., RepartitionExec), they write | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.