Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
SortExecwithTopKcauses OOM #17597.Rationale for this change
The original code was referencing a record batch for each line in the limit, which was causing the
arrow_select::interleave_record_batchto allocate vectors with billions of elements. For example, if we had 32 batches andLIMIT 50000, those 32 batches would be referenced 50k times. The new version uses only one reference per existing batch, limiting the memory used as well as improving the execution time:What changes are included in this PR?
topk/mod.rs.Are these changes tested?
Yes.
Are there any user-facing changes?
No.