Skip to content

perf: sort replace free()->try_grow() pattern with try_resize() to reduce memory pool interactions#20729

Merged
mbutrovich merged 1 commit intoapache:mainfrom
mbutrovich:sort_mem
Mar 5, 2026
Merged

perf: sort replace free()->try_grow() pattern with try_resize() to reduce memory pool interactions#20729
mbutrovich merged 1 commit intoapache:mainfrom
mbutrovich:sort_mem

Conversation

@mbutrovich
Copy link
Contributor

@mbutrovich mbutrovich commented Mar 5, 2026

Which issue does this PR close?

Rationale for this change

See full discussion in #20728, but the tl;dr is:

Commit 3f0b342 ("Improve sort memory resilience", #19494) added per-chunk reservation.try_grow(total_sorted_size) calls inside the sort's unfold closure (sort.rs:748-750). I believe this causes a performance regression for projects that implement custom MemoryPool backends where try_grow is not trivially cheap.

What changes are included in this PR?

Replace the pattern of free()->try_grow() with a try_resize() that in the common case doesn't do much work since a sorted output batch is likely the same size as the input batch.

Are these changes tested?

Existing tests, and we are running some longer benchmarks with Comet at the moment.

When I profile TPC-H Q21 SF100 locally I see the giant stack of memory allocations that occur on DF52...
Screenshot 2026-03-05 at 2 23 15 PM
...completely disappear...
Screenshot 2026-03-05 at 2 24 31 PM
... and TPC-H Q21 looks back to DF51 performance, at least locally.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the physical-plan Changes to the physical-plan crate label Mar 5, 2026
@mbutrovich mbutrovich changed the title perf: sort replace reservation.free()->try_grow() pattern with try_resize() to reduce memory pool interactions perf: sort replace free()->try_grow() pattern with try_resize() to reduce memory pool interactions Mar 5, 2026
@mbutrovich mbutrovich marked this pull request as ready for review March 5, 2026 19:02
Copy link
Contributor

@EmilyMatt EmilyMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Code also looks much cleaner!

@mbutrovich mbutrovich requested a review from andygrove March 5, 2026 19:14
@mbutrovich mbutrovich added the performance Make DataFusion faster label Mar 5, 2026
@mbutrovich mbutrovich requested a review from comphead March 5, 2026 19:47
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix @mbutrovich!

@mbutrovich mbutrovich added this pull request to the merge queue Mar 5, 2026
Merged via the queue into apache:main with commit 631c918 Mar 5, 2026
39 checks passed
@mbutrovich mbutrovich deleted the sort_mem branch March 5, 2026 20:43
@comphead
Copy link
Contributor

comphead commented Mar 5, 2026

Thanks @mbutrovich I just checked that try_resize doesn't do the same free+try_grow

Reg to memory fragmentation resize should be at least not worse than free+grow

mbutrovich added a commit that referenced this pull request Mar 5, 2026
…size() to reduce memory pool interactions (#20732)

Backport #20729 to `branch-52`.
mbutrovich added a commit that referenced this pull request Mar 6, 2026
…size() to reduce memory pool interactions (#20733)

Backport #20729 to `branch-53`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Make DataFusion faster physical-plan Changes to the physical-plan crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: Sort memory reservation causes performance regression with custom MemoryPool implementations

4 participants