Skip to content

Dynamic BufferExec sizing: row limit + memory cap for sort pushdown #21440

@zhuqi-lucas

Description

@zhuqi-lucas

Is your feature request related to a problem or challenge?

#21426 introduced a configurable fixed-size BufferExec capacity (default 1GB) for sort pushdown. While this is better than the SortExec it replaces (which buffers the entire partition), a fixed size is not optimal for all cases:

  • Wide rows (many columns, large strings): 1GB might not be enough row groups
  • Narrow rows (few small columns): 1GB buffers far more data than needed

As noted by @alamb in #21426 (comment):

I suspect a better solution than a fixed size buffer would be some calculation based on the actual size of the data (e.g. the number of rows to buffer). However, that is tricky to compute / constrain memory when large strings are involved. We probably would need to have both a row limit and a memory cap and pick the smaller of the two.

Describe the solution you'd like

Replace the fixed capacity with a dual-limit approach:

BufferExec stops buffering when EITHER limit is reached:
  - Row limit: e.g., 100K rows (prevents over-buffering narrow rows)
  - Memory cap: e.g., 1GB (prevents OOM for wide rows)

This adapts to different row widths automatically:

  • Narrow rows (100 bytes/row): row limit triggers at ~10MB
  • Wide rows (10KB/row): memory cap triggers at 1GB

Related issues:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions