Skip to content

Optimize array_min, array_max for primitive types #21100

@neilconway

Description

@neilconway

Is your feature request related to a problem or challenge?

In the current implementation, for each row in the batch we construct a PrimitiveArray for the row, feed it to Arrow min / max, and then collects the resulting ScalarValues in a Vec. We then construct a final PrimitiveArray for the result via ScalarValue::iter_to_array. There's a bunch of overhead here: constructing N intermediate PrimitiveArray values plus the Vec, dynamic dispatch, Arc refcount bumps, etc.

We can do better by just iterating over the flat values buffer directly: that's already a PrimitiveArray, so we can just invoke the Arrow compute kernel ourselves on the appropriate slice of the array for each row.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions