Optimize GroupBy.Sum and GroupBy.Mean for large DataFrames

I've analyzed the GroupBy.cs implementation and identified opportunities for performance improvements, including:
- Implement single-pass iteration over the group rows.
- Introduce typed accumulators for double and long to avoid boxing and improve numeric performance.
- Pre-allocate result columns (PrimitiveDataFrameColumn<T>) to reduce repeated resizing.
- Optimize delegates used during iteration for better performance.

I noticed issue #6824 tracks DataFrame performance improvements. Would performance enhancements for GroupBy operations be welcome as part of this effort?

I can provide benchmarks and a PR if there's interest.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize GroupBy.Sum and GroupBy.Mean for large DataFrames #7554

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize GroupBy.Sum and GroupBy.Mean for large DataFrames #7554

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions