Implement `GroupsAccumulator` for `first_value` aggregate (speed up `first_value` and `DISTINCT ON` queries)

### Is your feature request related to a problem or challenge?

As reported in https://github.com/apache/datafusion/issues/16620 by @debajyoti-truefoundry, evaluting `DISTINCT ON` results in a query plan that uses `first_value` aggregates

The current implementation of `first_value` appears to have only a basic `Accumulator` implementation, and not the faster  `GroupsAccumulator`: https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/nth_value.rs#L93

We can very likely improve the performance of such queries significantly by implementing a `GroupsAccumulator`  (background in  [Aggregating Millions of Groups Fast in Apache Arrow DataFusion](https://arrow.apache.org/blog/2023/08/05/datafusion_fast_grouping/) )



### Describe the solution you'd like

1. Add a benchmark (maybe add a query to the [clickbench_extended](https://github.com/apache/datafusion/tree/main/benchmarks/queries/clickbench) suite)
2. Implement a GroupsAccumulator for first (and maybe nth) value

### Describe alternatives you've considered

I think the accumulator could be pretty straightforward and track whatever groups were new and just copy the first row seen into the output (likely by using the `take` filter)

### Additional context

There is a similar issue for optimizing `array_agg` here:
- https://github.com/apache/datafusion/issues/10145


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement `GroupsAccumulator` for `first_value` aggregate (speed up `first_value` and `DISTINCT ON` queries) #17899

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement GroupsAccumulator for first_value aggregate (speed up first_value and DISTINCT ON queries) #17899

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Implement `GroupsAccumulator` for `first_value` aggregate (speed up `first_value` and `DISTINCT ON` queries) #17899