-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
We currently copy every input string in update_batch for StringAggGroupsAccumulator.
We could instead just bump the Arc refcount on the input batch and keep <group_id, batch_id, row_id> triples. Then assemble the actual results in evaluate() (this is similar to #20504 for array_agg). This would be quite a bit more complicated than the current approach, but it could be worth it to reduce the amount of data being copied. It will require some bookkeeping to ensure that the right state is reclaimed after a partial emit.
Note that the current string_agg benchmark uses 3 byte strings, so it would underestimate the impact of this optimization.
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request