Skip to content

Improvements to BooleanGroupValueBuilder (grouping by boolean columns) #17860

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

@ashdnazg added an optimized BooleanGroupValueBuilder in #17726, and @rluvaton had several ideas for a follow on optimizations:

https://github.com/apache/datafusion/pull/17726/files#r2387673598

Because this is a slice and not buffer this limit optimizations in my optimization for creating optimized version for all uniuqe, for example for non nullable checking if 2 arrays are the same is simple NOT XOR

https://github.com/apache/datafusion/pull/17726/files#r2387686684

I will try to change it to MutableBooleanBuffer or something in the future to allow for more optimizations

Describe the solution you'd like

This ticket tracks improving the performance of the BooleanGroupValueBuilder, perhaps using @rluvaton 's suggestions

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformanceMake DataFusion faster

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions