Skip to content

Make Binary Dictionary Operations Optional #4386

@tustvold

Description

@tustvold

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

The dyn_cmp_dict and dyn_arith_dict features of arrow enable binary operations involving dictionary arrays and other dictionary arrays or scalar arrays.

They are, however, extremely expensive both from a compilation time, and code size perspective. As the kernels must be generated for the combinatorial explosion of all dictionary key and value types. See apache/arrow-rs#2596 and apache/arrow-rs#2760.

They are also exceedingly rare in practice, as almost all queries instead use the scalar variant, i.e. add scalar value to dictionary, compare dictionary against scalar value, etc...

Describe the solution you'd like

I would like a feature flag, e.g. binary_dict_op, that is not enabled by default, and enables the arrow features. The three or so tests that happen to need this, can then be gated on this feature flag or possibly updated to not need it.

Describe alternatives you've considered

We could not do this

Additional context

The feature was initially enabled as a dev dependency in #3363 by @avantgardnerio

This was then updated as physical-expr dev dependency in #4163 by @isidentical

It was then enabled as a non-dev dependency in #4168 by @retikulum possibly unintentionally?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions