New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Add better and faster support for dictionary types #87

Open

1 of 3 tasks

alamb opened this issue Apr 26, 2021 · 1 comment

Open

1 of 3 tasks

Add better and faster support for dictionary types #87

alamb opened this issue Apr 26, 2021 · 1 comment

Labels

datafusion

Contributor

alamb commented Apr 26, 2021 •

edited

Loading

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-8464

Usecases: Efficiently process large columns of low cardinality Strings

BatchIterator should accept both DictionaryBatch and RecordBatch
Type Coercion optimizer rule should inject expression for converting dictionary value types to index types (for equality expressions, and IN(values, ...)
Use specialized dictionary compute kernels for binary PhysicalExpr evaluation #1178

alamb added the datafusion label

Contributor Author

alamb commented Apr 26, 2021

Comment from Andrew Lamb(alamb) @ 2020-10-06T12:25:16.639+0000:

FYI [~andygrove] -- I am doing some part of this in ARROW-10159 -- however, the initial implementation effectively converts DictionaryArray --> PrimitiveArray / StringArray and then uses the existing processing.

To support the actual efficient processing usecase, I think significant work will be needed to add appropriate dictionary support to the arrow compute kernels

alamb changed the title ~~[Rust] Add better and faster support for dictionary types~~ Add better and faster support for dictionary types

rdettai mentioned this issue

File partitioning for ListingTable #1141

Merged

6 tasks

alamb mentioned this issue

Use specialized dictionary compute kernels for binary PhysicalExpr evaluation #1178

Open

3 tasks

alamb mentioned this issue

Useeq_dyn, neq_dyn, lt_dyn, lt_eq_dyn, gt_dyn, gt_eq_dyn kernels from arrow #1475

Merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment