Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add better and faster support for dictionary types #87

Open
1 of 3 tasks
alamb opened this issue Apr 26, 2021 · 1 comment
Open
1 of 3 tasks

Add better and faster support for dictionary types #87

alamb opened this issue Apr 26, 2021 · 1 comment
Labels
datafusion Changes in the datafusion crate

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-8464

Usecases: Efficiently process large columns of low cardinality Strings
 

@alamb alamb added the datafusion Changes in the datafusion crate label Apr 26, 2021
@alamb
Copy link
Contributor Author

alamb commented Apr 26, 2021

Comment from Andrew Lamb(alamb) @ 2020-10-06T12:25:16.639+0000:

FYI [~andygrove] -- I am doing some part of this in ARROW-10159 -- however, the initial implementation effectively converts DictionaryArray --> PrimitiveArray / StringArray and then uses the existing processing.

To support the actual efficient processing usecase, I think significant work will be needed to add appropriate dictionary support to the arrow compute kernels

@alamb alamb changed the title [Rust] Add better and faster support for dictionary types Add better and faster support for dictionary types Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate
Projects
None yet
Development

No branches or pull requests

1 participant