Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-12426: [Rust] Fix concatentation of arrow dictionaries #15

Merged
merged 1 commit into from Apr 22, 2021

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Apr 21, 2021

This is the patchset from apache/arrow#10073 reapplyed to this repository.

From the original PR:

Currently calling concat on two dictionary encoded arrays blindly concatenates the keys arrays, and keeps only the values of the first. This can lead to invalid data, including keys referring to value indexes beyond the bounds of the values array.

This PR alters MutableArrayData to concatenate the child data of any dictionary arrays passed to it, and offset the source keys to correctly map to the relevant "slice" of the resulting child data. This does mean that the resulting dictionary array may contain duplicates, and will not be sorted, but I'm not sure if this is a problem?

@github-actions github-actions bot added the arrow Changes to the arrow crate label Apr 21, 2021
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks as good as it did the first time around 👍

@jorgecarleitao
Copy link
Member

We had to perform a small re-write of master. The commits may look a bit odd, but it should not cause conflicts. Could you kindly rebase this against the latest master to make it easier to review?

@alamb alamb merged commit a5732e8 into apache:master Apr 22, 2021
@alamb
Copy link
Contributor

alamb commented Apr 22, 2021

Thanks @tustvold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants