Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of compare_dict_op #1371

Closed
alamb opened this issue Feb 28, 2022 · 0 comments 路 Fixed by #1372
Closed

Improve performance of compare_dict_op #1371

alamb opened this issue Feb 28, 2022 · 0 comments 路 Fixed by #1372
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog performance

Comments

@alamb
Copy link
Contributor

alamb commented Feb 28, 2022

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

#1330 introduced a significant cleanup for comparing dictionary arrays 馃帀

However, it also may reduce performance as it indirectly calls value() repeatedly in an array which checks the bounds. This bounds check is necessary for known good dictionary arrays (where all the value entries are known to be valid indexes into the values array)

Describe the solution you'd like

As suggested by @viirya in #1330 (comment):

  1. A benchmark showing the speed of comparing dictionary arrays
  2. Implement something like unsafe take_iter_unchecked() that is used in compare_dict_op
  3. Demonstrate that the benchmark is faster with the specialized approach

Additional context
All the context is on #1330

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate enhancement Any new improvement worthy of a entry in the changelog performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant