Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gandiva] support input selection vectors for both projector and filter #19828

Closed
asfimport opened this issue Oct 14, 2018 · 6 comments
Closed

Comments

@asfimport
Copy link

The Gandiva filter module returns a selection vector representing the indices of records (in the batch) that matched the filter. We can connect this to other modules, by passing along this selection vector as an input argument to the downstream projector/filter.

 

Reporter: Pindikura Ravindra / @pravindra
Assignee: Praveen Krishna / @Praveen2112

PRs and other links:

Note: This issue was originally created as ARROW-3511. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Francois Saint-Jacques / @fsaintjacques:
I'm curious to know why gandiva makes primary use of selction vector as opposed to bitmap as is used in arrow.

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Selection integer vectors and boolean vectors are complementary; they are frequently used in pandas, and useful since you don't have to scan the boolean vector in order to determine the size of a filtered array. Others can comment further

@asfimport
Copy link
Author

Pindikura Ravindra / @pravindra:
I picked the idea of using selection vectors from dremio. Iterating over a selection vector should be more efficient that iterating over bits in a bitmap, especially when the selectivity is low. but, I haven't benchmarked this.

 

 

@asfimport
Copy link
Author

Wes McKinney / @wesm:
Yes, for low selectivity selections it's also a lot more memory efficient.

Either way, I see this as part of the "algebra" of the compiler. So at some point we can probably augment the algebra and code generation with boolean selections

@asfimport
Copy link
Author

Francois Saint-Jacques / @fsaintjacques:
I understand the difference and tradeoffs, was just curious about the decision on the divergence.

@asfimport
Copy link
Author

Pindikura Ravindra / @pravindra:
Issue resolved by pull request 2789
#2789

@asfimport asfimport added this to the 0.13.0 milestone Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant