-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-14069: [R] By default, filter out hash functions in list_compute_functions() #11363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
r/R/compute.R
Outdated
| } | ||
| # TODO: Filtering of hash funcs will already happen in C++ with ARROW-13943 | ||
| funcs <- grep( | ||
| "^hash_|index_in_meta_binary|is_in_meta_binary", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why filter out index/is_in_meta_binary? You can call them, and we in fact use index_in_meta_binary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a comment on the ticket suggesting otherwise and I didn't look into it thoroughly enough; I've removed those from the things to filter out now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment came from me I assume. @nealrichardson can you explain what the difference is between is_in and is_in_meta_binary or index_in and index_in_meta_binary? If the _meta_binary variants truly are needed then we should add documentation for them. Currently they are missing doc strings of any kind in python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC is_in is unary and takes the lookup table as an option; the _meta_binary version is binary instead. We don't use the _meta_binary in the query engine for performance concerns but it was convenient to use when calling more directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yep, I just tested them and they do appear to be binary variants of the unary kernels. I opened ARROW-14262 for discussion and to track documenting the kernels.
…e_functions() Filter the following functions from `list_compute_functions()`: * any which start with "hash_" * `index_in_meta_binary` * `is_in_meta_binary` Closes apache#11363 from thisisnic/ARROW-14069_list_compute Authored-by: Nic Crane <thisisnic@gmail.com> Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Filter the following functions from
list_compute_functions():index_in_meta_binaryis_in_meta_binary