Skip to content

Conversation

@thisisnic
Copy link
Member

@thisisnic thisisnic commented Oct 8, 2021

Filter the following functions from list_compute_functions():

  • any which start with "hash_"
  • index_in_meta_binary
  • is_in_meta_binary

@github-actions
Copy link

github-actions bot commented Oct 8, 2021

@github-actions
Copy link

github-actions bot commented Oct 8, 2021

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

r/R/compute.R Outdated
}
# TODO: Filtering of hash funcs will already happen in C++ with ARROW-13943
funcs <- grep(
"^hash_|index_in_meta_binary|is_in_meta_binary",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why filter out index/is_in_meta_binary? You can call them, and we in fact use index_in_meta_binary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a comment on the ticket suggesting otherwise and I didn't look into it thoroughly enough; I've removed those from the things to filter out now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That comment came from me I assume. @nealrichardson can you explain what the difference is between is_in and is_in_meta_binary or index_in and index_in_meta_binary? If the _meta_binary variants truly are needed then we should add documentation for them. Currently they are missing doc strings of any kind in python.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC is_in is unary and takes the lookup table as an option; the _meta_binary version is binary instead. We don't use the _meta_binary in the query engine for performance concerns but it was convenient to use when calling more directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yep, I just tested them and they do appear to be binary variants of the unary kernels. I opened ARROW-14262 for discussion and to track documenting the kernels.

ViniciusSouzaRoque pushed a commit to s1mbi0se/arrow that referenced this pull request Oct 20, 2021
…e_functions()

Filter the following functions from `list_compute_functions()`:
* any which start with "hash_"
* `index_in_meta_binary`
* `is_in_meta_binary`

Closes apache#11363 from thisisnic/ARROW-14069_list_compute

Authored-by: Nic Crane <thisisnic@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants