Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Reduce disparity between nested and non-nested column handling in lexicographic comparator #11667

Open
vyasr opened this issue Sep 8, 2022 · 0 comments
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.

Comments

@vyasr
Copy link
Contributor

vyasr commented Sep 8, 2022

Is your feature request related to a problem? Please describe.
As part of #11129 the row lexicographic comparator became templated on whether or not the tables being compared contain any nested columns. This templating was necessary to guarantee that nested and non-nested code paths were compiled separately. That, in turn, was required because the compiler fails to completely optimize the non-nested code path due to the complexity of the nested code path, which slows down code even when no nested data is present. Unfortunately, this now means that calling code must dispatch to separate paths depending on whether the table being operated on has nested data. In addition to being cumbersome in and of itself, this requirement makes code using the lexicographic comparator different from code using the equality comparator.

Describe the solution you'd like
We should consider alternative APIs for the comparator that abstract the nested column dispatch away from the call-site. One option that I considered is to define a wrapper function that accepts a callable to apply (e.g. thrust::sort) that requires the comparator as an argument. This wrapper function could instantiate the appropriate comparator and then call the provided callable. @jrhemstad suggested that this could be accomplished using a visitor pattern. We should prototype this approach.

Describe alternatives you've considered
I have not yet come up with any alternative solutions, but it would be nice to try to find an even less invasive approach if possible since the visitor pattern does add an extra level of indirection that would be nice to avoid.

Additional context
N/A

@vyasr vyasr added feature request New feature or request 0 - Backlog In queue waiting for assignment improvement Improvement / enhancement to an existing function labels Sep 8, 2022
@vyasr vyasr added this to Needs prioritizing in Other Issues via automation Oct 26, 2022
@GregoryKimball GregoryKimball added the libcudf Affects libcudf (C++/CUDA) code. label Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 - Backlog In queue waiting for assignment feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
No open projects
Other Issues
Needs prioritizing
Development

No branches or pull requests

2 participants