Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support ArrayIntersect on at least Arrays of String #4932

Closed
revans2 opened this issue Mar 10, 2022 · 1 comment · Fixed by #5958
Closed

[FEA] Support ArrayIntersect on at least Arrays of String #4932

revans2 opened this issue Mar 10, 2022 · 1 comment · Fixed by #5958
Assignees
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request

Comments

@revans2
Copy link
Collaborator

revans2 commented Mar 10, 2022

Is your feature request related to a problem? Please describe.
This is a follow on issue for #4900. Specifically

!NOT_FOUND <ArrayIntersect> array_intersect(sort_array(map_keys(xx1#1134), true), [READ_MORE,EXPLORE_MORE,NEXT_UP]) cannot run on GPU because no GPU enabled version of expression class org.apache.spark.sql.catalyst.expressions.ArrayIntersect could be found 

Looking at CUDF there is no existing functionality that supports this, but it should not be too difficult as it is similar to the existing drop duplicates code. If we have really long lists we might need to do something drastically different, but either way I will file an issue in CUDF to support this type of functionality.

It would be great to look at support for similar operators like array_union and array_except too.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify cudf_dependency An issue or PR with this label depends on a new feature in cudf labels Mar 10, 2022
@revans2
Copy link
Collaborator Author

revans2 commented Mar 10, 2022

I filed rapidsai/cudf#10409 as the CUDF dependency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cudf_dependency An issue or PR with this label depends on a new feature in cudf feature request New feature or request
Projects
None yet
4 participants