Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support for non key nested types in cudf::merge #8050

Closed
revans2 opened this issue Apr 23, 2021 · 2 comments
Closed

[FEA] Support for non key nested types in cudf::merge #8050

revans2 opened this issue Apr 23, 2021 · 2 comments
Assignees
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS

Comments

@revans2
Copy link
Contributor

revans2 commented Apr 23, 2021

Is your feature request related to a problem? Please describe.
For sorting data in Spark we use merge if the data is too large to fit in a single batch. We need to be able to support sorting data that contains nested types (that are not necessarily the key we are sorting on).

Describe the solution you'd like
I would like to see merge support the same types for non-sort keys columns that gather supports so we can sort whatever it is we need to sort.

Describe alternatives you've considered
As a work around we will concat the tables together and sort them, but it is much slower, and not ideal.

@revans2 revans2 added feature request New feature or request Needs Triage Need team to review and classify Spark Functionality that helps Spark RAPIDS labels Apr 23, 2021
@github-actions github-actions bot added this to Needs prioritizing in Feature Planning Apr 23, 2021
@kkraus14 kkraus14 added libcudf Affects libcudf (C++/CUDA) code. and removed Needs Triage Need team to review and classify labels Apr 27, 2021
@nvdbaranec nvdbaranec self-assigned this May 20, 2021
rapids-bot bot pushed a commit that referenced this issue Jun 16, 2021
Partially addresses #8050

Adds support for merging of struct columns.  The struct columns cannot be used as keys in the merge.

Authors:
  - https://github.com/nvdbaranec

Approvers:
  - Ram (Ramakrishna Prabhu) (https://github.com/rgsl888prabhu)
  - Christopher Harris (https://github.com/cwharris)
  - Conor Hoekstra (https://github.com/codereport)

URL: #8422
Feature Planning automation moved this from Needs prioritizing to Closed Jul 20, 2021
@firestarman
Copy link
Contributor

Reopen this since it is still missing the array and map type support.

@vyasr
Copy link
Contributor

vyasr commented Nov 21, 2023

Closing as resolved by #14250.

@vyasr vyasr closed this as completed Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. Spark Functionality that helps Spark RAPIDS
Projects
No open projects
Feature Planning
Needs prioritizing
Development

No branches or pull requests

5 participants