Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix null check when comparing structs in min and max reduction/groupby operations #9994

Merged
merged 3 commits into from Jan 11, 2022

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented Jan 7, 2022

When comparing structs, we need to flatten its view into a table_view and compare the table's rows. Nulls check for the comparison needs to be done by checking nulls in the input structs column at all levels.

This PR fixes a bug that checks for nulls only at the top level. Unit tests designed specifically to detect this bug have also been added.

@ttnghia ttnghia added bug Something isn't working 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. helps: Spark Functionality that helps Spark RAPIDS non-breaking Non-breaking change labels Jan 7, 2022
@ttnghia ttnghia self-assigned this Jan 7, 2022
@ttnghia ttnghia requested a review from a team as a code owner January 7, 2022 14:38
@ttnghia ttnghia added this to PR-WIP in v22.02 Release via automation Jan 7, 2022
@ttnghia
Copy link
Contributor Author

ttnghia commented Jan 7, 2022

Reference: NVIDIA/spark-rapids#4434 (that work is on spark-rapids plugin implementing min and max operations for structs, and uncovered the bug that this PR fixes).

@ttnghia ttnghia moved this from PR-WIP to PR-Needs review in v22.02 Release Jan 7, 2022
@codecov

This comment has been minimized.

@ttnghia ttnghia requested a review from bdice January 7, 2022 20:36
Copy link
Contributor

@hyperbolic2346 hyperbolic2346 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me outside of the copyrights.

cpp/src/reductions/struct_minmax_util.cuh Show resolved Hide resolved
v22.02 Release automation moved this from PR-Needs review to PR-Reviewer approved Jan 11, 2022
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - @ttnghia can you please update the PR description to cross-reference the issues, PRs, and/or comments that led to discovering this bug and its fix? You referenced NVIDIA/spark-rapids#4434 but I don't immediately see the relationship to this PR.

edit: I think this comment is the one that explains it. This PR fixes the difference in CPU/GPU behavior from this thread: #8974 (comment)

@ttnghia
Copy link
Contributor Author

ttnghia commented Jan 11, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit d3282cb into rapidsai:branch-22.02 Jan 11, 2022
v22.02 Release automation moved this from PR-Reviewer approved to Done Jan 11, 2022
@ttnghia ttnghia changed the title Fix null check when comparing rows of structs in min and max reduction/groupby operations Fix null check when comparing structs in min and max reduction/groupby operations Jan 12, 2022
@ttnghia ttnghia deleted the fix_null_check branch March 28, 2022 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team bug Something isn't working helps: Spark Functionality that helps Spark RAPIDS libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants