Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keep option to distinct nvbench #16497

Merged

Conversation

bdice
Copy link
Contributor

@bdice bdice commented Aug 5, 2024

Description

This PR adopts some work from @srinivasyadav18 with additional modifications. This is meant to complement #16484.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Aug 5, 2024
@bdice bdice added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Aug 5, 2024
@bdice bdice self-assigned this Aug 5, 2024
@bdice bdice marked this pull request as ready for review August 5, 2024 23:08
@bdice bdice requested a review from a team as a code owner August 5, 2024 23:08
@bdice bdice requested review from mythrocks and vuule August 5, 2024 23:08
.add_int64_axis("NumRows", {10'000, 100'000, 1'000'000, 10'000'000});
.add_string_axis("keep", {"any", "first", "last", "none"})
.add_int64_axis("cardinality",
{100, 1'000, 10'000, 100'000, 1'000'000, 10'000'000, 100'000'000, 1'000'000'000})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just decrease the default values in the axis.

Copy link
Member

@PointKernel PointKernel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One non-blocking suggestion

LGTM.

cpp/benchmarks/stream_compaction/distinct.cpp Outdated Show resolved Hide resolved
@github-actions github-actions bot added the CMake CMake build issue label Aug 6, 2024
Comment on lines +38 to +41
if (cardinality > num_rows) {
state.skip("cardinality > num_rows");
return;
}
Copy link
Contributor

@GregoryKimball GregoryKimball Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to omit this skipping condition. I recognize that we can't have 1M distinct elements in 1K rows, but this condition adds a lot of friction when sweeping NumRows for the high cardinality case. It forces me to run a full factorial of matching NumRows and Cardinality values and filter the outputs for the highest Cardinality unskipped for each NumRows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rewrite this logic. Thanks for the feedback!

Copy link
Contributor Author

@bdice bdice Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. @GregoryKimball I reviewed the NVBench docs and I don't see a way to filter out certain jobs except by skipping them. https://github.com/NVIDIA/nvbench/blob/main/docs/benchmarks.md#beware-combinatorial-explosion-is-lurking

We might be able to use a string axis like {"100,100", "100,1000", ..., "1000000000,1000000000"} and parse it, but that's hard to maintain.

Copy link
Member

@PointKernel PointKernel Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way to filter out certain jobs except by skipping them.

NVIDIA/nvbench#80 can solve this issue but the PR has been stalled for a while.

Co-authored-by: Yunsong Wang <yunsongw@nvidia.com>
@GregoryKimball
Copy link
Contributor

GregoryKimball commented Aug 7, 2024

Thanks guys, here is a performance snapshot of these benchmarks on A100
image

I would like to merge this update ASAP and then have @srinivasyadav18 pull the changes into #16484

I'm noticing that these throughput numbers on A100 are about 10x lower than what @srinivasyadav18 posted for H100. This is another reason I'm interested in running wider benchmarks on #16484

Copy link
Contributor Author

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srinivasyadav18 Can you approve?

Copy link
Contributor

@srinivasyadav18 srinivasyadav18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI fail is still failing due styling issues.
Otherwise LGTM! Thanks!

@bdice
Copy link
Contributor Author

bdice commented Aug 7, 2024

Just fixed the CI style check. I'll merge this when CI passes.

@bdice
Copy link
Contributor Author

bdice commented Aug 7, 2024

/merge

@rapids-bot rapids-bot bot merged commit 1bbe440 into rapidsai:branch-24.10 Aug 8, 2024
80 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants