Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test cuda graph capture #1112

Merged
merged 3 commits into from
Nov 17, 2023

Conversation

gevtushenko
Copy link
Collaborator

Description

closes #321

This PR generalizes CDP wrapper to support multiple launch backends.
It also introduces graph capture backend and adds coverage for graph capture to basic variants of device-scope tests.

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@gevtushenko gevtushenko requested review from a team as code owners November 16, 2023 04:32
@gevtushenko gevtushenko requested review from alliepiper, miscco and elstehle and removed request for a team November 16, 2023 04:32
Copy link
Collaborator

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Only some minor nits

cub/test/catch2_test_launch_helper.h Show resolved Hide resolved
cub/test/catch2_test_launch_helper.h Outdated Show resolved Hide resolved
cub/test/catch2_test_launch_wrapper.cu Outdated Show resolved Hide resolved
cub/test/catch2_test_launch_wrapper.cu Outdated Show resolved Hide resolved
cub/test/catch2_test_launch_helper.h Show resolved Hide resolved
Copy link
Collaborator

@alliepiper alliepiper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool stuff, impressively small patch for the amount of coverage this adds!

Some minor cleanups / discussion points inline, but this lgtm.

cub/test/CMakeLists.txt Outdated Show resolved Hide resolved
cub/test/CMakeLists.txt Outdated Show resolved Hide resolved
cub/test/CMakeLists.txt Outdated Show resolved Hide resolved
cub/test/CMakeLists.txt Show resolved Hide resolved
cub/test/CMakeLists.txt Show resolved Hide resolved
cub/test/catch2_test_launch_helper.h Outdated Show resolved Hide resolved
cub/test/test_cdp_variant_state.cu Outdated Show resolved Hide resolved
## cub_cdp_by_lid
#
# If given launcher id corresponds to a CDP launcher, set `out_var` to 1.
function(cub_cdp_by_lid out_var launcher_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: rename this function to _cub_launcher_requires_rdc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on this suggestion so that I can understand intention behind it? Is underscore used to add semantics of a "private" function? If so, why cub_get_test_params and cub_add_test do not have leading underscore in their names?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention of the underscore was to distinguish between "heavy" functions that might be used repeatedly across multiple files as a sort of API vs. helper functions that just simplify code flow in our implementations (like these one-off functions that don't do anything interesting except map one value into another to make life a little easier).

This was more important for thrust, which exposes some functions in the cmake package (thrust_create_target) that are actual user APIs. My goal was to have the more useful functions appear in IDE autocompletion while the helpers wouldn't clutter up the list of completions. It doesn't have any actual impact on correctness, though, it was just a convention.

In this suggestion, the new name was more important with the underscore. launcher_requires_rdc makes it clear that this is used solely to map launcher values to RDC flags.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for elaborating on this! I've changed the function name per recommendation. The convention doesn't seem intuitive. If I'm not missing anything, it's not documented. Maybe it's worth adding cmake developer overview, describing it.

@gevtushenko gevtushenko merged commit 4035cde into NVIDIA:main Nov 17, 2023
516 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: CUDA Graph Capture support for CUB routines
3 participants