Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Build only compute for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

Conversation

robertmaynard
Copy link
Contributor

RMM doesn't need to build sm and compute for each architecture, but only for the newest arch.

@robertmaynard robertmaynard requested a review from a team as a code owner February 16, 2021 21:46
@github-actions github-actions bot added the CMake label Feb 16, 2021
@kkraus14 kkraus14 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 16, 2021
@kkraus14
Copy link
Contributor

@harrism any chance you could review before we merge? I'm not super familiar with nvcc gencode options 😄

# newest arch only to build that way while the rest built only for sm.
list(SORT CMAKE_CUDA_ARCHITECTURES ORDER ASCENDING)
list(POP_BACK CMAKE_CUDA_ARCHITECTURES latest_arch)
list(TRANSFORM CMAKE_CUDA_ARCHITECTURES APPEND "-real")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is -real a cmake command? What does this do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-real and -virtual are special keywords that can be used with CMAKE_CUDA_ARCHITECTURES to provide abstractions around different CUDA compilers code generation API.

For nvcc:

input compiler invocation
80 --generate-code=arch=compute_80,code=[sm_80,compute_80]
80-virtual --generate-code=arch=compute_80,code=compute_80
80-real --generate-code=arch=compute_80,code=sm_80

Copy link
Member

@harrism harrism Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I see the output of this command when CMAKE_CUDA_ARCHITECTURES is unset? I see above now.

We want SASS for all architectures we support, right? If we only include SASS ("-real"/) for 80, then users with anything but Ampere GPUs will experience looooong load/import times due to PTX-JIT to their present architecture. We do need to include PTX, but only for those who have GPUs we don't officially support (e.g. forward compatibility).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I had it backwards. The -real is appended to all but the last entry. I thought it was only being appended to the last entry. All good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want SASS for all architectures we support, right? If we only include SASS ("-real"/) for 80, then users with anything but Ampere GPUs will experience looooong load/import times

You are correct. The code above is sneaky, as what we do is remove the 'newest' and only apply -real to any existing values. So input 70,80 becomes 70-real, 80 and input 80 becomes 80

@harrism harrism added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Feb 17, 2021
@harrism
Copy link
Member

harrism commented Feb 17, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 69a7541 into rapidsai:branch-0.19 Feb 17, 2021
@robertmaynard robertmaynard deleted the build_only_compute_for_newest_arch branch February 18, 2021 18:33
rapids-bot bot pushed a commit that referenced this pull request Feb 24, 2021
Based on #706 as they both modify `cmake/Modules/SetGPUArchs.cmake`

This brings rmm's handling of `CMAKE_CUDA_ARCHITECTURES` to match that is proposed for cudf in rapidsai/cudf#7391

Authors:
  - Robert Maynard (@robertmaynard)

Approvers:
  - Mark Harris (@harrism)
  - Keith Kraus (@kkraus14)

URL: #709
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge CMake improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants