[REVIEW] Build only `compute` for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

robertmaynard · 2021-02-16T21:46:09Z

RMM doesn't need to build sm and compute for each architecture, but only for the newest arch.

kkraus14 · 2021-02-17T20:56:49Z

@harrism any chance you could review before we merge? I'm not super familiar with nvcc gencode options 😄

harrism · 2021-02-17T21:58:08Z

cmake/Modules/SetGPUArchs.cmake

+# newest arch only to build that way while the rest built only for sm.
+list(SORT CMAKE_CUDA_ARCHITECTURES ORDER ASCENDING)
+list(POP_BACK CMAKE_CUDA_ARCHITECTURES latest_arch)
+list(TRANSFORM CMAKE_CUDA_ARCHITECTURES APPEND "-real")


Is -real a cmake command? What does this do?

https://cmake.org/cmake/help/git-stage/prop_tgt/CUDA_ARCHITECTURES.html

-real and -virtual are special keywords that can be used with CMAKE_CUDA_ARCHITECTURES to provide abstractions around different CUDA compilers code generation API.

For nvcc:

input compiler invocation

80 --generate-code=arch=compute_80,code=[sm_80,compute_80]

80-virtual --generate-code=arch=compute_80,code=compute_80

80-real --generate-code=arch=compute_80,code=sm_80

~~Can I see the output of this command when CMAKE_CUDA_ARCHITECTURES is unset?~~ I see above now.

We want SASS for all architectures we support, right? If we only include SASS ("-real"/) for 80, then users with anything but Ampere GPUs will experience looooong load/import times due to PTX-JIT to their present architecture. We do need to include PTX, but only for those who have GPUs we don't officially support (e.g. forward compatibility).

I see, I had it backwards. The -real is appended to all but the last entry. I thought it was only being appended to the last entry. All good.

We want SASS for all architectures we support, right? If we only include SASS ("-real"/) for 80, then users with anything but Ampere GPUs will experience looooong load/import times

You are correct. The code above is sneaky, as what we do is remove the 'newest' and only apply -real to any existing values. So input 70,80 becomes 70-real, 80 and input 80 becomes 80

harrism · 2021-02-17T22:40:16Z

@gpucibot merge

@robertmaynard

Based on #706 as they both modify `cmake/Modules/SetGPUArchs.cmake` This brings rmm's handling of `CMAKE_CUDA_ARCHITECTURES` to match that is proposed for cudf in rapidsai/cudf#7391 Authors: - Robert Maynard (@robertmaynard) Approvers: - Mark Harris (@harrism) - Keith Kraus (@kkraus14) URL: #709

robertmaynard requested a review from a team as a code owner February 16, 2021 21:46

github-actions bot added the CMake label Feb 16, 2021

kkraus14 approved these changes Feb 16, 2021

View reviewed changes

kkraus14 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Feb 16, 2021

harrism assigned robertmaynard Feb 16, 2021

Build only compute_ for the newest arch in CMAKE_CUDA_ARCHITECTURES

468a8e5

robertmaynard force-pushed the build_only_compute_for_newest_arch branch from 84a6a7b to 468a8e5 Compare February 17, 2021 14:13

robertmaynard mentioned this pull request Feb 17, 2021

[REVIEW] Simplify cmake cuda architectures handling #709

Merged

harrism reviewed Feb 17, 2021

View reviewed changes

harrism approved these changes Feb 17, 2021

View reviewed changes

harrism added the 5 - Ready to Merge Testing and reviews complete, ready to merge label Feb 17, 2021

rapids-bot bot merged commit 69a7541 into rapidsai:branch-0.19 Feb 17, 2021

robertmaynard deleted the build_only_compute_for_newest_arch branch February 18, 2021 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW] Build only `compute` for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

[REVIEW] Build only `compute` for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

robertmaynard commented Feb 16, 2021

kkraus14 commented Feb 17, 2021

harrism Feb 17, 2021

kkraus14 Feb 17, 2021

robertmaynard Feb 17, 2021

harrism Feb 17, 2021 •

edited

Loading

harrism Feb 17, 2021

robertmaynard Feb 17, 2021

harrism commented Feb 17, 2021

input	compiler invocation
80	--generate-code=arch=compute_80,code=[sm_80,compute_80]
80-virtual	--generate-code=arch=compute_80,code=compute_80
80-real	--generate-code=arch=compute_80,code=sm_80

[REVIEW] Build only compute for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

[REVIEW] Build only compute for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

Conversation

robertmaynard commented Feb 16, 2021

kkraus14 commented Feb 17, 2021

harrism Feb 17, 2021

Choose a reason for hiding this comment

kkraus14 Feb 17, 2021

Choose a reason for hiding this comment

robertmaynard Feb 17, 2021

Choose a reason for hiding this comment

harrism Feb 17, 2021 • edited Loading

Choose a reason for hiding this comment

harrism Feb 17, 2021

Choose a reason for hiding this comment

robertmaynard Feb 17, 2021

Choose a reason for hiding this comment

harrism commented Feb 17, 2021

[REVIEW] Build only `compute` for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

[REVIEW] Build only `compute` for the newest arch in CMAKE_CUDA_ARCHITECTURES #706

harrism Feb 17, 2021 •

edited

Loading