Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA] Correctly set CUDA default architecture #84017

Merged
merged 1 commit into from
Mar 6, 2024

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Mar 5, 2024

Summary:
We already had a special CUDA default that better tracked the state as
of modern CUDA installations. Recently this was bumped up to sm_52,
but there was a location that wasn't respecting this. Fix that.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Mar 5, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 5, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Joseph Huber (jhuber6)

Changes

Summary:
We already had a special CUDA default that better tracked the state as
of modern CUDA installations. Recently this was bumped up to sm_52,
but there was a location that wasn't respecting this. Fix that.


Full diff: https://github.com/llvm/llvm-project/pull/84017.diff

1 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+1-1)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index de8ceb2f0898bb..cecd34acbc92c0 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -3234,7 +3234,7 @@ class OffloadingActionBuilder final {
     CudaActionBuilder(Compilation &C, DerivedArgList &Args,
                       const Driver::InputList &Inputs)
         : CudaActionBuilderBase(C, Args, Inputs, Action::OFK_Cuda) {
-      DefaultCudaArch = CudaArch::SM_35;
+      DefaultCudaArch = CudaArch::CudaDefault;
     }
 
     StringRef getCanonicalOffloadArch(StringRef ArchStr) override {

Summary:
We already had a special CUDA default that better tracked the state as
of modern CUDA installations. Recently this was bumped up to `sm_52`,
but there was a location that wasn't respecting this. Fix that.

Fix tests
@jhuber6 jhuber6 merged commit 433b711 into llvm:main Mar 6, 2024
4 checks passed
// RUN: not %clang -### --target=x86_64-linux-gnu -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -c %s \
// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib -nogpuinc -fopenmp=libomp -c %s \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that this completely dropped -fopenmp-targets=nvptx64-nvidia-cuda, so the test likely lost coverage...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ grep -R ../clang/test/ -e '-fopenmp-targets=nvptx64' -l | sort | uniq | wc -l
94

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but this one is explicitly testing the (in)compatibility of debug options and these run lines are supposed to test them together with OpenMP offloading. Now they don't anymore...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--offload-arch and and -fopenmp-targets= are semantically equivalent in this context. They both enable the OpenMP offloding toolchain targeting NVPTX. The only difference is that --offload-arch sets the architecture manually while -fopenmp-targets= will look up the architecture through the nvptx-arch tool, which is extra work that I don't think is necessary for a test like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants