-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CUDA] Correctly set CUDA default architecture #84017
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-driver Author: Joseph Huber (jhuber6) ChangesSummary: Full diff: https://github.com/llvm/llvm-project/pull/84017.diff 1 Files Affected:
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index de8ceb2f0898bb..cecd34acbc92c0 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -3234,7 +3234,7 @@ class OffloadingActionBuilder final {
CudaActionBuilder(Compilation &C, DerivedArgList &Args,
const Driver::InputList &Inputs)
: CudaActionBuilderBase(C, Args, Inputs, Action::OFK_Cuda) {
- DefaultCudaArch = CudaArch::SM_35;
+ DefaultCudaArch = CudaArch::CudaDefault;
}
StringRef getCanonicalOffloadArch(StringRef ArchStr) override {
|
Summary: We already had a special CUDA default that better tracked the state as of modern CUDA installations. Recently this was bumped up to `sm_52`, but there was a location that wasn't respecting this. Fix that. Fix tests
// RUN: not %clang -### --target=x86_64-linux-gnu -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -c %s \ | ||
// RUN: %clang -### --target=x86_64-linux-gnu --offload-arch=sm_52 -nogpulib -nogpuinc -fopenmp=libomp -c %s \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that this completely dropped -fopenmp-targets=nvptx64-nvidia-cuda
, so the test likely lost coverage...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ grep -R ../clang/test/ -e '-fopenmp-targets=nvptx64' -l | sort | uniq | wc -l
94
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but this one is explicitly testing the (in)compatibility of debug options and these run lines are supposed to test them together with OpenMP offloading. Now they don't anymore...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--offload-arch
and and -fopenmp-targets=
are semantically equivalent in this context. They both enable the OpenMP offloding toolchain targeting NVPTX. The only difference is that --offload-arch
sets the architecture manually while -fopenmp-targets=
will look up the architecture through the nvptx-arch
tool, which is extra work that I don't think is necessary for a test like this.
Summary:
We already had a special CUDA default that better tracked the state as
of modern CUDA installations. Recently this was bumped up to
sm_52
,but there was a location that wasn't respecting this. Fix that.