Skip to content

[Clang][NFC] Clean up OpenMP offload toolchain generation #145549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 25, 2025

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Jun 24, 2025

Summary:
Small cleanup prior to some larger changes in this area. I want to move
the offload arch detection to be done solely here.

This might change the order the targets show up in but shouldn't affect
anything else. Will look into that.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jun 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 24, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

Changes

Summary:
Small cleanup prior to some larger changes in this area. I want to move
the offload arch detection to be done solely here.

This might change the order the targets show up in but shouldn't affect
anything else. Will look into that.


Full diff: https://github.com/llvm/llvm-project/pull/145549.diff

2 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+53-63)
  • (modified) clang/test/Driver/offload-Xarch.c (+2-2)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 2d055ffa17a8f..fd7fce114eca2 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -1044,82 +1044,72 @@ void Driver::CreateOffloadingDeviceToolChains(Compilation &C,
             << OpenMPTargets->getAsString(C.getInputArgs());
         return;
       }
-      for (StringRef T : OpenMPTargets->getValues())
-        OpenMPTriples.insert(T);
+      for (StringRef T : OpenMPTargets->getValues()) {
+        llvm::Triple TT(ToolChain::getOpenMPTriple(T));
+        std::string NormalizedName = TT.normalize();
+
+        // Make sure we don't have a duplicate triple.
+        auto [TripleIt, Inserted] =
+            FoundNormalizedTriples.try_emplace(NormalizedName, T);
+        if (!Inserted) {
+          Diag(clang::diag::warn_drv_omp_offload_target_duplicate)
+              << T << TripleIt->second;
+          continue;
+        }
+
+        // If the specified target is invalid, emit a diagnostic.
+        if (TT.getArch() == llvm::Triple::UnknownArch) {
+          Diag(clang::diag::err_drv_invalid_omp_target) << T;
+          continue;
+        }
+
+        auto &TC = getOffloadToolChain(C.getInputArgs(), Action::OFK_OpenMP, TT,
+                                       C.getDefaultToolChain().getTriple());
+        C.addOffloadDeviceToolChain(&TC, Action::OFK_OpenMP);
+      }
     } else if (C.getInputArgs().hasArg(options::OPT_offload_arch_EQ) &&
                ((!IsHIP && !IsCuda) || UseLLVMOffload)) {
       const ToolChain *HostTC = C.getSingleOffloadToolChain<Action::OFK_Host>();
-      auto AMDTriple = getHIPOffloadTargetTriple(*this, C.getInputArgs());
-      auto NVPTXTriple = getNVIDIAOffloadTargetTriple(*this, C.getInputArgs(),
-                                                      HostTC->getTriple());
+      llvm::Triple AMDTriple("amdgcn-amd-amdhsa");
+      llvm::Triple NVPTXTriple("nvptx64-nvidia-cuda");
 
       // Attempt to deduce the offloading triple from the set of architectures.
       // We can only correctly deduce NVPTX / AMDGPU triples currently.
-      // We need to temporarily create these toolchains so that we can access
-      // tools for inferring architectures.
-      llvm::DenseSet<StringRef> Archs;
-      for (const std::optional<llvm::Triple> &TT : {NVPTXTriple, AMDTriple}) {
-        if (!TT)
-          continue;
-
-        auto &TC =
-            getOffloadToolChain(C.getInputArgs(), Action::OFK_OpenMP, *TT,
-                                C.getDefaultToolChain().getTriple());
-        for (StringRef Arch :
-             getOffloadArchs(C, C.getArgs(), Action::OFK_OpenMP, &TC, true))
-          Archs.insert(Arch);
-      }
+      for (const llvm::Triple &TT : {AMDTriple, NVPTXTriple}) {
+        auto &TC = getOffloadToolChain(C.getInputArgs(), Action::OFK_OpenMP, TT,
+                                       C.getDefaultToolChain().getTriple());
+
+        llvm::DenseSet<StringRef> Archs =
+            getOffloadArchs(C, C.getArgs(), Action::OFK_OpenMP, &TC, true);
+        llvm::DenseSet<StringRef> ArchsForTarget;
+        for (StringRef Arch : Archs) {
+          bool IsNVPTX = IsNVIDIAOffloadArch(
+              StringToOffloadArch(getProcessorFromTargetID(NVPTXTriple, Arch)));
+          bool IsAMDGPU = IsAMDOffloadArch(
+              StringToOffloadArch(getProcessorFromTargetID(AMDTriple, Arch)));
+          if (!IsNVPTX && !IsAMDGPU && !Arch.equals_insensitive("native")) {
+            Diag(clang::diag::err_drv_failed_to_deduce_target_from_arch)
+                << Arch;
+            return;
+          }
 
-      for (StringRef Arch : Archs) {
-        if (NVPTXTriple && IsNVIDIAOffloadArch(StringToOffloadArch(
-                               getProcessorFromTargetID(*NVPTXTriple, Arch)))) {
-          DerivedArchs[NVPTXTriple->getTriple()].insert(Arch);
-        } else if (AMDTriple &&
-                   IsAMDOffloadArch(StringToOffloadArch(
-                       getProcessorFromTargetID(*AMDTriple, Arch)))) {
-          DerivedArchs[AMDTriple->getTriple()].insert(Arch);
-        } else {
-          Diag(clang::diag::err_drv_failed_to_deduce_target_from_arch) << Arch;
-          return;
+          if (TT.isNVPTX() && IsNVPTX) {
+            ArchsForTarget.insert(Arch);
+          } else if (TT.isAMDGPU() && IsAMDGPU) {
+            ArchsForTarget.insert(Arch);
+          }
+        }
+        if (!ArchsForTarget.empty()) {
+          C.addOffloadDeviceToolChain(&TC, Action::OFK_OpenMP);
+          KnownArchs[&TC] = ArchsForTarget;
         }
       }
 
       // If the set is empty then we failed to find a native architecture.
-      if (Archs.empty()) {
+      auto TCRange = C.getOffloadToolChains(Action::OFK_OpenMP);
+      if (TCRange.first == TCRange.second)
         Diag(clang::diag::err_drv_failed_to_deduce_target_from_arch)
             << "native";
-        return;
-      }
-
-      for (const auto &TripleAndArchs : DerivedArchs)
-        OpenMPTriples.insert(TripleAndArchs.first());
-    }
-
-    for (StringRef Val : OpenMPTriples) {
-      llvm::Triple TT(ToolChain::getOpenMPTriple(Val));
-      std::string NormalizedName = TT.normalize();
-
-      // Make sure we don't have a duplicate triple.
-      auto [TripleIt, Inserted] =
-          FoundNormalizedTriples.try_emplace(NormalizedName, Val);
-      if (!Inserted) {
-        Diag(clang::diag::warn_drv_omp_offload_target_duplicate)
-            << Val << TripleIt->second;
-        continue;
-      }
-
-      // If the specified target is invalid, emit a diagnostic.
-      if (TT.getArch() == llvm::Triple::UnknownArch) {
-        Diag(clang::diag::err_drv_invalid_omp_target) << Val;
-        continue;
-      }
-
-      auto &TC = getOffloadToolChain(C.getInputArgs(), Action::OFK_OpenMP, TT,
-                                     C.getDefaultToolChain().getTriple());
-      C.addOffloadDeviceToolChain(&TC, Action::OFK_OpenMP);
-      auto It = DerivedArchs.find(TT.getTriple());
-      if (It != DerivedArchs.end())
-        KnownArchs[&TC] = It->second;
     }
   } else if (C.getInputArgs().hasArg(options::OPT_fopenmp_targets_EQ)) {
     Diag(clang::diag::err_drv_expecting_fopenmp_with_fopenmp_targets);
diff --git a/clang/test/Driver/offload-Xarch.c b/clang/test/Driver/offload-Xarch.c
index 73943f3e9c7f8..6a17e1b8f3d24 100644
--- a/clang/test/Driver/offload-Xarch.c
+++ b/clang/test/Driver/offload-Xarch.c
@@ -20,12 +20,12 @@
 // RUN: | FileCheck -check-prefix=OPENMP %s
 
 // OPENMP: # "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"
-// OPENMP: # "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[GFX1030_BC:.+]]"
-// OPENMP: # "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[GFX90A_BC:.+]]"
 // OPENMP: # "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[SM52_PTX:.+]]"
 // OPENMP: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[SM52_PTX]]"], output: "[[SM52_CUBIN:.+]]"
 // OPENMP: # "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[SM60_PTX:.+]]"
 // OPENMP: # "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[SM60_PTX]]"], output: "[[SM60_CUBIN:.+]]"
+// OPENMP: # "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[GFX1030_BC:.+]]"
+// OPENMP: # "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[GFX90A_BC:.+]]"
 // OPENMP: # "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[GFX1030_BC]]", "[[GFX90A_BC]]", "[[SM52_CUBIN]]", "[[SM60_CUBIN]]"], output: "[[BINARY:.+]]"
 // OPENMP: # "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.+]]"
 // OPENMP: # "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

Summary:
Small cleanup prior to some larger changes in this area. I want to move
the offload arch detection to be done solely here.

This might change the order the targets show up in but shouldn't affect
anything else. Will look into that.
@jhuber6 jhuber6 force-pushed the ReworkOpenMPLookup branch from 156da4b to 205affa Compare June 24, 2025 20:30
@jhuber6 jhuber6 changed the title [Clang] Clean up OpenMP offload toolchain generation [Clang][NFC] Clean up OpenMP offload toolchain generation Jun 24, 2025
Comment on lines 1098 to 1102
if (TT.isNVPTX() && IsNVPTX) {
ArchsForTarget.insert(Arch);
} else if (TT.isAMDGPU() && IsAMDGPU) {
ArchsForTarget.insert(Arch);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (TT.isNVPTX() && IsNVPTX) {
ArchsForTarget.insert(Arch);
} else if (TT.isAMDGPU() && IsAMDGPU) {
ArchsForTarget.insert(Arch);
}
if (TT.isNVPTX() && IsNVPTX)
ArchsForTarget.insert(Arch);
else if (TT.isAMDGPU() && IsAMDGPU)
ArchsForTarget.insert(Arch);

}
}

// If the set is empty then we failed to find a native architecture.
if (Archs.empty()) {
auto TCRange = C.getOffloadToolChains(Action::OFK_OpenMP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No auto

Archs.insert(Arch);
}
for (const llvm::Triple &TT : {AMDTriple, NVPTXTriple}) {
auto &TC = getOffloadToolChain(C.getInputArgs(), Action::OFK_OpenMP, TT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no auto

continue;
}

auto &TC = getOffloadToolChain(C.getInputArgs(), Action::OFK_OpenMP, TT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no auto

@jhuber6 jhuber6 force-pushed the ReworkOpenMPLookup branch from cf7d158 to d341c6e Compare June 25, 2025 16:04
@jhuber6 jhuber6 merged commit 029823a into llvm:main Jun 25, 2025
anthonyhatran pushed a commit to anthonyhatran/llvm-project that referenced this pull request Jun 26, 2025
Summary:
Small cleanup prior to some larger changes in this area. I want to move
the offload arch detection to be done solely here.

This might change the order the targets show up in but shouldn't affect
anything else. Will look into that.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
Summary:
Small cleanup prior to some larger changes in this area. I want to move
the offload arch detection to be done solely here.

This might change the order the targets show up in but shouldn't affect
anything else. Will look into that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants