Skip to content

clang/AMDGPU: Use TranslateArgs from the base toolchain instead of the host#198627

Open
arsenm wants to merge 2 commits into
mainfrom
users/arsenm/clang/amdgpu-fix-using-host-TranslateArgs-not-base-toolchain
Open

clang/AMDGPU: Use TranslateArgs from the base toolchain instead of the host#198627
arsenm wants to merge 2 commits into
mainfrom
users/arsenm/clang/amdgpu-fix-using-host-TranslateArgs-not-base-toolchain

Conversation

@arsenm
Copy link
Copy Markdown
Contributor

@arsenm arsenm commented May 19, 2026

This fixes -Xopenmp-target / -Xarch for arbitrary arguments. HIP and OpenMP
had cargo-cult broken implementations of TranslateArgs, which called the host
toolchain's implementation, and then special case transferred either -march
or -mcpu to the device argument list. The respective device forwarding flags
should work for any argument, not just this one. The main feature that needs
to be preserved is the shared filtering of unsupported sanitizers to degrade
them into warnings.

Most of the changes here are dealing with fallout observed when
the host target is darwin. The darwin toolchain happens to have
some hacky statefulness tracking the compile target version, which
gets written and rewritten on argument parsing. To maintain this hack,
there are a few unused calls to getArgsForToolChain; start passing OFK_Host
to these so the offload toolchains don't get confused and think they're in
a non-offload context.

Copy link
Copy Markdown
Contributor Author

arsenm commented May 19, 2026

@arsenm arsenm added backend:AMDGPU clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:openmp OpenMP related changes to Clang labels May 19, 2026 — with Graphite App
@arsenm arsenm requested review from jhuber6, lamb-j and ro-i May 19, 2026 20:04
@arsenm arsenm marked this pull request as ready for review May 19, 2026 20:04
@llvmorg-github-actions
Copy link
Copy Markdown

llvmorg-github-actions Bot commented May 19, 2026

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Matt Arsenault (arsenm)

Changes

This fixes -Xopenmp-target / -Xarch for arbitrary arguments. HIP and OpenMP
had cargo-cult broken implementations of TranslateArgs, which called the host
toolchain's implementation, and then special case transferred either -march
or -mcpu to the device argument list. The respective device forwarding flags
should work for any argument, not just this one. The main feature that needs
to be preserved is the shared filtering of unsupported sanitizers to degrade
them into warnings.

Most of the changes here are dealing with fallout observed when
the host target is darwin. The darwin toolchain happens to have
some hacky statefulness tracking the compile target version, which
gets written and rewritten on argument parsing. To maintain this hack,
there are a few unused calls to getArgsForToolChain; start passing OFK_Host
to these so the offload toolchains don't get confused and think they're in
a non-offload context.


Full diff: https://github.com/llvm/llvm-project/pull/198627.diff

11 Files Affected:

  • (modified) clang/lib/Driver/Driver.cpp (+2-2)
  • (modified) clang/lib/Driver/ToolChain.cpp (+12-3)
  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+35-8)
  • (modified) clang/lib/Driver/ToolChains/AMDGPU.h (+5-25)
  • (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp (-43)
  • (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.h (-5)
  • (modified) clang/lib/Driver/ToolChains/Gnu.cpp (+15-8)
  • (modified) clang/lib/Driver/ToolChains/HIPAMD.cpp (+5-35)
  • (modified) clang/lib/Driver/ToolChains/HIPAMD.h (-1)
  • (added) clang/test/Driver/offload-darwin-host.hip (+8)
  • (added) clang/test/Driver/openmp-Xopenmp-target-forward-args.c (+10)
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 4a968a4ce5cc0..7ee4163030a5e 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -2709,7 +2709,7 @@ bool Driver::HandleImmediateArgs(Compilation &C) {
     // FIXME: Remove when darwin's toolchain is initialized during construction.
     // FIXME: For some more esoteric targets the default toolchain is not the
     //        correct one.
-    C.getArgsForToolChain(&TC, Triple.getArchName(), Action::OFK_None);
+    C.getArgsForToolChain(&TC, Triple.getArchName(), Action::OFK_Host);
     RegisterEffectiveTriple TripleRAII(TC, Triple);
     switch (RLT) {
     case ToolChain::RLT_CompilerRT:
@@ -6084,7 +6084,7 @@ InputInfoList Driver::BuildJobsForActionNoCache(
     // computed. Get the default arguments for OFK_None to ensure that
     // initialization is performed before processing the offload action.
     // FIXME: Remove when darwin's toolchain is initialized during construction.
-    C.getArgsForToolChain(TC, BoundArch, Action::OFK_None);
+    C.getArgsForToolChain(TC, BoundArch, Action::OFK_Host);
 
     // The offload action is expected to be used in four different situations.
     //
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index 21d61a60224cb..95413c57e7c06 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -519,11 +519,20 @@ SanitizerArgs
 ToolChain::getSanitizerArgs(const llvm::opt::ArgList &JobArgs,
                             StringRef BoundArch,
                             Action::OffloadKind DeviceOffloadKind) const {
+  // When -fno-gpu-sanitize is specified for GPU targets, don't emit
+  // diagnostics about unsupported sanitizers for specific GPU arches,
+  // since sanitizers are disabled for the GPU anyway.
+  bool DiagnoseBoundArchErrors =
+      BoundArchSanitizerArgsChecked.insert(BoundArch).second;
+  if (!BoundArch.empty() && getTriple().isGPU() &&
+      !JobArgs.hasFlag(options::OPT_fgpu_sanitize,
+                       options::OPT_fno_gpu_sanitize, true)) {
+    DiagnoseBoundArchErrors = false;
+  }
+
   SanitizerArgs SanArgs(*this, JobArgs,
                         /*DiagnoseErrors=*/!SanitizerArgsChecked,
-                        /*DiagnoseBoundArchErrors=*/
-                        BoundArchSanitizerArgsChecked.insert(BoundArch).second,
-                        BoundArch, DeviceOffloadKind);
+                        DiagnoseBoundArchErrors, BoundArch, DeviceOffloadKind);
 
   SanitizerArgsChecked = true;
   return SanArgs;
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index bb87817be225e..b6ae49802dc9e 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -717,17 +717,15 @@ Tool *AMDGPUToolChain::buildLinker() const {
 DerivedArgList *
 AMDGPUToolChain::TranslateArgs(const DerivedArgList &Args, StringRef BoundArch,
                                Action::OffloadKind DeviceOffloadKind) const {
-
   DerivedArgList *DAL =
       Generic_ELF::TranslateArgs(Args, BoundArch, DeviceOffloadKind);
-
-  const OptTable &Opts = getDriver().getOpts();
-
-  if (!DAL)
+  if (!DAL) {
     DAL = new DerivedArgList(Args.getBaseArgs());
+    for (Arg *A : Args)
+      DAL->append(A);
+  }
 
-  for (Arg *A : Args)
-    DAL->append(A);
+  const OptTable &Opts = getDriver().getOpts();
 
   // Replace -mcpu=native with detected GPU.
   Arg *LastMCPUArg = DAL->getLastArg(options::OPT_mcpu_EQ);
@@ -747,7 +745,13 @@ AMDGPUToolChain::TranslateArgs(const DerivedArgList &Args, StringRef BoundArch,
     }
   }
 
-  checkTargetID(*DAL);
+  if (!BoundArch.empty()) {
+    DAL->eraseArg(options::OPT_mcpu_EQ);
+    DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_mcpu_EQ), BoundArch);
+    checkTargetID(*DAL);
+  } else if (DeviceOffloadKind == Action::OFK_None) {
+    checkTargetID(*DAL);
+  }
 
   if (Args.getLastArgValue(options::OPT_x) != "cl")
     return DAL;
@@ -838,6 +842,29 @@ ROCMToolChain::ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
     RocmInstallation->detectDeviceLibrary();
 }
 
+DerivedArgList *
+ROCMToolChain::TranslateArgs(const DerivedArgList &Args, StringRef BoundArch,
+                             Action::OffloadKind DeviceOffloadKind) const {
+  DerivedArgList *DAL =
+      AMDGPUToolChain::TranslateArgs(Args, BoundArch, DeviceOffloadKind);
+
+  // Filter out sanitizer coverage options that are not supported for AMDGPU.
+  for (Arg *A : Args) {
+    // Sanitizer coverage is currently not supported for AMDGPU.
+    if (A->getOption().matches(options::OPT_fsan_cov_Group)) {
+      // Upgrade to error if the option was explicitly specified for device
+      bool IsExplicitDevice =
+          A->getBaseArg().getOption().matches(options::OPT_Xarch_device);
+      getDriver().Diag(IsExplicitDevice
+                           ? diag::err_drv_unsupported_option_for_target
+                           : diag::warn_drv_unsupported_option_for_target)
+          << A->getAsString(Args) << getTriple().str();
+    }
+  }
+
+  return DAL;
+}
+
 void AMDGPUToolChain::addClangTargetOptions(
     const llvm::opt::ArgList &DriverArgs,
     llvm::opt::ArgStringList &CC1Args,
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.h b/clang/lib/Driver/ToolChains/AMDGPU.h
index 2a5e28224808c..5f938fa7e8cbb 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.h
+++ b/clang/lib/Driver/ToolChains/AMDGPU.h
@@ -141,6 +141,11 @@ class LLVM_LIBRARY_VISIBILITY ROCMToolChain : public AMDGPUToolChain {
 public:
   ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
                 const llvm::opt::ArgList &Args);
+
+  llvm::opt::DerivedArgList *
+  TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
+                Action::OffloadKind DeviceOffloadKind) const override;
+
   void
   addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
                         llvm::opt::ArgStringList &CC1Args,
@@ -151,31 +156,6 @@ class LLVM_LIBRARY_VISIBILITY ROCMToolChain : public AMDGPUToolChain {
   getCommonDeviceLibNames(const llvm::opt::ArgList &DriverArgs,
                           llvm::StringRef TargetID, llvm::StringRef GPUArch,
                           Action::OffloadKind DeviceOffloadingKind) const;
-
-  bool diagnoseUnsupportedOption(const llvm::opt::Arg *A,
-                                 const llvm::opt::DerivedArgList &DAL,
-                                 const llvm::opt::ArgList &DriverArgs,
-                                 const char *Value = nullptr) const {
-    auto &Diags = getDriver().getDiags();
-    bool IsExplicitDevice =
-        A->getBaseArg().getOption().matches(options::OPT_Xarch_device);
-
-    if (Value) {
-      unsigned DiagID =
-          IsExplicitDevice
-              ? clang::diag::err_drv_unsupported_option_part_for_target
-              : clang::diag::warn_drv_unsupported_option_part_for_target;
-      Diags.Report(DiagID) << Value << A->getAsString(DriverArgs)
-                           << getTriple().str();
-    } else {
-      unsigned DiagID =
-          IsExplicitDevice
-              ? clang::diag::err_drv_unsupported_option_for_target
-              : clang::diag::warn_drv_unsupported_option_for_target;
-      Diags.Report(DiagID) << A->getAsString(DAL) << getTriple().str();
-    }
-    return true;
-  }
 };
 
 } // end namespace toolchains
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index 8c7efbf491187..1303e4e0e6799 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -33,8 +33,6 @@ AMDGPUOpenMPToolChain::AMDGPUOpenMPToolChain(const Driver &D,
 void AMDGPUOpenMPToolChain::addClangTargetOptions(
     const llvm::opt::ArgList &DriverArgs, llvm::opt::ArgStringList &CC1Args,
     Action::OffloadKind DeviceOffloadingKind) const {
-  HostTC.addClangTargetOptions(DriverArgs, CC1Args, DeviceOffloadingKind);
-
   assert(DeviceOffloadingKind == Action::OFK_OpenMP &&
          "Only OpenMP offloading kinds are supported.");
 
@@ -53,47 +51,6 @@ void AMDGPUOpenMPToolChain::addClangTargetOptions(
     return;
 }
 
-llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(
-    const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
-    Action::OffloadKind DeviceOffloadKind) const {
-  DerivedArgList *DAL =
-      HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);
-
-  if (!DAL)
-    DAL = new DerivedArgList(Args.getBaseArgs());
-
-  const OptTable &Opts = getDriver().getOpts();
-
-  for (Arg *A : Args) {
-    // Sanitizer coverage is currently not supported for AMDGPU.
-    if (A->getOption().matches(options::OPT_fsan_cov_Group)) {
-      diagnoseUnsupportedOption(A, *DAL, Args);
-      continue;
-    }
-
-    if (A->getOption().matches(options::OPT_fsanitize_EQ) &&
-        !Args.hasFlag(options::OPT_fgpu_sanitize, options::OPT_fno_gpu_sanitize,
-                      true))
-      continue;
-
-    DAL->append(A);
-  }
-
-  if (!BoundArch.empty()) {
-    DAL->eraseArg(options::OPT_march_EQ);
-    DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
-                      BoundArch);
-  }
-
-  return DAL;
-}
-
-void AMDGPUOpenMPToolChain::addClangWarningOptions(
-    ArgStringList &CC1Args) const {
-  AMDGPUToolChain::addClangWarningOptions(CC1Args);
-  HostTC.addClangWarningOptions(CC1Args);
-}
-
 ToolChain::CXXStdlibType
 AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList &Args) const {
   return HostTC.GetCXXStdlibType(Args);
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.h b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
index d030246d02cbb..3eaf4e5e8af91 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.h
@@ -33,15 +33,10 @@ class LLVM_LIBRARY_VISIBILITY AMDGPUOpenMPToolChain final
     return &HostTC.getTriple();
   }
 
-  llvm::opt::DerivedArgList *
-  TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
-                Action::OffloadKind DeviceOffloadKind) const override;
-
   void
   addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
                         llvm::opt::ArgStringList &CC1Args,
                         Action::OffloadKind DeviceOffloadKind) const override;
-  void addClangWarningOptions(llvm::opt::ArgStringList &CC1Args) const override;
   CXXStdlibType GetCXXStdlibType(const llvm::opt::ArgList &Args) const override;
   void AddClangCXXStdlibIncludeArgs(
       const llvm::opt::ArgList &Args,
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp b/clang/lib/Driver/ToolChains/Gnu.cpp
index 131dd725c7289..95453fd7076eb 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -3458,8 +3458,8 @@ llvm::opt::DerivedArgList *
 Generic_GCC::TranslateArgs(const llvm::opt::DerivedArgList &Args,
                            StringRef BoundArch,
                            Action::OffloadKind DeviceOffloadKind) const {
-  if (DeviceOffloadKind != Action::OFK_SYCL &&
-      DeviceOffloadKind != Action::OFK_OpenMP)
+  if (DeviceOffloadKind == Action::OFK_None ||
+      DeviceOffloadKind == Action::OFK_Host)
     return nullptr;
 
   DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
@@ -3485,12 +3485,19 @@ Generic_GCC::TranslateArgs(const llvm::opt::DerivedArgList &Args,
 
   // Add the bound architecture to the arguments list if present.
   if (!BoundArch.empty()) {
-    options::ID Opt =
-        getTriple().isARM() || getTriple().isPPC() || getTriple().isAArch64()
-            ? options::OPT_mcpu_EQ
-            : options::OPT_march_EQ;
-    DAL->eraseArg(Opt);
-    DAL->AddJoinedArg(nullptr, Opts.getOption(Opt), BoundArch);
+    if (getTriple().isAMDGPU()) {
+      for (options::ID OptID : {options::OPT_mcpu_EQ, options::OPT_march_EQ}) {
+        DAL->eraseArg(OptID);
+        DAL->AddJoinedArg(nullptr, Opts.getOption(OptID), BoundArch);
+      }
+    } else {
+      options::ID Opt =
+          getTriple().isARM() || getTriple().isPPC() || getTriple().isAArch64()
+              ? options::OPT_mcpu_EQ
+              : options::OPT_march_EQ;
+      DAL->eraseArg(Opt);
+      DAL->AddJoinedArg(nullptr, Opts.getOption(Opt), BoundArch);
+    }
   }
   return DAL;
 }
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.cpp b/clang/lib/Driver/ToolChains/HIPAMD.cpp
index 9211803aa6a2f..1d833076b0a02 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.cpp
+++ b/clang/lib/Driver/ToolChains/HIPAMD.cpp
@@ -235,8 +235,6 @@ HIPAMDToolChain::HIPAMDToolChain(const Driver &D, const llvm::Triple &Triple,
 void HIPAMDToolChain::addClangTargetOptions(
     const llvm::opt::ArgList &DriverArgs, llvm::opt::ArgStringList &CC1Args,
     Action::OffloadKind DeviceOffloadingKind) const {
-  HostTC.addClangTargetOptions(DriverArgs, CC1Args, DeviceOffloadingKind);
-
   assert(DeviceOffloadingKind == Action::OFK_HIP &&
          "Only HIP offloading kinds are supported for GPUs.");
 
@@ -292,37 +290,14 @@ llvm::opt::DerivedArgList *
 HIPAMDToolChain::TranslateArgs(const llvm::opt::DerivedArgList &Args,
                                StringRef BoundArch,
                                Action::OffloadKind DeviceOffloadKind) const {
-  DerivedArgList *DAL =
-      HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);
-  if (!DAL)
-    DAL = new DerivedArgList(Args.getBaseArgs());
-
-  const OptTable &Opts = getDriver().getOpts();
-
-  for (Arg *A : Args) {
-    // Sanitizer coverage is currently not supported for AMDGPU.
-    if (A->getOption().matches(options::OPT_fsan_cov_Group)) {
-      diagnoseUnsupportedOption(A, *DAL, Args);
-      continue;
-    }
-
-    if (A->getOption().matches(options::OPT_fsanitize_EQ) &&
-        !Args.hasFlag(options::OPT_fgpu_sanitize, options::OPT_fno_gpu_sanitize,
-                      true))
-      continue;
-
-    DAL->append(A);
-  }
+  llvm::opt::DerivedArgList *DAL =
+      ROCMToolChain::TranslateArgs(Args, BoundArch, DeviceOffloadKind);
 
-  if (!BoundArch.empty()) {
-    DAL->eraseArg(options::OPT_mcpu_EQ);
-    DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_mcpu_EQ), BoundArch);
-    checkTargetID(*DAL);
-  }
-
-  if (!Args.hasArg(options::OPT_flto_partitions_EQ))
+  if (!Args.hasArg(options::OPT_flto_partitions_EQ)) {
+    const OptTable &Opts = getDriver().getOpts();
     DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_flto_partitions_EQ),
                       "8");
+  }
 
   return DAL;
 }
@@ -333,11 +308,6 @@ Tool *HIPAMDToolChain::buildLinker() const {
   return new tools::AMDGCN::Linker(*this);
 }
 
-void HIPAMDToolChain::addClangWarningOptions(ArgStringList &CC1Args) const {
-  AMDGPUToolChain::addClangWarningOptions(CC1Args);
-  HostTC.addClangWarningOptions(CC1Args);
-}
-
 ToolChain::CXXStdlibType
 HIPAMDToolChain::GetCXXStdlibType(const ArgList &Args) const {
   return HostTC.GetCXXStdlibType(Args);
diff --git a/clang/lib/Driver/ToolChains/HIPAMD.h b/clang/lib/Driver/ToolChains/HIPAMD.h
index ef1ebb83c1023..fd11da6d9096c 100644
--- a/clang/lib/Driver/ToolChains/HIPAMD.h
+++ b/clang/lib/Driver/ToolChains/HIPAMD.h
@@ -69,7 +69,6 @@ class LLVM_LIBRARY_VISIBILITY HIPAMDToolChain final : public ROCMToolChain {
   addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
                         llvm::opt::ArgStringList &CC1Args,
                         Action::OffloadKind DeviceOffloadKind) const override;
-  void addClangWarningOptions(llvm::opt::ArgStringList &CC1Args) const override;
   CXXStdlibType GetCXXStdlibType(const llvm::opt::ArgList &Args) const override;
   void
   AddClangSystemIncludeArgs(const llvm::opt::ArgList &DriverArgs,
diff --git a/clang/test/Driver/offload-darwin-host.hip b/clang/test/Driver/offload-darwin-host.hip
new file mode 100644
index 0000000000000..3a3ab7267bd70
--- /dev/null
+++ b/clang/test/Driver/offload-darwin-host.hip
@@ -0,0 +1,8 @@
+// Make sure the darwin host toolchain does not assert with offload
+// languages. The darwin toolchain has a platform initialization hack which was
+// broken by offload toolchains trying to perform extra host argument handling.
+
+// RUN: %clang -### --target=arm64-apple-darwin24.6.0 -mmacos-version-min=10.9 --offload-arch=gfx900 -nogpulib -nogpuinc %s 2>&1 | FileCheck %s
+
+// CHECK: "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "arm64-apple-darwin24.6.0"
+// CHECK: "-cc1" "-triple" "arm64-apple-macosx10.9.0" "-aux-triple" "amdgcn-amd-amdhsa"
diff --git a/clang/test/Driver/openmp-Xopenmp-target-forward-args.c b/clang/test/Driver/openmp-Xopenmp-target-forward-args.c
new file mode 100644
index 0000000000000..5749df973ed78
--- /dev/null
+++ b/clang/test/Driver/openmp-Xopenmp-target-forward-args.c
@@ -0,0 +1,10 @@
+// Check that -Xopenmp-target forwards arbitrary arguments, not just -march
+
+// RUN:   %clang -### --target=x86_64-pc-linux -no-canonical-prefixes -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx900 -Xopenmp-target=amdgcn-amd-amdhsa -ffinite-math-only -nogpulib %s 2>&1 \
+// RUN:   | FileCheck %s
+
+// Flag should only apply to device, not the host.
+
+// CHECK-NOT: -ffinite-math-only
+// CHECK: "-triple" "amdgcn-amd-amdhsa" {{.*}} "-ffinite-math-only"
+// CHECK-NOT: -ffinite-math-only

Copy link
Copy Markdown
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely in favor of cleaning this up

if (!DAL)
if (!DAL) {
DAL = new DerivedArgList(Args.getBaseArgs());
for (Arg *A : Args)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there an issue when we did this unconditionally?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base implementation is doing the same thing, so this was duplicating all the flags but I don't remember what the symptom was

Comment thread clang/lib/Driver/ToolChains/Gnu.cpp Outdated
: options::OPT_march_EQ;
DAL->eraseArg(Opt);
DAL->AddJoinedArg(nullptr, Opts.getOption(Opt), BoundArch);
if (getTriple().isAMDGPU()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be split on AMDGPU?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other targets were choosing -mcpu or -march. This is handling both, so

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really hate the fact that we use -march on AMDGPU for OpenMP. Do you think there's a way we could just rewrite that early on so we don't need to pretend like it every works? Alternatively, we just make it an error and make people update their compiler flags. It's been at least like 3 or 4 years since we started telling people to use --offload-arch instead, and -march only exists for the old OpenMP path.

Copy link
Copy Markdown
Contributor Author

@arsenm arsenm May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC -march is the preferred one for x86. I don't see why we can't just handle both. Not sure where it would be replaced

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only need to check -mcpu now. AMDGPU accepting -march seems to have mostly snuck in at some point and no one noticed, but I at least tried to make it consistent internally.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as naming -march kind of makes more sense for a GPU

@arsenm arsenm force-pushed the users/arsenm/clang/amdgpu-fix-using-host-TranslateArgs-not-base-toolchain branch from d42c150 to 7d4122a Compare May 21, 2026 12:15
@arsenm arsenm force-pushed the users/arsenm/clang/refactor-offload-sanitizer-arg-diagnostics branch from c8d00a6 to 446822b Compare May 21, 2026 12:15
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 21, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@arsenm arsenm force-pushed the users/arsenm/clang/amdgpu-fix-using-host-TranslateArgs-not-base-toolchain branch 2 times, most recently from aad5469 to 3346c9f Compare May 21, 2026 12:43
arsenm and others added 2 commits May 21, 2026 16:20
Previously the AMDGPU toolchains hackily handled -fsanitize arguments.
They would lie and report that all host side sanitizers are available,
then TranslateArgs would filter out the device side cases that do not
work, providing diagnostics for the skipped cases. Move that logic
into the base sanitizer argument parsing.

This makes the produced diagnostics more consistent. Previously we
would get repeated warnings when a sanitizer is fully unsupported
by amdgpu, which should now be once for the toolchain. These could
be further improved; we're printing the specific field of -fsanitize
in more cases where it could be skipped. In other cases we have the
opposite problem, where we aren't reporting the exact sanitizer
from the -f flag in the case that depends on a subtarget feature.

This will help fix other broken target specific flag forwarding bugs
in the future.

Co-authored-by: Claude Sonnet 4 <noreply@anthropic.com>
…e host

This fixes -Xopenmp-target / -Xarch for arbitrary arguments. HIP and OpenMP
had cargo-cult broken implementations of TranslateArgs, which called the host
toolchain's implementation, and then special case  transferred either -march
or -mcpu to the device argument list. The respective device forwarding flags
should work for any argument, not just this one. The main feature that needs
to be preserved is the shared filtering of unsupported sanitizers to degrade
them into warnings.

Most of the changes here are dealing with fallout observed when
the host target is darwin. The darwin toolchain happens to have
some hacky statefulness tracking the compile target version, which
gets written and rewritten on argument parsing. To maintain this hack,
there are a few unused calls to getArgsForToolChain; start passing OFK_Host
to these so the offload toolchains don't get confused and think they're in
a non-offload context.
@arsenm arsenm force-pushed the users/arsenm/clang/amdgpu-fix-using-host-TranslateArgs-not-base-toolchain branch from 3346c9f to e48fb36 Compare May 21, 2026 15:21
@arsenm arsenm force-pushed the users/arsenm/clang/refactor-offload-sanitizer-arg-diagnostics branch from 446822b to 6a13b99 Compare May 21, 2026 15:21
Base automatically changed from users/arsenm/clang/refactor-offload-sanitizer-arg-diagnostics to main May 21, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants