Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][FatLTO][UnifiedLTO] Pass -enable-matrix to the LTO driver #77829

Merged
merged 1 commit into from
Jan 12, 2024

Conversation

ormris
Copy link
Collaborator

@ormris ormris commented Jan 11, 2024

Unified LTO and Fat LTO do not use the regular LTO prelink pipeline when -flto/-flto=full is specified on the command line, thus they require LowerMatrixIntrinsicsPass to be run during the link stage. To enable this, we pass -enable-matrix to the LTO driver, replicating ThinLTO behavior.

This fixes #77621.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jan 11, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Jan 11, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Matthew Voss (ormris)

Changes

Unified LTO and Fat LTO do not use the regular LTO prelink pipeline when -flto/-flto=full is specified on the command line, thus they require LowerMatrixIntrinsicsPass to be run during the link stage. To enable this, we pass -enable-matrix to the LTO driver, replicating ThinLTO behavior.

This fixes #77621.


Full diff: https://github.com/llvm/llvm-project/pull/77829.diff

2 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+5-2)
  • (modified) clang/test/Driver/matrix.c (+12)
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 2340191ca97d98..385f66f3782bc1 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -736,6 +736,8 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
   const bool IsAMDGCN = ToolChain.getTriple().isAMDGCN();
   const char *Linker = Args.MakeArgString(ToolChain.GetLinkerPath());
   const Driver &D = ToolChain.getDriver();
+  const bool IsFatLTO = Args.hasArg(options::OPT_ffat_lto_objects);
+  const bool IsUnifiedLTO = Args.hasArg(options::OPT_funified_lto);
   if (llvm::sys::path::filename(Linker) != "ld.lld" &&
       llvm::sys::path::stem(Linker) != "ld.lld" &&
       !ToolChain.getTriple().isOSOpenBSD()) {
@@ -765,7 +767,7 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
   } else {
     // Tell LLD to find and use .llvm.lto section in regular relocatable object
     // files
-    if (Args.hasArg(options::OPT_ffat_lto_objects))
+    if (IsFatLTO)
       CmdArgs.push_back("--fat-lto-objects");
   }
 
@@ -825,7 +827,8 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
   // Matrix intrinsic lowering happens at link time with ThinLTO. Enable
   // LowerMatrixIntrinsicsPass, which is transitively called by
   // buildThinLTODefaultPipeline under EnableMatrix.
-  if (IsThinLTO && Args.hasArg(options::OPT_fenable_matrix))
+  if ((IsThinLTO || IsFatLTO || IsUnifiedLTO) &&
+        Args.hasArg(options::OPT_fenable_matrix))
     CmdArgs.push_back(
         Args.MakeArgString(Twine(PluginOptPrefix) + "-enable-matrix"));
 
diff --git a/clang/test/Driver/matrix.c b/clang/test/Driver/matrix.c
index 15b44ce5a4ec15..4d2624ad39c16e 100644
--- a/clang/test/Driver/matrix.c
+++ b/clang/test/Driver/matrix.c
@@ -6,3 +6,15 @@
 // RUN: %clang -flto=thin -fenable-matrix %t.o -### --target=x86_64-unknown-linux 2>&1 \
 // RUN:     | FileCheck %s -check-prefix=CHECK-THINLTO-MATRIX
 // CHECK-THINLTO-MATRIX: "-plugin-opt=-enable-matrix"
+
+// RUN: %clang -flto -ffat-lto-objects -fenable-matrix %t.o -### --target=x86_64-unknown-linux 2>&1 \
+// RUN:     | FileCheck %s -check-prefix=CHECK-THINLTO-MATRIX
+
+// RUN: %clang -flto=thin -ffat-lto-objects -fenable-matrix %t.o -### --target=x86_64-unknown-linux 2>&1 \
+// RUN:     | FileCheck %s -check-prefix=CHECK-THINLTO-MATRIX
+
+// RUN: %clang -flto -funified-lto -fenable-matrix %t.o -### --target=x86_64-unknown-linux 2>&1 \
+// RUN:     | FileCheck %s -check-prefix=CHECK-THINLTO-MATRIX
+
+// RUN: %clang -flto=thin -funified-lto -fenable-matrix %t.o -### --target=x86_64-unknown-linux 2>&1 \
+// RUN:     | FileCheck %s -check-prefix=CHECK-THINLTO-MATRIX

Copy link

github-actions bot commented Jan 11, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 2bb511e277e501d3faa0f2da0d1c98ea0b515507 b8d3ec5cde2b311747aec83d316af370202b37c1 -- clang/lib/Driver/ToolChains/CommonArgs.cpp clang/test/Driver/matrix.c
View the diff from clang-format here.
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 385f66f378..6305085087 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -828,7 +828,7 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
   // LowerMatrixIntrinsicsPass, which is transitively called by
   // buildThinLTODefaultPipeline under EnableMatrix.
   if ((IsThinLTO || IsFatLTO || IsUnifiedLTO) &&
-        Args.hasArg(options::OPT_fenable_matrix))
+      Args.hasArg(options::OPT_fenable_matrix))
     CmdArgs.push_back(
         Args.MakeArgString(Twine(PluginOptPrefix) + "-enable-matrix"));
 

@MaskRay
Copy link
Member

MaskRay commented Jan 11, 2024

The description should mention https://reviews.llvm.org/D153583 ("[ThinLTO][Matrix] Forward -enable-matrix flag to the LTO plugin"). I think this is correct when used in the ThinLTO mode, but it's not clear whether regular LTO mode is correct (perhaps correct but redundant).

@ormris
Copy link
Collaborator Author

ormris commented Jan 12, 2024

Thanks! I'll make sure to include that review in the commit message. Unified LTO is using the ThinLTO pre-link pipeline, regardless of what's passed to the driver. Since the ThinLTO pre-link pipeline doesn't provide this pass, this option should be required when Unified LTO is used.

Unified LTO and Fat LTO do not use the regular LTO prelink pipeline when
-flto/-flto=full is specified on the command line, thus they require
LowerMatrixIntrinsicsPass to be run during the link stage. To enable
this, we pass -enable-matrix to the LTO driver, replicating ThinLTO
behavior. This fix was applied to ThinLTO in https://reviews.llvm.org/D153583.

This fixes llvm#77621.
@ormris ormris merged commit f626b1f into llvm:main Jan 12, 2024
2 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FatLTO][UnifiedLTO] Unified lto pipeline sometimes misses running LowerMatrixIntrinsicsPass
3 participants