Skip to content

Conversation

tuliom
Copy link
Contributor

@tuliom tuliom commented Sep 5, 2025

This is not feature complete yet, but is able to link simple executables.

The feature is disabled by default and can be enable via
-DLLD_LINK_GPL3:BOOL=ON -DLLD_ENABLE_GNU_LTO:BOOL=ON.

This implementation reuses GCC's plugin liblto_plugin.so, loading it at
runtime, when necessary.

Modify current tests in order to support -plugin.
Rename classes BitcodeFile and BitcodeCompiler to IRFile and IRCompiler
respectively.  This helps to reuse their code in the new GCC-related
classes while keeping their names semantically correct.

Part of the code from IRCompiler that is specific to LLVM got moved to a
new class called BitcodeCompiler.
The new class GccIRCompiler abstracts the interface to GCC's
liblto_plugin.so.
Implement the most basic functions from GccIRCompiler that let LLD link
simple executables.

Add a config.h file in order to detect at configure time if
liblto_plugin supports version 2 of LDPT_REGISTER_CLAIM_FILE_HOOK.
Add LLD_LINK_GPL3 and LLD_ENABLE_GNULTO in order to control if LLD can
link/load to GPLv3 code and enabling support for the GNU LTO format
supported by GCC.
This test ensures that GCC and LLD are both used and the final binary
has GNU LTO.
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-lld

Author: Tulio Magno Quites Machado Filho (tuliom)

Changes

This is not feature complete yet, but is able to link simple executables.

The feature is disabled by default and can be enable via
-DLLD_LINK_GPL3:BOOL=ON -DLLD_ENABLE_GNU_LTO:BOOL=ON.

This implementation reuses GCC's plugin liblto_plugin.so, loading it at
runtime, when necessary.


Patch is 29.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157175.diff

18 Files Affected:

  • (modified) lld/CMakeLists.txt (+8)
  • (modified) lld/ELF/CMakeLists.txt (+9)
  • (modified) lld/ELF/Config.h (+6-1)
  • (modified) lld/ELF/Driver.cpp (+69-5)
  • (modified) lld/ELF/InputFiles.cpp (+3-3)
  • (modified) lld/ELF/InputFiles.h (+16-5)
  • (modified) lld/ELF/LTO.cpp (+241-5)
  • (modified) lld/ELF/LTO.h (+69-6)
  • (modified) lld/ELF/Options.td (+4-4)
  • (added) lld/cmake/modules/FindGNULTO.cmake (+38)
  • (added) lld/include/lld/config.h.cmake (+16)
  • (added) lld/test/ELF/Inputs/plugin.so ()
  • (added) lld/test/ELF/gnu-lto/hello.test (+28)
  • (removed) lld/test/ELF/ignore-plugin.test (-2)
  • (modified) lld/test/ELF/lto-plugin-ignore.s (-1)
  • (added) lld/test/ELF/plugin.test (+21)
  • (modified) lld/test/lit.cfg.py (+20)
  • (modified) lld/test/lit.site.cfg.py.in (+2)
diff --git a/lld/CMakeLists.txt b/lld/CMakeLists.txt
index 80e25204a65ee..9481d71b0ca57 100644
--- a/lld/CMakeLists.txt
+++ b/lld/CMakeLists.txt
@@ -177,6 +177,14 @@ if (LLD_DEFAULT_LD_LLD_IS_MINGW)
   add_definitions("-DLLD_DEFAULT_LD_LLD_IS_MINGW=1")
 endif()
 
+option(LLD_LINK_GPL3 "Allow LLD to link to GPLv3-licensed files")
+option(LLD_ENABLE_GNU_LTO "Enable support for GNU LTO plugin")
+if(!LLD_LINK_GPL3)
+  # Support for GNU LTO require linking to liblto_plugin.so from GCC which is
+  # licensed as GPLv3.
+  set(LLD_ENABLE_GNU_LTO FALSE)
+endif()
+
 if (MSVC)
   add_definitions(-wd4530) # Suppress 'warning C4530: C++ exception handler used, but unwind semantics are not enabled.'
   add_definitions(-wd4062) # Suppress 'warning C4062: enumerator X in switch of enum Y is not handled' from system header.
diff --git a/lld/ELF/CMakeLists.txt b/lld/ELF/CMakeLists.txt
index ec3f6382282b1..82121cb6b2422 100644
--- a/lld/ELF/CMakeLists.txt
+++ b/lld/ELF/CMakeLists.txt
@@ -18,6 +18,15 @@ if(LLVM_ENABLE_ZSTD)
   list(APPEND imported_libs ${zstd_target})
 endif()
 
+set(GNULTO_INCLUDE_DIR "" CACHE PATH "Additional directory, where CMake should search for plugin-api.h")
+if(LLD_ENABLE_GNU_LTO)
+  find_package(GNULTO REQUIRED)
+endif()
+
+configure_file(
+  ${CMAKE_CURRENT_SOURCE_DIR}/../include/lld/config.h.cmake
+  ${CMAKE_CURRENT_BINARY_DIR}/../include/lld/config.h)
+
 add_lld_library(lldELF
   AArch64ErrataFix.cpp
   Arch/AArch64.cpp
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index a83a4c1176f6f..bb9638387a15f 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -54,6 +54,7 @@ class TargetInfo;
 struct Ctx;
 struct Partition;
 struct PhdrEntry;
+class IRCompiler;
 
 class BssSection;
 class GdbIndexSection;
@@ -191,6 +192,7 @@ class LinkerDriver {
   void inferMachineType();
   template <class ELFT> void link(llvm::opt::InputArgList &args);
   template <class ELFT> void compileBitcodeFiles(bool skipLinkedOutput);
+  template <class ELFT> void compileGccIRFiles(bool skipLinkedOutput);
   bool tryAddFatLTOFile(MemoryBufferRef mb, StringRef archiveName,
                         uint64_t offsetInArchive, bool lazy);
   // True if we are in --whole-archive and --no-whole-archive.
@@ -199,7 +201,7 @@ class LinkerDriver {
   // True if we are in --start-lib and --end-lib.
   bool inLib = false;
 
-  std::unique_ptr<BitcodeCompiler> lto;
+  std::unique_ptr<IRCompiler> lto;
   SmallVector<std::unique_ptr<InputFile>, 0> files, ltoObjectFiles;
 
 public:
@@ -241,9 +243,12 @@ struct Config {
   llvm::StringRef optRemarksPasses;
   llvm::StringRef optRemarksFormat;
   llvm::StringRef optStatsFilename;
+  llvm::StringRef plugin;
+  llvm::SmallVector<std::string, 0> pluginOpt;
   llvm::StringRef progName;
   llvm::StringRef printArchiveStats;
   llvm::StringRef printSymbolOrder;
+  llvm::StringRef resolutionFile;
   llvm::StringRef soName;
   llvm::StringRef sysroot;
   llvm::StringRef thinLTOCacheDir;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6c2f318ffe469..e0510edd59861 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1786,6 +1786,14 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
 
   cl::ResetAllOptionOccurrences();
 
+  // Ignore -plugin=LLVMgold.so because we don't need to load it.
+  StringRef v = args.getLastArgValue(OPT_plugin);
+  if (!v.empty() && !v.ends_with("LLVMgold.so"))
+    if (!llvm::sys::fs::exists(v))
+      ErrAlways(ctx) << "Cannot find plugin " << v;
+    else
+      ctx.arg.plugin = v;
+
   // Parse LTO options.
   if (auto *arg = args.getLastArg(OPT_plugin_opt_mcpu_eq))
     parseClangOption(ctx, ctx.saver.save("-mcpu=" + StringRef(arg->getValue())),
@@ -1796,14 +1804,31 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
                      arg->getSpelling());
 
   // GCC collect2 passes -plugin-opt=path/to/lto-wrapper with an absolute or
-  // relative path. Just ignore. If not ended with "lto-wrapper" (or
-  // "lto-wrapper.exe" for GCC cross-compiled for Windows), consider it an
-  // unsupported LLVMgold.so option and error.
+  // relative path. If not ended with "lto-wrapper" (or "lto-wrapper.exe" for
+  // GCC cross-compiled for Windows), consider it an unsupported LLVMgold.so
+  // option and error.
   for (opt::Arg *arg : args.filtered(OPT_plugin_opt_eq)) {
     StringRef v(arg->getValue());
     if (!v.ends_with("lto-wrapper") && !v.ends_with("lto-wrapper.exe"))
       ErrAlways(ctx) << arg->getSpelling() << ": unknown plugin option '"
                      << arg->getValue() << "'";
+    else if (!ctx.arg.plugin.empty())
+#if LLD_ENABLE_GNU_LTO
+        ctx.arg.pluginOpt.push_back(v.str());
+#else
+      ErrAlways(ctx) << arg->getSpelling() << " : support for GNU LTO is disabled";
+#endif
+  }
+
+  // Parse GCC collect2 options.
+  if (!ctx.arg.plugin.empty()) {
+#if LLD_ENABLE_GNU_LTO
+    StringRef v = args.getLastArgValue(OPT_plugin_opt_fresolution);
+    if (!v.empty()) {
+      ctx.arg.resolutionFile = v;
+      ctx.arg.pluginOpt.push_back(std::string("-fresolution=" + v.str()));
+    }
+#endif
   }
 
   ctx.arg.passPlugins = args::getStrings(args, OPT_load_pass_plugins);
@@ -2683,7 +2708,7 @@ static void markBuffersAsDontNeed(Ctx &ctx, bool skipLinkedOutput) {
 }
 
 // This function is where all the optimizations of link-time
-// optimization takes place. When LTO is in use, some input files are
+// optimization takes place. When LLVM LTO is in use, some input files are
 // not in native object file format but in the LLVM bitcode format.
 // This function compiles bitcode files into a few big native files
 // using LLVM functions and replaces bitcode symbols with the results.
@@ -2732,6 +2757,39 @@ void LinkerDriver::compileBitcodeFiles(bool skipLinkedOutput) {
   }
 }
 
+#if LLD_ENABLE_GNU_LTO
+template <class ELFT>
+void LinkerDriver::compileGccIRFiles(bool skipLinkedOutput) {
+  llvm::TimeTraceScope timeScope("LTO");
+  // Compile files and replace symbols.
+  GccIRCompiler *c = GccIRCompiler::getInstance(ctx);
+  lto.reset(c);
+
+  for (ELFFileBase *file : ctx.objectFiles)
+    c->add(*file);
+
+  ltoObjectFiles = c->compile();
+  for (auto &file : ltoObjectFiles) {
+    auto *obj = cast<ObjFile<ELFT>>(file.get());
+    obj->parse(/*ignoreComdats=*/true);
+
+    // For defined symbols in non-relocatable output,
+    // compute isExported and parse '@'.
+    if (!ctx.arg.relocatable)
+      for (Symbol *sym : obj->getGlobalSymbols()) {
+        if (!sym->isDefined())
+          continue;
+        if (ctx.arg.exportDynamic && sym->computeBinding(ctx) != STB_LOCAL)
+          sym->isExported = true;
+        if (sym->hasVersionSuffix)
+          sym->parseSymbolVersion(ctx);
+      }
+    ctx.objectFiles.push_back(obj);
+  }
+  return;
+}
+#endif
+
 // The --wrap option is a feature to rename symbols so that you can write
 // wrappers for existing functions. If you pass `--wrap=foo`, all
 // occurrences of symbol `foo` are resolved to `__wrap_foo` (so, you are
@@ -3289,7 +3347,13 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
   // except a few linker-synthesized ones will be added to the symbol table.
   const size_t numObjsBeforeLTO = ctx.objectFiles.size();
   const size_t numInputFilesBeforeLTO = ctx.driver.files.size();
-  compileBitcodeFiles<ELFT>(skipLinkedOutput);
+  if (ctx.arg.plugin.empty()) {
+    compileBitcodeFiles<ELFT>(skipLinkedOutput);
+#if LLD_ENABLE_GNU_LTO
+  } else {
+    compileGccIRFiles<ELFT>(skipLinkedOutput);
+#endif
+  }
 
   // Symbol resolution finished. Report backward reference problems,
   // --print-archive-stats=, and --why-extract=.
diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index a5921feb18299..8b9b55c4a5d47 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -1845,8 +1845,8 @@ static bool dtltoAdjustMemberPathIfThinArchive(Ctx &ctx, StringRef archivePath,
   return true;
 }
 
-BitcodeFile::BitcodeFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
-                         uint64_t offsetInArchive, bool lazy)
+IRFile::IRFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
+               uint64_t offsetInArchive, bool lazy)
     : InputFile(ctx, BitcodeKind, mb) {
   this->archiveName = archiveName;
   this->lazy = lazy;
@@ -1958,7 +1958,7 @@ void BitcodeFile::parse() {
     addDependentLibrary(ctx, l, this);
 }
 
-void BitcodeFile::parseLazy() {
+void IRFile::parseLazy() {
   numSymbols = obj->symbols().size();
   symbols = std::make_unique<Symbol *[]>(numSymbols);
   for (auto [i, irSym] : llvm::enumerate(obj->symbols())) {
diff --git a/lld/ELF/InputFiles.h b/lld/ELF/InputFiles.h
index ba844ad18f637..551fbb85cb84f 100644
--- a/lld/ELF/InputFiles.h
+++ b/lld/ELF/InputFiles.h
@@ -16,6 +16,7 @@
 #include "lld/Common/Reproduce.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/BinaryFormat/Magic.h"
+#include "llvm/LTO/LTO.h"
 #include "llvm/Object/ELF.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Threading.h"
@@ -321,15 +322,25 @@ template <class ELFT> class ObjFile : public ELFFileBase {
   ArrayRef<Elf_Word> shndxTable;
 };
 
-class BitcodeFile : public InputFile {
+class IRFile : public InputFile {
 public:
-  BitcodeFile(Ctx &, MemoryBufferRef m, StringRef archiveName,
-              uint64_t offsetInArchive, bool lazy);
+  IRFile(Ctx &ctx, MemoryBufferRef m, StringRef archiveName, uint64_t offsetInArchive,
+         bool lazy);
   static bool classof(const InputFile *f) { return f->kind() == BitcodeKind; }
-  void parse();
+  virtual void parse() = 0;
   void parseLazy();
-  void postParse();
+  virtual void postParse() = 0;
   std::unique_ptr<llvm::lto::InputFile> obj;
+};
+
+class BitcodeFile : public IRFile {
+public:
+  BitcodeFile(Ctx &ctx, MemoryBufferRef m, StringRef archiveName,
+              uint64_t offsetInArchive, bool lazy)
+      : IRFile(ctx, m, archiveName, offsetInArchive, lazy) {};
+  static bool classof(const InputFile *f) { return f->kind() == BitcodeKind; }
+  void parse() override;
+  void postParse() override;
   std::vector<bool> keptComdats;
 };
 
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 8d4a6c9e3a81e..dde63aa754975 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -26,6 +26,8 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include <cstddef>
+#include <cstring>
+#include <dlfcn.h>
 #include <memory>
 #include <string>
 #include <system_error>
@@ -165,7 +167,7 @@ static lto::Config createConfig(Ctx &ctx) {
   return c;
 }
 
-BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : ctx(ctx) {
+BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : IRCompiler(ctx) {
   // Initialize indexFile.
   if (!ctx.arg.thinLTOIndexOnlyArg.empty())
     indexFile = openFile(ctx.arg.thinLTOIndexOnlyArg);
@@ -215,9 +217,7 @@ BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : ctx(ctx) {
   }
 }
 
-BitcodeCompiler::~BitcodeCompiler() = default;
-
-void BitcodeCompiler::add(BitcodeFile &f) {
+void IRCompiler::add(IRFile &f) {
   lto::InputFile &obj = *f.obj;
   bool isExec = !ctx.arg.shared && !ctx.arg.relocatable;
 
@@ -278,7 +278,7 @@ void BitcodeCompiler::add(BitcodeFile &f) {
     // their values are still not final.
     r.LinkerRedefined = sym->scriptDefined;
   }
-  checkError(ctx.e, ltoObj->add(std::move(f.obj), resols));
+  addObject(f, resols);
 }
 
 // If LazyObjFile has not been added to link, emit empty index files.
@@ -421,3 +421,239 @@ SmallVector<std::unique_ptr<InputFile>, 0> BitcodeCompiler::compile() {
   }
   return ret;
 }
+
+void BitcodeCompiler::addObject(IRFile &f,
+                                std::vector<llvm::lto::SymbolResolution> &r) {
+  checkError(ctx.e, ltoObj->add(std::move(f.obj), r));
+}
+
+#if LLD_ENABLE_GNU_LTO
+GccIRCompiler *GccIRCompiler::singleton = nullptr;
+
+ GccIRCompiler *GccIRCompiler::getInstance() {
+  assert(singleton != nullptr);
+  return singleton;
+}
+
+GccIRCompiler *GccIRCompiler::getInstance(Ctx &ctx) {
+  if (singleton == nullptr) {
+    singleton = new GccIRCompiler(ctx);
+    singleton->loadPlugin();
+  }
+
+  return singleton;
+}
+
+GccIRCompiler::GccIRCompiler(Ctx &ctx) : IRCompiler(ctx) {
+  singleton = nullptr;
+
+  // TODO: Properly find the right size.
+  int tvsz = 100;
+  tv = new ld_plugin_tv[tvsz];
+  initializeTv();
+}
+
+GccIRCompiler::~GccIRCompiler() {
+  singleton = nullptr;
+  delete[] tv;
+}
+
+void GccIRCompiler::loadPlugin() {
+  plugin = dlopen(ctx.arg.plugin.data(), RTLD_NOW);
+  if (plugin == NULL) {
+    error(dlerror());
+    return;
+  }
+  void *tmp = dlsym(plugin, "onload");
+  if (tmp == NULL) {
+    error("Plugin does not provide onload()");
+    return;
+  }
+
+  ld_plugin_onload onload;
+  // Ensure source and destination types have the same size.
+  assert(sizeof(ld_plugin_onload) == sizeof(void *));
+  std::memcpy(&onload, &tmp, sizeof(ld_plugin_onload));
+
+  (*onload)(tv);
+}
+
+enum ld_plugin_status regClaimFile(ld_plugin_claim_file_handler handler) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+  return c->registerClaimFile(handler);
+}
+
+enum ld_plugin_status
+GccIRCompiler::registerClaimFile(ld_plugin_claim_file_handler handler) {
+  claimFileHandler = handler;
+  return LDPS_OK;
+}
+
+#if HAVE_LDPT_REGISTER_CLAIM_FILE_HOOK_V2
+enum ld_plugin_status regClaimFileV2(ld_plugin_claim_file_handler handler) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+  return c->registerClaimFileV2(handler);
+}
+
+enum ld_plugin_status
+GccIRCompiler::registerClaimFileV2(ld_plugin_claim_file_handler_v2 handler) {
+  claimFileHandlerV2 = handler;
+  return LDPS_OK;
+}
+#endif
+
+enum ld_plugin_status regAllSymbolsRead(ld_plugin_all_symbols_read_handler handler) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+  return c->registerAllSymbolsRead(handler);
+}
+
+enum ld_plugin_status
+GccIRCompiler::registerAllSymbolsRead(ld_plugin_all_symbols_read_handler handler) {
+  allSymbolsReadHandler = handler;
+  return LDPS_OK;
+}
+
+static enum ld_plugin_status addSymbols(void *handle, int nsyms,
+                                        const struct ld_plugin_symbol *syms) {
+  ELFFileBase *f = (ELFFileBase *) handle;
+  if(f == NULL)
+    return LDPS_ERR;
+
+  for (int i = 0; i < nsyms; i++) {
+    // TODO: Add symbols.
+    // TODO: Convert these symbosl into ArrayRef<lto::InputFile::Symbol> and
+    // ArrayRef<Symbol *> ?
+  }
+
+  return LDPS_OK;
+}
+
+static enum ld_plugin_status getSymbols(const void *handle, int nsyms,
+                                        struct ld_plugin_symbol *syms) {
+  for (int i = 0; i < nsyms; i++) {
+    syms[i].resolution = LDPR_UNDEF;
+    // TODO: Implement other scenarios.
+  }
+  return LDPS_OK;
+}
+
+ld_plugin_status addInputFile(const char *pathname) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+
+  if (c->addCompiledFile(StringRef(pathname)))
+    return LDPS_OK;
+  else
+    return LDPS_ERR;
+}
+
+void GccIRCompiler::initializeTv() {
+  int i = 0;
+
+#define TVU_SETTAG(t, f, v)                                                    \
+  {                                                                            \
+    tv[i].tv_tag = t;                                                          \
+    tv[i].tv_u.tv_##f = v;                                                     \
+    i++;                                                                       \
+  }
+
+  TVU_SETTAG(LDPT_MESSAGE, message, message);
+  TVU_SETTAG(LDPT_API_VERSION, val, LD_PLUGIN_API_VERSION);
+  for (std::string &s : ctx.arg.pluginOpt) {
+    TVU_SETTAG(LDPT_OPTION, string, s.c_str());
+  }
+  ld_plugin_output_file_type o;
+  if (ctx.arg.pie)
+    o = LDPO_PIE;
+  else if (ctx.arg.relocatable)
+    o = LDPO_REL;
+  else if (ctx.arg.shared)
+    o = LDPO_DYN;
+  else
+    o = LDPO_EXEC;
+  TVU_SETTAG(LDPT_LINKER_OUTPUT, val, o);
+  TVU_SETTAG(LDPT_OUTPUT_NAME, string, ctx.arg.outputFile.data());
+  // Share the address of a C wrapper that is API-compatible with
+  // plugin-api.h.
+  TVU_SETTAG(LDPT_REGISTER_CLAIM_FILE_HOOK, register_claim_file, regClaimFile);
+#if HAVE_LDPT_REGISTER_CLAIM_FILE_HOOK_V2
+  TVU_SETTAG(LDPT_REGISTER_CLAIM_FILE_HOOK_V2, register_claim_file_v2,
+             regClaimFileV2);
+#endif
+
+  TVU_SETTAG(LDPT_ADD_SYMBOLS, add_symbols, addSymbols);
+  TVU_SETTAG(LDPT_REGISTER_ALL_SYMBOLS_READ_HOOK, register_all_symbols_read,
+             regAllSymbolsRead);
+  TVU_SETTAG(LDPT_GET_SYMBOLS, get_symbols, getSymbols);
+  TVU_SETTAG(LDPT_ADD_INPUT_FILE, add_input_file, addInputFile);
+}
+
+void GccIRCompiler::add(ELFFileBase &f) {
+  struct ld_plugin_input_file file;
+
+  std::string name = f.getName().str();
+  file.name = f.getName().data();
+  file.handle = const_cast<void *>(reinterpret_cast<const void *>(&f));
+
+  std::error_code ec = sys::fs::openFileForRead(name, file.fd);
+  if (ec) {
+    error("Cannot open file " + name + ": " + ec.message());
+    return;
+  }
+  file.offset = 0;
+  uint64_t size;
+  ec = sys::fs::file_size(name, size);
+  if (ec) {
+    error("Cannot get the size of file " + name + ": " + ec.message());
+    sys::fs::closeFile(file.fd);
+    return;
+  }
+  if (size > 0 && size <= INT_MAX)
+    file.filesize = size;
+
+  int claimed;
+#if HAVE_LDPT_REGISTER_CLAIM_FILE_HOOK_V2
+  ld_plugin_status status = claimFileHandler(&file, &claimed, 1);
+#else
+  ld_plugin_status status = claimFileHandler(&file, &claimed);
+#endif
+
+  if (status != LDPS_OK)
+    error("liblto returned " + std::to_string(status));
+
+  ec = sys::fs::closeFile(file.fd);
+  if (ec) {
+    error(ec.message());
+  }
+}
+
+SmallVector<std::unique_ptr<InputFile>, 0> GccIRCompiler::compile() {
+  SmallVector<std::unique_ptr<InputFile>, 0> ret;
+  ld_plugin_status status = allSymbolsReadHandler();
+  if (status != LDPS_OK)
+    error("The plugin returned an error after all symbols were read.");
+
+  for (auto& m : files) {
+    ret.push_back(createObjFile(ctx, m->getMemBufferRef()));
+  }
+  return ret;
+}
+
+void GccIRCompiler::addObject(IRFile &f,
+                              std::vector<llvm::lto::SymbolResolution> &r) {
+  // TODO: Implement this.
+}
+
+enum ld_plugin_status GccIRCompiler::message(int level, const char *format,
+                                             ...) {
+  // TODO: Implement this function.
+  return LDPS_OK;
+}
+
+bool GccIRCompiler::addCompiledFile(StringRef path) {
+  std::optional<MemoryBufferRef> mbref = readFile(ctx, path);
+  if (!mbref)
+    return false;
+  files.push_back(std::move(MemoryBuffer::getMemBuffer(*mbref)));
+  return true;
+}
+#endif
diff --git a/lld/ELF/LTO.h b/lld/ELF/LTO.h
index acf3bcff7f2f1..3dac2eddfd532 100644
--- a/lld/ELF/LTO.h
+++ b/lld/ELF/LTO.h
@@ -21,40 +21,103 @@
 #define LLD_ELF_LTO_H
 
 #include "lld/Common/LLVM.h"
+#include "lld/config.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/Support/raw_ostream.h"
 #include <memory>
+#include <plugin-api.h>
 #include <vector>
 
 namespace llvm::lto {
 class LTO;
+class SymbolResolution;
 }
 
 namespace lld::elf {
 struct Ctx;
 class BitcodeFile;
+class ELFFileBase;
 class InputFile;
+class IRFile;
+class BinaryFile;
+
+class IRCompiler {
+protected:
+  Ctx &ctx;
+  llvm::DenseSet<StringRef> thinIndices;
+  llvm::DenseSet<StringRef> usedStartStop;
+  virtual void addObject(IRFile &f,
+                         std::vector<llvm::lto::SymbolResolution> &r) = 0;
+
+public:
+  IRCompiler(Ctx &ctx) : ctx(ctx) {}
+  void add(IRFile &f);
+  virtual SmallVector<std::unique_ptr<InputFile>, 0> compile() = 0;
+};
+
+class BitcodeCompiler : public IRCompiler {
+protected:
+  void addObject(IRFile &f,
+                 std::vector<llvm::lto::SymbolResolution> &r) override;
 
-class BitcodeCompiler {
 public:
   BitcodeCompiler(Ctx &ctx);
   ~BitcodeCompiler();
 
-  void add(BitcodeFile &f);
-  SmallVector<std::unique_ptr<InputFile>, 0> compile();
+  void add(BinaryFile &f);
+  SmallVector<std::unique_ptr<InputFile>, 0> compile() override;
 
 private:
-  Ctx &ctx;
   std::unique_ptr<llvm::lto::LTO> ltoObj;
   // An array of (module name, native relocatable file content) pairs.
   SmallVector<std::pair<std::string, SmallString<0>>, 0> buf;
   std::vector<std::unique_ptr<MemoryBuffer>> files;
   SmallVector<std::string, 0> filenames;
-  llvm::DenseSet<StringRef> usedStartStop;
   std::unique_ptr<llvm::raw_fd_ostream> indexFile;
-  llvm::DenseSet<StringRef> thinIndices;
 };
+
+#if LLD_ENABLE_GNU_LTO
+class GccIRCompiler : public IRCompiler {
+protected:
+  void addObject(IR...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-lld-elf

Author: Tulio Magno Quites Machado Filho (tuliom)

Changes

This is not feature complete yet, but is able to link simple executables.

The feature is disabled by default and can be enable via
-DLLD_LINK_GPL3:BOOL=ON -DLLD_ENABLE_GNU_LTO:BOOL=ON.

This implementation reuses GCC's plugin liblto_plugin.so, loading it at
runtime, when necessary.


Patch is 29.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157175.diff

18 Files Affected:

  • (modified) lld/CMakeLists.txt (+8)
  • (modified) lld/ELF/CMakeLists.txt (+9)
  • (modified) lld/ELF/Config.h (+6-1)
  • (modified) lld/ELF/Driver.cpp (+69-5)
  • (modified) lld/ELF/InputFiles.cpp (+3-3)
  • (modified) lld/ELF/InputFiles.h (+16-5)
  • (modified) lld/ELF/LTO.cpp (+241-5)
  • (modified) lld/ELF/LTO.h (+69-6)
  • (modified) lld/ELF/Options.td (+4-4)
  • (added) lld/cmake/modules/FindGNULTO.cmake (+38)
  • (added) lld/include/lld/config.h.cmake (+16)
  • (added) lld/test/ELF/Inputs/plugin.so ()
  • (added) lld/test/ELF/gnu-lto/hello.test (+28)
  • (removed) lld/test/ELF/ignore-plugin.test (-2)
  • (modified) lld/test/ELF/lto-plugin-ignore.s (-1)
  • (added) lld/test/ELF/plugin.test (+21)
  • (modified) lld/test/lit.cfg.py (+20)
  • (modified) lld/test/lit.site.cfg.py.in (+2)
diff --git a/lld/CMakeLists.txt b/lld/CMakeLists.txt
index 80e25204a65ee..9481d71b0ca57 100644
--- a/lld/CMakeLists.txt
+++ b/lld/CMakeLists.txt
@@ -177,6 +177,14 @@ if (LLD_DEFAULT_LD_LLD_IS_MINGW)
   add_definitions("-DLLD_DEFAULT_LD_LLD_IS_MINGW=1")
 endif()
 
+option(LLD_LINK_GPL3 "Allow LLD to link to GPLv3-licensed files")
+option(LLD_ENABLE_GNU_LTO "Enable support for GNU LTO plugin")
+if(!LLD_LINK_GPL3)
+  # Support for GNU LTO require linking to liblto_plugin.so from GCC which is
+  # licensed as GPLv3.
+  set(LLD_ENABLE_GNU_LTO FALSE)
+endif()
+
 if (MSVC)
   add_definitions(-wd4530) # Suppress 'warning C4530: C++ exception handler used, but unwind semantics are not enabled.'
   add_definitions(-wd4062) # Suppress 'warning C4062: enumerator X in switch of enum Y is not handled' from system header.
diff --git a/lld/ELF/CMakeLists.txt b/lld/ELF/CMakeLists.txt
index ec3f6382282b1..82121cb6b2422 100644
--- a/lld/ELF/CMakeLists.txt
+++ b/lld/ELF/CMakeLists.txt
@@ -18,6 +18,15 @@ if(LLVM_ENABLE_ZSTD)
   list(APPEND imported_libs ${zstd_target})
 endif()
 
+set(GNULTO_INCLUDE_DIR "" CACHE PATH "Additional directory, where CMake should search for plugin-api.h")
+if(LLD_ENABLE_GNU_LTO)
+  find_package(GNULTO REQUIRED)
+endif()
+
+configure_file(
+  ${CMAKE_CURRENT_SOURCE_DIR}/../include/lld/config.h.cmake
+  ${CMAKE_CURRENT_BINARY_DIR}/../include/lld/config.h)
+
 add_lld_library(lldELF
   AArch64ErrataFix.cpp
   Arch/AArch64.cpp
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index a83a4c1176f6f..bb9638387a15f 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -54,6 +54,7 @@ class TargetInfo;
 struct Ctx;
 struct Partition;
 struct PhdrEntry;
+class IRCompiler;
 
 class BssSection;
 class GdbIndexSection;
@@ -191,6 +192,7 @@ class LinkerDriver {
   void inferMachineType();
   template <class ELFT> void link(llvm::opt::InputArgList &args);
   template <class ELFT> void compileBitcodeFiles(bool skipLinkedOutput);
+  template <class ELFT> void compileGccIRFiles(bool skipLinkedOutput);
   bool tryAddFatLTOFile(MemoryBufferRef mb, StringRef archiveName,
                         uint64_t offsetInArchive, bool lazy);
   // True if we are in --whole-archive and --no-whole-archive.
@@ -199,7 +201,7 @@ class LinkerDriver {
   // True if we are in --start-lib and --end-lib.
   bool inLib = false;
 
-  std::unique_ptr<BitcodeCompiler> lto;
+  std::unique_ptr<IRCompiler> lto;
   SmallVector<std::unique_ptr<InputFile>, 0> files, ltoObjectFiles;
 
 public:
@@ -241,9 +243,12 @@ struct Config {
   llvm::StringRef optRemarksPasses;
   llvm::StringRef optRemarksFormat;
   llvm::StringRef optStatsFilename;
+  llvm::StringRef plugin;
+  llvm::SmallVector<std::string, 0> pluginOpt;
   llvm::StringRef progName;
   llvm::StringRef printArchiveStats;
   llvm::StringRef printSymbolOrder;
+  llvm::StringRef resolutionFile;
   llvm::StringRef soName;
   llvm::StringRef sysroot;
   llvm::StringRef thinLTOCacheDir;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6c2f318ffe469..e0510edd59861 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1786,6 +1786,14 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
 
   cl::ResetAllOptionOccurrences();
 
+  // Ignore -plugin=LLVMgold.so because we don't need to load it.
+  StringRef v = args.getLastArgValue(OPT_plugin);
+  if (!v.empty() && !v.ends_with("LLVMgold.so"))
+    if (!llvm::sys::fs::exists(v))
+      ErrAlways(ctx) << "Cannot find plugin " << v;
+    else
+      ctx.arg.plugin = v;
+
   // Parse LTO options.
   if (auto *arg = args.getLastArg(OPT_plugin_opt_mcpu_eq))
     parseClangOption(ctx, ctx.saver.save("-mcpu=" + StringRef(arg->getValue())),
@@ -1796,14 +1804,31 @@ static void readConfigs(Ctx &ctx, opt::InputArgList &args) {
                      arg->getSpelling());
 
   // GCC collect2 passes -plugin-opt=path/to/lto-wrapper with an absolute or
-  // relative path. Just ignore. If not ended with "lto-wrapper" (or
-  // "lto-wrapper.exe" for GCC cross-compiled for Windows), consider it an
-  // unsupported LLVMgold.so option and error.
+  // relative path. If not ended with "lto-wrapper" (or "lto-wrapper.exe" for
+  // GCC cross-compiled for Windows), consider it an unsupported LLVMgold.so
+  // option and error.
   for (opt::Arg *arg : args.filtered(OPT_plugin_opt_eq)) {
     StringRef v(arg->getValue());
     if (!v.ends_with("lto-wrapper") && !v.ends_with("lto-wrapper.exe"))
       ErrAlways(ctx) << arg->getSpelling() << ": unknown plugin option '"
                      << arg->getValue() << "'";
+    else if (!ctx.arg.plugin.empty())
+#if LLD_ENABLE_GNU_LTO
+        ctx.arg.pluginOpt.push_back(v.str());
+#else
+      ErrAlways(ctx) << arg->getSpelling() << " : support for GNU LTO is disabled";
+#endif
+  }
+
+  // Parse GCC collect2 options.
+  if (!ctx.arg.plugin.empty()) {
+#if LLD_ENABLE_GNU_LTO
+    StringRef v = args.getLastArgValue(OPT_plugin_opt_fresolution);
+    if (!v.empty()) {
+      ctx.arg.resolutionFile = v;
+      ctx.arg.pluginOpt.push_back(std::string("-fresolution=" + v.str()));
+    }
+#endif
   }
 
   ctx.arg.passPlugins = args::getStrings(args, OPT_load_pass_plugins);
@@ -2683,7 +2708,7 @@ static void markBuffersAsDontNeed(Ctx &ctx, bool skipLinkedOutput) {
 }
 
 // This function is where all the optimizations of link-time
-// optimization takes place. When LTO is in use, some input files are
+// optimization takes place. When LLVM LTO is in use, some input files are
 // not in native object file format but in the LLVM bitcode format.
 // This function compiles bitcode files into a few big native files
 // using LLVM functions and replaces bitcode symbols with the results.
@@ -2732,6 +2757,39 @@ void LinkerDriver::compileBitcodeFiles(bool skipLinkedOutput) {
   }
 }
 
+#if LLD_ENABLE_GNU_LTO
+template <class ELFT>
+void LinkerDriver::compileGccIRFiles(bool skipLinkedOutput) {
+  llvm::TimeTraceScope timeScope("LTO");
+  // Compile files and replace symbols.
+  GccIRCompiler *c = GccIRCompiler::getInstance(ctx);
+  lto.reset(c);
+
+  for (ELFFileBase *file : ctx.objectFiles)
+    c->add(*file);
+
+  ltoObjectFiles = c->compile();
+  for (auto &file : ltoObjectFiles) {
+    auto *obj = cast<ObjFile<ELFT>>(file.get());
+    obj->parse(/*ignoreComdats=*/true);
+
+    // For defined symbols in non-relocatable output,
+    // compute isExported and parse '@'.
+    if (!ctx.arg.relocatable)
+      for (Symbol *sym : obj->getGlobalSymbols()) {
+        if (!sym->isDefined())
+          continue;
+        if (ctx.arg.exportDynamic && sym->computeBinding(ctx) != STB_LOCAL)
+          sym->isExported = true;
+        if (sym->hasVersionSuffix)
+          sym->parseSymbolVersion(ctx);
+      }
+    ctx.objectFiles.push_back(obj);
+  }
+  return;
+}
+#endif
+
 // The --wrap option is a feature to rename symbols so that you can write
 // wrappers for existing functions. If you pass `--wrap=foo`, all
 // occurrences of symbol `foo` are resolved to `__wrap_foo` (so, you are
@@ -3289,7 +3347,13 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
   // except a few linker-synthesized ones will be added to the symbol table.
   const size_t numObjsBeforeLTO = ctx.objectFiles.size();
   const size_t numInputFilesBeforeLTO = ctx.driver.files.size();
-  compileBitcodeFiles<ELFT>(skipLinkedOutput);
+  if (ctx.arg.plugin.empty()) {
+    compileBitcodeFiles<ELFT>(skipLinkedOutput);
+#if LLD_ENABLE_GNU_LTO
+  } else {
+    compileGccIRFiles<ELFT>(skipLinkedOutput);
+#endif
+  }
 
   // Symbol resolution finished. Report backward reference problems,
   // --print-archive-stats=, and --why-extract=.
diff --git a/lld/ELF/InputFiles.cpp b/lld/ELF/InputFiles.cpp
index a5921feb18299..8b9b55c4a5d47 100644
--- a/lld/ELF/InputFiles.cpp
+++ b/lld/ELF/InputFiles.cpp
@@ -1845,8 +1845,8 @@ static bool dtltoAdjustMemberPathIfThinArchive(Ctx &ctx, StringRef archivePath,
   return true;
 }
 
-BitcodeFile::BitcodeFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
-                         uint64_t offsetInArchive, bool lazy)
+IRFile::IRFile(Ctx &ctx, MemoryBufferRef mb, StringRef archiveName,
+               uint64_t offsetInArchive, bool lazy)
     : InputFile(ctx, BitcodeKind, mb) {
   this->archiveName = archiveName;
   this->lazy = lazy;
@@ -1958,7 +1958,7 @@ void BitcodeFile::parse() {
     addDependentLibrary(ctx, l, this);
 }
 
-void BitcodeFile::parseLazy() {
+void IRFile::parseLazy() {
   numSymbols = obj->symbols().size();
   symbols = std::make_unique<Symbol *[]>(numSymbols);
   for (auto [i, irSym] : llvm::enumerate(obj->symbols())) {
diff --git a/lld/ELF/InputFiles.h b/lld/ELF/InputFiles.h
index ba844ad18f637..551fbb85cb84f 100644
--- a/lld/ELF/InputFiles.h
+++ b/lld/ELF/InputFiles.h
@@ -16,6 +16,7 @@
 #include "lld/Common/Reproduce.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/BinaryFormat/Magic.h"
+#include "llvm/LTO/LTO.h"
 #include "llvm/Object/ELF.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Threading.h"
@@ -321,15 +322,25 @@ template <class ELFT> class ObjFile : public ELFFileBase {
   ArrayRef<Elf_Word> shndxTable;
 };
 
-class BitcodeFile : public InputFile {
+class IRFile : public InputFile {
 public:
-  BitcodeFile(Ctx &, MemoryBufferRef m, StringRef archiveName,
-              uint64_t offsetInArchive, bool lazy);
+  IRFile(Ctx &ctx, MemoryBufferRef m, StringRef archiveName, uint64_t offsetInArchive,
+         bool lazy);
   static bool classof(const InputFile *f) { return f->kind() == BitcodeKind; }
-  void parse();
+  virtual void parse() = 0;
   void parseLazy();
-  void postParse();
+  virtual void postParse() = 0;
   std::unique_ptr<llvm::lto::InputFile> obj;
+};
+
+class BitcodeFile : public IRFile {
+public:
+  BitcodeFile(Ctx &ctx, MemoryBufferRef m, StringRef archiveName,
+              uint64_t offsetInArchive, bool lazy)
+      : IRFile(ctx, m, archiveName, offsetInArchive, lazy) {};
+  static bool classof(const InputFile *f) { return f->kind() == BitcodeKind; }
+  void parse() override;
+  void postParse() override;
   std::vector<bool> keptComdats;
 };
 
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 8d4a6c9e3a81e..dde63aa754975 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -26,6 +26,8 @@
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/Support/Path.h"
 #include <cstddef>
+#include <cstring>
+#include <dlfcn.h>
 #include <memory>
 #include <string>
 #include <system_error>
@@ -165,7 +167,7 @@ static lto::Config createConfig(Ctx &ctx) {
   return c;
 }
 
-BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : ctx(ctx) {
+BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : IRCompiler(ctx) {
   // Initialize indexFile.
   if (!ctx.arg.thinLTOIndexOnlyArg.empty())
     indexFile = openFile(ctx.arg.thinLTOIndexOnlyArg);
@@ -215,9 +217,7 @@ BitcodeCompiler::BitcodeCompiler(Ctx &ctx) : ctx(ctx) {
   }
 }
 
-BitcodeCompiler::~BitcodeCompiler() = default;
-
-void BitcodeCompiler::add(BitcodeFile &f) {
+void IRCompiler::add(IRFile &f) {
   lto::InputFile &obj = *f.obj;
   bool isExec = !ctx.arg.shared && !ctx.arg.relocatable;
 
@@ -278,7 +278,7 @@ void BitcodeCompiler::add(BitcodeFile &f) {
     // their values are still not final.
     r.LinkerRedefined = sym->scriptDefined;
   }
-  checkError(ctx.e, ltoObj->add(std::move(f.obj), resols));
+  addObject(f, resols);
 }
 
 // If LazyObjFile has not been added to link, emit empty index files.
@@ -421,3 +421,239 @@ SmallVector<std::unique_ptr<InputFile>, 0> BitcodeCompiler::compile() {
   }
   return ret;
 }
+
+void BitcodeCompiler::addObject(IRFile &f,
+                                std::vector<llvm::lto::SymbolResolution> &r) {
+  checkError(ctx.e, ltoObj->add(std::move(f.obj), r));
+}
+
+#if LLD_ENABLE_GNU_LTO
+GccIRCompiler *GccIRCompiler::singleton = nullptr;
+
+ GccIRCompiler *GccIRCompiler::getInstance() {
+  assert(singleton != nullptr);
+  return singleton;
+}
+
+GccIRCompiler *GccIRCompiler::getInstance(Ctx &ctx) {
+  if (singleton == nullptr) {
+    singleton = new GccIRCompiler(ctx);
+    singleton->loadPlugin();
+  }
+
+  return singleton;
+}
+
+GccIRCompiler::GccIRCompiler(Ctx &ctx) : IRCompiler(ctx) {
+  singleton = nullptr;
+
+  // TODO: Properly find the right size.
+  int tvsz = 100;
+  tv = new ld_plugin_tv[tvsz];
+  initializeTv();
+}
+
+GccIRCompiler::~GccIRCompiler() {
+  singleton = nullptr;
+  delete[] tv;
+}
+
+void GccIRCompiler::loadPlugin() {
+  plugin = dlopen(ctx.arg.plugin.data(), RTLD_NOW);
+  if (plugin == NULL) {
+    error(dlerror());
+    return;
+  }
+  void *tmp = dlsym(plugin, "onload");
+  if (tmp == NULL) {
+    error("Plugin does not provide onload()");
+    return;
+  }
+
+  ld_plugin_onload onload;
+  // Ensure source and destination types have the same size.
+  assert(sizeof(ld_plugin_onload) == sizeof(void *));
+  std::memcpy(&onload, &tmp, sizeof(ld_plugin_onload));
+
+  (*onload)(tv);
+}
+
+enum ld_plugin_status regClaimFile(ld_plugin_claim_file_handler handler) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+  return c->registerClaimFile(handler);
+}
+
+enum ld_plugin_status
+GccIRCompiler::registerClaimFile(ld_plugin_claim_file_handler handler) {
+  claimFileHandler = handler;
+  return LDPS_OK;
+}
+
+#if HAVE_LDPT_REGISTER_CLAIM_FILE_HOOK_V2
+enum ld_plugin_status regClaimFileV2(ld_plugin_claim_file_handler handler) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+  return c->registerClaimFileV2(handler);
+}
+
+enum ld_plugin_status
+GccIRCompiler::registerClaimFileV2(ld_plugin_claim_file_handler_v2 handler) {
+  claimFileHandlerV2 = handler;
+  return LDPS_OK;
+}
+#endif
+
+enum ld_plugin_status regAllSymbolsRead(ld_plugin_all_symbols_read_handler handler) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+  return c->registerAllSymbolsRead(handler);
+}
+
+enum ld_plugin_status
+GccIRCompiler::registerAllSymbolsRead(ld_plugin_all_symbols_read_handler handler) {
+  allSymbolsReadHandler = handler;
+  return LDPS_OK;
+}
+
+static enum ld_plugin_status addSymbols(void *handle, int nsyms,
+                                        const struct ld_plugin_symbol *syms) {
+  ELFFileBase *f = (ELFFileBase *) handle;
+  if(f == NULL)
+    return LDPS_ERR;
+
+  for (int i = 0; i < nsyms; i++) {
+    // TODO: Add symbols.
+    // TODO: Convert these symbosl into ArrayRef<lto::InputFile::Symbol> and
+    // ArrayRef<Symbol *> ?
+  }
+
+  return LDPS_OK;
+}
+
+static enum ld_plugin_status getSymbols(const void *handle, int nsyms,
+                                        struct ld_plugin_symbol *syms) {
+  for (int i = 0; i < nsyms; i++) {
+    syms[i].resolution = LDPR_UNDEF;
+    // TODO: Implement other scenarios.
+  }
+  return LDPS_OK;
+}
+
+ld_plugin_status addInputFile(const char *pathname) {
+  GccIRCompiler *c = GccIRCompiler::getInstance();
+
+  if (c->addCompiledFile(StringRef(pathname)))
+    return LDPS_OK;
+  else
+    return LDPS_ERR;
+}
+
+void GccIRCompiler::initializeTv() {
+  int i = 0;
+
+#define TVU_SETTAG(t, f, v)                                                    \
+  {                                                                            \
+    tv[i].tv_tag = t;                                                          \
+    tv[i].tv_u.tv_##f = v;                                                     \
+    i++;                                                                       \
+  }
+
+  TVU_SETTAG(LDPT_MESSAGE, message, message);
+  TVU_SETTAG(LDPT_API_VERSION, val, LD_PLUGIN_API_VERSION);
+  for (std::string &s : ctx.arg.pluginOpt) {
+    TVU_SETTAG(LDPT_OPTION, string, s.c_str());
+  }
+  ld_plugin_output_file_type o;
+  if (ctx.arg.pie)
+    o = LDPO_PIE;
+  else if (ctx.arg.relocatable)
+    o = LDPO_REL;
+  else if (ctx.arg.shared)
+    o = LDPO_DYN;
+  else
+    o = LDPO_EXEC;
+  TVU_SETTAG(LDPT_LINKER_OUTPUT, val, o);
+  TVU_SETTAG(LDPT_OUTPUT_NAME, string, ctx.arg.outputFile.data());
+  // Share the address of a C wrapper that is API-compatible with
+  // plugin-api.h.
+  TVU_SETTAG(LDPT_REGISTER_CLAIM_FILE_HOOK, register_claim_file, regClaimFile);
+#if HAVE_LDPT_REGISTER_CLAIM_FILE_HOOK_V2
+  TVU_SETTAG(LDPT_REGISTER_CLAIM_FILE_HOOK_V2, register_claim_file_v2,
+             regClaimFileV2);
+#endif
+
+  TVU_SETTAG(LDPT_ADD_SYMBOLS, add_symbols, addSymbols);
+  TVU_SETTAG(LDPT_REGISTER_ALL_SYMBOLS_READ_HOOK, register_all_symbols_read,
+             regAllSymbolsRead);
+  TVU_SETTAG(LDPT_GET_SYMBOLS, get_symbols, getSymbols);
+  TVU_SETTAG(LDPT_ADD_INPUT_FILE, add_input_file, addInputFile);
+}
+
+void GccIRCompiler::add(ELFFileBase &f) {
+  struct ld_plugin_input_file file;
+
+  std::string name = f.getName().str();
+  file.name = f.getName().data();
+  file.handle = const_cast<void *>(reinterpret_cast<const void *>(&f));
+
+  std::error_code ec = sys::fs::openFileForRead(name, file.fd);
+  if (ec) {
+    error("Cannot open file " + name + ": " + ec.message());
+    return;
+  }
+  file.offset = 0;
+  uint64_t size;
+  ec = sys::fs::file_size(name, size);
+  if (ec) {
+    error("Cannot get the size of file " + name + ": " + ec.message());
+    sys::fs::closeFile(file.fd);
+    return;
+  }
+  if (size > 0 && size <= INT_MAX)
+    file.filesize = size;
+
+  int claimed;
+#if HAVE_LDPT_REGISTER_CLAIM_FILE_HOOK_V2
+  ld_plugin_status status = claimFileHandler(&file, &claimed, 1);
+#else
+  ld_plugin_status status = claimFileHandler(&file, &claimed);
+#endif
+
+  if (status != LDPS_OK)
+    error("liblto returned " + std::to_string(status));
+
+  ec = sys::fs::closeFile(file.fd);
+  if (ec) {
+    error(ec.message());
+  }
+}
+
+SmallVector<std::unique_ptr<InputFile>, 0> GccIRCompiler::compile() {
+  SmallVector<std::unique_ptr<InputFile>, 0> ret;
+  ld_plugin_status status = allSymbolsReadHandler();
+  if (status != LDPS_OK)
+    error("The plugin returned an error after all symbols were read.");
+
+  for (auto& m : files) {
+    ret.push_back(createObjFile(ctx, m->getMemBufferRef()));
+  }
+  return ret;
+}
+
+void GccIRCompiler::addObject(IRFile &f,
+                              std::vector<llvm::lto::SymbolResolution> &r) {
+  // TODO: Implement this.
+}
+
+enum ld_plugin_status GccIRCompiler::message(int level, const char *format,
+                                             ...) {
+  // TODO: Implement this function.
+  return LDPS_OK;
+}
+
+bool GccIRCompiler::addCompiledFile(StringRef path) {
+  std::optional<MemoryBufferRef> mbref = readFile(ctx, path);
+  if (!mbref)
+    return false;
+  files.push_back(std::move(MemoryBuffer::getMemBuffer(*mbref)));
+  return true;
+}
+#endif
diff --git a/lld/ELF/LTO.h b/lld/ELF/LTO.h
index acf3bcff7f2f1..3dac2eddfd532 100644
--- a/lld/ELF/LTO.h
+++ b/lld/ELF/LTO.h
@@ -21,40 +21,103 @@
 #define LLD_ELF_LTO_H
 
 #include "lld/Common/LLVM.h"
+#include "lld/config.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/Support/raw_ostream.h"
 #include <memory>
+#include <plugin-api.h>
 #include <vector>
 
 namespace llvm::lto {
 class LTO;
+class SymbolResolution;
 }
 
 namespace lld::elf {
 struct Ctx;
 class BitcodeFile;
+class ELFFileBase;
 class InputFile;
+class IRFile;
+class BinaryFile;
+
+class IRCompiler {
+protected:
+  Ctx &ctx;
+  llvm::DenseSet<StringRef> thinIndices;
+  llvm::DenseSet<StringRef> usedStartStop;
+  virtual void addObject(IRFile &f,
+                         std::vector<llvm::lto::SymbolResolution> &r) = 0;
+
+public:
+  IRCompiler(Ctx &ctx) : ctx(ctx) {}
+  void add(IRFile &f);
+  virtual SmallVector<std::unique_ptr<InputFile>, 0> compile() = 0;
+};
+
+class BitcodeCompiler : public IRCompiler {
+protected:
+  void addObject(IRFile &f,
+                 std::vector<llvm::lto::SymbolResolution> &r) override;
 
-class BitcodeCompiler {
 public:
   BitcodeCompiler(Ctx &ctx);
   ~BitcodeCompiler();
 
-  void add(BitcodeFile &f);
-  SmallVector<std::unique_ptr<InputFile>, 0> compile();
+  void add(BinaryFile &f);
+  SmallVector<std::unique_ptr<InputFile>, 0> compile() override;
 
 private:
-  Ctx &ctx;
   std::unique_ptr<llvm::lto::LTO> ltoObj;
   // An array of (module name, native relocatable file content) pairs.
   SmallVector<std::pair<std::string, SmallString<0>>, 0> buf;
   std::vector<std::unique_ptr<MemoryBuffer>> files;
   SmallVector<std::string, 0> filenames;
-  llvm::DenseSet<StringRef> usedStartStop;
   std::unique_ptr<llvm::raw_fd_ostream> indexFile;
-  llvm::DenseSet<StringRef> thinIndices;
 };
+
+#if LLD_ENABLE_GNU_LTO
+class GccIRCompiler : public IRCompiler {
+protected:
+  void addObject(IR...
[truncated]

Copy link

github-actions bot commented Sep 5, 2025

✅ With the latest revision this PR passed the Python code formatter.

Copy link

github-actions bot commented Sep 5, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@tuliom
Copy link
Contributor Author

tuliom commented Sep 29, 2025

I believe this is now ready for review.

add_definitions("-DLLD_DEFAULT_LD_LLD_IS_MINGW=1")
endif()

option(LLD_LINK_GPL3 "Allow LLD to link to GPLv3-licensed files")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to a bit confusing. It is common to have LLVM to default to use libgcc and libstdc++, both are licensed under the GPLv3. This option is not related to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fweimer-rh Would Allow LLD to link itself to GPLv3-licensed files be clearer?

Copy link
Contributor

@fweimer-rh fweimer-rh Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuliom I would mention the GCC compiler plugin specifically. Typical builds of LLD already link against GPL code via libgcc_s and libstdc++. Mentioning the plugin and not its licensing impact avoids taking a position on the implications of linking against other code from GCC.

But in the end, you don't need to convince me, but the LLVM community.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my last update I specified the GCC compiler plugin and tried to restrict the explanation to plugins.

Specify liblto_plugin.so and try to distinguish from the usual linking
that happens when linking to libstdc++ or libgcc_s.
@tuliom tuliom requested review from MaskRay, jrtc27 and smithp35 October 8, 2025 17:34
Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left some quick comments, I'll need to spend some time looking at how the plugin works.

My apologies, I'll be out of the office for the next 2 weeks and then at the US developer meeting. I can go through this is in detail when I get back. I don't have any particularly strong opinions so don't let this block anyone else.

Before I, or anyone else spends a lot of time, I'd like to know if there is anyone with any fundamental objections. I know in the past that there has been some reluctance (#41791) but my reading of the RFC https://discourse.llvm.org/t/rfc-lld-add-support-for-gcc-lto-format/87172/1 looks like positions have moved on since then.

With the obligatory "I am not a lawyer but" disclaimer, I do think there is a material difference between #include <plugin-api.h>, the GNU lto plugin and linking against libgcc and libstdc++ as the LTO includes and library don't have the GCC RUNTIME LIBRARY EXCEPTION in their license. As I understand it from the RFC the foundations lawyers are OK with this (perhaps worth citing the RFC in the description).

}

#if LLD_ENABLE_GNU_LTO
#include <dlfcn.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be available on Windows? I'm guessing that this could work with mingw?

};

#if LLD_ENABLE_GNU_LTO
#include <plugin-api.h>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have any opinions on the recommendation raised in the RFC https://discourse.llvm.org/t/rfc-lld-add-support-for-gcc-lto-format/87172/13 to not use plugin-api.h to avoid the source dependency.

This header is GPL licensed, which I think would make the LLD produced when LLD_ENABLE_GNU_LTO is defined GPL too, even when not dynamically linking to the plugin.

@@ -0,0 +1,28 @@
// REQUIRES: gnu_lto, gcc
Copy link
Collaborator

@smithp35 smithp35 Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this may be the first test that we have that tries to execute the linked executable. I'm trying to think if gnu_lto and gcc are sufficient to stop this test on some platforms. For example if they have cross-compilation enabled.

Would it be sufficient to use llvm-readelf and llvm-objdump like the existing lto tests do. This would also mean that a freestanding example could be used without dependencies on a C-library.

@@ -0,0 +1,21 @@
RUN: not ld.lld -plugin %S/Inputs/plugin.so 2>&1 | FileCheck %s
Copy link
Member

@MaskRay MaskRay Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option is defm plugin: Eq<"plugin", ... and so we support many syntax variants, but it isn't needed to test every of them.

We just need

-plugin %S/Inputs/plugin.so
--plugin=%S/Inputs/plugin.so

it's not necessary to also cd %S

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants