Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MC,llvm-readobj,yaml2obj] Support CREL relocation format #91280

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented May 6, 2024

CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form)
available with llvm-mc -filetype=obj -crel a.s -o a.o.
A dependent patch will add clang -c -Wa,--crel,--allow-experimental-crel.

Also add llvm-readobj support (for both REL and RELA forms) to
facilitate testing the assembler. Additionally, yaml2obj gains support
for the RELA form to aid testing with llvm-readobj.

We temporarily assign the section type code 0x40000020 from the generic
range to SHT_CREL. We avoided using SHT_LLVM_ or SHT_GNU_ to
avoid code churn and maintain broader applicability for interested psABIs.
Similarly, DT_CREL is temporarily 0x40000026.

LLVM will change the code and break compatibility. This is not an issue
if all relocatable files using CREL are regenerated (aka no prebuilt
relocatable files).

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Created using spr 1.3.5-bogner
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen mc Machine (object) code objectyaml llvm:binary-utilities labels May 6, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 6, 2024

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-objectyaml
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Fangrui Song (MaskRay)

Changes

CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form),
which can be enabled with clang -c -Wa,--crel,--experimental-crel.
-Wa,--experimental-crel indicates that CREL is experimental and not
a standard. -Wa,--experimental-crel might be removed in the future.

  • We take a section type code 20 from the generic range for SHT_CREL.
    We avoided using SHT_LLVM_ or SHT_GNU_ to maintain broader
    applicability for interested psABIs.
  • In the extremely unlikely case the generic ABI assigns 20 to another
    purpose, and existing toolchains generate relocatable files using that
    code (unlikely if we ensure GNU and LLVM don't do this), any conflicts
    would be the user's responsibility to resolve.

This patch also adds llvm-readobj support (for both REL and RELA forms)
to facilitate testing the assembler. Additionally, yaml2obj gains
support for the RELA form to aid testing with llvm-readobj.

Currently, linking the generated relocatable file with GNU ld will fail
because GNU ld doesn't support CREL yet.

unknown architecture of input file `a.o' is incompatible with i386:x86-64 output

lld will also produce an error when attempting to link a CREL relocatable file.
The support will come in a future change.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600


Patch is 72.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91280.diff

32 Files Affected:

  • (modified) clang/include/clang/Basic/CodeGenOptions.def (+1)
  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+3)
  • (modified) clang/include/clang/Driver/Options.td (+3)
  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+1)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+18)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+21)
  • (added) clang/test/Driver/crel.c (+24)
  • (added) clang/test/Misc/cc1as-crel.s (+6)
  • (modified) clang/tools/driver/cc1as_main.cpp (+6)
  • (modified) llvm/include/llvm/BinaryFormat/DynamicTags.def (+2)
  • (modified) llvm/include/llvm/BinaryFormat/ELF.h (+1)
  • (modified) llvm/include/llvm/MC/MCTargetOptions.h (+3)
  • (modified) llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h (+1)
  • (modified) llvm/include/llvm/Object/ELF.h (+5)
  • (modified) llvm/include/llvm/Object/ELFObjectFile.h (+64-6)
  • (modified) llvm/include/llvm/Object/ELFTypes.h (+25)
  • (modified) llvm/lib/MC/ELFObjectWriter.cpp (+84-26)
  • (modified) llvm/lib/MC/MCTargetOptionsCommandFlags.cpp (+6)
  • (modified) llvm/lib/Object/ELF.cpp (+63)
  • (modified) llvm/lib/ObjectYAML/ELFEmitter.cpp (+53-7)
  • (modified) llvm/lib/ObjectYAML/ELFYAML.cpp (+2)
  • (added) llvm/test/MC/ELF/crel-32.s (+16)
  • (added) llvm/test/MC/ELF/crel.s (+100)
  • (added) llvm/test/tools/llvm-readobj/ELF/crel.test (+180)
  • (modified) llvm/test/tools/llvm-readobj/ELF/dynamic-reloc.test (+25-6)
  • (modified) llvm/test/tools/llvm-readobj/ELF/relocation-errors.test (+18-1)
  • (modified) llvm/test/tools/yaml2obj/ELF/dynamic-relocations.yaml (+4-1)
  • (modified) llvm/test/tools/yaml2obj/ELF/reloc-sec-entry-size.yaml (+5)
  • (added) llvm/test/tools/yaml2obj/ELF/relocation-crel.yaml (+63)
  • (modified) llvm/test/tools/yaml2obj/ELF/relocation-missing-symbol.yaml (+2-1)
  • (modified) llvm/test/tools/yaml2obj/ELF/relocation-type.yaml (+3-1)
  • (modified) llvm/tools/llvm-readobj/ELFDumper.cpp (+75-6)
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 340b08dd7e2a33..3229f77eef1fcc 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -36,6 +36,7 @@ VALUE_CODEGENOPT(Name, Bits, Default)
 #endif
 
 CODEGENOPT(DisableIntegratedAS, 1, 0) ///< -no-integrated-as
+CODEGENOPT(Crel, 1, 0) ///< -Wa,--crel
 CODEGENOPT(RelaxELFRelocations, 1, 1) ///< -Wa,-mrelax-relocations={yes,no}
 CODEGENOPT(AsmVerbose        , 1, 0) ///< -dA, -fverbose-asm.
 CODEGENOPT(PreserveAsmComments, 1, 1) ///< -dA, -fno-preserve-as-comments.
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 9781fcaa4ff5e9..e9cea8967c1334 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -809,6 +809,9 @@ def warn_drv_missing_multilib : Warning<
 def note_drv_available_multilibs : Note<
   "available multilibs are:%0">;
 
+def err_drv_experimental_crel : Error<
+  "-Wa,--experimental-crel must be specified to use -Wa,--crel. CREL is experimental and takes a non-standard section type code">;
+
 def warn_android_unversioned_fallback : Warning<
   "Using unversioned Android target directory %0 for target %1. Unversioned"
   " directories will not be used in Clang 19. Provide a versioned directory"
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index c9f7c4e5f718b4..55f0c9aff6a49f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6958,6 +6958,9 @@ def massembler_no_warn : Flag<["-"], "massembler-no-warn">,
 def massembler_fatal_warnings : Flag<["-"], "massembler-fatal-warnings">,
   HelpText<"Make assembler warnings fatal">,
   MarshallingInfoFlag<CodeGenOpts<"FatalWarnings">>;
+def crel : Flag<["--"], "crel">,
+  HelpText<"Enable CREL relocation format (ELF only)">,
+  MarshallingInfoFlag<CodeGenOpts<"Crel">>;
 def mrelax_relocations_no : Flag<["-"], "mrelax-relocations=no">,
     HelpText<"Disable x86 relax relocations">,
     MarshallingInfoNegativeFlag<CodeGenOpts<"RelaxELFRelocations">>;
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 22c3f8642ad8eb..ed37085d69ac53 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -474,6 +474,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.MCOptions.AsmVerbose = CodeGenOpts.AsmVerbose;
   Options.MCOptions.Dwarf64 = CodeGenOpts.Dwarf64;
   Options.MCOptions.PreserveAsmComments = CodeGenOpts.PreserveAsmComments;
+  Options.MCOptions.Crel = CodeGenOpts.Crel;
   Options.MCOptions.X86RelaxRelocations = CodeGenOpts.RelaxELFRelocations;
   Options.MCOptions.CompressDebugSections =
       CodeGenOpts.getCompressDebugSections();
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 0a2ea96de73824..0f4d8aa3ad7938 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2467,6 +2467,8 @@ static void CollectArgsForIntegratedAssembler(Compilation &C,
   // arg after parsing the '-I' arg.
   bool TakeNextArg = false;
 
+  const llvm::Triple &Triple = C.getDefaultToolChain().getTriple();
+  bool Crel = false, ExperimentalCrel = false;
   bool UseRelaxRelocations = C.getDefaultToolChain().useRelaxRelocations();
   bool UseNoExecStack = false;
   const char *MipsTargetFeature = nullptr;
@@ -2590,6 +2592,12 @@ static void CollectArgsForIntegratedAssembler(Compilation &C,
                  Value == "-nocompress-debug-sections" ||
                  Value == "--nocompress-debug-sections") {
         CmdArgs.push_back(Value.data());
+      } else if (Value == "--crel") {
+        Crel = true;
+      } else if (Value == "--no-crel") {
+        Crel = false;
+      } else if (Value == "--experimental-crel") {
+        ExperimentalCrel = true;
       } else if (Value == "-mrelax-relocations=yes" ||
                  Value == "--mrelax-relocations=yes") {
         UseRelaxRelocations = true;
@@ -2655,6 +2663,16 @@ static void CollectArgsForIntegratedAssembler(Compilation &C,
   }
   if (ImplicitIt.size())
     AddARMImplicitITArgs(Args, CmdArgs, ImplicitIt);
+  if (Crel) {
+    if (!ExperimentalCrel)
+      D.Diag(diag::err_drv_experimental_crel);
+    if (Triple.isOSBinFormatELF() && !Triple.isMIPS()) {
+      CmdArgs.push_back("--crel");
+    } else {
+      D.Diag(diag::err_drv_unsupported_opt_for_target)
+          << "-Wa,--crel" << D.getTargetTriple();
+    }
+  }
   if (!UseRelaxRelocations)
     CmdArgs.push_back("-mrelax-relocations=no");
   if (UseNoExecStack)
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 6796b43a155020..8ba45aeafd0569 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1072,6 +1072,27 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
 
   addMachineOutlinerArgs(D, Args, CmdArgs, ToolChain.getEffectiveTriple(),
                          /*IsLTO=*/true, PluginOptPrefix);
+
+  for (const Arg *A : Args.filtered(options::OPT_Wa_COMMA)) {
+    bool Crel = false;
+    for (StringRef V : A->getValues()) {
+      if (V == "--crel")
+        Crel = true;
+      else if (V == "--no-crel")
+        Crel = false;
+      else
+        continue;
+      A->claim();
+    }
+    if (Crel) {
+      if (Triple.isOSBinFormatELF() && !Triple.isMIPS()) {
+        CmdArgs.push_back(Args.MakeArgString(Twine(PluginOptPrefix) + "-crel"));
+      } else {
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
+            << "-Wa,--crel" << D.getTargetTriple();
+      }
+    }
+  }
 }
 
 /// Adds the '-lcgpu' and '-lmgpu' libraries to the compilation to include the
diff --git a/clang/test/Driver/crel.c b/clang/test/Driver/crel.c
new file mode 100644
index 00000000000000..0dc0507705d72d
--- /dev/null
+++ b/clang/test/Driver/crel.c
@@ -0,0 +1,24 @@
+// RUN: not %clang -### -c --target=x86_64 -Wa,--crel %s 2>&1 | FileCheck %s --check-prefix=NOEXP
+
+// NOEXP: error: -Wa,--experimental-crel must be specified to use -Wa,--crel. CREL is experimental and takes a non-standard section type code
+
+// RUN: %clang -### -c --target=x86_64 -Wa,--crel,--experimental-crel %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c --target=x86_64 -Wa,--crel,--no-crel,--experimental-crel %s 2>&1 | FileCheck %s --check-prefix=NO
+// RUN: not %clang -### -c --target=arm64-apple-darwin -Wa,--crel,--experimental-crel %s 2>&1 | FileCheck %s --check-prefix=ERR
+// RUN: not %clang -### -c --target=mips64 -Wa,--crel,--experimental-crel %s 2>&1 | FileCheck %s --check-prefix=ERR
+
+// RUN: %clang -### -c --target=aarch64 -Werror -Wa,--crel,--experimental-crel -x assembler %s -Werror 2>&1 | FileCheck %s --check-prefix=ASM
+// RUN: not %clang -### -c --target=mips64 -Wa,--crel,--experimental-crel -x assembler %s 2>&1 | FileCheck %s --check-prefix=ERR
+
+// CHECK: "-cc1" {{.*}}"--crel"
+// NO:     "-cc1"
+// NO-NOT: "--crel"
+// ASM:   "-cc1as" {{.*}}"--crel"
+// ERR: error: unsupported option '-Wa,--crel' for target '{{.*}}'
+
+/// Don't bother with --experimental-crel for LTO.
+// RUN: %clang -### --target=x86_64-linux -Werror -flto -Wa,--crel %s 2>&1 | FileCheck %s --check-prefix=LTO
+// LTO:       "-plugin-opt=-crel"
+
+// RUN: touch %t.o
+// RUN: not %clang -### --target=mips64-linux-gnu -flto -Wa,--crel %t.o 2>&1 | FileCheck %s --check-prefix=ERR
diff --git a/clang/test/Misc/cc1as-crel.s b/clang/test/Misc/cc1as-crel.s
new file mode 100644
index 00000000000000..953f2a1f3d3bcb
--- /dev/null
+++ b/clang/test/Misc/cc1as-crel.s
@@ -0,0 +1,6 @@
+// REQUIRES: x86-registered-target
+// RUN: %clang -cc1as -triple x86_64 %s -filetype obj --crel -o %t
+// RUN: llvm-readelf -S %t | FileCheck %s
+
+// CHECK: .crel.text CREL
+call foo
diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp
index 86afe22fac24cc..9d53cca7a0bf5b 100644
--- a/clang/tools/driver/cc1as_main.cpp
+++ b/clang/tools/driver/cc1as_main.cpp
@@ -164,6 +164,9 @@ struct AssemblerInvocation {
   LLVM_PREFERRED_TYPE(bool)
   unsigned EmitCompactUnwindNonCanonical : 1;
 
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned Crel : 1;
+
   /// The name of the relocation model to use.
   std::string RelocationModel;
 
@@ -204,6 +207,7 @@ struct AssemblerInvocation {
     EmbedBitcode = 0;
     EmitDwarfUnwind = EmitDwarfUnwindType::Default;
     EmitCompactUnwindNonCanonical = false;
+    Crel = false;
   }
 
   static bool CreateFromArgs(AssemblerInvocation &Res,
@@ -373,6 +377,7 @@ bool AssemblerInvocation::CreateFromArgs(AssemblerInvocation &Opts,
 
   Opts.EmitCompactUnwindNonCanonical =
       Args.hasArg(OPT_femit_compact_unwind_non_canonical);
+  Opts.Crel = Args.hasArg(OPT_crel);
 
   Opts.AsSecureLogFile = Args.getLastArgValue(OPT_as_secure_log_file);
 
@@ -429,6 +434,7 @@ static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
   MCOptions.MCRelaxAll = Opts.RelaxAll;
   MCOptions.EmitDwarfUnwind = Opts.EmitDwarfUnwind;
   MCOptions.EmitCompactUnwindNonCanonical = Opts.EmitCompactUnwindNonCanonical;
+  MCOptions.Crel = Opts.Crel;
   MCOptions.X86RelaxRelocations = Opts.RelaxELFRelocations;
   MCOptions.CompressDebugSections = Opts.CompressDebugSections;
   MCOptions.AsSecureLogFile = Opts.AsSecureLogFile;
diff --git a/llvm/include/llvm/BinaryFormat/DynamicTags.def b/llvm/include/llvm/BinaryFormat/DynamicTags.def
index 1502d375f5c45d..d67a1e6de3678f 100644
--- a/llvm/include/llvm/BinaryFormat/DynamicTags.def
+++ b/llvm/include/llvm/BinaryFormat/DynamicTags.def
@@ -86,6 +86,8 @@ DYNAMIC_TAG(RELRSZ, 35)  // Size of Relr relocation table.
 DYNAMIC_TAG(RELR, 36)    // Address of relocation table (Relr entries).
 DYNAMIC_TAG(RELRENT, 37) // Size of a Relr relocation entry.
 
+DYNAMIC_TAG(CREL,  38)   // CREL relocation table
+
 DYNAMIC_TAG_MARKER(LOOS, 0x60000000)   // Start of environment specific tags.
 DYNAMIC_TAG_MARKER(HIOS, 0x6FFFFFFF)   // End of environment specific tags.
 DYNAMIC_TAG_MARKER(LOPROC, 0x70000000) // Start of processor specific tags.
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index f296acc2ca4bb9..b5c546d8b67e73 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -1079,6 +1079,7 @@ enum : unsigned {
   // Experimental support for SHT_RELR sections. For details, see proposal
   // at https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
   SHT_RELR = 19,         // Relocation entries; only offsets.
+  SHT_CREL = 20,         // CREL relocation entries
   SHT_LOOS = 0x60000000, // Lowest operating system-specific type.
   // Android packed relocation section types.
   // https://android.googlesource.com/platform/bionic/+/6f12bfece5dcc01325e0abba56a46b1bcf991c69/tools/relocation_packer/src/elf_file.cc#37
diff --git a/llvm/include/llvm/MC/MCTargetOptions.h b/llvm/include/llvm/MC/MCTargetOptions.h
index 0cf2806bd48040..98317712250bf9 100644
--- a/llvm/include/llvm/MC/MCTargetOptions.h
+++ b/llvm/include/llvm/MC/MCTargetOptions.h
@@ -61,6 +61,9 @@ class MCTargetOptions {
 
   bool Dwarf64 : 1;
 
+  // Use CREL relocation format for ELF.
+  bool Crel = false;
+
   // If true, prefer R_X86_64_[REX_]GOTPCRELX to R_X86_64_GOTPCREL on x86-64
   // ELF.
   bool X86RelaxRelocations = true;
diff --git a/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h b/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
index fc35eea09c4b3e..837f148e8f4ab4 100644
--- a/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
+++ b/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
@@ -49,6 +49,7 @@ bool getNoDeprecatedWarn();
 
 bool getNoTypeCheck();
 
+bool getCrel();
 bool getX86RelaxRelocations();
 
 std::string getABIName();
diff --git a/llvm/include/llvm/Object/ELF.h b/llvm/include/llvm/Object/ELF.h
index 78b6b40cbddf67..26e38075f8db00 100644
--- a/llvm/include/llvm/Object/ELF.h
+++ b/llvm/include/llvm/Object/ELF.h
@@ -321,6 +321,11 @@ class ELFFile {
 
   std::vector<Elf_Rel> decode_relrs(Elf_Relr_Range relrs) const;
 
+  uint64_t crelHeader(ArrayRef<uint8_t> Content) const;
+  using RelsOrRelas = std::pair<std::vector<Elf_Rel>, std::vector<Elf_Rela>>;
+  Expected<RelsOrRelas> decodeCrel(ArrayRef<uint8_t> Content) const;
+  Expected<RelsOrRelas> crels(const Elf_Shdr &Sec) const;
+
   Expected<std::vector<Elf_Rela>> android_relas(const Elf_Shdr &Sec) const;
 
   /// Iterate over program header table.
diff --git a/llvm/include/llvm/Object/ELFObjectFile.h b/llvm/include/llvm/Object/ELFObjectFile.h
index 8cc09e7fd7d551..505ea9ede313af 100644
--- a/llvm/include/llvm/Object/ELFObjectFile.h
+++ b/llvm/include/llvm/Object/ELFObjectFile.h
@@ -29,6 +29,7 @@
 #include "llvm/Support/ELFAttributes.h"
 #include "llvm/Support/Error.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/LEB128.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/ScopedPrinter.h"
 #include "llvm/TargetParser/SubtargetFeature.h"
@@ -292,6 +293,9 @@ template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {
   const Elf_Shdr *DotSymtabSec = nullptr; // Symbol table section.
   const Elf_Shdr *DotSymtabShndxSec = nullptr; // SHT_SYMTAB_SHNDX section.
 
+  // Hold CREL relocations for SectionRef::relocations().
+  mutable SmallVector<SmallVector<Elf_Crel, 0>, 0> Crels;
+
   Error initContent() override;
 
   void moveSymbolNext(DataRefImpl &Symb) const override;
@@ -446,6 +450,7 @@ template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {
 
   const Elf_Rel *getRel(DataRefImpl Rel) const;
   const Elf_Rela *getRela(DataRefImpl Rela) const;
+  Elf_Crel getCrel(DataRefImpl Rel) const;
 
   Expected<const Elf_Sym *> getSymbol(DataRefImpl Sym) const {
     return EF.template getEntry<Elf_Sym>(Sym.d.a, Sym.d.b);
@@ -1022,6 +1027,40 @@ ELFObjectFile<ELFT>::section_rel_begin(DataRefImpl Sec) const {
   uintptr_t SHT = reinterpret_cast<uintptr_t>((*SectionsOrErr).begin());
   RelData.d.a = (Sec.p - SHT) / EF.getHeader().e_shentsize;
   RelData.d.b = 0;
+  if (reinterpret_cast<const Elf_Shdr *>(Sec.p)->sh_type == ELF::SHT_CREL) {
+    if (RelData.d.a + 1 > Crels.size())
+      Crels.resize(RelData.d.a + 1);
+    if (Crels[RelData.d.a].empty()) {
+      ArrayRef<uint8_t> Content = cantFail(getSectionContents(Sec));
+      DataExtractor Data(Content, true, 8); // endian/class is irrelevant
+      DataExtractor::Cursor Cur(0);
+
+      const auto Hdr = Data.getULEB128(Cur);
+      const size_t Count = Hdr / 8, FlagBits = Hdr & 4 ? 3 : 2, Shift = Hdr % 4;
+      uintX_t Offset = 0, Addend = 0;
+      uint32_t Symidx = 0, Type = 0;
+      for (size_t i = 0; i != Count; ++i) {
+        const uint8_t B = Data.getU8(Cur);
+        Offset += B >> FlagBits;
+        if (B >= 0x80)
+          Offset +=
+              (Data.getULEB128(Cur) << (7 - FlagBits)) - (0x80 >> FlagBits);
+        if (B & 1)
+          Symidx += Data.getSLEB128(Cur);
+        if (B & 2)
+          Type += Data.getSLEB128(Cur);
+        if (B & 4)
+          Addend += Data.getSLEB128(Cur);
+        if (!Cur)
+          break;
+        Crels[RelData.d.a].push_back(
+            Elf_Crel{Offset << Shift, uint32_t(Symidx), Type,
+                     std::make_signed_t<typename ELFT::uint>(Addend)});
+      }
+      if (!Cur)
+        consumeError(Cur.takeError());
+    }
+  }
   return relocation_iterator(RelocationRef(RelData, this));
 }
 
@@ -1030,9 +1069,13 @@ relocation_iterator
 ELFObjectFile<ELFT>::section_rel_end(DataRefImpl Sec) const {
   const Elf_Shdr *S = reinterpret_cast<const Elf_Shdr *>(Sec.p);
   relocation_iterator Begin = section_rel_begin(Sec);
+  DataRefImpl RelData = Begin->getRawDataRefImpl();
+  if (S->sh_type == ELF::SHT_CREL) {
+    RelData.d.b = Crels[RelData.d.a].size();
+    return relocation_iterator(RelocationRef(RelData, this));
+  }
   if (S->sh_type != ELF::SHT_RELA && S->sh_type != ELF::SHT_REL)
     return Begin;
-  DataRefImpl RelData = Begin->getRawDataRefImpl();
   const Elf_Shdr *RelSec = getRelSection(RelData);
 
   // Error check sh_link here so that getRelocationSymbol can just use it.
@@ -1050,7 +1093,7 @@ Expected<section_iterator>
 ELFObjectFile<ELFT>::getRelocatedSection(DataRefImpl Sec) const {
   const Elf_Shdr *EShdr = getSection(Sec);
   uintX_t Type = EShdr->sh_type;
-  if (Type != ELF::SHT_REL && Type != ELF::SHT_RELA)
+  if (Type != ELF::SHT_REL && Type != ELF::SHT_RELA && Type != ELF::SHT_CREL)
     return section_end();
 
   Expected<const Elf_Shdr *> SecOrErr = EF.getSection(EShdr->sh_info);
@@ -1070,7 +1113,9 @@ symbol_iterator
 ELFObjectFile<ELFT>::getRelocationSymbol(DataRefImpl Rel) const {
   uint32_t symbolIdx;
   const Elf_Shdr *sec = getRelSection(Rel);
-  if (sec->sh_type == ELF::SHT_REL)
+  if (sec->sh_type == ELF::SHT_CREL)
+    symbolIdx = getCrel(Rel).r_symidx;
+  else if (sec->sh_type == ELF::SHT_REL)
     symbolIdx = getRel(Rel)->getSymbol(EF.isMips64EL());
   else
     symbolIdx = getRela(Rel)->getSymbol(EF.isMips64EL());
@@ -1087,6 +1132,8 @@ ELFObjectFile<ELFT>::getRelocationSymbol(DataRefImpl Rel) const {
 template <class ELFT>
 uint64_t ELFObjectFile<ELFT>::getRelocationOffset(DataRefImpl Rel) const {
   const Elf_Shdr *sec = getRelSection(Rel);
+  if (sec->sh_type == ELF::SHT_CREL)
+    return getCrel(Rel).r_offset;
   if (sec->sh_type == ELF::SHT_REL)
     return getRel(Rel)->r_offset;
 
@@ -1096,6 +1143,8 @@ uint64_t ELFObjectFile<ELFT>::getRelocationOffset(DataRefImpl Rel) const {
 template <class ELFT>
 uint64_t ELFObjectFile<ELFT>::getRelocationType(DataRefImpl Rel) const {
   const Elf_Shdr *sec = getRelSection(Rel);
+  if (sec->sh_type == ELF::SHT_CREL)
+    return getCrel(Rel).r_type;
   if (sec->sh_type == ELF::SHT_REL)
     return getRel(Rel)->getType(EF.isMips64EL());
   else
@@ -1117,9 +1166,11 @@ void ELFObjectFile<ELFT>::getRelocationTypeName(
 template <class ELFT>
 Expected<int64_t>
 ELFObjectFile<ELFT>::getRelocationAddend(DataRefImpl Rel) const {
-  if (getRelSection(Rel)->sh_type != ELF::SHT_RELA)
-    return createError("Section is not SHT_RELA");
-  return (int64_t)getRela(Rel)->r_addend;
+  if (getRelSection(Rel)->sh_type == ELF::SHT_RELA)
+    return (int64_t)getRela(Rel)->r_addend;
+  if (getRelSection(Rel)->sh_type == ELF::SHT_CREL)
+    return (int64_t)getCrel(Rel).r_addend;
+  return createError("Section is not SHT_RELA");
 }
 
 template <class ELFT>
@@ -1142,6 +1193,13 @@ ELFObjectFile<ELFT>::getRela(DataRefImpl Rela) const {
   return *Ret;
 }
 
+template <class ELFT>
+typename ELFObjectFile<ELFT>::Elf_Crel
+ELFObjectFile<ELFT>::getCrel(DataRefImpl Rel) const {
+  assert(getRelSection(Rel)->sh_type == ELF::SHT_CREL);
+  return Crels[Rel.d.a][Rel.d.b];
+}
+
 template <class ELFT>
 Expected<ELFObjectFile<ELFT>>
 ELFObjectFile<ELFT>::create(MemoryBufferRef Object, bool InitContent) {
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 4ab23e4ea81b1a..aa07f4cacc4aba 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -32,6 +32,7 @@ template <class ELFT> struct Elf_Sym_Impl;
 template <class ELFT> struct Elf_Dyn_Impl;
 template <class ELFT> struct Elf_Phdr_Impl;
 template <class ELFT, bool isRela> struct Elf_Rel_Impl;
+template <bool Is64> struct Elf_Crel_Impl;
 template <class ELFT> struct Elf_Verdef_Impl;
 template <class ELFT> struct Elf_Verdaux_Impl;
 template <class ELFT> struct Elf_Verneed_Impl;
@@ -62,6 +63,7 @@ template <endianness E, bool Is64> struct ELFType {
   using Phdr = Elf_Phdr_Impl<ELFType<E, Is64>>;
   using Rel = Elf_Rel_Impl<ELFType<E, Is64>, false>;
   using Rela = Elf_Rel_Impl<ELFType<E, Is64>, true>;
+  using Crel = Elf_Crel_Impl<Is64>;
   using Relr = packed<uint>;
   using Verdef = Elf_Verdef_Impl<ELFType<E, Is64>>;
   using Verdaux = Elf_Verdaux_Impl<ELFType<E, Is64>>;
@@ -117,6 +119,7 @@ using ELF64BE = ELFType<llvm::endianness::big, true>;
   using Elf_Phdr = typename ELFT::Phdr;                                        \
   using Elf_Rel = typename ELFT::Rel;                                          \
   using Elf_Rela = typename ELFT::Rela;                                        \
+  using Elf_Crel = typename ELFT::Crel;                                        \
   using Elf_Relr = typename ELFT::Relr;                                     ...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented May 6, 2024

@llvm/pr-subscribers-mc

Author: Fangrui Song (MaskRay)

Changes

CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form),
which can be enabled with clang -c -Wa,--crel,--experimental-crel.
-Wa,--experimental-crel indicates that CREL is experimental and not
a standard. -Wa,--experimental-crel might be removed in the future.

  • We take a section type code 20 from the generic range for SHT_CREL.
    We avoided using SHT_LLVM_ or SHT_GNU_ to maintain broader
    applicability for interested psABIs.
  • In the extremely unlikely case the generic ABI assigns 20 to another
    purpose, and existing toolchains generate relocatable files using that
    code (unlikely if we ensure GNU and LLVM don't do this), any conflicts
    would be the user's responsibility to resolve.

This patch also adds llvm-readobj support (for both REL and RELA forms)
to facilitate testing the assembler. Additionally, yaml2obj gains
support for the RELA form to aid testing with llvm-readobj.

Currently, linking the generated relocatable file with GNU ld will fail
because GNU ld doesn't support CREL yet.

unknown architecture of input file `a.o' is incompatible with i386:x86-64 output

lld will also produce an error when attempting to link a CREL relocatable file.
The support will come in a future change.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600


Patch is 72.41 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/91280.diff

32 Files Affected:

  • (modified) clang/include/clang/Basic/CodeGenOptions.def (+1)
  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+3)
  • (modified) clang/include/clang/Driver/Options.td (+3)
  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+1)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+18)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+21)
  • (added) clang/test/Driver/crel.c (+24)
  • (added) clang/test/Misc/cc1as-crel.s (+6)
  • (modified) clang/tools/driver/cc1as_main.cpp (+6)
  • (modified) llvm/include/llvm/BinaryFormat/DynamicTags.def (+2)
  • (modified) llvm/include/llvm/BinaryFormat/ELF.h (+1)
  • (modified) llvm/include/llvm/MC/MCTargetOptions.h (+3)
  • (modified) llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h (+1)
  • (modified) llvm/include/llvm/Object/ELF.h (+5)
  • (modified) llvm/include/llvm/Object/ELFObjectFile.h (+64-6)
  • (modified) llvm/include/llvm/Object/ELFTypes.h (+25)
  • (modified) llvm/lib/MC/ELFObjectWriter.cpp (+84-26)
  • (modified) llvm/lib/MC/MCTargetOptionsCommandFlags.cpp (+6)
  • (modified) llvm/lib/Object/ELF.cpp (+63)
  • (modified) llvm/lib/ObjectYAML/ELFEmitter.cpp (+53-7)
  • (modified) llvm/lib/ObjectYAML/ELFYAML.cpp (+2)
  • (added) llvm/test/MC/ELF/crel-32.s (+16)
  • (added) llvm/test/MC/ELF/crel.s (+100)
  • (added) llvm/test/tools/llvm-readobj/ELF/crel.test (+180)
  • (modified) llvm/test/tools/llvm-readobj/ELF/dynamic-reloc.test (+25-6)
  • (modified) llvm/test/tools/llvm-readobj/ELF/relocation-errors.test (+18-1)
  • (modified) llvm/test/tools/yaml2obj/ELF/dynamic-relocations.yaml (+4-1)
  • (modified) llvm/test/tools/yaml2obj/ELF/reloc-sec-entry-size.yaml (+5)
  • (added) llvm/test/tools/yaml2obj/ELF/relocation-crel.yaml (+63)
  • (modified) llvm/test/tools/yaml2obj/ELF/relocation-missing-symbol.yaml (+2-1)
  • (modified) llvm/test/tools/yaml2obj/ELF/relocation-type.yaml (+3-1)
  • (modified) llvm/tools/llvm-readobj/ELFDumper.cpp (+75-6)
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 340b08dd7e2a33..3229f77eef1fcc 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -36,6 +36,7 @@ VALUE_CODEGENOPT(Name, Bits, Default)
 #endif
 
 CODEGENOPT(DisableIntegratedAS, 1, 0) ///< -no-integrated-as
+CODEGENOPT(Crel, 1, 0) ///< -Wa,--crel
 CODEGENOPT(RelaxELFRelocations, 1, 1) ///< -Wa,-mrelax-relocations={yes,no}
 CODEGENOPT(AsmVerbose        , 1, 0) ///< -dA, -fverbose-asm.
 CODEGENOPT(PreserveAsmComments, 1, 1) ///< -dA, -fno-preserve-as-comments.
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 9781fcaa4ff5e9..e9cea8967c1334 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -809,6 +809,9 @@ def warn_drv_missing_multilib : Warning<
 def note_drv_available_multilibs : Note<
   "available multilibs are:%0">;
 
+def err_drv_experimental_crel : Error<
+  "-Wa,--experimental-crel must be specified to use -Wa,--crel. CREL is experimental and takes a non-standard section type code">;
+
 def warn_android_unversioned_fallback : Warning<
   "Using unversioned Android target directory %0 for target %1. Unversioned"
   " directories will not be used in Clang 19. Provide a versioned directory"
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index c9f7c4e5f718b4..55f0c9aff6a49f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6958,6 +6958,9 @@ def massembler_no_warn : Flag<["-"], "massembler-no-warn">,
 def massembler_fatal_warnings : Flag<["-"], "massembler-fatal-warnings">,
   HelpText<"Make assembler warnings fatal">,
   MarshallingInfoFlag<CodeGenOpts<"FatalWarnings">>;
+def crel : Flag<["--"], "crel">,
+  HelpText<"Enable CREL relocation format (ELF only)">,
+  MarshallingInfoFlag<CodeGenOpts<"Crel">>;
 def mrelax_relocations_no : Flag<["-"], "mrelax-relocations=no">,
     HelpText<"Disable x86 relax relocations">,
     MarshallingInfoNegativeFlag<CodeGenOpts<"RelaxELFRelocations">>;
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 22c3f8642ad8eb..ed37085d69ac53 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -474,6 +474,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.MCOptions.AsmVerbose = CodeGenOpts.AsmVerbose;
   Options.MCOptions.Dwarf64 = CodeGenOpts.Dwarf64;
   Options.MCOptions.PreserveAsmComments = CodeGenOpts.PreserveAsmComments;
+  Options.MCOptions.Crel = CodeGenOpts.Crel;
   Options.MCOptions.X86RelaxRelocations = CodeGenOpts.RelaxELFRelocations;
   Options.MCOptions.CompressDebugSections =
       CodeGenOpts.getCompressDebugSections();
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 0a2ea96de73824..0f4d8aa3ad7938 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2467,6 +2467,8 @@ static void CollectArgsForIntegratedAssembler(Compilation &C,
   // arg after parsing the '-I' arg.
   bool TakeNextArg = false;
 
+  const llvm::Triple &Triple = C.getDefaultToolChain().getTriple();
+  bool Crel = false, ExperimentalCrel = false;
   bool UseRelaxRelocations = C.getDefaultToolChain().useRelaxRelocations();
   bool UseNoExecStack = false;
   const char *MipsTargetFeature = nullptr;
@@ -2590,6 +2592,12 @@ static void CollectArgsForIntegratedAssembler(Compilation &C,
                  Value == "-nocompress-debug-sections" ||
                  Value == "--nocompress-debug-sections") {
         CmdArgs.push_back(Value.data());
+      } else if (Value == "--crel") {
+        Crel = true;
+      } else if (Value == "--no-crel") {
+        Crel = false;
+      } else if (Value == "--experimental-crel") {
+        ExperimentalCrel = true;
       } else if (Value == "-mrelax-relocations=yes" ||
                  Value == "--mrelax-relocations=yes") {
         UseRelaxRelocations = true;
@@ -2655,6 +2663,16 @@ static void CollectArgsForIntegratedAssembler(Compilation &C,
   }
   if (ImplicitIt.size())
     AddARMImplicitITArgs(Args, CmdArgs, ImplicitIt);
+  if (Crel) {
+    if (!ExperimentalCrel)
+      D.Diag(diag::err_drv_experimental_crel);
+    if (Triple.isOSBinFormatELF() && !Triple.isMIPS()) {
+      CmdArgs.push_back("--crel");
+    } else {
+      D.Diag(diag::err_drv_unsupported_opt_for_target)
+          << "-Wa,--crel" << D.getTargetTriple();
+    }
+  }
   if (!UseRelaxRelocations)
     CmdArgs.push_back("-mrelax-relocations=no");
   if (UseNoExecStack)
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 6796b43a155020..8ba45aeafd0569 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1072,6 +1072,27 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
 
   addMachineOutlinerArgs(D, Args, CmdArgs, ToolChain.getEffectiveTriple(),
                          /*IsLTO=*/true, PluginOptPrefix);
+
+  for (const Arg *A : Args.filtered(options::OPT_Wa_COMMA)) {
+    bool Crel = false;
+    for (StringRef V : A->getValues()) {
+      if (V == "--crel")
+        Crel = true;
+      else if (V == "--no-crel")
+        Crel = false;
+      else
+        continue;
+      A->claim();
+    }
+    if (Crel) {
+      if (Triple.isOSBinFormatELF() && !Triple.isMIPS()) {
+        CmdArgs.push_back(Args.MakeArgString(Twine(PluginOptPrefix) + "-crel"));
+      } else {
+        D.Diag(diag::err_drv_unsupported_opt_for_target)
+            << "-Wa,--crel" << D.getTargetTriple();
+      }
+    }
+  }
 }
 
 /// Adds the '-lcgpu' and '-lmgpu' libraries to the compilation to include the
diff --git a/clang/test/Driver/crel.c b/clang/test/Driver/crel.c
new file mode 100644
index 00000000000000..0dc0507705d72d
--- /dev/null
+++ b/clang/test/Driver/crel.c
@@ -0,0 +1,24 @@
+// RUN: not %clang -### -c --target=x86_64 -Wa,--crel %s 2>&1 | FileCheck %s --check-prefix=NOEXP
+
+// NOEXP: error: -Wa,--experimental-crel must be specified to use -Wa,--crel. CREL is experimental and takes a non-standard section type code
+
+// RUN: %clang -### -c --target=x86_64 -Wa,--crel,--experimental-crel %s 2>&1 | FileCheck %s
+// RUN: %clang -### -c --target=x86_64 -Wa,--crel,--no-crel,--experimental-crel %s 2>&1 | FileCheck %s --check-prefix=NO
+// RUN: not %clang -### -c --target=arm64-apple-darwin -Wa,--crel,--experimental-crel %s 2>&1 | FileCheck %s --check-prefix=ERR
+// RUN: not %clang -### -c --target=mips64 -Wa,--crel,--experimental-crel %s 2>&1 | FileCheck %s --check-prefix=ERR
+
+// RUN: %clang -### -c --target=aarch64 -Werror -Wa,--crel,--experimental-crel -x assembler %s -Werror 2>&1 | FileCheck %s --check-prefix=ASM
+// RUN: not %clang -### -c --target=mips64 -Wa,--crel,--experimental-crel -x assembler %s 2>&1 | FileCheck %s --check-prefix=ERR
+
+// CHECK: "-cc1" {{.*}}"--crel"
+// NO:     "-cc1"
+// NO-NOT: "--crel"
+// ASM:   "-cc1as" {{.*}}"--crel"
+// ERR: error: unsupported option '-Wa,--crel' for target '{{.*}}'
+
+/// Don't bother with --experimental-crel for LTO.
+// RUN: %clang -### --target=x86_64-linux -Werror -flto -Wa,--crel %s 2>&1 | FileCheck %s --check-prefix=LTO
+// LTO:       "-plugin-opt=-crel"
+
+// RUN: touch %t.o
+// RUN: not %clang -### --target=mips64-linux-gnu -flto -Wa,--crel %t.o 2>&1 | FileCheck %s --check-prefix=ERR
diff --git a/clang/test/Misc/cc1as-crel.s b/clang/test/Misc/cc1as-crel.s
new file mode 100644
index 00000000000000..953f2a1f3d3bcb
--- /dev/null
+++ b/clang/test/Misc/cc1as-crel.s
@@ -0,0 +1,6 @@
+// REQUIRES: x86-registered-target
+// RUN: %clang -cc1as -triple x86_64 %s -filetype obj --crel -o %t
+// RUN: llvm-readelf -S %t | FileCheck %s
+
+// CHECK: .crel.text CREL
+call foo
diff --git a/clang/tools/driver/cc1as_main.cpp b/clang/tools/driver/cc1as_main.cpp
index 86afe22fac24cc..9d53cca7a0bf5b 100644
--- a/clang/tools/driver/cc1as_main.cpp
+++ b/clang/tools/driver/cc1as_main.cpp
@@ -164,6 +164,9 @@ struct AssemblerInvocation {
   LLVM_PREFERRED_TYPE(bool)
   unsigned EmitCompactUnwindNonCanonical : 1;
 
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned Crel : 1;
+
   /// The name of the relocation model to use.
   std::string RelocationModel;
 
@@ -204,6 +207,7 @@ struct AssemblerInvocation {
     EmbedBitcode = 0;
     EmitDwarfUnwind = EmitDwarfUnwindType::Default;
     EmitCompactUnwindNonCanonical = false;
+    Crel = false;
   }
 
   static bool CreateFromArgs(AssemblerInvocation &Res,
@@ -373,6 +377,7 @@ bool AssemblerInvocation::CreateFromArgs(AssemblerInvocation &Opts,
 
   Opts.EmitCompactUnwindNonCanonical =
       Args.hasArg(OPT_femit_compact_unwind_non_canonical);
+  Opts.Crel = Args.hasArg(OPT_crel);
 
   Opts.AsSecureLogFile = Args.getLastArgValue(OPT_as_secure_log_file);
 
@@ -429,6 +434,7 @@ static bool ExecuteAssemblerImpl(AssemblerInvocation &Opts,
   MCOptions.MCRelaxAll = Opts.RelaxAll;
   MCOptions.EmitDwarfUnwind = Opts.EmitDwarfUnwind;
   MCOptions.EmitCompactUnwindNonCanonical = Opts.EmitCompactUnwindNonCanonical;
+  MCOptions.Crel = Opts.Crel;
   MCOptions.X86RelaxRelocations = Opts.RelaxELFRelocations;
   MCOptions.CompressDebugSections = Opts.CompressDebugSections;
   MCOptions.AsSecureLogFile = Opts.AsSecureLogFile;
diff --git a/llvm/include/llvm/BinaryFormat/DynamicTags.def b/llvm/include/llvm/BinaryFormat/DynamicTags.def
index 1502d375f5c45d..d67a1e6de3678f 100644
--- a/llvm/include/llvm/BinaryFormat/DynamicTags.def
+++ b/llvm/include/llvm/BinaryFormat/DynamicTags.def
@@ -86,6 +86,8 @@ DYNAMIC_TAG(RELRSZ, 35)  // Size of Relr relocation table.
 DYNAMIC_TAG(RELR, 36)    // Address of relocation table (Relr entries).
 DYNAMIC_TAG(RELRENT, 37) // Size of a Relr relocation entry.
 
+DYNAMIC_TAG(CREL,  38)   // CREL relocation table
+
 DYNAMIC_TAG_MARKER(LOOS, 0x60000000)   // Start of environment specific tags.
 DYNAMIC_TAG_MARKER(HIOS, 0x6FFFFFFF)   // End of environment specific tags.
 DYNAMIC_TAG_MARKER(LOPROC, 0x70000000) // Start of processor specific tags.
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index f296acc2ca4bb9..b5c546d8b67e73 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -1079,6 +1079,7 @@ enum : unsigned {
   // Experimental support for SHT_RELR sections. For details, see proposal
   // at https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg
   SHT_RELR = 19,         // Relocation entries; only offsets.
+  SHT_CREL = 20,         // CREL relocation entries
   SHT_LOOS = 0x60000000, // Lowest operating system-specific type.
   // Android packed relocation section types.
   // https://android.googlesource.com/platform/bionic/+/6f12bfece5dcc01325e0abba56a46b1bcf991c69/tools/relocation_packer/src/elf_file.cc#37
diff --git a/llvm/include/llvm/MC/MCTargetOptions.h b/llvm/include/llvm/MC/MCTargetOptions.h
index 0cf2806bd48040..98317712250bf9 100644
--- a/llvm/include/llvm/MC/MCTargetOptions.h
+++ b/llvm/include/llvm/MC/MCTargetOptions.h
@@ -61,6 +61,9 @@ class MCTargetOptions {
 
   bool Dwarf64 : 1;
 
+  // Use CREL relocation format for ELF.
+  bool Crel = false;
+
   // If true, prefer R_X86_64_[REX_]GOTPCRELX to R_X86_64_GOTPCREL on x86-64
   // ELF.
   bool X86RelaxRelocations = true;
diff --git a/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h b/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
index fc35eea09c4b3e..837f148e8f4ab4 100644
--- a/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
+++ b/llvm/include/llvm/MC/MCTargetOptionsCommandFlags.h
@@ -49,6 +49,7 @@ bool getNoDeprecatedWarn();
 
 bool getNoTypeCheck();
 
+bool getCrel();
 bool getX86RelaxRelocations();
 
 std::string getABIName();
diff --git a/llvm/include/llvm/Object/ELF.h b/llvm/include/llvm/Object/ELF.h
index 78b6b40cbddf67..26e38075f8db00 100644
--- a/llvm/include/llvm/Object/ELF.h
+++ b/llvm/include/llvm/Object/ELF.h
@@ -321,6 +321,11 @@ class ELFFile {
 
   std::vector<Elf_Rel> decode_relrs(Elf_Relr_Range relrs) const;
 
+  uint64_t crelHeader(ArrayRef<uint8_t> Content) const;
+  using RelsOrRelas = std::pair<std::vector<Elf_Rel>, std::vector<Elf_Rela>>;
+  Expected<RelsOrRelas> decodeCrel(ArrayRef<uint8_t> Content) const;
+  Expected<RelsOrRelas> crels(const Elf_Shdr &Sec) const;
+
   Expected<std::vector<Elf_Rela>> android_relas(const Elf_Shdr &Sec) const;
 
   /// Iterate over program header table.
diff --git a/llvm/include/llvm/Object/ELFObjectFile.h b/llvm/include/llvm/Object/ELFObjectFile.h
index 8cc09e7fd7d551..505ea9ede313af 100644
--- a/llvm/include/llvm/Object/ELFObjectFile.h
+++ b/llvm/include/llvm/Object/ELFObjectFile.h
@@ -29,6 +29,7 @@
 #include "llvm/Support/ELFAttributes.h"
 #include "llvm/Support/Error.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/LEB128.h"
 #include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/ScopedPrinter.h"
 #include "llvm/TargetParser/SubtargetFeature.h"
@@ -292,6 +293,9 @@ template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {
   const Elf_Shdr *DotSymtabSec = nullptr; // Symbol table section.
   const Elf_Shdr *DotSymtabShndxSec = nullptr; // SHT_SYMTAB_SHNDX section.
 
+  // Hold CREL relocations for SectionRef::relocations().
+  mutable SmallVector<SmallVector<Elf_Crel, 0>, 0> Crels;
+
   Error initContent() override;
 
   void moveSymbolNext(DataRefImpl &Symb) const override;
@@ -446,6 +450,7 @@ template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {
 
   const Elf_Rel *getRel(DataRefImpl Rel) const;
   const Elf_Rela *getRela(DataRefImpl Rela) const;
+  Elf_Crel getCrel(DataRefImpl Rel) const;
 
   Expected<const Elf_Sym *> getSymbol(DataRefImpl Sym) const {
     return EF.template getEntry<Elf_Sym>(Sym.d.a, Sym.d.b);
@@ -1022,6 +1027,40 @@ ELFObjectFile<ELFT>::section_rel_begin(DataRefImpl Sec) const {
   uintptr_t SHT = reinterpret_cast<uintptr_t>((*SectionsOrErr).begin());
   RelData.d.a = (Sec.p - SHT) / EF.getHeader().e_shentsize;
   RelData.d.b = 0;
+  if (reinterpret_cast<const Elf_Shdr *>(Sec.p)->sh_type == ELF::SHT_CREL) {
+    if (RelData.d.a + 1 > Crels.size())
+      Crels.resize(RelData.d.a + 1);
+    if (Crels[RelData.d.a].empty()) {
+      ArrayRef<uint8_t> Content = cantFail(getSectionContents(Sec));
+      DataExtractor Data(Content, true, 8); // endian/class is irrelevant
+      DataExtractor::Cursor Cur(0);
+
+      const auto Hdr = Data.getULEB128(Cur);
+      const size_t Count = Hdr / 8, FlagBits = Hdr & 4 ? 3 : 2, Shift = Hdr % 4;
+      uintX_t Offset = 0, Addend = 0;
+      uint32_t Symidx = 0, Type = 0;
+      for (size_t i = 0; i != Count; ++i) {
+        const uint8_t B = Data.getU8(Cur);
+        Offset += B >> FlagBits;
+        if (B >= 0x80)
+          Offset +=
+              (Data.getULEB128(Cur) << (7 - FlagBits)) - (0x80 >> FlagBits);
+        if (B & 1)
+          Symidx += Data.getSLEB128(Cur);
+        if (B & 2)
+          Type += Data.getSLEB128(Cur);
+        if (B & 4)
+          Addend += Data.getSLEB128(Cur);
+        if (!Cur)
+          break;
+        Crels[RelData.d.a].push_back(
+            Elf_Crel{Offset << Shift, uint32_t(Symidx), Type,
+                     std::make_signed_t<typename ELFT::uint>(Addend)});
+      }
+      if (!Cur)
+        consumeError(Cur.takeError());
+    }
+  }
   return relocation_iterator(RelocationRef(RelData, this));
 }
 
@@ -1030,9 +1069,13 @@ relocation_iterator
 ELFObjectFile<ELFT>::section_rel_end(DataRefImpl Sec) const {
   const Elf_Shdr *S = reinterpret_cast<const Elf_Shdr *>(Sec.p);
   relocation_iterator Begin = section_rel_begin(Sec);
+  DataRefImpl RelData = Begin->getRawDataRefImpl();
+  if (S->sh_type == ELF::SHT_CREL) {
+    RelData.d.b = Crels[RelData.d.a].size();
+    return relocation_iterator(RelocationRef(RelData, this));
+  }
   if (S->sh_type != ELF::SHT_RELA && S->sh_type != ELF::SHT_REL)
     return Begin;
-  DataRefImpl RelData = Begin->getRawDataRefImpl();
   const Elf_Shdr *RelSec = getRelSection(RelData);
 
   // Error check sh_link here so that getRelocationSymbol can just use it.
@@ -1050,7 +1093,7 @@ Expected<section_iterator>
 ELFObjectFile<ELFT>::getRelocatedSection(DataRefImpl Sec) const {
   const Elf_Shdr *EShdr = getSection(Sec);
   uintX_t Type = EShdr->sh_type;
-  if (Type != ELF::SHT_REL && Type != ELF::SHT_RELA)
+  if (Type != ELF::SHT_REL && Type != ELF::SHT_RELA && Type != ELF::SHT_CREL)
     return section_end();
 
   Expected<const Elf_Shdr *> SecOrErr = EF.getSection(EShdr->sh_info);
@@ -1070,7 +1113,9 @@ symbol_iterator
 ELFObjectFile<ELFT>::getRelocationSymbol(DataRefImpl Rel) const {
   uint32_t symbolIdx;
   const Elf_Shdr *sec = getRelSection(Rel);
-  if (sec->sh_type == ELF::SHT_REL)
+  if (sec->sh_type == ELF::SHT_CREL)
+    symbolIdx = getCrel(Rel).r_symidx;
+  else if (sec->sh_type == ELF::SHT_REL)
     symbolIdx = getRel(Rel)->getSymbol(EF.isMips64EL());
   else
     symbolIdx = getRela(Rel)->getSymbol(EF.isMips64EL());
@@ -1087,6 +1132,8 @@ ELFObjectFile<ELFT>::getRelocationSymbol(DataRefImpl Rel) const {
 template <class ELFT>
 uint64_t ELFObjectFile<ELFT>::getRelocationOffset(DataRefImpl Rel) const {
   const Elf_Shdr *sec = getRelSection(Rel);
+  if (sec->sh_type == ELF::SHT_CREL)
+    return getCrel(Rel).r_offset;
   if (sec->sh_type == ELF::SHT_REL)
     return getRel(Rel)->r_offset;
 
@@ -1096,6 +1143,8 @@ uint64_t ELFObjectFile<ELFT>::getRelocationOffset(DataRefImpl Rel) const {
 template <class ELFT>
 uint64_t ELFObjectFile<ELFT>::getRelocationType(DataRefImpl Rel) const {
   const Elf_Shdr *sec = getRelSection(Rel);
+  if (sec->sh_type == ELF::SHT_CREL)
+    return getCrel(Rel).r_type;
   if (sec->sh_type == ELF::SHT_REL)
     return getRel(Rel)->getType(EF.isMips64EL());
   else
@@ -1117,9 +1166,11 @@ void ELFObjectFile<ELFT>::getRelocationTypeName(
 template <class ELFT>
 Expected<int64_t>
 ELFObjectFile<ELFT>::getRelocationAddend(DataRefImpl Rel) const {
-  if (getRelSection(Rel)->sh_type != ELF::SHT_RELA)
-    return createError("Section is not SHT_RELA");
-  return (int64_t)getRela(Rel)->r_addend;
+  if (getRelSection(Rel)->sh_type == ELF::SHT_RELA)
+    return (int64_t)getRela(Rel)->r_addend;
+  if (getRelSection(Rel)->sh_type == ELF::SHT_CREL)
+    return (int64_t)getCrel(Rel).r_addend;
+  return createError("Section is not SHT_RELA");
 }
 
 template <class ELFT>
@@ -1142,6 +1193,13 @@ ELFObjectFile<ELFT>::getRela(DataRefImpl Rela) const {
   return *Ret;
 }
 
+template <class ELFT>
+typename ELFObjectFile<ELFT>::Elf_Crel
+ELFObjectFile<ELFT>::getCrel(DataRefImpl Rel) const {
+  assert(getRelSection(Rel)->sh_type == ELF::SHT_CREL);
+  return Crels[Rel.d.a][Rel.d.b];
+}
+
 template <class ELFT>
 Expected<ELFObjectFile<ELFT>>
 ELFObjectFile<ELFT>::create(MemoryBufferRef Object, bool InitContent) {
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 4ab23e4ea81b1a..aa07f4cacc4aba 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -32,6 +32,7 @@ template <class ELFT> struct Elf_Sym_Impl;
 template <class ELFT> struct Elf_Dyn_Impl;
 template <class ELFT> struct Elf_Phdr_Impl;
 template <class ELFT, bool isRela> struct Elf_Rel_Impl;
+template <bool Is64> struct Elf_Crel_Impl;
 template <class ELFT> struct Elf_Verdef_Impl;
 template <class ELFT> struct Elf_Verdaux_Impl;
 template <class ELFT> struct Elf_Verneed_Impl;
@@ -62,6 +63,7 @@ template <endianness E, bool Is64> struct ELFType {
   using Phdr = Elf_Phdr_Impl<ELFType<E, Is64>>;
   using Rel = Elf_Rel_Impl<ELFType<E, Is64>, false>;
   using Rela = Elf_Rel_Impl<ELFType<E, Is64>, true>;
+  using Crel = Elf_Crel_Impl<Is64>;
   using Relr = packed<uint>;
   using Verdef = Elf_Verdef_Impl<ELFType<E, Is64>>;
   using Verdaux = Elf_Verdaux_Impl<ELFType<E, Is64>>;
@@ -117,6 +119,7 @@ using ELF64BE = ELFType<llvm::endianness::big, true>;
   using Elf_Phdr = typename ELFT::Phdr;                                        \
   using Elf_Rel = typename ELFT::Rel;                                          \
   using Elf_Rela = typename ELFT::Rela;                                        \
+  using Elf_Crel = typename ELFT::Crel;                                        \
   using Elf_Relr = typename ELFT::Relr;                                     ...
[truncated]

@dwblaikie
Copy link
Collaborator

Presumably this'll be split out into separate reviews for the components here? (I'd guess llvm-mc then clang integrated assembler, with readobj and yaml2obj in any order/don't have dependencies, unless they're needed for testing, in which case I guess they come first?)

I hesitate to comment on the specifics while it's all together as I imagine it'll become hard to follow the different commentary across the larger patch/different project areas, etc.

(though the flag design seems strange to me - having --crel and --experimental-crel, rather than having only the latter (and renaming it in the future, rather than using two flags now and removing one in the future))

@dwblaikie
Copy link
Collaborator

Oh, and I take it there's no /official/ extension space in the SHT_* space? The intent is to standardize it where it lies? (like use 20 for the shipped version? Or eventually get some lower designation?) I understand tnhe hesitance to use "SHT_LLVM" or the like (though equally, it does seem suitable (to try to use a name less likely to collide), if this is an experiment ,but I understand other projects may have some not-invented-here negative sentiment to such a thing), perhaps "SHT_EXP" wouldn't be too bad?

@MaskRay
Copy link
Member Author

MaskRay commented May 7, 2024

Thanks for the comments.

The syntax -Wa,--crel -Wa,--experimental-crel (or -Wa,--crel,--experimental-crel) might seem cumbersome, it serves a purpose:

  • -Wa,--experimental-crel: no effect
  • -Wa,--crel: driver error. Make sure users acknowledge the non-standard status.
  • -Wa,--crel,--experimental-crel: CREL

As CREL adoption grows within the ecosystem, it has the potential to become a de-facto standard, similar to SHF_EXCLUDE.
Once that happens, the --experimental-crel flag can be deprecated and eventually removed.

This approahc parallels the strategy used for -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang, albeit a nicer name.


I can split this PR into 5 components.

  • llvm-mc
  • clang driver
  • yaml2obj/obj2yaml
  • llvm-readobj
  • llvm-objdump

The assembler testing might look bare-bone without llvm-readobj support.


While ELF allows extension space for processor/OS/application-specific section types, using them introduces challenges
since we want linker errors when SHT_CREL is unsupported (see isKnownSpecificSectionType in #85173):

  • [SHT_LOUSER,SHT_HIUSER]: Not applicable as static relocation sections do not have the SHF_ALLOC flag.
  • [SHT_LOOS,SHT_HIOS]: Requires annoying SHF_OS_NONCONFORMING. We want psABI documents to be willing to accept CREL when toolchain becomes ready, especially if their relocatable files are huge due to linker relaxation (e.g. RISC-V, LoongArch), we want to avoid GNUism / LLVMism.
  • [SHT_LOPROC,SHT_HIPROC]: Processor-specific and not applicable.

We will unlikely change the format. I avoid the name SHT_EXP_* because the renaming would cause unnecessary churn.
We also don't want to support two section type codes, because testing would be unnecessarily complex.

Solaris-specific SHF_ORDERED and SHF_EXCLUDE, without a SunOS/Solaris name infix/prefix/suffix, make them into glibc elf.h.
SHF_EXCLUDE gets picked due to its linker semantics but SHF_ORDERED is never used.
A generic name SHT_CREL might not be fundamentally different from SHF_EXCLUDE.

@smithp35
Copy link
Collaborator

smithp35 commented May 8, 2024

I will take a look, but may take me a while to go through it. Some very quick comments:

  • Would --enable-experimental-crel be a better name than --experimental-crel that makes it clear that the option is just an unlock of another flag, and not a flag in of itself?
  • I agree that the section type should be in the generic range. While it is in the experimental stage it could be possible to put it in SHT_LOOS SHT_HIOS range temporarily. When there is an official ELF code in the generic range it could be changed (forcing a recompile, or a migration script), or just have the LLVM tools implicitly convert from the old code to the new. For the experimental range.
  • While I can give a personal opinion. This will touch multiple components of LLVM each with their own code-owners. Will probably need a signal, either on the review or via discourse threads that the community is happy to go ahead with the addition of CREL.

@dwblaikie
Copy link
Collaborator

This approahc parallels the strategy used for -enable-trivial-auto-var-init-zero-knowing-it-will-be-removed-from-clang, albeit a nicer name.

Fair enough - yeah, if we're wordsmithing this. Maybe "allow" would be a good word, rather than "enable" (or no verb at all), as in -allow-experimental-crel?

While ELF allows extension space for processor/OS/application-specific section types, using them introduces challenges
since we want linker errors when SHT_CREL is unsupported (see isKnownSpecificSectionType in #85173):

[SHT_LOUSER,SHT_HIUSER]: Not applicable as static relocation sections do not have the SHF_ALLOC flag.

I missed a step here - why is this range only applicable for SHF_ALLOC sections? Ah, from the patch you linked earlier, linkers don't fail on SHT_*USER range unless it's allocated... that seems a bit tenuous, though.

Seems like this the range intended for this kind of experimentation, with the cost that it doesn't produce an error from existing linkers (though it could, but they're trying to be conservative/allow for extensions that don't need linker-awareness).

Guess I'll leave that up to other folks to comment on.

[SHT_LOOS,SHT_HIOS]: Requires annoying SHF_OS_NONCONFORMING. We want psABI documents to be willing to accept CREL when toolchain becomes ready, especially if their relocatable files are huge due to linker relaxation (e.g. RISC-V, LoongArch), we want to avoid GNUism / LLVMism.

So in this case each OS would have to allocate their own number in in the SHT_*OS range? Yeah, that seems unfortunate.

[SHT_LOPROC,SHT_HIPROC]: Processor-specific and not applicable.

Similar to the *OS case above, I guess.

We will unlikely change the format.

I think we should start with the theory that the format will change - even if it doesn't, we wouldn't want to introduce this new thing and avoid change because we've made it hard to change if we discover in early testing that there's things we could do better.

Solaris-specific SHF_ORDERED and SHF_EXCLUDE, without a SunOS/Solaris name infix/prefix/suffix, make them into glibc elf.h.
SHF_EXCLUDE gets picked due to its linker semantics but SHF_ORDERED is never used.
A generic name SHT_CREL might not be fundamentally different from SHF_EXCLUDE.

Got links to the history here, about how SHF_ORDERED/EXCLUDE came to be?

Created using spr 1.3.5-bogner
@MaskRay MaskRay changed the title [MC,clang,llvm-readobj,yaml2obj] Support CREL relocation format [MC,llvm-readobj,yaml2obj] Support CREL relocation format May 8, 2024
Copy link

github-actions bot commented May 8, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@MaskRay
Copy link
Member Author

MaskRay commented May 8, 2024

  // TODO: Experimental CREL relocations. LLVM will change the value and
  // break compatibility in the future.
  SHT_CREL = 0x40000014,

Switched to 0x40000014 (generic range) to retain linker errors while making the
experimental status stand out. Add a comment to make the intention clearer.

The [SHT_LOOS, SHT_HIOS] range with a SHF_OS_NONCONFORMING flag would
change llvm-readelf output differences when we change the section type code.

Avoiding code churn, I'll keep the current SHT name.
A helper function like isCrelType is not ideal due to existing code using direct comparisons or switch statements.

Clang changes have been removed. I will use clang -c -Wl,--crel,--allow-experimental-crel in a dependent patch.

The labels (e.g. clang llvm:binary-utilities mc objectyaml) shall notify subscribed folks.

llvm:binary-utilities objectyaml are primarily reviewed by @jh7370 and me.
Almost all functionality changes to llvm/lib/MC/ELFObjectWriter.cpp in the past 5 years are from me.
I'll manually tag some folks on the discourse thread to give this a signal boost.

@smithp35
Copy link
Collaborator

smithp35 commented May 9, 2024

I think it would be useful to nominate a source as the canonical reference for the specification which is updated along with any implementation changes. I think https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600/3 is a good starting point.

Possible examples:

  • A patch to the LLVM documentation.
  • A blog-post that is kept up to date with the implementation.
  • A discourse post that is updated.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments based on a read of the ELFObjectFile. I haven't got as far as the test and tools.

One general thought is whether it is worth adding in some more constants. For example:

if (B & 1)

Could be something like:

if (B & DeltaSymbolMask)

I'll keep looking, likely next week.

llvm/include/llvm/Object/ELF.h Show resolved Hide resolved
@@ -446,6 +450,7 @@ template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {

const Elf_Rel *getRel(DataRefImpl Rel) const;
const Elf_Rela *getRela(DataRefImpl Rela) const;
Elf_Crel getCrel(DataRefImpl Rel) const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use CRel as the parameter name. Rel could give the idea that this is a Rel type relocation with no addend.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

return (int64_t)getRela(Rel)->r_addend;
if (getRelSection(Rel)->sh_type == ELF::SHT_RELA)
return (int64_t)getRela(Rel)->r_addend;
if (getRelSection(Rel)->sh_type == ELF::SHT_CREL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the addend bit in the "header" for the CREL section is 0 should this return an Error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted this file. Sorry for leaving it in the initial patch.

This code is used by .getAddend() in the files (not core binary utilities).

  • llvm/lib/Object/RelocationResolver.cpp
  • llvm/lib/XRay/InstrumentationMap.cpp
  • llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp

Since they are all for static relocations, we can assume r_addend.
However, I agree that having an error checking would be nice.

@@ -292,6 +293,9 @@ template <class ELFT> class ELFObjectFile : public ELFObjectFileBase {
const Elf_Shdr *DotSymtabSec = nullptr; // Symbol table section.
const Elf_Shdr *DotSymtabShndxSec = nullptr; // SHT_SYMTAB_SHNDX section.

// Hold CREL relocations for SectionRef::relocations().
mutable SmallVector<SmallVector<Elf_Crel, 0>, 0> Crels;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any thread-safety concerns here?

It looks like this can be modified when obtaining an iterator?

Just thinking of a use case where multiple threads given a const reference to an ELFObjectFile iterate through different CREL sections in the same object.

There may be no expectation that this would work though.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge, ELFObjectFile is not used by multi-threading. The mutable member is not nice, which is to comply with the interface requirement by llvm-objdump -r/-R.

@@ -1022,6 +1027,40 @@ ELFObjectFile<ELFT>::section_rel_begin(DataRefImpl Sec) const {
uintptr_t SHT = reinterpret_cast<uintptr_t>((*SectionsOrErr).begin());
RelData.d.a = (Sec.p - SHT) / EF.getHeader().e_shentsize;
RelData.d.b = 0;
if (reinterpret_cast<const Elf_Shdr *>(Sec.p)->sh_type == ELF::SHT_CREL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there scope for a forwards iterator that decodes a CREL when dereferenced or incremented. That could be used in a thread-safe environment and wouldn't take up a lot of memory.

May need to be under a different interface.

@@ -1142,6 +1193,13 @@ ELFObjectFile<ELFT>::getRela(DataRefImpl Rela) const {
return *Ret;
}

template <class ELFT>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way this can be called before Crels has been populated? If not then please ignore the rest of the comment.

It looks like getRelocatedSection() will need to be called first.

It looks like all the existing use cases here are safe, but if this is part of the general API then maybe worth some additional asserts and pre-conditions given in comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to this file are for llvm-objdump support. The use patterns ensure that Crels will be populated.

I've added the assert to https://github.com/MaskRay/llvm-project/tree/demo-crel

@dwblaikie
Copy link
Collaborator

Switched to 0x40000014 (generic range) to retain linker errors while making the experimental status stand out. Add a comment to make the intention clearer.

Seems a bit weird/problematic, using something in the reserved range/not in a user extension space, but I guess so long as you're comfy breaking it in the future (& it's not like the current allocated numbers are anywhere near... so by the time anyone is using this allocated number for an official purposes, this experiment will be long gone & hopefully have an official number) until it gets an official number, I guess that works for me. Thanks!

@MaskRay
Copy link
Member Author

MaskRay commented May 9, 2024

I think it would be useful to nominate a source as the canonical reference for the specification which is updated along with any implementation changes. I think discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600/3 is a good starting point.

Possible examples:

  • A patch to the LLVM documentation.
  • A blog-post that is kept up to date with the implementation.
  • A discourse post that is updated.

Thanks. I frequently update my blog posts and https://maskray.me/blog/2024-03-09-a-compact-relocation-format-for-elf#crel-relocation-format will be kept up to date.

However, I feel not so good to name my personal website. Then, https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600/3 should serve the purpose well. discourse.llvm.org posts become non-editable after a period of time. I'll need to make further applies if changes are made.

Link: discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

I can edit the link to append /3 to refer to the specification.

lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form)
available with `llvm-mc -filetype=obj -crel a.s -o a.o`.
A dependent patch will add `clang -c -Wa,--crel,--allow-experimental-crel`.

Also add llvm-readobj support (for both REL and RELA forms) to
facilitate testing the assembler. Additionally, yaml2obj gains support
for the RELA form to aid testing with llvm-readobj.

We temporarily assign the section type code 0x40000020 from the generic
range to `SHT_CREL`. We avoided using `SHT_LLVM_` or `SHT_GNU_` to
avoid code churn and maintain broader applicability for interested psABIs.
Similarly, `DT_CREL` is temporarily 0x40000026.

LLVM will change the code and break compatibility. This is not an issue
if all relocatable files using CREL are regenerated (aka no prebuilt
relocatable files).

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: llvm#91280
MaskRay added a commit that referenced this pull request Jul 3, 2024
The two options are discussed in a few comments around
#91280 (comment)

* -Wa,--crel: error "-Wa,--allow-experimental-crel must be specified to use -Wa,--crel..."
* -Wa,--allow-experimental-crel: no-op
* -Wa,--crel,--allow-experimental-crel: enable CREL in the integrated assembler (#91280)

MIPS's little-endian n64 ABI messed up the `r_info` field in
relocations. While this could be fixed with CREL, my intention is to
avoid complication in assembler/linker. The implementation simply
doesn't allow CREL for MIPS.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: #97378
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
CREL is a compact relocation format for the ELF object file format.

This patch adds integrated assembler support (using the RELA form)
available with `llvm-mc -filetype=obj -crel a.s -o a.o`.
A dependent patch will add `clang -c -Wa,--crel,--allow-experimental-crel`.

Also add llvm-readobj support (for both REL and RELA forms) to
facilitate testing the assembler. Additionally, yaml2obj gains support
for the RELA form to aid testing with llvm-readobj.

We temporarily assign the section type code 0x40000020 from the generic
range to `SHT_CREL`. We avoided using `SHT_LLVM_` or `SHT_GNU_` to
avoid code churn and maintain broader applicability for interested psABIs.
Similarly, `DT_CREL` is temporarily 0x40000026.

LLVM will change the code and break compatibility. This is not an issue
if all relocatable files using CREL are regenerated (aka no prebuilt
relocatable files).

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: llvm#91280
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
The two options are discussed in a few comments around
llvm#91280 (comment)

* -Wa,--crel: error "-Wa,--allow-experimental-crel must be specified to use -Wa,--crel..."
* -Wa,--allow-experimental-crel: no-op
* -Wa,--crel,--allow-experimental-crel: enable CREL in the integrated assembler (llvm#91280)

MIPS's little-endian n64 ABI messed up the `r_info` field in
relocations. While this could be fixed with CREL, my intention is to
avoid complication in assembler/linker. The implementation simply
doesn't allow CREL for MIPS.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: llvm#97378
MaskRay added a commit that referenced this pull request Jul 8, 2024
Extract the llvm-readelf decoder to `decodeCrel` (#91280) and reuse it
for llvm-objdump.

Because the section representation of LLVMObject (`SectionRef`) is
64-bit, insufficient to hold all decoder states, `section_rel_begin` is
modified to decode CREL eagerly and hold the decoded relocations inside
ELFObjectFile<ELFT>.

The test is adapted from llvm/test/tools/llvm-readobj/ELF/crel.test.

Pull Request: #97382
MaskRay added a commit that referenced this pull request Jul 8, 2024
llvm-objcopy may modify the symbol table and need to rewrite
relocations. For CREL, while we can reuse the decoder from #91280, we
need an encoder to support CREL.

Since MC/ELFObjectWriter.cpp has an existing encoder, and MC is at a
lower layer than Object, extract the encoder to a new header file
llvm/MC/MCELFExtras.h.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: #97521
MaskRay added a commit that referenced this pull request Jul 9, 2024
@MaskRay MaskRay mentioned this pull request Jul 9, 2024
MaskRay added a commit that referenced this pull request Jul 9, 2024
This decoder feature complements the yaml2obj encoder added by #91280.

Pull Request: #98116
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
`IsRela` is used by lld to differentiate REL and RELA static
relocations. The proposed CREL patch will reuse `IsRela` for CREL
(llvm#91280). Rename `IsRela` to be more appropriate.

Pull Request: llvm#96592
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
This decoder feature complements the yaml2obj encoder added by llvm#91280.

Pull Request: llvm#98116
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category lld:ELF lld llvm:binary-utilities mc Machine (object) code objectyaml
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants