Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld/ELF] Warn when a PC32 relocation references a large section #71248

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aeubanks
Copy link
Contributor

@aeubanks aeubanks commented Nov 3, 2023

Large sections are meant to be referenced with 64-bit relocations.
PC32 relocations to large section can still accidentally happen, typically due to hand-written assembly.
Add a flag to warn on these so they're easily findable. Off by default since this noticeably impacts link times.

perf stat instruction counts for linking bin/opt with --threads=1:
w/o patch (small code model): 1,917M
w/ patch (small code model): 1,918M
w/ patch, --warn-32-bit-reloc-to-large-section (small code model): 1,919M

w/o patch (medium code model): 1,918M
w/ patch, (medium code model): 1,919M
w/ patch, --warn-32-bit-reloc-to-large-section (medium code model): 1,967M

Large sections are meant to be referenced with 64-bit relocations.
PC32 relocations to large section can still accidentally happen, typically due to hand-written assembly.
Add a flag to warn on these so they're easily findable. Off by default since this noticeably impacts link times.

perf stat instruction counts for linking bin/opt with --threads=1:
w/o patch (small code model):                                       1,917M
w/ patch (small code model):                                        1,918M
w/ patch, --warn-32-bit-reloc-to-large-section (small code model):  1,919M

w/o patch (medium code model):                                      1,918M
w/ patch, (medium code model):                                      1,919M
w/ patch, --warn-32-bit-reloc-to-large-section (medium code model): 1,967M
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 3, 2023

@llvm/pr-subscribers-lld

Author: Arthur Eubanks (aeubanks)

Changes

Large sections are meant to be referenced with 64-bit relocations.
PC32 relocations to large section can still accidentally happen, typically due to hand-written assembly.
Add a flag to warn on these so they're easily findable. Off by default since this noticeably impacts link times.

perf stat instruction counts for linking bin/opt with --threads=1:
w/o patch (small code model): 1,917M
w/ patch (small code model): 1,918M
w/ patch, --warn-32-bit-reloc-to-large-section (small code model): 1,919M

w/o patch (medium code model): 1,918M
w/ patch, (medium code model): 1,919M
w/ patch, --warn-32-bit-reloc-to-large-section (medium code model): 1,967M


Full diff: https://github.com/llvm/llvm-project/pull/71248.diff

8 Files Affected:

  • (modified) lld/ELF/Arch/X86_64.cpp (+24-1)
  • (modified) lld/ELF/Config.h (+3)
  • (modified) lld/ELF/Driver.cpp (+2)
  • (modified) lld/ELF/Options.td (+4)
  • (modified) lld/ELF/Relocations.cpp (+1-1)
  • (modified) lld/ELF/Target.h (+2)
  • (modified) lld/ELF/Writer.cpp (+5-1)
  • (added) lld/test/ELF/warn-large.s (+32)
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 2135ac234864513..9fc7555beec28ab 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -757,6 +757,26 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
 
 static void relaxGot(uint8_t *loc, const Relocation &rel, uint64_t val);
 
+static void warnIfRelocToLargeSection(uint8_t *loc, const Relocation &rel) {
+  if (!rel.sym || !rel.sym->getOutputSection())
+    return;
+  Symbol &sym = *rel.sym;
+  if (sym.getOutputSection()->flags & SHF_X86_64_LARGE) {
+    ErrorPlace errPlace = getErrorPlace(loc);
+    std::string hint;
+    if (!sym.isSection())
+      hint = "; references '" + lld::toString(sym) + '\'';
+    else if (auto *d = dyn_cast<Defined>(&sym))
+      hint = ("; references section '" + d->section->name + "'").str();
+    if (!errPlace.srcLoc.empty())
+      hint += "\n>>> referenced by " + errPlace.srcLoc;
+    if (!sym.isSection())
+      hint += getDefinedLocation(sym);
+    warn(errPlace.loc +
+         "Large section should not be addressed with PC32 relocation" + hint);
+  }
+}
+
 void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
   switch (rel.type) {
   case R_X86_64_8:
@@ -779,11 +799,14 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     checkUInt(loc, val, 32, rel);
     write32le(loc, val);
     break;
+  case R_X86_64_PC32:
+    if (config->warnLarge && ctx.hasLargeSection)
+      warnIfRelocToLargeSection(loc, rel);
+    [[fallthrough]];
   case R_X86_64_32S:
   case R_X86_64_GOT32:
   case R_X86_64_GOTPC32:
   case R_X86_64_GOTPCREL:
-  case R_X86_64_PC32:
   case R_X86_64_PLT32:
   case R_X86_64_DTPOFF32:
   case R_X86_64_SIZE32:
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 56229334f9a44ae..9c57ca95979f0c4 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -298,6 +298,7 @@ struct Config {
   bool warnCommon;
   bool warnMissingEntry;
   bool warnSymbolOrdering;
+  bool warnLarge;
   bool writeAddends;
   bool zCombreloc;
   bool zCopyreloc;
@@ -482,6 +483,8 @@ struct Ctx {
   // True if all native vtable symbols have corresponding type info symbols
   // during LTO.
   bool ltoAllVtablesHaveTypeInfos;
+  // True if any output section has the SHF_X86_64_LARGE flag set.
+  bool hasLargeSection;
 
   // Each symbol assignment and DEFINED(sym) reference is assigned an increasing
   // order. Each DEFINED(sym) evaluation checks whether the reference happens
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 5f88389a5840824..0c05d3af68c84d5 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -111,6 +111,7 @@ void Ctx::reset() {
   scriptSymOrderCounter = 1;
   scriptSymOrder.clear();
   ltoAllVtablesHaveTypeInfos = false;
+  hasLargeSection = false;
 }
 
 llvm::raw_fd_ostream Ctx::openAuxiliaryFile(llvm::StringRef filename,
@@ -1440,6 +1441,7 @@ static void readConfigs(opt::InputArgList &args) {
   config->warnCommon = args.hasFlag(OPT_warn_common, OPT_no_warn_common, false);
   config->warnSymbolOrdering =
       args.hasFlag(OPT_warn_symbol_ordering, OPT_no_warn_symbol_ordering, true);
+  config->warnLarge = args.hasFlag(OPT_warn_large, OPT_no_warn_large, false);
   config->whyExtract = args.getLastArgValue(OPT_why_extract);
   config->zCombreloc = getZFlag(args, "combreloc", "nocombreloc", true);
   config->zCopyreloc = getZFlag(args, "copyreloc", "nocopyreloc", true);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 9a23f48350644a0..7dd01fe54fc9005 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -522,6 +522,10 @@ defm warn_symbol_ordering: BB<"warn-symbol-ordering",
     "Warn about problems with the symbol ordering file (default)",
     "Do not warn about problems with the symbol ordering file">;
 
+defm warn_large: BB<"warn-32-bit-reloc-to-large-section",
+    "Warn about PC32 relocations to a large section",
+    "Do not warn about PC32 relocations to a large section">;
+
 def warn_unresolved_symbols: F<"warn-unresolved-symbols">,
   HelpText<"Report unresolved symbols as warnings">;
 
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f3fb0c71a8b3064..418847faeefc5a3 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -73,7 +73,7 @@ static std::optional<std::string> getLinkerScriptLocation(const Symbol &sym) {
   return std::nullopt;
 }
 
-static std::string getDefinedLocation(const Symbol &sym) {
+std::string elf::getDefinedLocation(const Symbol &sym) {
   const char msg[] = "\n>>> defined in ";
   if (sym.file)
     return msg + toString(sym.file);
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index 6264ab1a3da74a7..16406d222ada36c 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -242,6 +242,8 @@ TargetInfo *getTarget();
 
 template <class ELFT> bool isMipsPIC(const Defined *sym);
 
+std::string getDefinedLocation(const Symbol &sym);
+
 void reportRangeError(uint8_t *loc, const Relocation &rel, const Twine &v,
                       int64_t min, uint64_t max);
 void reportRangeError(uint8_t *loc, int64_t v, int n, const Symbol &sym,
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 57e1aa06c6aa873..2c1ff97b4d5ac81 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -25,6 +25,7 @@
 #include "lld/Common/Filesystem.h"
 #include "lld/Common/Strings.h"
 #include "llvm/ADT/StringMap.h"
+#include "llvm/BinaryFormat/ELF.h"
 #include "llvm/Support/BLAKE3.h"
 #include "llvm/Support/Parallel.h"
 #include "llvm/Support/RandomNumberGenerator.h"
@@ -2233,8 +2234,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() {
   // Fill other section headers. The dynamic table is finalized
   // at the end because some tags like RELSZ depend on result
   // of finalizing other sections.
-  for (OutputSection *sec : outputSections)
+  for (OutputSection *sec : outputSections) {
     sec->finalize();
+    if (sec->flags & SHF_X86_64_LARGE)
+      ctx.hasLargeSection = true;
+  }
 
   script->checkFinalScriptConditions();
 
diff --git a/lld/test/ELF/warn-large.s b/lld/test/ELF/warn-large.s
new file mode 100644
index 000000000000000..aec9dc8f2688ae6
--- /dev/null
+++ b/lld/test/ELF/warn-large.s
@@ -0,0 +1,32 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
+# RUN: ld.lld %t.o -o /dev/null 2>&1 --warn-32-bit-reloc-to-large-section | FileCheck %s
+# RUN: ld.lld %t.o -o /dev/null 2>&1 | FileCheck %s --check-prefix=NO --allow-empty
+
+# CHECK: warning: {{.*}}warn-large.s{{.*}}:(.text+{{.*}}): Large section should not be addressed with PC32 relocation; references 'hello'
+# CHECK-NEXT: >>> referenced by foo.c
+# CHECK-NEXT: >>> defined in {{.*}}warn-large.s{{.*}}
+
+# CHECK: warning: {{.*}}warn-large.s{{.*}}:(.text+{{.*}}): Large section should not be addressed with PC32 relocation; references section 'ldata'
+# CHECK-NEXT: >>> referenced by foo.c
+
+# NO-NOT: warning
+
+.text
+.file "foo.c"
+.globl _start
+.type _start, @function
+_start:
+  movq hello(%rip), %rax
+  movq ok(%rip), %rax
+
+.section ldata,"awl",@progbits
+
+.type   hello, @object
+.globl  hello
+.p2align        2, 0x0
+hello:
+.long   1
+
+ok:
+.long   1

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 3, 2023

@llvm/pr-subscribers-lld-elf

Author: Arthur Eubanks (aeubanks)

Changes

Large sections are meant to be referenced with 64-bit relocations.
PC32 relocations to large section can still accidentally happen, typically due to hand-written assembly.
Add a flag to warn on these so they're easily findable. Off by default since this noticeably impacts link times.

perf stat instruction counts for linking bin/opt with --threads=1:
w/o patch (small code model): 1,917M
w/ patch (small code model): 1,918M
w/ patch, --warn-32-bit-reloc-to-large-section (small code model): 1,919M

w/o patch (medium code model): 1,918M
w/ patch, (medium code model): 1,919M
w/ patch, --warn-32-bit-reloc-to-large-section (medium code model): 1,967M


Full diff: https://github.com/llvm/llvm-project/pull/71248.diff

8 Files Affected:

  • (modified) lld/ELF/Arch/X86_64.cpp (+24-1)
  • (modified) lld/ELF/Config.h (+3)
  • (modified) lld/ELF/Driver.cpp (+2)
  • (modified) lld/ELF/Options.td (+4)
  • (modified) lld/ELF/Relocations.cpp (+1-1)
  • (modified) lld/ELF/Target.h (+2)
  • (modified) lld/ELF/Writer.cpp (+5-1)
  • (added) lld/test/ELF/warn-large.s (+32)
diff --git a/lld/ELF/Arch/X86_64.cpp b/lld/ELF/Arch/X86_64.cpp
index 2135ac234864513..9fc7555beec28ab 100644
--- a/lld/ELF/Arch/X86_64.cpp
+++ b/lld/ELF/Arch/X86_64.cpp
@@ -757,6 +757,26 @@ int64_t X86_64::getImplicitAddend(const uint8_t *buf, RelType type) const {
 
 static void relaxGot(uint8_t *loc, const Relocation &rel, uint64_t val);
 
+static void warnIfRelocToLargeSection(uint8_t *loc, const Relocation &rel) {
+  if (!rel.sym || !rel.sym->getOutputSection())
+    return;
+  Symbol &sym = *rel.sym;
+  if (sym.getOutputSection()->flags & SHF_X86_64_LARGE) {
+    ErrorPlace errPlace = getErrorPlace(loc);
+    std::string hint;
+    if (!sym.isSection())
+      hint = "; references '" + lld::toString(sym) + '\'';
+    else if (auto *d = dyn_cast<Defined>(&sym))
+      hint = ("; references section '" + d->section->name + "'").str();
+    if (!errPlace.srcLoc.empty())
+      hint += "\n>>> referenced by " + errPlace.srcLoc;
+    if (!sym.isSection())
+      hint += getDefinedLocation(sym);
+    warn(errPlace.loc +
+         "Large section should not be addressed with PC32 relocation" + hint);
+  }
+}
+
 void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
   switch (rel.type) {
   case R_X86_64_8:
@@ -779,11 +799,14 @@ void X86_64::relocate(uint8_t *loc, const Relocation &rel, uint64_t val) const {
     checkUInt(loc, val, 32, rel);
     write32le(loc, val);
     break;
+  case R_X86_64_PC32:
+    if (config->warnLarge && ctx.hasLargeSection)
+      warnIfRelocToLargeSection(loc, rel);
+    [[fallthrough]];
   case R_X86_64_32S:
   case R_X86_64_GOT32:
   case R_X86_64_GOTPC32:
   case R_X86_64_GOTPCREL:
-  case R_X86_64_PC32:
   case R_X86_64_PLT32:
   case R_X86_64_DTPOFF32:
   case R_X86_64_SIZE32:
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 56229334f9a44ae..9c57ca95979f0c4 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -298,6 +298,7 @@ struct Config {
   bool warnCommon;
   bool warnMissingEntry;
   bool warnSymbolOrdering;
+  bool warnLarge;
   bool writeAddends;
   bool zCombreloc;
   bool zCopyreloc;
@@ -482,6 +483,8 @@ struct Ctx {
   // True if all native vtable symbols have corresponding type info symbols
   // during LTO.
   bool ltoAllVtablesHaveTypeInfos;
+  // True if any output section has the SHF_X86_64_LARGE flag set.
+  bool hasLargeSection;
 
   // Each symbol assignment and DEFINED(sym) reference is assigned an increasing
   // order. Each DEFINED(sym) evaluation checks whether the reference happens
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 5f88389a5840824..0c05d3af68c84d5 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -111,6 +111,7 @@ void Ctx::reset() {
   scriptSymOrderCounter = 1;
   scriptSymOrder.clear();
   ltoAllVtablesHaveTypeInfos = false;
+  hasLargeSection = false;
 }
 
 llvm::raw_fd_ostream Ctx::openAuxiliaryFile(llvm::StringRef filename,
@@ -1440,6 +1441,7 @@ static void readConfigs(opt::InputArgList &args) {
   config->warnCommon = args.hasFlag(OPT_warn_common, OPT_no_warn_common, false);
   config->warnSymbolOrdering =
       args.hasFlag(OPT_warn_symbol_ordering, OPT_no_warn_symbol_ordering, true);
+  config->warnLarge = args.hasFlag(OPT_warn_large, OPT_no_warn_large, false);
   config->whyExtract = args.getLastArgValue(OPT_why_extract);
   config->zCombreloc = getZFlag(args, "combreloc", "nocombreloc", true);
   config->zCopyreloc = getZFlag(args, "copyreloc", "nocopyreloc", true);
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 9a23f48350644a0..7dd01fe54fc9005 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -522,6 +522,10 @@ defm warn_symbol_ordering: BB<"warn-symbol-ordering",
     "Warn about problems with the symbol ordering file (default)",
     "Do not warn about problems with the symbol ordering file">;
 
+defm warn_large: BB<"warn-32-bit-reloc-to-large-section",
+    "Warn about PC32 relocations to a large section",
+    "Do not warn about PC32 relocations to a large section">;
+
 def warn_unresolved_symbols: F<"warn-unresolved-symbols">,
   HelpText<"Report unresolved symbols as warnings">;
 
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f3fb0c71a8b3064..418847faeefc5a3 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -73,7 +73,7 @@ static std::optional<std::string> getLinkerScriptLocation(const Symbol &sym) {
   return std::nullopt;
 }
 
-static std::string getDefinedLocation(const Symbol &sym) {
+std::string elf::getDefinedLocation(const Symbol &sym) {
   const char msg[] = "\n>>> defined in ";
   if (sym.file)
     return msg + toString(sym.file);
diff --git a/lld/ELF/Target.h b/lld/ELF/Target.h
index 6264ab1a3da74a7..16406d222ada36c 100644
--- a/lld/ELF/Target.h
+++ b/lld/ELF/Target.h
@@ -242,6 +242,8 @@ TargetInfo *getTarget();
 
 template <class ELFT> bool isMipsPIC(const Defined *sym);
 
+std::string getDefinedLocation(const Symbol &sym);
+
 void reportRangeError(uint8_t *loc, const Relocation &rel, const Twine &v,
                       int64_t min, uint64_t max);
 void reportRangeError(uint8_t *loc, int64_t v, int n, const Symbol &sym,
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 57e1aa06c6aa873..2c1ff97b4d5ac81 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -25,6 +25,7 @@
 #include "lld/Common/Filesystem.h"
 #include "lld/Common/Strings.h"
 #include "llvm/ADT/StringMap.h"
+#include "llvm/BinaryFormat/ELF.h"
 #include "llvm/Support/BLAKE3.h"
 #include "llvm/Support/Parallel.h"
 #include "llvm/Support/RandomNumberGenerator.h"
@@ -2233,8 +2234,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() {
   // Fill other section headers. The dynamic table is finalized
   // at the end because some tags like RELSZ depend on result
   // of finalizing other sections.
-  for (OutputSection *sec : outputSections)
+  for (OutputSection *sec : outputSections) {
     sec->finalize();
+    if (sec->flags & SHF_X86_64_LARGE)
+      ctx.hasLargeSection = true;
+  }
 
   script->checkFinalScriptConditions();
 
diff --git a/lld/test/ELF/warn-large.s b/lld/test/ELF/warn-large.s
new file mode 100644
index 000000000000000..aec9dc8f2688ae6
--- /dev/null
+++ b/lld/test/ELF/warn-large.s
@@ -0,0 +1,32 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o %t.o
+# RUN: ld.lld %t.o -o /dev/null 2>&1 --warn-32-bit-reloc-to-large-section | FileCheck %s
+# RUN: ld.lld %t.o -o /dev/null 2>&1 | FileCheck %s --check-prefix=NO --allow-empty
+
+# CHECK: warning: {{.*}}warn-large.s{{.*}}:(.text+{{.*}}): Large section should not be addressed with PC32 relocation; references 'hello'
+# CHECK-NEXT: >>> referenced by foo.c
+# CHECK-NEXT: >>> defined in {{.*}}warn-large.s{{.*}}
+
+# CHECK: warning: {{.*}}warn-large.s{{.*}}:(.text+{{.*}}): Large section should not be addressed with PC32 relocation; references section 'ldata'
+# CHECK-NEXT: >>> referenced by foo.c
+
+# NO-NOT: warning
+
+.text
+.file "foo.c"
+.globl _start
+.type _start, @function
+_start:
+  movq hello(%rip), %rax
+  movq ok(%rip), %rax
+
+.section ldata,"awl",@progbits
+
+.type   hello, @object
+.globl  hello
+.p2align        2, 0x0
+hello:
+.long   1
+
+ok:
+.long   1

aeubanks added a commit to aeubanks/llvm-project that referenced this pull request Nov 21, 2023
…_LARGE section

Makes it clearer what the issue is when hand-written assembly doesn't follow medium code model assumptions in a medium code model build.

Alternative to llvm#71248 by only hinting on an overflow.
aeubanks added a commit that referenced this pull request Jan 17, 2024
…_LARGE section (#73045)

Makes it clearer what the issue is when hand-written assembly doesn't
follow medium code model assumptions in a medium code model build.

Alternative to #71248 by only hinting on an overflow.
ampandey-1995 pushed a commit to ampandey-1995/llvm-project that referenced this pull request Jan 19, 2024
…_LARGE section (llvm#73045)

Makes it clearer what the issue is when hand-written assembly doesn't
follow medium code model assumptions in a medium code model build.

Alternative to llvm#71248 by only hinting on an overflow.
@aeubanks aeubanks marked this pull request as draft April 22, 2024 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants