Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld][ELF] Demote symbols in discarded sections to Undefined. #68777

Closed
wants to merge 1 commit into from

Conversation

bevin-hansson
Copy link
Contributor

After linker script processing, it may be the case that sections
were discarded via /DISCARD/, but relocations to symbols in those
sections remain. Currently, such symbols are still considered
Defined even though their section is not live and will not be
placed anywhere. This results in a silently broken binary.

This patch goes through the symbols after placement and changes
them from Defined to Undefined if their section is no longer
live at that point. During relocation processing, we will catch
such undefined symbols and report an error.

See #58891.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 11, 2023

@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lld

Author: Bevin Hansson (bevin-hansson)

Changes

After linker script processing, it may be the case that sections
were discarded via /DISCARD/, but relocations to symbols in those
sections remain. Currently, such symbols are still considered
Defined even though their section is not live and will not be
placed anywhere. This results in a silently broken binary.

This patch goes through the symbols after placement and changes
them from Defined to Undefined if their section is no longer
live at that point. During relocation processing, we will catch
such undefined symbols and report an error.

See #58891.


Full diff: https://github.com/llvm/llvm-project/pull/68777.diff

5 Files Affected:

  • (modified) lld/ELF/Driver.cpp (+24)
  • (modified) lld/ELF/Relocations.cpp (+1-2)
  • (modified) lld/ELF/Symbols.cpp (+7-4)
  • (modified) lld/ELF/Writer.cpp (+16)
  • (added) lld/test/ELF/linkerscript/discard-symbol.s (+16)
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 6272276e94b2d35..a49e72ee96ef466 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3062,6 +3062,30 @@ void LinkerDriver::link(opt::InputArgList &args) {
     script->addOrphanSections();
   }
 
+  // Explicitly demote symbols which didn't get placed. If we don't, any
+  // symbol in a discarded section will still be considered defined, even
+  // though it didn't end up in the output, and we get silently broken
+  // binaries.
+  if (script->hasSectionsCommand) {
+    llvm::TimeTraceScope timeScope("Demote symbols in discarded sections");
+    parallelForEach(symtab.getSymbols(), [](Symbol *sym) {
+      if (Defined *d = dyn_cast<Defined>(sym))
+        if (d->section && !d->section->isLive()) {
+          uint32_t secIdx = 0;
+          if (d->file)
+            secIdx = d->file->getSections().drop_while([=](auto *s) {
+              return s != d->section;
+            }).size();
+          // If we don't change the binding from WEAK to GLOBAL here, the
+          // undefined symbol reporting will think this is undefined weak and
+          // not give a warning.
+          Undefined(d->file, sym->getName(), sym->isWeak() ? (uint8_t)STB_GLOBAL
+                                                           : sym->binding,
+                    sym->stOther, sym->type, secIdx).overwrite(*sym);
+        }
+    });
+  }
+
   {
     llvm::TimeTraceScope timeScope("Merge/finalize input sections");
 
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index ee27cc15e040a49..44111161619605a 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -507,8 +507,7 @@ int64_t RelocationScanner::computeMipsAddend(const RelTy &rel, RelExpr expr,
 template <class ELFT>
 static std::string maybeReportDiscarded(Undefined &sym) {
   auto *file = dyn_cast_or_null<ObjFile<ELFT>>(sym.file);
-  if (!file || !sym.discardedSecIdx ||
-      file->getSections()[sym.discardedSecIdx] != &InputSection::discarded)
+  if (!file || !sym.discardedSecIdx)
     return "";
   ArrayRef<typename ELFT::Shdr> objSections =
       file->template getELFShdrs<ELFT>();
diff --git a/lld/ELF/Symbols.cpp b/lld/ELF/Symbols.cpp
index 07061d3a1223e9e..3f16ede772cb4a4 100644
--- a/lld/ELF/Symbols.cpp
+++ b/lld/ELF/Symbols.cpp
@@ -321,7 +321,7 @@ void elf::maybeWarnUnorderableSymbol(const Symbol *sym) {
   //
   // Note, ld.bfd --symbol-ordering-file= does not warn on undefined symbols,
   // but we don't have to be compatible here.
-  if (sym->isUndefined() &&
+  if (sym->isUndefined() && !cast<Undefined>(sym)->discardedSecIdx &&
       config->unresolvedSymbols == UnresolvedPolicy::Ignore)
     return;
 
@@ -330,9 +330,12 @@ void elf::maybeWarnUnorderableSymbol(const Symbol *sym) {
 
   auto report = [&](StringRef s) { warn(toString(file) + s + sym->getName()); };
 
-  if (sym->isUndefined())
-    report(": unable to order undefined symbol: ");
-  else if (sym->isShared())
+  if (sym->isUndefined()) {
+    if (cast<Undefined>(sym)->discardedSecIdx)
+      report(": unable to order discarded symbol: ");
+    else
+      report(": unable to order undefined symbol: ");
+  } else if (sym->isShared())
     report(": unable to order shared symbol: ");
   else if (d && !d->section)
     report(": unable to order absolute symbol: ");
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index ce72189015348a4..7593bb14e1ffda8 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -714,6 +714,22 @@ template <class ELFT> void Writer<ELFT>::copyLocalSymbols() {
       // No reason to keep local undefined symbol in symtab.
       if (!dr)
         continue;
+
+      // Demote locals which did not end up in any partition. This is similar
+      // to what we do in Driver.cpp, but that only works on globals.
+      // Don't do this for section symbols; they are expected to be lost.
+      if (script->hasSectionsCommand && dr->section && !dr->isSection() &&
+          !dr->section->isLive()) {
+        uint32_t secIdx = 0;
+        if (dr->file)
+          secIdx = dr->file->getSections().drop_while([=](auto *s) {
+            return s != dr->section;
+          }).size();
+        Undefined(dr->file, b->getName(), b->binding,
+                  b->stOther, b->type, secIdx).overwrite(*b);
+        continue;
+      }
+
       if (includeInSymtab(*b) && shouldKeepInSymtab(*dr))
         in.symTab->addSymbol(b);
     }
diff --git a/lld/test/ELF/linkerscript/discard-symbol.s b/lld/test/ELF/linkerscript/discard-symbol.s
new file mode 100644
index 000000000000000..739cd585af00372
--- /dev/null
+++ b/lld/test/ELF/linkerscript/discard-symbol.s
@@ -0,0 +1,16 @@
+# REQUIRES: x86
+# RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t
+# RUN: echo "SECTIONS { /DISCARD/ : { *(.aaa*) } }" > %t.script
+# RUN: not ld.lld -o %t1 --script %t.script %t | FileCheck %s
+
+# CHECK: error: relocation refers to a symbol in a discarded section: aab
+# CHECK-NEXT:   defined in {{.*}}/discard-symbol.s.tmp
+# CHECK-NEXT:   referenced by {{.*}}/discard-symbol.s.tmp:(.zzz+0x0)
+
+.global aab
+.section .aaa,"a"
+aab:
+  .quad 0
+
+.section .zzz,"a"
+  .quad aab


// Demote locals which did not end up in any partition. This is similar
// to what we do in Driver.cpp, but that only works on globals.
// Don't do this for section symbols; they are expected to be lost.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure about this. I had to prevent this for section symbols, because a few tests fail otherwise, mostly ones related to arm exidx sections. I don't understand the mechanisms at play enough to know exactly what is supposed to happen.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised about test failures in .ARM.exidx. As there shouldn't be any relocations to a .ARM.exidx input section. There will be relocations from a .ARM.exidx section, but given that when a section is discarded, any section with a SHF_LINK_ORDER dependency on it is discarded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can disable the check and push so you can see what tests are failing on it.

@github-actions
Copy link

github-actions bot commented Oct 11, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does GNU ld permit this, or does it result in dangling relocations? If GNU ld does permit this then there's a chance that this might break some programs where the dangling relocations are benign. I guess that we may need to add a command line option to suppress the errors, although maybe not initially.

How does this work with uses of /DISCARD/ in a ld -r relocatable link? I guess a follow up question is what is it supposed to do for ld -r relocatable links. It would still be user error to /DISCARD/ a section where some other section had relocations to that section, but I don't think undefined symbols are checked for.

A possible alternative implementation could be to test each relocation in scanOne to see if the symbol is defined in a discarded section. You could then make a specific error message, or do the work to make the symbol undefined at that point. Not sure how well that would work out, but could be worth a prototype.

@@ -0,0 +1,16 @@
# REQUIRES: x86
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test looks only to check global symbols. The patch makes modifications for local symbols and symbol ordering files, please can you add test cases for these as well?

Will be worth adding a test case for relocatable links too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true. The reason I couldn't add a test for the local symbol case is because when you have local symbols, llvm-mc seems to emit them as section symbols with an offset addend rather than an actual symbol relocation. So it runs afoul of the exception for section symbols, and doesn't trigger.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the file after Rebase on top of 557299c

// symbol in a discarded section will still be considered defined, even
// though it didn't end up in the output, and we get silently broken
// binaries.
if (script->hasSectionsCommand) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it be worth adding a script->seenDiscard variable so that this potentially expensive operation is only done when someone has used /DISCARD/ ?


// Demote locals which did not end up in any partition. This is similar
// to what we do in Driver.cpp, but that only works on globals.
// Don't do this for section symbols; they are expected to be lost.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised about test failures in .ARM.exidx. As there shouldn't be any relocations to a .ARM.exidx input section. There will be relocations from a .ARM.exidx section, but given that when a section is discarded, any section with a SHF_LINK_ORDER dependency on it is discarded.

@bevin-hansson
Copy link
Contributor Author

Does GNU ld permit this, or does it result in dangling relocations? If GNU ld does permit this then there's a chance that this might break some programs where the dangling relocations are benign. I guess that we may need to add a command line option to suppress the errors, although maybe not initially.

When I tested a while back, GNU ld seems to keep the section if there is still a reference to it, even though it was marked /DISCARD/. I don't think this is possible in lld, since the relocation processing is done after the linker script handling, and we've already made the call to keep or not keep the section at that point.

How does this work with uses of /DISCARD/ in a ld -r relocatable link? I guess a follow up question is what is it supposed to do for ld -r relocatable links.

I haven't tested -r. I would assume that discarding something in an -r invocation today does not trigger these types of errors, so they wouldn't with this patch either.

It would still be user error to /DISCARD/ a section where some other section had relocations to that section, but I don't think undefined symbols are checked for.

Hm, would it? Could you not discard a section that contains something which is referenced from the current file, and have that reference be resolved from some other object at the final link?

A possible alternative implementation could be to test each relocation in scanOne to see if the symbol is defined in a discarded section. You could then make a specific error message, or do the work to make the symbol undefined at that point. Not sure how well that would work out, but could be worth a prototype.

Yeah, it doesn't have to go through this logic. It's just that the logic for emitting these errors already existed (and only worked for Undefined), so I went that path.

@smithp35
Copy link
Collaborator

Does GNU ld permit this, or does it result in dangling relocations? If GNU ld does permit this then there's a chance that this might break some programs where the dangling relocations are benign. I guess that we may need to add a command line option to suppress the errors, although maybe not initially.

When I tested a while back, GNU ld seems to keep the section if there is still a reference to it, even though it was marked /DISCARD/. I don't think this is possible in lld, since the relocation processing is done after the linker script handling, and we've already made the call to keep or not keep the section at that point.

I had a quick check and GNU ld on my machine at least gives a similar error message to what you are proposing.

How does this work with uses of /DISCARD/ in a ld -r relocatable link? I guess a follow up question is what is it supposed to do for ld -r relocatable links.

I haven't tested -r. I would assume that discarding something in an -r invocation today does not trigger these types of errors, so they wouldn't with this patch either.

It looks like GNU ld does trigger these errors with ld -r at least with a trivial test case. LLD seems to give a warning in this case and outputs a R_*_NONE relocation in its place.

It would still be user error to /DISCARD/ a section where some other section had relocations to that section, but I don't think undefined symbols are checked for.

Hm, would it? Could you not discard a section that contains something which is referenced from the current file, and have that reference be resolved from some other object at the final link?

I think it depends on whether the linker resolves the relocation at relocatable link time, and whether it is a global definition or not. For example if we make a local symbol undefined that is likely to be unrecoverable.

A possible alternative implementation could be to test each relocation in scanOne to see if the symbol is defined in a discarded section. You could then make a specific error message, or do the work to make the symbol undefined at that point. Not sure how well that would work out, but could be worth a prototype.

Yeah, it doesn't have to go through this logic. It's just that the logic for emitting these errors already existed (and only worked for Undefined), so I went that path.

I've found out why the tests are failing anyway. The ELF/linkerscript/discard-section.s contains a local reference that we would explicitly want to catch, so I think this test would need updating. The ARM.exidx tests are failing due a missing test for liveness in scanRelocations. The .ARM.exidx SyntheticSection has a list of input .ARM.exidx sections. When some, but not all of these input sections are discarded we need the liveness check.

This should fix that:

diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 441111616196..f3fb0c71a8b3 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1574,7 +1574,8 @@ template <class ELFT> void elf::scanRelocations() {
         scanner.template scanSection<ELFT>(*sec);
       if (part.armExidx && part.armExidx->isLive())
         for (InputSection *sec : part.armExidx->exidxSections)
-          scanner.template scanSection<ELFT>(*sec);
+          if (sec->isLive())
+            scanner.template scanSection<ELFT>(*sec);
     }
   });
 }

With this you should be able remove the special case for section symbols. It should be possible to use yaml2obj to make a test case with a relocation to a local symbol, but with the special case removed I don't think you'll need them.

@bevin-hansson
Copy link
Contributor Author

Does GNU ld permit this, or does it result in dangling relocations? If GNU ld does permit this then there's a chance that this might break some programs where the dangling relocations are benign. I guess that we may need to add a command line option to suppress the errors, although maybe not initially.

When I tested a while back, GNU ld seems to keep the section if there is still a reference to it, even though it was marked /DISCARD/. I don't think this is possible in lld, since the relocation processing is done after the linker script handling, and we've already made the call to keep or not keep the section at that point.

I had a quick check and GNU ld on my machine at least gives a similar error message to what you are proposing.

Interesting. I'll have to try it again myself.

How does this work with uses of /DISCARD/ in a ld -r relocatable link? I guess a follow up question is what is it supposed to do for ld -r relocatable links.

I haven't tested -r. I would assume that discarding something in an -r invocation today does not trigger these types of errors, so they wouldn't with this patch either.

It looks like GNU ld does trigger these errors with ld -r at least with a trivial test case. LLD seems to give a warning in this case and outputs a R_*_NONE relocation in its place.

It would still be user error to /DISCARD/ a section where some other section had relocations to that section, but I don't think undefined symbols are checked for.

Hm, would it? Could you not discard a section that contains something which is referenced from the current file, and have that reference be resolved from some other object at the final link?

I think it depends on whether the linker resolves the relocation at relocatable link time, and whether it is a global definition or not. For example if we make a local symbol undefined that is likely to be unrecoverable.

Ah, that's true. Unresolved locals would be a problem.

A possible alternative implementation could be to test each relocation in scanOne to see if the symbol is defined in a discarded section. You could then make a specific error message, or do the work to make the symbol undefined at that point. Not sure how well that would work out, but could be worth a prototype.

Yeah, it doesn't have to go through this logic. It's just that the logic for emitting these errors already existed (and only worked for Undefined), so I went that path.

I've found out why the tests are failing anyway. The ELF/linkerscript/discard-section.s contains a local reference that we would explicitly want to catch, so I think this test would need updating. The ARM.exidx tests are failing due a missing test for liveness in scanRelocations. The .ARM.exidx SyntheticSection has a list of input .ARM.exidx sections. When some, but not all of these input sections are discarded we need the liveness check.

This should fix that:

diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 441111616196..f3fb0c71a8b3 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1574,7 +1574,8 @@ template <class ELFT> void elf::scanRelocations() {
         scanner.template scanSection<ELFT>(*sec);
       if (part.armExidx && part.armExidx->isLive())
         for (InputSection *sec : part.armExidx->exidxSections)
-          scanner.template scanSection<ELFT>(*sec);
+          if (sec->isLive())
+            scanner.template scanSection<ELFT>(*sec);
     }
   });
 }

With this you should be able remove the special case for section symbols. It should be possible to use yaml2obj to make a test case with a relocation to a local symbol, but with the special case removed I don't think you'll need them.

Great, big thanks for finding this! I pushed the fix and test additions, along with some other suggested changes.

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the updates. This looks good to me know. I think it is worth the maintainer @MaskRay taking a look first before merging.

@MaskRay MaskRay self-requested a review October 14, 2023 02:31
lld/ELF/Driver.cpp Outdated Show resolved Hide resolved
lld/ELF/Driver.cpp Outdated Show resolved Hide resolved
After linker script processing, it may be the case that sections
were discarded via /DISCARD/, but relocations to symbols in those
sections remain. Currently, such symbols are still considered
Defined even though their section is not live and will not be
placed anywhere. This results in a silently broken binary.

This patch goes through the symbols after placement and changes
them from Defined to Undefined if their section is no longer
live at that point. During relocation processing, we will catch
such undefined symbols and report an error.

See llvm#58891.
@bevin-hansson
Copy link
Contributor Author

I rebased, adapted the test and addressed the comments.

secIdx = idx;
break;
}
idx++;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the section index code in both places. It was not giving the right index. I'm not sure why I didn't notice it earlier. I'm pretty sure it worked before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a performance issue. If an object file has many discarded symbols, the quadratic time complexity will not be acceptable. I have fixed this in https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard

@MaskRay
Copy link
Member

MaskRay commented Oct 17, 2023

I think we need a different approach to implement symbol demoting. I re-implemented the stuff in https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard and added you as a co-author :)

secIdx = idx;
break;
}
idx++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a performance issue. If an object file has many discarded symbols, the quadratic time complexity will not be acceptable. I have fixed this in https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard


// Demote locals which did not end up in any partition. This is similar
// to what we do in Driver.cpp, but that only works on globals.
if (script->seenDiscard && dr->section && !dr->section->isLive()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need this copy. This can be unified with Driver.cpp. See https://github.com/maskray/llvm-project/tree/lld-symbols-in-discard

# RUN: ld.lld -r -T a.lds a.o b.o -o /dev/null 2>&1 | count 0
# RUN: not ld.lld --threads=1 -T a.lds a.o b.o -z undefs -o /dev/null 2>&1 | FileCheck %s --check-prefix=SECTION --implicit-check-not=error:
# RUN: not ld.lld --threads=1 -T a.lds a.o b.o -o /dev/null 2>&1 | FileCheck %s --check-prefixes=SECTION,SYMBOL --implicit-check-not=error:
# RUN: ld.lld -r -T a.lds a.o b.o -o /dev/null 2>&1 | FileCheck %s --check-prefix=WARNING --implicit-check-not=error:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--implicit-check-not=warning:

exit code 0 means that there cannot be any error.

@@ -3061,6 +3061,37 @@ void LinkerDriver::link(opt::InputArgList &args) {
script->addOrphanSections();
}

// Demote symbols defined relative to input sections that are discarded by
// /DISCARD/ so that relocations referencing them will get reported.
if (script->seenDiscard) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a corner case but I think is worth testing. The demotion logic applies --gc-sections as well but we only run the code when /DISCARD/ is seen, so there will be some behavior differences.

I have updated lld/test/ELF/gc-sections-tls.s in https://github.com/MaskRay/llvm-project/tree/lld-symbols-in-discard to test this.

Actually, we probably should do #69295 so that all non-local symbols in discarded sections are demoted, regardless of whether /DISCARD/ is seen? @smithp35

@bevin-hansson
Copy link
Contributor Author

Your alternate approach seems good, I was also worried about the performance impact of looking through all the symbols like that.

The TLS case was a bit unexpected to me. I wouldn't have thought that the demotion could cause any issues related to --gc-sections, since sections with references to them cannot be discarded by it.

@MaskRay
Copy link
Member

MaskRay commented Oct 19, 2023

Superseded by #69295.

@MaskRay MaskRay closed this Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants