Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld][ELF] Use the containing symbol, not the referenced symbol, for undefined symbol errors. #70800

Closed

Conversation

bevin-hansson
Copy link
Contributor

When reporting undefined symbols, the symbol being used to find
the source location for where the symbol was referenced from was
the symbol being referenced, not the symbol where the reference
was made from.

Use the relocation offset to locate the correct symbol.

With this, the following program:

extern int ex;

int * p[] = { &ex };

int main() {
  return 0;
}

When built with clang -fuse-ld=lld a.c -g, it goes from

ld.lld: error: undefined symbol: ex
>>> referenced by a.c
>>>               /tmp/a-0e7dc8.o:(p)

to

ld.lld: error: undefined symbol: ex
>>> referenced by a.c:3 ([...]/a.c:3)
>>>               /tmp/a-0e7dc8.o:(p)

I'm unsure how to construct a reasonable lit test for this.

…undefined symbol errors.

When reporting undefined symbols, the symbol being used to find
the source location for where the symbol was referenced from was
the symbol being referenced, not the symbol where the reference
was made from.

Use the relocation offset to locate the correct symbol.

With this, the following program:

```
extern int ex;

int * p[] = { &ex };

int main() {
  return 0;
}
```

When built with `clang -fuse-ld=lld a.c -g`, it goes from
```
ld.lld: error: undefined symbol: ex
>>> referenced by a.c
>>>               /tmp/a-0e7dc8.o:(p)
```
to
```
ld.lld: error: undefined symbol: ex
>>> referenced by a.c:3 ([...]/a.c:3)
>>>               /tmp/a-0e7dc8.o:(p)
```

I'm unsure how to construct a reasonable lit test for this.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 31, 2023

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-elf

Author: Bevin Hansson (bevin-hansson)

Changes

When reporting undefined symbols, the symbol being used to find
the source location for where the symbol was referenced from was
the symbol being referenced, not the symbol where the reference
was made from.

Use the relocation offset to locate the correct symbol.

With this, the following program:

extern int ex;

int * p[] = { &ex };

int main() {
  return 0;
}

When built with clang -fuse-ld=lld a.c -g, it goes from

ld.lld: error: undefined symbol: ex
>>> referenced by a.c
>>>               /tmp/a-0e7dc8.o:(p)

to

ld.lld: error: undefined symbol: ex
>>> referenced by a.c:3 ([...]/a.c:3)
>>>               /tmp/a-0e7dc8.o:(p)

I'm unsure how to construct a reasonable lit test for this.


Full diff: https://github.com/llvm/llvm-project/pull/70800.diff

1 Files Affected:

  • (modified) lld/ELF/Relocations.cpp (+29-1)
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index f3fb0c71a8b3064..bf4d452f3ed9f79 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -693,6 +693,34 @@ static const Symbol *getAlternativeSpelling(const Undefined &sym,
   return nullptr;
 }
 
+static Symbol &
+getSymAtOffset(InputSectionBase &sec, uint64_t off, Symbol &sym) {
+  ArrayRef<Symbol *> syms;
+  switch (config->ekind) {
+  case ELF32LEKind:
+    syms = sec.getFile<ELF32LE>()->getSymbols();
+    break;
+  case ELF32BEKind:
+    syms = sec.getFile<ELF32BE>()->getSymbols();
+    break;
+  case ELF64LEKind:
+    syms = sec.getFile<ELF64LE>()->getSymbols();
+    break;
+  case ELF64BEKind:
+    syms = sec.getFile<ELF64BE>()->getSymbols();
+    break;
+  default:
+    llvm_unreachable("");
+  }
+  // Get the symbol that contains this offset.
+  for (Symbol *b : syms)
+    if (auto *d = dyn_cast_or_null<Defined>(b))
+      if (d->section == &sec && d->value <= off && off < d->value + d->size)
+        return *d;
+  // Fall back to the supplied symbol otherwise.
+  return sym;
+}
+
 static void reportUndefinedSymbol(const UndefinedDiag &undef,
                                   bool correctSpelling) {
   Undefined &sym = *undef.sym;
@@ -739,7 +767,7 @@ static void reportUndefinedSymbol(const UndefinedDiag &undef,
     uint64_t offset = l.offset;
 
     msg += "\n>>> referenced by ";
-    std::string src = sec.getSrcMsg(sym, offset);
+    std::string src = sec.getSrcMsg(getSymAtOffset(sec, offset, sym), offset);
     if (!src.empty())
       msg += src + "\n>>>               ";
     msg += sec.getObjMsg(offset);

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 9ca6bf3fb7b7df373723b3275730f101f9ff816b f2d5efdb7250eaf8481c86d13a45528cb77c612c -- lld/ELF/Relocations.cpp
View the diff from clang-format here.
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index bf4d452f3ed9..f5b78d51f0e1 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -693,8 +693,8 @@ static const Symbol *getAlternativeSpelling(const Undefined &sym,
   return nullptr;
 }
 
-static Symbol &
-getSymAtOffset(InputSectionBase &sec, uint64_t off, Symbol &sym) {
+static Symbol &getSymAtOffset(InputSectionBase &sec, uint64_t off,
+                              Symbol &sym) {
   ArrayRef<Symbol *> syms;
   switch (config->ekind) {
   case ELF32LEKind:

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me. I think it should be possible to write a lit test for this using the .file and .loc directives. The only example in the lld testsuite that does this is debug-line-obj.s

You may be able to adapt that, or something similar to make a test case.

@@ -693,6 +693,34 @@ static const Symbol *getAlternativeSpelling(const Undefined &sym,
return nullptr;
}

static Symbol &
getSymAtOffset(InputSectionBase &sec, uint64_t off, Symbol &sym) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would getSymContainingOffset() be a better name, as we're looking for the symbol with off in the range [sym->off,, sym->off + sym->size) .

@MaskRay
Copy link
Member

MaskRay commented Oct 31, 2023

Thanks for the improvement. Can you change the description to mention https://reviews.llvm.org/D31481 (for easier archaeology in the future)?

@MaskRay
Copy link
Member

MaskRay commented Oct 31, 2023

The sym parameter of getSrcMsg provides a fallback when line number information is unavailable.
The line number information covers cases when the undefined symbol is from a function.

int * p[] = { &ex };

This is about a case when the undefined symbol is from a variable with no line number information.
If DWARF DW_TAG_variable is available (-g2 instead of -gmlt), we can report the variable location.
I am nervous if we need to spend these lines of code to give a diagnostic for this case and I'd probably choose not.

Fortunately, I think we can refactor the existing code (getEnclosingFunction) to give this diagnostic improvement with nearly no additional complexity:
https://github.com/MaskRay/llvm-project/tree/lld-enclosing-function I haven't added a test yet.

@bevin-hansson
Copy link
Contributor Author

I am nervous if we need to spend these lines of code to give a diagnostic for this case and I'd probably choose not.

Fortunately, I think we can refactor the existing code (getEnclosingFunction) to give this diagnostic improvement with nearly no additional complexity: https://github.com/MaskRay/llvm-project/tree/lld-enclosing-function I haven't added a test yet.

The change in your patch seems reasonable, since we can reuse the existing function for finding the enclosing symbol. Do you want me to transplant those changes into this PR?

@bevin-hansson
Copy link
Contributor Author

Superseded by #70854

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants