Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion bolt/include/bolt/Core/BinaryFunction.h
Original file line number Diff line number Diff line change
Expand Up @@ -650,7 +650,8 @@ class BinaryFunction {
///
/// Prefer to use BinaryContext::getFunctionForSymbol(EntrySymbol, &ID)
/// instead of calling this function directly.
uint64_t getEntryIDForSymbol(const MCSymbol *EntrySymbol) const;
std::optional<uint64_t>
getEntryIDForSymbol(const MCSymbol *EntrySymbol) const;

/// If the function represents a secondary split function fragment, set its
/// parent fragment to \p BF.
Expand Down
11 changes: 9 additions & 2 deletions bolt/lib/Core/BinaryContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2374,8 +2374,15 @@ BinaryFunction *BinaryContext::getFunctionForSymbol(const MCSymbol *Symbol,
return nullptr;

BinaryFunction *BF = BFI->second;
if (EntryDesc)
*EntryDesc = BF->getEntryIDForSymbol(Symbol);
if (EntryDesc) {
const uint64_t Address = BF->getAddress() + Symbol->getOffset();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I'm not sure Symbol->getOffset() will always produce the expected output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. Could you please elaborate on what might be the problem here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry. Symbol->getOffset() is not guaranteed to return the offset from the start of the function at arbitrary point of execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand, if Symbol->getOffset() does not return the symbol offset, then it returns 0. In which case const uint64_t Address = BF->getAddress() + Symbol->getOffset(); becomes the same as BF->getAddress(). For the purpose of this patch, I think that is fine. Whether we're checking the function address or the symbol address, the point is to check if we're in a data area (constant island) before proceeding. This check is only reached in getFunctionForSymbol as a way to prevent calling getEntryIDForSymbol, and avoid hitting the llvm_unreachable, if we're in a data area.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'd prefer to make getEntryIDForSymbol() return an optional (i.e. remove the unreachable) and not rely on the implicit behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll look into making it an optional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now returning an optional. I did keep the Symbol->getOffset() check in order to give a warning to the user, but we are no longer relying on it to avoid thellvm_unreachable.

if (BF->isInConstantIsland(Address)) {
this->outs() << "BOLT-WARNING: symbol " << Symbol->getName()
<< " is in data region of function " << BF->getOneName()
<< "(0x" << Twine::utohexstr(BF->getAddress()) << ").\n";
}
*EntryDesc = BF->getEntryIDForSymbol(Symbol).value_or(0);
}

return BF;
}
Expand Down
6 changes: 3 additions & 3 deletions bolt/lib/Core/BinaryFunction.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3855,7 +3855,8 @@ MCSymbol *BinaryFunction::getSymbolForEntryID(uint64_t EntryID) {
return nullptr;
}

uint64_t BinaryFunction::getEntryIDForSymbol(const MCSymbol *Symbol) const {
std::optional<uint64_t>
BinaryFunction::getEntryIDForSymbol(const MCSymbol *Symbol) const {
if (!isMultiEntry())
return 0;

Expand All @@ -3882,8 +3883,7 @@ uint64_t BinaryFunction::getEntryIDForSymbol(const MCSymbol *Symbol) const {
return NumEntries;
++NumEntries;
}

llvm_unreachable("symbol not found");
return std::nullopt;
}

bool BinaryFunction::forEachEntryPoint(EntryPointCallbackTy Callback) const {
Expand Down
36 changes: 36 additions & 0 deletions bolt/test/AArch64/check-symbol-area.s
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
// This test checks that when looking for a function corresponding to a
// symbol, BOLT is not looking through a data area (constant island).

# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
# RUN: llvm-bolt %t.exe -o %t.bolt 2>&1 | FileCheck %s

// Before adding a check for constant islands, BOLT would exit with an error
// of the form: "symbol not found" and throw an LLVM UNREACHABLE error.
# CHECK-NOT: symbol not found
# CHECK-NOT: UNREACHABLE

// Now BOLT throws a warning and does not crash.
# CHECK: BOLT-WARNING: symbol third_block/1 is in data region of function first_block(0x{{[0-9a-f]+}}).

.text
.global main
main:
add x0, x1, x1
bl first_block
ret

.global first_block
$d:
first_block:
add x0, x1, x1
bl second_block
ret
second_block:
add x0, x1, x1
bl third_block
ret
$x:
third_block:
add x0, x1, x1
ret
Loading