Skip to content

Commit

Permalink
[lldb] Improve/fix base address selection in location lists
Browse files Browse the repository at this point in the history
Summary:
Lldb support base address selection entries in location lists was broken
for a long time. This wasn't noticed until llvm started producing these
kinds of entries more frequently with r374600.

In r374769, I made a quick patch which added sufficient support for them
to get the test suite to pass. However, I did not fully understand how
this code operates, and so the fix was not complete. Specifically, what
was lacking was the ability to handle modules which were not loaded at
their preferred load address (for instance, due to ASLR).

Now that I better understand how this code works, I've come to the
conclusion that the current setup does not provide enough information
to correctly process these entries. In the current setup the location
lists were parameterized by two addresses:
- the distance of the function start from the start of the compile unit.
  The purpose of this was to make the location ranges relative to the
  start of the function.
- the actual address where the function was loaded at. With this the
  function-start-relative ranges can be translated to actual memory
  locations.

The reason for the two values, instead of just one (the load bias) is (I
think) MachO, where the debug info in the object files will appear to be
relative to the address zero, but the actual code it refers to
can be moved and reordered by the linker. This means that the location
lists need to be "linked" to reflect the locations in the actual linked
file.

These two bits of information were enough to correctly process location
lists which do not contain base address selection entries (and so all
entries are relative to the CU base). However, they don't work with
them because, in theory two base address can be completely unrelated (as
can happen for instace with hot/cold function splitting, where the
linker can reorder the two pars arbitrarily).

To fix that, I split the first parameter into two:
- the compile unit base address
- the function start address, as is known in the object file

The new algorithm becomes:
- the location lists are processed as they were meant to be processed.
  The CU base address is used as the initial base address value. Base
  address selection entries can set a new base.
- the difference between the "file" and "load" function start addresses
  is used to compute the load bias. This value is added to the final
  ranges to get the actual memory location.

This algorithm is correct for non-MachO debug info, as there the
location lists correctly describe the code in the final executable, and
the dynamic linker can just move the entire module, not pieces of it. It
will also be correct for MachO if the static linker preserves relative
positions of the various parts of the location lists -- I don't know
whether it actually does that, but judging by the lack of base address
selection support in dsymutil and lldb, this isn't something that has
come up in the past.

I add a test case which simulates the ASLR scenario and demonstrates
that base address selection entries now work correctly here.

Reviewers: JDevlieghere, aprantl, clayborg

Subscribers: dblaikie, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D70532
  • Loading branch information
labath committed Dec 9, 2019
1 parent dba420b commit 329008f
Show file tree
Hide file tree
Showing 6 changed files with 247 additions and 76 deletions.
43 changes: 24 additions & 19 deletions lldb/include/lldb/Expression/DWARFExpression.h
Expand Up @@ -85,8 +85,8 @@ class DWARFExpression {

/// Search for a load address in the location list
///
/// \param[in] process
/// The process to use when resolving the load address
/// \param[in] func_load_addr
/// The actual address of the function containing this location list.
///
/// \param[in] addr
/// The address to resolve
Expand All @@ -98,7 +98,7 @@ class DWARFExpression {
// LocationListContainsLoadAddress (Process* process, const Address &addr)
// const;
//
bool LocationListContainsAddress(lldb::addr_t loclist_base_addr,
bool LocationListContainsAddress(lldb::addr_t func_load_addr,
lldb::addr_t addr) const;

/// If a location is not a location list, return true if the location
Expand Down Expand Up @@ -138,13 +138,15 @@ class DWARFExpression {

/// Tells the expression that it refers to a location list.
///
/// \param[in] slide
/// This value should be a slide that is applied to any values
/// in the location list data so the values become zero based
/// offsets into the object that owns the location list. We need
/// to make location lists relative to the objects that own them
/// so we can relink addresses on the fly.
void SetLocationListSlide(lldb::addr_t slide);
/// \param[in] cu_file_addr
/// The base address to use for interpreting relative location list
/// entries.
/// \param[in] func_file_addr
/// The file address of the function containing this location list. This
/// address will be used to relocate the location list on the fly (in
/// conjuction with the func_load_addr arguments).
void SetLocationListAddresses(lldb::addr_t cu_file_addr,
lldb::addr_t func_file_addr);

/// Return the call-frame-info style register kind
int GetRegisterKind();
Expand All @@ -158,8 +160,7 @@ class DWARFExpression {
/// Wrapper for the static evaluate function that accepts an
/// ExecutionContextScope instead of an ExecutionContext and uses member
/// variables to populate many operands
bool Evaluate(ExecutionContextScope *exe_scope,
lldb::addr_t loclist_base_load_addr,
bool Evaluate(ExecutionContextScope *exe_scope, lldb::addr_t func_load_addr,
const Value *initial_value_ptr, const Value *object_address_ptr,
Value &result, Status *error_ptr) const;

Expand Down Expand Up @@ -222,8 +223,8 @@ class DWARFExpression {
}

bool DumpLocationForAddress(Stream *s, lldb::DescriptionLevel level,
lldb::addr_t loclist_base_load_addr,
lldb::addr_t address, ABI *abi);
lldb::addr_t func_load_addr, lldb::addr_t address,
ABI *abi);

static bool PrintDWARFExpression(Stream &s, const DataExtractor &data,
int address_size, int dwarf_ref_size,
Expand Down Expand Up @@ -256,7 +257,7 @@ class DWARFExpression {
void DumpLocation(Stream *s, lldb::offset_t offset, lldb::offset_t length,
lldb::DescriptionLevel level, ABI *abi) const;

bool GetLocation(lldb::addr_t base_addr, lldb::addr_t pc,
bool GetLocation(lldb::addr_t func_load_addr, lldb::addr_t pc,
lldb::offset_t &offset, lldb::offset_t &len);

static bool AddressRangeForLocationListEntry(
Expand All @@ -266,6 +267,9 @@ class DWARFExpression {
bool GetOpAndEndOffsets(StackFrame &frame, lldb::offset_t &op_offset,
lldb::offset_t &end_offset);

void RelocateLowHighPC(lldb::addr_t base_address, lldb::addr_t func_load_addr,
lldb::addr_t &low_pc, lldb::addr_t &high_pc) const;

/// Module which defined this expression.
lldb::ModuleWP m_module_wp;

Expand All @@ -280,10 +284,11 @@ class DWARFExpression {
/// One of the defines that starts with LLDB_REGKIND_
lldb::RegisterKind m_reg_kind;

/// A value used to slide the location list offsets so that m_c they are
/// relative to the object that owns the location list (the function for
/// frame base and variable location lists)
lldb::addr_t m_loclist_slide;
struct LoclistAddresses {
lldb::addr_t cu_file_addr;
lldb::addr_t func_file_addr;
};
llvm::Optional<LoclistAddresses> m_loclist_addresses;
};

} // namespace lldb_private
Expand Down
109 changes: 58 additions & 51 deletions lldb/source/Expression/DWARFExpression.cpp
Expand Up @@ -56,13 +56,13 @@ ReadAddressFromDebugAddrSection(const DWARFUnit *dwarf_cu,
// DWARFExpression constructor
DWARFExpression::DWARFExpression()
: m_module_wp(), m_data(), m_dwarf_cu(nullptr),
m_reg_kind(eRegisterKindDWARF), m_loclist_slide(LLDB_INVALID_ADDRESS) {}
m_reg_kind(eRegisterKindDWARF) {}

DWARFExpression::DWARFExpression(lldb::ModuleSP module_sp,
const DataExtractor &data,
const DWARFUnit *dwarf_cu)
: m_module_wp(), m_data(data), m_dwarf_cu(dwarf_cu),
m_reg_kind(eRegisterKindDWARF), m_loclist_slide(LLDB_INVALID_ADDRESS) {
m_reg_kind(eRegisterKindDWARF) {
if (module_sp)
m_module_wp = module_sp;
}
Expand Down Expand Up @@ -94,8 +94,9 @@ void DWARFExpression::DumpLocation(Stream *s, lldb::offset_t offset,
nullptr);
}

void DWARFExpression::SetLocationListSlide(addr_t slide) {
m_loclist_slide = slide;
void DWARFExpression::SetLocationListAddresses(addr_t cu_file_addr,
addr_t func_file_addr) {
m_loclist_addresses = LoclistAddresses{cu_file_addr, func_file_addr};
}

int DWARFExpression::GetRegisterKind() { return m_reg_kind; }
Expand All @@ -105,7 +106,7 @@ void DWARFExpression::SetRegisterKind(RegisterKind reg_kind) {
}

bool DWARFExpression::IsLocationList() const {
return m_loclist_slide != LLDB_INVALID_ADDRESS;
return bool(m_loclist_addresses);
}

void DWARFExpression::GetDescription(Stream *s, lldb::DescriptionLevel level,
Expand Down Expand Up @@ -614,46 +615,43 @@ bool DWARFExpression::LinkThreadLocalStorage(
return true;
}

bool DWARFExpression::LocationListContainsAddress(
lldb::addr_t loclist_base_addr, lldb::addr_t addr) const {
if (addr == LLDB_INVALID_ADDRESS)
bool DWARFExpression::LocationListContainsAddress(addr_t func_load_addr,
lldb::addr_t addr) const {
if (func_load_addr == LLDB_INVALID_ADDRESS || addr == LLDB_INVALID_ADDRESS)
return false;

if (IsLocationList()) {
lldb::offset_t offset = 0;

if (loclist_base_addr == LLDB_INVALID_ADDRESS)
return false;
if (!IsLocationList())
return false;

while (m_data.ValidOffset(offset)) {
// We need to figure out what the value is for the location.
addr_t lo_pc = LLDB_INVALID_ADDRESS;
addr_t hi_pc = LLDB_INVALID_ADDRESS;
if (!AddressRangeForLocationListEntry(m_dwarf_cu, m_data, &offset, lo_pc,
hi_pc))
break;
lldb::offset_t offset = 0;
lldb::addr_t base_address = m_loclist_addresses->cu_file_addr;
while (m_data.ValidOffset(offset)) {
// We need to figure out what the value is for the location.
addr_t lo_pc = LLDB_INVALID_ADDRESS;
addr_t hi_pc = LLDB_INVALID_ADDRESS;
if (!AddressRangeForLocationListEntry(m_dwarf_cu, m_data, &offset, lo_pc,
hi_pc))
break;

if (lo_pc == 0 && hi_pc == 0)
break;
if (lo_pc == 0 && hi_pc == 0)
break;

if ((m_data.GetAddressByteSize() == 4 && (lo_pc == UINT32_MAX)) ||
(m_data.GetAddressByteSize() == 8 && (lo_pc == UINT64_MAX))) {
loclist_base_addr = hi_pc + m_loclist_slide;
continue;
}
lo_pc += loclist_base_addr - m_loclist_slide;
hi_pc += loclist_base_addr - m_loclist_slide;
if ((m_data.GetAddressByteSize() == 4 && (lo_pc == UINT32_MAX)) ||
(m_data.GetAddressByteSize() == 8 && (lo_pc == UINT64_MAX))) {
base_address = hi_pc;
continue;
}
RelocateLowHighPC(base_address, func_load_addr, lo_pc, hi_pc);

if (lo_pc <= addr && addr < hi_pc)
return true;
if (lo_pc <= addr && addr < hi_pc)
return true;

offset += m_data.GetU16(&offset);
}
offset += m_data.GetU16(&offset);
}
return false;
}

bool DWARFExpression::GetLocation(addr_t base_addr, addr_t pc,
bool DWARFExpression::GetLocation(addr_t func_load_addr, addr_t pc,
lldb::offset_t &offset,
lldb::offset_t &length) {
offset = 0;
Expand All @@ -662,9 +660,8 @@ bool DWARFExpression::GetLocation(addr_t base_addr, addr_t pc,
return true;
}

if (base_addr != LLDB_INVALID_ADDRESS && pc != LLDB_INVALID_ADDRESS) {
addr_t curr_base_addr = base_addr;

if (func_load_addr != LLDB_INVALID_ADDRESS && pc != LLDB_INVALID_ADDRESS) {
addr_t base_address = m_loclist_addresses->cu_file_addr;
while (m_data.ValidOffset(offset)) {
// We need to figure out what the value is for the location.
addr_t lo_pc = LLDB_INVALID_ADDRESS;
Expand All @@ -678,13 +675,11 @@ bool DWARFExpression::GetLocation(addr_t base_addr, addr_t pc,

if ((m_data.GetAddressByteSize() == 4 && (lo_pc == UINT32_MAX)) ||
(m_data.GetAddressByteSize() == 8 && (lo_pc == UINT64_MAX))) {
curr_base_addr = hi_pc + m_loclist_slide;
base_address = hi_pc;
continue;
}

lo_pc += curr_base_addr - m_loclist_slide;
hi_pc += curr_base_addr - m_loclist_slide;

RelocateLowHighPC(base_address, func_load_addr, lo_pc, hi_pc);
length = m_data.GetU16(&offset);

if (length > 0 && lo_pc <= pc && pc < hi_pc)
Expand All @@ -700,12 +695,12 @@ bool DWARFExpression::GetLocation(addr_t base_addr, addr_t pc,

bool DWARFExpression::DumpLocationForAddress(Stream *s,
lldb::DescriptionLevel level,
addr_t base_addr, addr_t address,
ABI *abi) {
addr_t func_load_addr,
addr_t address, ABI *abi) {
lldb::offset_t offset = 0;
lldb::offset_t length = 0;

if (GetLocation(base_addr, address, offset, length)) {
if (GetLocation(func_load_addr, address, offset, length)) {
if (length > 0) {
DumpLocation(s, offset, length, level, abi);
return true;
Expand Down Expand Up @@ -936,7 +931,7 @@ bool DWARFExpression::Evaluate(ExecutionContextScope *exe_scope,

bool DWARFExpression::Evaluate(ExecutionContext *exe_ctx,
RegisterContext *reg_ctx,
lldb::addr_t loclist_base_load_addr,
lldb::addr_t func_load_addr,
const Value *initial_value_ptr,
const Value *object_address_ptr, Value &result,
Status *error_ptr) const {
Expand All @@ -958,15 +953,14 @@ bool DWARFExpression::Evaluate(ExecutionContext *exe_ctx,
pc = reg_ctx_sp->GetPC();
}

if (loclist_base_load_addr != LLDB_INVALID_ADDRESS) {
if (func_load_addr != LLDB_INVALID_ADDRESS) {
if (pc == LLDB_INVALID_ADDRESS) {
if (error_ptr)
error_ptr->SetErrorString("Invalid PC in frame.");
return false;
}

addr_t curr_loclist_base_load_addr = loclist_base_load_addr;

addr_t base_address = m_loclist_addresses->cu_file_addr;
while (m_data.ValidOffset(offset)) {
// We need to figure out what the value is for the location.
addr_t lo_pc = LLDB_INVALID_ADDRESS;
Expand All @@ -982,12 +976,11 @@ bool DWARFExpression::Evaluate(ExecutionContext *exe_ctx,
(lo_pc == UINT32_MAX)) ||
(m_data.GetAddressByteSize() == 8 &&
(lo_pc == UINT64_MAX))) {
curr_loclist_base_load_addr = hi_pc + m_loclist_slide;
base_address = hi_pc;
continue;
}
lo_pc += curr_loclist_base_load_addr - m_loclist_slide;
hi_pc += curr_loclist_base_load_addr - m_loclist_slide;

RelocateLowHighPC(base_address, func_load_addr, lo_pc, hi_pc);
uint16_t length = m_data.GetU16(&offset);

if (length > 0 && lo_pc <= pc && pc < hi_pc) {
Expand Down Expand Up @@ -2970,6 +2963,20 @@ bool DWARFExpression::GetOpAndEndOffsets(StackFrame &frame,
return true;
}

void DWARFExpression::RelocateLowHighPC(addr_t base_address,
addr_t func_load_addr, addr_t &low_pc,
addr_t &high_pc) const {
// How this works:
// base_address is the current base address, as known in the file. low_pc and
// high_pc are relative to that. First, we relocate the base address by
// applying the load bias (the difference between an address in the file and
// the actual address in memory). Then we relocate low_pc and high_pc based on
// that.
base_address += func_load_addr - m_loclist_addresses->func_file_addr;
low_pc += base_address;
high_pc += base_address;
}

bool DWARFExpression::MatchesOperand(StackFrame &frame,
const Instruction::Operand &operand) {
using namespace OperandMatchers;
Expand Down
6 changes: 3 additions & 3 deletions lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
Expand Up @@ -350,8 +350,8 @@ bool DWARFDebugInfoEntry::GetDIENamesAndRanges(
*frame_base = DWARFExpression(module, data, cu);
if (lo_pc != LLDB_INVALID_ADDRESS) {
assert(lo_pc >= cu->GetBaseAddress());
frame_base->SetLocationListSlide(lo_pc -
cu->GetBaseAddress());
frame_base->SetLocationListAddresses(cu->GetBaseAddress(),
lo_pc);
} else {
set_frame_base_loclist_addr = true;
}
Expand Down Expand Up @@ -379,7 +379,7 @@ bool DWARFDebugInfoEntry::GetDIENamesAndRanges(
if (set_frame_base_loclist_addr) {
dw_addr_t lowest_range_pc = ranges.GetMinRangeBase(0);
assert(lowest_range_pc >= cu->GetBaseAddress());
frame_base->SetLocationListSlide(lowest_range_pc - cu->GetBaseAddress());
frame_base->SetLocationListAddresses(cu->GetBaseAddress(), lowest_range_pc);
}

if (ranges.IsEmpty() || name == nullptr || mangled == nullptr) {
Expand Down
6 changes: 3 additions & 3 deletions lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
Expand Up @@ -3362,9 +3362,9 @@ VariableSP SymbolFileDWARF::ParseVariableDIE(const SymbolContext &sc,
data = DataExtractor(data, offset, data.GetByteSize() - offset);
location = DWARFExpression(module, data, die.GetCU());
assert(func_low_pc != LLDB_INVALID_ADDRESS);
location.SetLocationListSlide(
func_low_pc -
attributes.CompileUnitAtIndex(i)->GetBaseAddress());
location.SetLocationListAddresses(
attributes.CompileUnitAtIndex(i)->GetBaseAddress(),
func_low_pc);
}
}
} break;
Expand Down
30 changes: 30 additions & 0 deletions lldb/test/Shell/SymbolFile/DWARF/Inputs/debug_loc-aslr.yaml
@@ -0,0 +1,30 @@
--- !minidump
Streams:
- Type: ThreadList
Threads:
- Thread Id: 0x00003E81
Context: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000B001000000000006CAE000000006B7FC05A0000C81D415A0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000A2BF9E5A6B7F0000000000000000000000000000000000008850C14BFD7F00009850C14BFD7F00000100000000000000B04AC14BFD7F0000000000000000000060812D01000000000800000000000000B065E05A6B7F00008004400000000000E050C14BFD7F00000000000000000000000000000000000001004700000000007F03FFFF0000FFFFFFFFFFFF000000000000000000000000801F00006B7F00000400000000000000B84CC14BFD7F0000304D405A6B7F0000C84DC14BFD7F0000C0AA405A6B7F00004F033D0000000000B84DC14BFD7F0000E84DC14BFD7F0000000000000000000000000000000000000070E05A6B7F000078629E5A6B7F0000C81D415A6B7F0000804F9E5A6B7F00000000000001000000E603000001000000E093115A6B7F0000804EC14BFD7F0000584EC14BFD7F000099ADC05A6B7F00000100000000000000AAAAD77D0000000002000000000000000800000000000000B065E05A6B7F0000E6B7C05A6B7F0000010000006B7F0000884DC14BFD7F0000106F7C5A6B7F0000984EC14BFD7F0000488B7C5A6B7F0000C4A71CB90000000001000000000000000800000000000000B065E05A6B7F000048B6C05A6B7F0000702AE25A6B7F0000D84DC14BFD7F000030489E5A6B7F0000E84EC14BFD7F0000E05E9E5A6B7F00000991F0460000000001000000000000000800000000000000B065E05A6B7F000048B6C05A6B7F00000100000000000000284EC14BFD7F00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Stack:
Start of Memory Range: 0x00007FFCEB34A000
Content: DEAD
- Type: SystemInfo
Processor Arch: AMD64
Processor Level: 6
Processor Revision: 15876
Number of Processors: 40
Platform ID: Linux
CSD Version: 'Linux 3.13.0-91-generic'
CPU:
Vendor ID: GenuineIntel
Version Info: 0x00000000
Feature Info: 0x00000000
- Type: LinuxProcStatus
Text: |
Name: linux-x86_64
State: t (tracing stop)
Tgid: 29917
Ngid: 0
Pid: 29917
PPid: 29370
...

0 comments on commit 329008f

Please sign in to comment.