Skip to content

[lldb] Add DWARFExpressionEntry and GetExpressionEntryAtAddress() to … #144238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

UltimateForce21
Copy link

This patch introduces a new struct and helper API in DWARFExpressionList to expose
variable location metadata (base address, end address, and DWARFExpression pointer) for
a given PC address. It will be used in later patches to annotate disassembly instructions
with source-level variable locations.

New struct

/// Represents one entry in a DWARFExpressionList, with its range and expr.
struct DWARFExpressionEntry {
  lldb::addr_t base;           // file‐address start of this location range
  lldb::addr_t end;            // file‐address end of this range (exclusive)
  const DWARFExpression *expr; // the DWARF expression for this range
};

New API

/// Retrieve the DWARFExpressionEntry covering a particular instruction.
///
/// \param func_load_addr
///     The load address of the start of the function containing this location list;
///     used to translate between file offsets and load addresses. If this is
///     LLDB_INVALID_ADDRESS, the stored CU base (m_func_file_addr) is used.
///
/// \param load_addr
///     The load address of the *current* PC (i.e., the instruction for which
///     we want its variable‐location entry). We first convert this back into
///     the function’s file‐address space to find the correct DWARF range.
///
/// \returns
///     On success, an entry whose `[base,end)` covers this PC; else an Error.
llvm::Expected<DWARFExpressionEntry>
GetExpressionEntryAtAddress(lldb::addr_t func_load_addr,
                            lldb::addr_t load_addr) const;

Rationale

LLDB already provides:

const DWARFExpression *
GetExpressionAtAddress(lldb::addr_t func_load_addr,
                       lldb::addr_t load_addr) const;

However, this only returns the DWARF expression itself, without the file‐address start (base) and end (end) of the location range. Those bounds are crucial for:

  1. Detecting range beginnings: render a var = annotation exactly when a variable’s live‐range starts.

  2. Detecting range continuation: optionally display a “|” on subsequent instructions in the same range.

  3. Detecting state changes: know when a variable moves (e.g. from one register to another), becomes a constant, or goes out of scope.

These primitives form the foundation for the Rich Disassembler feature proposed for GSoC 25.

…DWARFExpressionList

This introduces a new API for retrieving DWARF expression metadata associated with variable location entries at a given PC address. It provides the base, end, and expression pointer for downstream consumers such as disassembler annotations.

Intended for use in richer instruction annotations in Instruction::Dump().
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added the lldb label Jun 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 14, 2025

@llvm/pr-subscribers-lldb

Author: Abdullah Mohammad Amin (UltimateForce21)

Changes

This patch introduces a new struct and helper API in DWARFExpressionList to expose
variable location metadata (base address, end address, and DWARFExpression pointer) for
a given PC address. It will be used in later patches to annotate disassembly instructions
with source-level variable locations.

New struct

/// Represents one entry in a DWARFExpressionList, with its range and expr.
struct DWARFExpressionEntry {
  lldb::addr_t base;           // file‐address start of this location range
  lldb::addr_t end;            // file‐address end of this range (exclusive)
  const DWARFExpression *expr; // the DWARF expression for this range
};

New API

/// Retrieve the DWARFExpressionEntry covering a particular instruction.
///
/// \param func_load_addr
///     The load address of the start of the function containing this location list;
///     used to translate between file offsets and load addresses. If this is
///     LLDB_INVALID_ADDRESS, the stored CU base (m_func_file_addr) is used.
///
/// \param load_addr
///     The load address of the *current* PC (i.e., the instruction for which
///     we want its variable‐location entry). We first convert this back into
///     the function’s file‐address space to find the correct DWARF range.
///
/// \returns
///     On success, an entry whose `[base,end)` covers this PC; else an Error.
llvm::Expected&lt;DWARFExpressionEntry&gt;
GetExpressionEntryAtAddress(lldb::addr_t func_load_addr,
                            lldb::addr_t load_addr) const;

Rationale

LLDB already provides:

const DWARFExpression *
GetExpressionAtAddress(lldb::addr_t func_load_addr,
                       lldb::addr_t load_addr) const;

However, this only returns the DWARF expression itself, without the file‐address start (base) and end (end) of the location range. Those bounds are crucial for:

  1. Detecting range beginnings: render a var = <location> annotation exactly when a variable’s live‐range starts.

  2. Detecting range continuation: optionally display a “|” on subsequent instructions in the same range.

  3. Detecting state changes: know when a variable moves (e.g. from one register to another), becomes a constant, or goes out of scope.

These primitives form the foundation for the Rich Disassembler feature proposed for GSoC 25.


Full diff: https://github.com/llvm/llvm-project/pull/144238.diff

2 Files Affected:

  • (modified) lldb/include/lldb/Expression/DWARFExpressionList.h (+12)
  • (modified) lldb/source/Expression/DWARFExpressionList.cpp (+21)
diff --git a/lldb/include/lldb/Expression/DWARFExpressionList.h b/lldb/include/lldb/Expression/DWARFExpressionList.h
index d8f8ec247ed56..a329b37393018 100644
--- a/lldb/include/lldb/Expression/DWARFExpressionList.h
+++ b/lldb/include/lldb/Expression/DWARFExpressionList.h
@@ -59,6 +59,18 @@ class DWARFExpressionList {
 
   lldb::addr_t GetFuncFileAddress() { return m_func_file_addr; }
 
+  /// Represents an entry in the DWARFExpressionList with all needed metadata
+  struct DWARFExpressionEntry {
+    lldb::addr_t base;
+    lldb::addr_t end;
+    const DWARFExpression *expr;
+  };
+
+  /// Returns the entry (base, end, data) for a given PC address
+  llvm::Expected<DWARFExpressionEntry>
+  GetExpressionEntryAtAddress(lldb::addr_t func_load_addr,
+                              lldb::addr_t load_addr) const;
+
   const DWARFExpression *GetExpressionAtAddress(lldb::addr_t func_load_addr,
                                                 lldb::addr_t load_addr) const;
 
diff --git a/lldb/source/Expression/DWARFExpressionList.cpp b/lldb/source/Expression/DWARFExpressionList.cpp
index 04592a1eb7ff4..b55bc7120c4af 100644
--- a/lldb/source/Expression/DWARFExpressionList.cpp
+++ b/lldb/source/Expression/DWARFExpressionList.cpp
@@ -53,6 +53,27 @@ bool DWARFExpressionList::ContainsAddress(lldb::addr_t func_load_addr,
   return GetExpressionAtAddress(func_load_addr, addr) != nullptr;
 }
 
+llvm::Expected<DWARFExpressionList::DWARFExpressionEntry>
+DWARFExpressionList::GetExpressionEntryAtAddress(lldb::addr_t func_load_addr,
+                                                 lldb::addr_t load_addr) const {
+  if (const DWARFExpression *expr = GetAlwaysValidExpr()) {
+    return DWARFExpressionEntry{0, LLDB_INVALID_ADDRESS, expr};
+  }
+
+  if (func_load_addr == LLDB_INVALID_ADDRESS)
+    func_load_addr = m_func_file_addr;
+
+  addr_t addr = load_addr - func_load_addr + m_func_file_addr;
+  uint32_t index = m_exprs.FindEntryIndexThatContains(addr);
+  if (index == UINT32_MAX) {
+    return llvm::createStringError(llvm::inconvertibleErrorCode(),
+                                   "No DWARF expression found for address 0x%llx", addr);
+  }
+
+  const Entry &entry = *m_exprs.GetEntryAtIndex(index);
+  return DWARFExpressionEntry{entry.base, entry.GetRangeEnd(), &entry.data};
+}
+
 const DWARFExpression *
 DWARFExpressionList::GetExpressionAtAddress(lldb::addr_t func_load_addr,
                                             lldb::addr_t load_addr) const {

@UltimateForce21
Copy link
Author

@adrian-prantl @JDevlieghere I’ve opened my first PR for the new DWARFExpression API. Looking forward to your feedback!

@@ -59,6 +59,18 @@ class DWARFExpressionList {

lldb::addr_t GetFuncFileAddress() { return m_func_file_addr; }

/// Represents an entry in the DWARFExpressionList with all needed metadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing period (see LLVM Coding Style)

Suggested change
/// Represents an entry in the DWARFExpressionList with all needed metadata
/// Represents an entry in the DWARFExpressionList with all needed metadata.

const DWARFExpression *expr;
};

/// Returns the entry (base, end, data) for a given PC address
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Returns the entry (base, end, data) for a given PC address
/// Returns the DWARFExpressionEntry for a given PC address.

Why not reference the struct here so you don't have to keep the comment in sync if we ever add new fields?

Comment on lines +59 to +61
if (const DWARFExpression *expr = GetAlwaysValidExpr()) {
return DWARFExpressionEntry{0, LLDB_INVALID_ADDRESS, expr};
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no braces around single-line if bodies (Coding Style)

Suggested change
if (const DWARFExpression *expr = GetAlwaysValidExpr()) {
return DWARFExpressionEntry{0, LLDB_INVALID_ADDRESS, expr};
}
if (const DWARFExpression *expr = GetAlwaysValidExpr())
return DWARFExpressionEntry{0, LLDB_INVALID_ADDRESS, expr};

Comment on lines +64 to +65
lldb::addr_t base;
lldb::addr_t end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these file or load addresses? If you use lldb::addr_t (as opposed to the Address class), it's always a good idea to make that part of the variable name (like you did for func_load_addr) or at least add a Doxygen-style comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if you do want to use Address, there is already an AddressRange class.

uint32_t index = m_exprs.FindEntryIndexThatContains(addr);
if (index == UINT32_MAX) {
return llvm::createStringError(llvm::inconvertibleErrorCode(),
"No DWARF expression found for address 0x%llx", addr);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally using Expected is desirable, especially if there is actionable information in the error message (and information that only this function can provide). Lookups are borderline cases. If it's expected during normal operation that client will call this API and the lookup fails, then I would return a std::optional instead. The client can turn this into a more targeted error message if needed.
In short — I would return a std::optional here.

const Entry &entry = *m_exprs.GetEntryAtIndex(index);
return DWARFExpressionEntry{entry.base, entry.GetRangeEnd(), &entry.data};
}

const DWARFExpression *
DWARFExpressionList::GetExpressionAtAddress(lldb::addr_t func_load_addr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity: Do we still need this API, or is it just a subset of the function above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants