Skip to content

Conversation

@felipepiovezan
Copy link
Contributor

This is a series of patches aimed at simplifying the main loop in UnwindAssemblyInstruction.
In a future PR, I would like to fix a bug whereby a mid function epilogue causes the unwinder to produce incorrect results, and these changes will make the fix easier to follow.

While this does pollute the git blame a bit, this code hasn't changed in close to ten years, and even the latest changes were already of a reformatting nature. As such, I believe this is acceptable.

Please look at each commit individually.

@llvmbot
Copy link
Member

llvmbot commented Nov 13, 2025

@llvm/pr-subscribers-lldb

Author: Felipe de Azevedo Piovezan (felipepiovezan)

Changes

This is a series of patches aimed at simplifying the main loop in UnwindAssemblyInstruction.
In a future PR, I would like to fix a bug whereby a mid function epilogue causes the unwinder to produce incorrect results, and these changes will make the fix easier to follow.

While this does pollute the git blame a bit, this code hasn't changed in close to ten years, and even the latest changes were already of a reformatting nature. As such, I believe this is acceptable.

Please look at each commit individually.


Full diff: https://github.com/llvm/llvm-project/pull/167914.diff

3 Files Affected:

  • (modified) lldb/include/lldb/Core/Disassembler.h (+4)
  • (modified) lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp (+126-124)
  • (modified) lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h (+6-2)
diff --git a/lldb/include/lldb/Core/Disassembler.h b/lldb/include/lldb/Core/Disassembler.h
index db186dd33d774..daa7b3d25f11b 100644
--- a/lldb/include/lldb/Core/Disassembler.h
+++ b/lldb/include/lldb/Core/Disassembler.h
@@ -297,6 +297,10 @@ class InstructionList {
 
   lldb::InstructionSP GetInstructionAtIndex(size_t idx) const;
 
+  llvm::ArrayRef<lldb::InstructionSP> Instructions() const {
+    return m_instructions;
+  }
+
   /// Get the instruction at the given address.
   ///
   /// \return
diff --git a/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp b/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
index 397c693fc498a..68a18bc39249a 100644
--- a/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+++ b/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
@@ -50,6 +50,33 @@ bool UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(
       range, function_text.data(), function_text.size(), unwind_plan);
 }
 
+static void DumpUnwindRowsToLog(Log *log, AddressRange range,
+                                    const UnwindPlan &unwind_plan) {
+  if (!log || !log->GetVerbose())
+    return;
+  StreamString strm;
+  lldb::addr_t base_addr = range.GetBaseAddress().GetFileAddress();
+  strm.Printf("Resulting unwind rows for [0x%" PRIx64 " - 0x%" PRIx64 "):",
+              base_addr, base_addr + range.GetByteSize());
+  unwind_plan.Dump(strm, nullptr, base_addr);
+  log->PutString(strm.GetString());
+}
+
+static void DumpInstToLog(Log *log, Instruction &inst,
+                          InstructionList inst_list) {
+  if (!log || !log->GetVerbose())
+    return;
+  const bool show_address = true;
+  const bool show_bytes = true;
+  const bool show_control_flow_kind = false;
+  StreamString strm;
+  lldb_private::FormatEntity::Entry format;
+  FormatEntity::Parse("${frame.pc}: ", format);
+  inst.Dump(&strm, inst_list.GetMaxOpcocdeByteSize(), show_address, show_bytes,
+            show_control_flow_kind, nullptr, nullptr, nullptr, &format, 0);
+  log->PutString(strm.GetString());
+}
+
 bool UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(
     AddressRange &range, uint8_t *opcode_data, size_t opcode_size,
     UnwindPlan &unwind_plan) {
@@ -82,11 +109,6 @@ bool UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(
   m_range_ptr = &range;
   m_unwind_plan_ptr = &unwind_plan;
 
-  const uint32_t addr_byte_size = m_arch.GetAddressByteSize();
-  const bool show_address = true;
-  const bool show_bytes = true;
-  const bool show_control_flow_kind = false;
-
   m_state.cfa_reg_info = *m_inst_emulator_up->GetRegisterInfo(
       unwind_plan.GetRegisterKind(), unwind_plan.GetInitialCFARegister());
   m_state.fp_is_cfa = false;
@@ -94,134 +116,114 @@ bool UnwindAssemblyInstEmulation::GetNonCallSiteUnwindPlanFromAssembly(
 
   m_pushed_regs.clear();
 
-  // Initialize the CFA with a known value. In the 32 bit case it will be
-  // 0x80000000, and in the 64 bit case 0x8000000000000000. We use the address
-  // byte size to be safe for any future address sizes
-  m_initial_sp = (1ull << ((addr_byte_size * 8) - 1));
   RegisterValue cfa_reg_value;
-  cfa_reg_value.SetUInt(m_initial_sp, m_state.cfa_reg_info.byte_size);
+  cfa_reg_value.SetUInt(m_initial_cfa, m_state.cfa_reg_info.byte_size);
   SetRegisterValue(m_state.cfa_reg_info, cfa_reg_value);
 
-  const InstructionList &inst_list = disasm_sp->GetInstructionList();
-  const size_t num_instructions = inst_list.GetSize();
-
-  if (num_instructions > 0) {
-    Instruction *inst = inst_list.GetInstructionAtIndex(0).get();
-    const lldb::addr_t base_addr = inst->GetAddress().GetFileAddress();
-
-    // Map for storing the unwind state at a given offset. When we see a forward
-    // branch we add a new entry to this map with the actual unwind plan row and
-    // register context for the target address of the branch as the current data
-    // have to be valid for the target address of the branch too if we are in
-    // the same function.
-    std::map<lldb::addr_t, UnwindState> saved_unwind_states;
-
-    // Make a copy of the current instruction Row and save it in m_state so
-    // we can add updates as we process the instructions.
-    m_state.row = *unwind_plan.GetLastRow();
-
-    // Add the initial state to the save list with offset 0.
-    auto condition_block_start_state =
-        saved_unwind_states.emplace(0, m_state).first;
-
-    // The architecture dependent condition code of the last processed
-    // instruction.
-    EmulateInstruction::InstructionCondition last_condition =
-        EmulateInstruction::UnconditionalCondition;
-
-    for (size_t idx = 0; idx < num_instructions; ++idx) {
-      m_curr_row_modified = false;
-      m_forward_branch_offset = 0;
-
-      inst = inst_list.GetInstructionAtIndex(idx).get();
-      if (!inst)
-        continue;
-
-      lldb::addr_t current_offset =
-          inst->GetAddress().GetFileAddress() - base_addr;
-      auto it = saved_unwind_states.upper_bound(current_offset);
-      assert(it != saved_unwind_states.begin() &&
-             "Unwind row for the function entry missing");
-      --it; // Move it to the row corresponding to the current offset
-
-      // If the offset of m_curr_row don't match with the offset we see in
-      // saved_unwind_states then we have to update current unwind state to
-      // the saved values. It is happening after we processed an epilogue and a
-      // return to caller instruction.
-      if (it->second.row.GetOffset() != m_state.row.GetOffset())
-        m_state = it->second;
-
-      m_inst_emulator_up->SetInstruction(inst->GetOpcode(), inst->GetAddress(),
-                                         nullptr);
-
-      if (last_condition != m_inst_emulator_up->GetInstructionCondition()) {
-        // If the last instruction was conditional with a different condition
-        // than the current condition then restore the state.
-        if (last_condition != EmulateInstruction::UnconditionalCondition) {
-          m_state = condition_block_start_state->second;
-          m_state.row.SetOffset(current_offset);
-          // The last instruction might already created a row for this offset
-          // and we want to overwrite it.
-          saved_unwind_states.insert_or_assign(current_offset, m_state);
-        }
+  InstructionList inst_list = disasm_sp->GetInstructionList();
 
-        // We are starting a new conditional block at the actual offset
-        condition_block_start_state = it;
-      }
+  if (inst_list.GetSize()) {
+    DumpUnwindRowsToLog(log, range, unwind_plan);
+    return unwind_plan.GetRowCount() > 0;
+  }
 
-      if (log && log->GetVerbose()) {
-        StreamString strm;
-        lldb_private::FormatEntity::Entry format;
-        FormatEntity::Parse("${frame.pc}: ", format);
-        inst->Dump(&strm, inst_list.GetMaxOpcocdeByteSize(), show_address,
-                   show_bytes, show_control_flow_kind, nullptr, nullptr,
-                   nullptr, &format, 0);
-        log->PutString(strm.GetString());
-      }
+  Instruction &first_inst = *inst_list.GetInstructionAtIndex(0);
+  const lldb::addr_t base_addr = first_inst.GetAddress().GetFileAddress();
 
-      last_condition = m_inst_emulator_up->GetInstructionCondition();
+  // Map for storing the unwind state at a given offset. When we see a forward
+  // branch we add a new entry to this map with the actual unwind plan row and
+  // register context for the target address of the branch as the current data
+  // have to be valid for the target address of the branch too if we are in
+  // the same function.
+  std::map<lldb::addr_t, UnwindState> saved_unwind_states;
 
-      m_inst_emulator_up->EvaluateInstruction(
-          eEmulateInstructionOptionIgnoreConditions);
+  // Make a copy of the current instruction Row and save it in m_state so
+  // we can add updates as we process the instructions.
+  m_state.row = *unwind_plan.GetLastRow();
 
-      // If the current instruction is a branch forward then save the current
-      // CFI information for the offset where we are branching.
-      if (m_forward_branch_offset != 0 &&
-          range.ContainsFileAddress(inst->GetAddress().GetFileAddress() +
-                                    m_forward_branch_offset)) {
-        if (auto [it, inserted] = saved_unwind_states.emplace(
-                current_offset + m_forward_branch_offset, m_state);
-            inserted)
-          it->second.row.SetOffset(current_offset + m_forward_branch_offset);
-      }
+  // Add the initial state to the save list with offset 0.
+  auto condition_block_start_state =
+      saved_unwind_states.emplace(0, m_state).first;
 
-      // Were there any changes to the CFI while evaluating this instruction?
-      if (m_curr_row_modified) {
-        // Save the modified row if we don't already have a CFI row in the
-        // current address
-        if (saved_unwind_states.count(current_offset +
-                                      inst->GetOpcode().GetByteSize()) == 0) {
-          m_state.row.SetOffset(current_offset +
-                                inst->GetOpcode().GetByteSize());
-          saved_unwind_states.emplace(
-              current_offset + inst->GetOpcode().GetByteSize(), m_state);
-        }
+  // The architecture dependent condition code of the last processed
+  // instruction.
+  EmulateInstruction::InstructionCondition last_condition =
+      EmulateInstruction::UnconditionalCondition;
+
+  for (InstructionSP inst : inst_list.Instructions()) {
+    if (!inst)
+      continue;
+    DumpInstToLog(log, *inst, inst_list);
+
+    m_curr_row_modified = false;
+    m_forward_branch_offset = 0;
+
+    lldb::addr_t current_offset =
+        inst->GetAddress().GetFileAddress() - base_addr;
+    auto it = saved_unwind_states.upper_bound(current_offset);
+    assert(it != saved_unwind_states.begin() &&
+           "Unwind row for the function entry missing");
+    --it; // Move it to the row corresponding to the current offset
+
+    // If the offset of m_state.row doesn't match with the offset we see in
+    // saved_unwind_states then we have to update current unwind state to
+    // the saved values. It is happening after we processed an epilogue and a
+    // return to caller instruction.
+    if (it->second.row.GetOffset() != m_state.row.GetOffset())
+      m_state = it->second;
+
+    m_inst_emulator_up->SetInstruction(inst->GetOpcode(), inst->GetAddress(),
+                                       nullptr);
+
+    if (last_condition != m_inst_emulator_up->GetInstructionCondition()) {
+      // If the last instruction was conditional with a different condition
+      // than the current condition then restore the state.
+      if (last_condition != EmulateInstruction::UnconditionalCondition) {
+        m_state = condition_block_start_state->second;
+        m_state.row.SetOffset(current_offset);
+        // The last instruction might already created a row for this offset
+        // and we want to overwrite it.
+        saved_unwind_states.insert_or_assign(current_offset, m_state);
       }
+
+      // We are starting a new conditional block at the actual offset
+      condition_block_start_state = it;
     }
-    for (auto &[_, state] : saved_unwind_states) {
-      unwind_plan.InsertRow(std::move(state.row),
-                            /*replace_existing=*/true);
+
+    last_condition = m_inst_emulator_up->GetInstructionCondition();
+
+    m_inst_emulator_up->EvaluateInstruction(
+        eEmulateInstructionOptionIgnoreConditions);
+
+    // If the current instruction is a branch forward then save the current
+    // CFI information for the offset where we are branching.
+    if (m_forward_branch_offset != 0 &&
+        range.ContainsFileAddress(inst->GetAddress().GetFileAddress() +
+                                  m_forward_branch_offset)) {
+      if (auto [it, inserted] = saved_unwind_states.emplace(
+              current_offset + m_forward_branch_offset, m_state);
+          inserted)
+        it->second.row.SetOffset(current_offset + m_forward_branch_offset);
     }
-  }
 
-  if (log && log->GetVerbose()) {
-    StreamString strm;
-    lldb::addr_t base_addr = range.GetBaseAddress().GetFileAddress();
-    strm.Printf("Resulting unwind rows for [0x%" PRIx64 " - 0x%" PRIx64 "):",
-                base_addr, base_addr + range.GetByteSize());
-    unwind_plan.Dump(strm, nullptr, base_addr);
-    log->PutString(strm.GetString());
+    // Were there any changes to the CFI while evaluating this instruction?
+    if (m_curr_row_modified) {
+      // Save the modified row if we don't already have a CFI row in the
+      // current address
+      if (saved_unwind_states.count(current_offset +
+                                    inst->GetOpcode().GetByteSize()) == 0) {
+        m_state.row.SetOffset(current_offset + inst->GetOpcode().GetByteSize());
+        saved_unwind_states.emplace(
+            current_offset + inst->GetOpcode().GetByteSize(), m_state);
+      }
+    }
   }
+
+  for (auto &[_, state] : saved_unwind_states)
+    unwind_plan.InsertRow(std::move(state.row),
+                          /*replace_existing=*/true);
+
+  DumpUnwindRowsToLog(log, range, unwind_plan);
   return unwind_plan.GetRowCount() > 0;
 }
 
@@ -382,7 +384,7 @@ size_t UnwindAssemblyInstEmulation::WriteMemory(
     if (reg_num != LLDB_INVALID_REGNUM &&
         generic_regnum != LLDB_REGNUM_GENERIC_SP) {
       if (m_pushed_regs.try_emplace(reg_num, addr).second) {
-        const int32_t offset = addr - m_initial_sp;
+        const int32_t offset = addr - m_initial_cfa;
         m_state.row.SetRegisterLocationToAtCFAPlusOffset(reg_num, offset,
                                                          /*can_replace=*/true);
         m_curr_row_modified = true;
@@ -549,7 +551,7 @@ bool UnwindAssemblyInstEmulation::WriteRegister(
                   sp_reg_info.kinds[m_unwind_plan_ptr->GetRegisterKind()];
               assert(cfa_reg_num != LLDB_INVALID_REGNUM);
               m_state.row.GetCFAValue().SetIsRegisterPlusOffset(
-                  cfa_reg_num, m_initial_sp - sp_reg_val.GetAsUInt64());
+                  cfa_reg_num, m_initial_cfa - sp_reg_val.GetAsUInt64());
             }
           }
         }
@@ -580,7 +582,7 @@ bool UnwindAssemblyInstEmulation::WriteRegister(
           reg_info->kinds[m_unwind_plan_ptr->GetRegisterKind()];
       assert(cfa_reg_num != LLDB_INVALID_REGNUM);
       m_state.row.GetCFAValue().SetIsRegisterPlusOffset(
-          cfa_reg_num, m_initial_sp - reg_value.GetAsUInt64());
+          cfa_reg_num, m_initial_cfa - reg_value.GetAsUInt64());
       m_curr_row_modified = true;
     }
     break;
@@ -593,7 +595,7 @@ bool UnwindAssemblyInstEmulation::WriteRegister(
           reg_info->kinds[m_unwind_plan_ptr->GetRegisterKind()];
       assert(cfa_reg_num != LLDB_INVALID_REGNUM);
       m_state.row.GetCFAValue().SetIsRegisterPlusOffset(
-          cfa_reg_num, m_initial_sp - reg_value.GetAsUInt64());
+          cfa_reg_num, m_initial_cfa - reg_value.GetAsUInt64());
       m_curr_row_modified = true;
     }
     break;
@@ -604,7 +606,7 @@ bool UnwindAssemblyInstEmulation::WriteRegister(
     if (!m_state.fp_is_cfa) {
       m_state.row.GetCFAValue().SetIsRegisterPlusOffset(
           m_state.row.GetCFAValue().GetRegisterNumber(),
-          m_initial_sp - reg_value.GetAsUInt64());
+          m_initial_cfa - reg_value.GetAsUInt64());
       m_curr_row_modified = true;
     }
     break;
diff --git a/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h b/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h
index 96a0881eaee9d..6c0492f5dfc66 100644
--- a/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h
+++ b/lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h
@@ -63,13 +63,17 @@ class UnwindAssemblyInstEmulation : public lldb_private::UnwindAssembly {
   UnwindAssemblyInstEmulation(const lldb_private::ArchSpec &arch,
                               lldb_private::EmulateInstruction *inst_emulator)
       : UnwindAssembly(arch), m_inst_emulator_up(inst_emulator),
-        m_range_ptr(nullptr), m_unwind_plan_ptr(nullptr), m_initial_sp(0),
+        m_range_ptr(nullptr), m_unwind_plan_ptr(nullptr),
         m_curr_row_modified(false), m_forward_branch_offset(0) {
     if (m_inst_emulator_up) {
       m_inst_emulator_up->SetBaton(this);
       m_inst_emulator_up->SetCallbacks(ReadMemory, WriteMemory, ReadRegister,
                                        WriteRegister);
     }
+    // Initialize the CFA with a known value. In the 32 bit case it will be
+    // 0x80000000, and in the 64 bit case 0x8000000000000000. We use the address
+    // byte size to be safe for any future address sizes
+    m_initial_cfa = (1ull << ((m_arch.GetAddressByteSize() * 8) - 1));
   }
 
   static size_t
@@ -134,8 +138,8 @@ class UnwindAssemblyInstEmulation : public lldb_private::UnwindAssembly {
   lldb_private::AddressRange *m_range_ptr;
   lldb_private::UnwindPlan *m_unwind_plan_ptr;
   UnwindState m_state;
+  uint64_t m_initial_cfa;
   typedef std::map<uint64_t, uint64_t> PushedRegisterToAddrMap;
-  uint64_t m_initial_sp;
   PushedRegisterToAddrMap m_pushed_regs;
 
   // While processing the instruction stream, we need to communicate some state

@github-actions
Copy link

github-actions bot commented Nov 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

m_forward_branch_offset = 0;

Instruction *inst = inst_list.GetInstructionAtIndex(idx).get();
for (InstructionSP inst : inst_list.Instructions()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const InstructionSP&?

InstructionList inst_list = disasm_sp->GetInstructionList();

if (num_instructions == 0) {
if (inst_list.GetSize()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be !inst_list.GetSize()?

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (just left 2 questions)

@felipepiovezan felipepiovezan force-pushed the felipe/UnwindAssemblyInstEmulation branch from 717467b to e371803 Compare November 14, 2025 10:17
@felipepiovezan
Copy link
Contributor Author

addressed review comments

felipepiovezan added a commit that referenced this pull request Nov 14, 2025
…lation (#167914)

Also rename the "sp" suffix (originally intended to mean "Stack
Pointer") to "cfa", as "sp" generally means Shared Pointer.
@felipepiovezan
Copy link
Contributor Author

I've pushed these separately to preserve the separation of commits, and added links on each commit to this review.

@felipepiovezan felipepiovezan deleted the felipe/UnwindAssemblyInstEmulation branch November 14, 2025 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants