Skip to content

Conversation

@felipepiovezan
Copy link

@felipepiovezan felipepiovezan commented Dec 3, 2025

This cherry picks all the patches for handling backwards branches in instruction emulation.

rdar://152694506

@felipepiovezan felipepiovezan requested a review from a team as a code owner December 3, 2025 11:14
@felipepiovezan
Copy link
Author

@swift-ci test

…wards branches (llvm#168398)

If we have a conditional branch, followed by an epilogue, followed by
more code, LLDB will incorrectly compute unwind information through
instruction emulation. Consider this:

```
// ...
<+16>: b.ne   ; <+52> DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE

// epilogue start
<+20>: ldp    x29, x30, [sp, #0x20]
<+24>: add    sp, sp, #0x30
<+28>: ret
// epilogue end

AFTER_EPILOGUE:
<+32>: do something
// ...
<+48>: ret

DO_SOMETHING_AND_GOTO_AFTER_EPILOGUE:
<+52>: stp    x22, x23, [sp, #0x10]
<+56>: mov    x22, #0x1
<+64>: b      ; <+32> AFTER_EPILOGUE
```

LLDB will think that the unwind state of +32 is the same as +28. This is
false, as +32 _never_ executes after +28.

The root cause of the problem is the order in which instructions are
visited; they are visited in the order they appear in the text, with
unwind state always being forwarded to positive branch offsets, but
never to negative offsets.

In the example above, `AFTER_EPILOGUE` should inherit the state of the
branch in +64, but it doesn't because `AFTER_EPILOGUE` is visited right
after the `ret` in +28.

Fixing this should be simple: maintain a stack of instructions to visit.
While the stack is not empty, take the next instruction on stack and
visit it.
* After visiting a non-branching instruction, push the next instruction
and forward unwind state to it.
* After visiting a branch with one or more known targets, push the known
branch targets and forward state to them.
* In all other cases (ret, or branch to register), don't push nor
forward anything.

Never push an instruction already on the stack. Like the algorithm
today, this new algorithm also assumes that, if two instructions branch
to the same target, the unwind state in both better be the same.

(Note: yes, branch to register is also handled incorrectly today, and
will still be incorrect).

(cherry picked from commit cd13d9f)
…sit (llvm#169630)

Currently, UnwindAssemblyInstEmulation visits instructions in the order
in which they appear in a function. This commit makes an NFCI change to
UnwindAssemblyInstEmulation so that it follows the function's CFG:

1. The first instruction is enqueued.
2. While the queue is not empty:
2.1 Visit the instruction in the *back* queue to compute the new unwind
    state.
2.2 Push(+) the next instruction to the *back* of the queue.
2.3 If the instruction is a forward branch with a known branch target,
    push(+) the destination instruction to the *front* of the queue.

(+) Only push if this instruction hasn't been enqueued before.
(+) When pushing an instruction, the current unwind state is attached to
it.

Note that:
* the "next instruction" is pushed to the *back* of the queue,
* a branch target is pushed to the *front* of the queue, and
* we always dequeue from the *back* of the queue.

This means that consecutive instructions are visited one after the
other; this is important to support "conditional blocks" [1] of
instructions (see the line with "if last_condition != new_condition").
This is arguably a very Thumb specific thing, so maybe it shouldn't be
in the generic algorithm; that said, it is already in the code, so we
have to support it.

The main reason this patch is NFCI and not NFC is that, now, the
destination of a forward branch is visited in a slightly different
moment than before. This should not cause any changes in output, as if a
branch destination is reachable through two different paths, any well
behaved compiler will generate the same unwind state in both paths.

The motivation for this patch is to change step 2.2 so that it _only_
pushes the next instruction if the current instruction is not an
unconditional branch / return, and to change step 2.3 so that backwards
branches are also allowed, fixing the bug described by [2].

[1]:
https://developer.arm.com/documentation/dui0473/m/arm-and-thumb-instructions/it
[2]: llvm#168398

Part of a sequence of PRs:
[lldb][NFCI] Rewrite UnwindAssemblyInstEmulation in terms of a CFG visit
llvm#169630
[lldb][NFC] Rename forward_branch_offset to branch_offset in
UnwindAssemblyInstEmulation llvm#169631
[lldb] Add DisassemblerLLVMC::IsBarrier API llvm#169632
[lldb] Handle backwards branches in UnwindAssemblyInstEmulation llvm#169633

commit-id:dce6b515
(cherry picked from commit 5a32fd3)
…semblyInstEmulation (llvm#169631)

This will reduce the diff in subsequent patches

Part of a sequence of PRs:
[lldb][NFCI] Rewrite UnwindAssemblyInstEmulation in terms of a CFG visit
llvm#169630
[lldb][NFC] Rename forward_branch_offset to branch_offset in
UnwindAssemblyInstEmulation llvm#169631
[lldb] Add DisassemblerLLVMC::IsBarrier API llvm#169632
[lldb] Handle backwards branches in UnwindAssemblyInstEmulation llvm#169633
commit-id:5e758a22

(cherry picked from commit 6638d59)
This will allow the instruction emulation unwinder to reason about
instructions that prevent the subsequent instruction from executing.

Part of a sequence of PRs:
[lldb][NFCI] Rewrite UnwindAssemblyInstEmulation in terms of a CFG visit
llvm#169630
[lldb][NFC] Rename forward_branch_offset to branch_offset in
UnwindAssemblyInstEmulation llvm#169631
[lldb] Add DisassemblerLLVMC::IsBarrier API llvm#169632
[lldb] Handle backwards branches in UnwindAssemblyInstEmulation llvm#169633

commit-id:bb5df4aa
(cherry picked from commit 2b725ab)
…#169633)

This allows the unwinder to handle code with mid-function epilogues
where the subsequent code is reachable through a backwards branch.

Two changes are required to accomplish this:

1. Do not enqueue the subsequent instruction if the current instruction
   is a barrier(*).
2. When processing an instruction, stop ignoring branches with negative
   offsets.

(*) As per the definition in LLVM's MC layer, a barrier is any
instruction that "stops control flow from executing the instruction
immediately following it". See `MCInstrDesc::isBarrier` in MCInstrDesc.h

Part of a sequence of PRs:
[lldb][NFCI] Rewrite UnwindAssemblyInstEmulation in terms of a CFG visit
llvm#169630
[lldb][NFC] Rename forward_branch_offset to branch_offset in
UnwindAssemblyInstEmulation llvm#169631
[lldb] Add DisassemblerLLVMC::IsBarrier API llvm#169632
[lldb] Handle backwards branches in UnwindAssemblyInstEmulation llvm#169633

commit-id:fd266c13
(cherry picked from commit 4e4763a)
@felipepiovezan felipepiovezan force-pushed the felipe/cherrypick_unwind_backwards_fixes branch from 95ea5b5 to d1a5a22 Compare December 3, 2025 15:58
@felipepiovezan
Copy link
Author

@swift-ci test

@felipepiovezan
Copy link
Author

@swift-ci test macos platform

@adrian-prantl adrian-prantl merged commit 1f3607d into swiftlang:stable/21.x Dec 4, 2025
3 checks passed
@felipepiovezan felipepiovezan deleted the felipe/cherrypick_unwind_backwards_fixes branch December 4, 2025 16:58
adrian-prantl added a commit that referenced this pull request Dec 5, 2025
…d_backwards_fixes

🍒 [lldb] Handle backwards branches in UnwindAssemblyInstEmulation (PR #11920)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants