Skip to content

Fix data race on RspConnector lifetime in Gdb/Corellium/Esreven adapters#1075

Open
xusheng6 wants to merge 1 commit into
devfrom
fix-rspconnector-lifetime-race
Open

Fix data race on RspConnector lifetime in Gdb/Corellium/Esreven adapters#1075
xusheng6 wants to merge 1 commit into
devfrom
fix-rspconnector-lifetime-race

Conversation

@xusheng6
Copy link
Copy Markdown
Member

Summary

Properly fixes the race that #1073 only narrowed. DebuggerController::ExecuteAdapterAndWait deliberately uses two separate mutexes (debuggercontroller.cpp:2678-2695) so Pause/Quit/Detach can fire while another thread is blocked inside GoResponseHandler. That design is correct: a blocking recv on one thread has to coexist with a \x03 send on another. The bug is that m_rspConnector itself - the raw pointer field - was read and freed concurrently with no synchronization, so the operations could legitimately race against a use-after-free of the connector object.

Approach

Wrap m_rspConnector in a small AtomicRspConnector helper added to rspconnector.h: a mutex-guarded std::shared_ptr<RspConnector> with load() and store() methods. Every accessor calls load() once to obtain a strong reference, then operates on the local; the lock is held only for the duration of a shared_ptr copy (a refcount bump), never during actual I/O. Teardown sites just store(nullptr) - the connector destructs when the last strong reference drops, which is now guaranteed to be after every in-flight call returns.

The same pattern is applied to all three RSP-based adapters (CorelliumAdapter, GdbAdapter, EsrevenAdapter) since they share the identical raw-pointer-with-no-lock layout.

Why not std::atomic<std::shared_ptr<T>>: libc++ on macOS (Apple's stdlib) doesn't implement the C++20 specialization without a trivially-copyable T, so the direct C++20 form does not compile here. The mutex-wrapped equivalent is portable and has the same observable semantics for our use.

Tradeoff

Every read site now hoists a local at the top of the method. Roughly 30 call sites in corelliumadapter.cpp, 55 in gdbadapter.cpp, 85 in esrevenadapter.cpp are touched. Mechanical, no behavior change beyond the synchronization itself. The two-mutex design in DebuggerController is intact.

Fixes #1074
Refs #1066

Test plan

  • Build the debugger and confirm a clean connect/break/resume/detach cycle for each adapter.
  • Trigger Pause concurrently with target-exit (e.g., target crashes while user clicks Pause) and confirm no crash.
  • Trigger Pause without ever connecting (the original Crash inside BinaryNinjaDebugger::RspConnector::SendRaw #1066 scenario) and confirm a clean no-op.

🤖 Generated with Claude Code

DebuggerController::ExecuteAdapterAndWait uses two separate mutexes so
that Pause/Quit/Detach can run concurrently with a blocked Go/Receive
on another thread (debuggercontroller.cpp:2678-2695). That design is
intentional, but it means BreakInto races against ResponseHandler's
own teardown of m_rspConnector on target exit. The previous raw-pointer
field had no synchronization on either the load or the delete, so the
read side could deref a freed object.

Wrap the field in a small AtomicRspConnector helper (mutex-guarded
shared_ptr) and have every accessor load a strong reference before use.
Teardown sites now just store(nullptr) - the connector is destroyed when
the last strong reference drops, never under an in-flight call.

The same fix is applied to GdbAdapter and EsrevenAdapter since they
share the pattern.

Fixes #1074
Refs #1066

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data race on m_rspConnector lifetime across Gdb/Corellium/Esreven adapters

1 participant