Add GDB-server debug endpoint#2
Conversation
Port the DuetOS GDB-server system to MaNGOS Zero so a debugger or an AI agent can attach to the running mangosd and drive it: read process memory, inspect live game state, and run server commands over one endpoint. Two layers, matching the DuetOS design: - Native: a transport-agnostic GDB Remote Serial Protocol (RSP) stub over TCP (gdb / lldb / IDA). Framing, checksum, qSupported, ?, g/G (synthetic registers), guarded m/M memory, H, c/s/D/k, vCont, qRcmd. - Semantic: 'monitor mangos <verb>' commands (status, players, tick, session, config, and a 'cmd' bridge to the full ChatCommand surface), also exposed on a plain-text TCP bridge for AI agents and non-RSP debuggers (WinDbg/CDB/x64dbg attach natively + use the bridge). The kernel NMI-freeze stop is replaced by a cooperative world-tick stop: the world thread runs the RSP pump loop (pausing the 50ms tick) while the ACE network thread only shuttles bytes. The stop loop honours World::IsStopped() so a paused server can still shut down. Memory access is guarded (Linux /proc/self/maps, Windows VirtualQuery) so a bad debugger address returns an RSP error instead of crashing. New subsystem: src/game/Debug/GdbServer/ (RSP core, monitor, verbs, guarded memory, facade) + src/mangosd/GdbServerThread (ACE listener modeled on RAThread). Config-gated, disabled by default, localhost bind. See doc/GdbServer.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Reword header comments and documentation to describe the debug endpoint as a native mangosd feature built on the GDB Remote Serial Protocol, removing references to the upstream project and kernel-debugging framing. No behavioural change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Build on the Phase 1 endpoint with the in-process debugging features that let a debugger or AI agent genuinely drive the server: - Live register capture: at any stop the world thread's real registers are captured (Linux getcontext, Windows RtlCaptureContext on x86_64) and returned by the 'g' packet, so gdb can backtrace the actual call stack through the 'm' memory packets. - Game-level breakpoints: arm pauses on a received opcode, on player map-entry, or on a named label. When one fires and a debugger is attached, the world thread stops inline at the call site, captures context, and waits for resume. Driven over the monitor surface (mangos break ...). Hot-path cost is a single relaxed atomic load when nothing is armed; an armed-but-unattended breakpoint never stalls the server. - monitor mangos dump: world-thread backtrace (Linux symbols; addresses elsewhere). Wired demonstration call sites: opcode dispatch (WorldSession::Update) and player map-entry (Player::AddToWorld), each behind a GDB_BREAK_* macro. Native instruction breakpoints / hardware single-step are intentionally not implemented: patching int3 or self-setting debug registers in a live multi-threaded server is unsafe; the cooperative game-level breakpoints are the supported equivalent. See doc/GdbServer.md. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
…overage Replace the opcode/map/label-specific breakpoint engine with a generic (event, filter) model and wire breakpoints across every major gameplay subsystem, so a debugger or AI agent can pause the server on essentially any in-game event. Engine (GdbBreakpoints): - 21 event families in a single enum; armed set tracked as a per-event bitmask, so the call-site guard is one relaxed atomic load. - Each breakpoint is an event plus an optional numeric filter (0 = any), e.g. spell id, map id, quest id, creature entry. - Name<->event table drives the monitor surface: mangos break events|list|clear|<event> [filter]|del <event> [filter]. - Stops never nest (reentrancy guard) so a breakpoint that fires from a command run while already stopped is ignored. Wired call sites: opcode dispatch, login, logout, map enter, map leave, spell cast, spell prepare, unit death, damage dealt, level up, loot, quest accept/complete/reward, chat, item use, gossip select, creature create, game-object use, command parse, and world tick (single-step the world loop). Each fires inline on the world thread, captures live registers, and waits — only while a debugger is attached, so an armed-but-unattended breakpoint never stalls the server. New sites need a single GDB_BREAK(event, detail). Verified the engine end-to-end (event listing, arm/list/clear, filter matching, register-published 'g', monitor encode/decode) with the standalone protocol test. Docs updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Prepare for wide subsystem coverage by relocating the breakpoint event enum to the shared layer and adding a lightweight bridge so lower-layer systems can raise game-level breakpoints. - New shared/Debug/GdbEvents.h: GdbEvent enum (55 event families, <= 64 so the armed set stays a single bitmask), shared by both libraries. - New shared/Debug/DebugBreakHook: a function-pointer shim (armed-mask getter + hit handler) that the game engine registers at startup, letting shared subsystems raise breakpoints without depending on game. Inert until registered. GDB_BREAK_SHARED() macro for shared call sites. - GdbBp now aliases the shared enum, expands its name table to all 55 events, and registers the bridge in GdbServer::Init. - Wire the first shared-layer hooks: SQL query (DbQuery) and statement (DbExecute) execution in MySQLConnection. Verified the engine end-to-end with the standalone protocol test (all 55 events list/arm/match) and -fsyntax-only across the debug subsystem. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Wire game-level breakpoint hooks across every major subsystem via the event model, taking total coverage to ~55 events / 53 call sites. Added hooks: - Netcode/auth: WorldSocket accept/close, HandleAuthSession, SendPacket. - Warden: check request and violation handling. - Scripting: creature AI selection (CreatureAISelector). - Creature AI: enter/leave combat, AI update, respawn. - Maps/instances: map create, grid load, dungeon create, instance reset. - Economy/social: mail send/receive, auction list/bid, trade complete, group join. - BG/pet/item/pvp/move: battleground start/end, pet summon, item equip/destroy, honorable kill, movement inform. - Database (shared layer): SQL query/execute via the registered bridge. These were discovered and applied by a parallel agent workflow, then hand-audited: three insertions that the literal anchor placement put inside an if-condition / constructor initializer lists were relocated into the function bodies, and three detail expressions referencing not-yet-declared packet locals were reduced to filter 0. Every call site was checked for correct include, scope-valid detail expression, and statement-boundary placement. Docs updated with the full event catalogue. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2846cd77d0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| MonitorReq req; | ||
| req.ctx = ctx; | ||
| req.writer = writer; | ||
| req.line = line; | ||
| std::lock_guard<std::mutex> guard(m_monLock); | ||
| m_monitor.push_back(std::move(req)); |
There was a problem hiding this comment.
Keep monitor sockets alive while requests are queued
When a plain-text monitor client sends a complete line and then disconnects before the next world tick, the socket can be closed and its final ACE reference released while this queued request still only contains the raw GdbMonSocket*. DrainMonitorRequests() later calls req.writer(req.ctx, ...), so the reply path can dereference a freed socket; take/hold a reference for queued monitor work or cancel pending requests on close.
Useful? React with 👍 / 👎.
| if (GdbBp::Armed(GdbBp::Event::ev) && \ | ||
| GdbBp::Matches(GdbBp::Event::ev, static_cast<uint64>(detail))) \ | ||
| GdbBp::Hit(GdbBp::Event::ev, static_cast<uint64>(detail)); \ |
There was a problem hiding this comment.
Route non-world breakpoints through the world thread
This macro is now used from non-world contexts such as WorldSocket::open in the ACE network reactor and the shared database layer, but GdbBp::Matches reads the unlocked g_entries vector while monitor commands can arm/disarm it, and GdbBp::Hit enters a stop loop documented as world-thread-only. With a debugger attached and a net/db breakpoint armed, those threads can race breakpoint edits or mutate RSP state while the world continues running; either dispatch these events to the world thread or make the registry and stop path thread-safe.
Useful? React with 👍 / 👎.
Implement GDB-server system to MaNGOS Zero so a debugger or an AI
agent can attach to the running mangosd and drive it: read process memory,
inspect live game state, and run server commands over one endpoint.
TCP (gdb / lldb / IDA). Framing, checksum, qSupported, ?, g/G (synthetic
registers), guarded m/M memory, H, c/s/D/k, vCont, qRcmd.
session, config, and a 'cmd' bridge to the full ChatCommand surface),
also exposed on a plain-text TCP bridge for AI agents and non-RSP
debuggers (WinDbg/CDB/x64dbg attach natively + use the bridge).
the world thread runs the RSP pump loop (pausing the 50ms tick) while the
ACE network thread only shuttles bytes. The stop loop honours
World::IsStopped() so a paused server can still shut down.
Memory access is guarded (Linux /proc/self/maps, Windows VirtualQuery) so
a bad debugger address returns an RSP error instead of crashing.
New subsystem: src/game/Debug/GdbServer/ (RSP core, monitor, verbs,
guarded memory, facade) + src/mangosd/GdbServerThread (ACE listener
modeled on RAThread). Config-gated, disabled by default, localhost bind.
See doc/GdbServer.md.
Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo