Skip to content

Add GDB-server debug endpoint#2

Merged
Krilliac merged 6 commits into
masterfrom
claude/gdb-server-mangos-port-3ugxqb
Jun 29, 2026
Merged

Add GDB-server debug endpoint#2
Krilliac merged 6 commits into
masterfrom
claude/gdb-server-mangos-port-3ugxqb

Conversation

@Krilliac

@Krilliac Krilliac commented Jun 29, 2026

Copy link
Copy Markdown
Owner

Implement GDB-server system to MaNGOS Zero so a debugger or an AI
agent can attach to the running mangosd and drive it: read process memory,
inspect live game state, and run server commands over one endpoint.

  • Native: a transport-agnostic GDB Remote Serial Protocol (RSP) stub over
    TCP (gdb / lldb / IDA). Framing, checksum, qSupported, ?, g/G (synthetic
    registers), guarded m/M memory, H, c/s/D/k, vCont, qRcmd.
  • Semantic: 'monitor mangos ' commands (status, players, tick,
    session, config, and a 'cmd' bridge to the full ChatCommand surface),
    also exposed on a plain-text TCP bridge for AI agents and non-RSP
    debuggers (WinDbg/CDB/x64dbg attach natively + use the bridge).

the world thread runs the RSP pump loop (pausing the 50ms tick) while the
ACE network thread only shuttles bytes. The stop loop honours
World::IsStopped() so a paused server can still shut down.

Memory access is guarded (Linux /proc/self/maps, Windows VirtualQuery) so
a bad debugger address returns an RSP error instead of crashing.

New subsystem: src/game/Debug/GdbServer/ (RSP core, monitor, verbs,
guarded memory, facade) + src/mangosd/GdbServerThread (ACE listener
modeled on RAThread). Config-gated, disabled by default, localhost bind.
See doc/GdbServer.md.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo

claude added 6 commits June 29, 2026 06:53
Port the DuetOS GDB-server system to MaNGOS Zero so a debugger or an AI
agent can attach to the running mangosd and drive it: read process memory,
inspect live game state, and run server commands over one endpoint.

Two layers, matching the DuetOS design:
- Native: a transport-agnostic GDB Remote Serial Protocol (RSP) stub over
  TCP (gdb / lldb / IDA). Framing, checksum, qSupported, ?, g/G (synthetic
  registers), guarded m/M memory, H, c/s/D/k, vCont, qRcmd.
- Semantic: 'monitor mangos <verb>' commands (status, players, tick,
  session, config, and a 'cmd' bridge to the full ChatCommand surface),
  also exposed on a plain-text TCP bridge for AI agents and non-RSP
  debuggers (WinDbg/CDB/x64dbg attach natively + use the bridge).

The kernel NMI-freeze stop is replaced by a cooperative world-tick stop:
the world thread runs the RSP pump loop (pausing the 50ms tick) while the
ACE network thread only shuttles bytes. The stop loop honours
World::IsStopped() so a paused server can still shut down.

Memory access is guarded (Linux /proc/self/maps, Windows VirtualQuery) so
a bad debugger address returns an RSP error instead of crashing.

New subsystem: src/game/Debug/GdbServer/ (RSP core, monitor, verbs,
guarded memory, facade) + src/mangosd/GdbServerThread (ACE listener
modeled on RAThread). Config-gated, disabled by default, localhost bind.
See doc/GdbServer.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Reword header comments and documentation to describe the debug endpoint as
a native mangosd feature built on the GDB Remote Serial Protocol, removing
references to the upstream project and kernel-debugging framing. No
behavioural change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Build on the Phase 1 endpoint with the in-process debugging features that
let a debugger or AI agent genuinely drive the server:

- Live register capture: at any stop the world thread's real registers are
  captured (Linux getcontext, Windows RtlCaptureContext on x86_64) and
  returned by the 'g' packet, so gdb can backtrace the actual call stack
  through the 'm' memory packets.
- Game-level breakpoints: arm pauses on a received opcode, on player
  map-entry, or on a named label. When one fires and a debugger is attached,
  the world thread stops inline at the call site, captures context, and waits
  for resume. Driven over the monitor surface (mangos break ...). Hot-path
  cost is a single relaxed atomic load when nothing is armed; an
  armed-but-unattended breakpoint never stalls the server.
- monitor mangos dump: world-thread backtrace (Linux symbols; addresses
  elsewhere).

Wired demonstration call sites: opcode dispatch (WorldSession::Update) and
player map-entry (Player::AddToWorld), each behind a GDB_BREAK_* macro.

Native instruction breakpoints / hardware single-step are intentionally not
implemented: patching int3 or self-setting debug registers in a live
multi-threaded server is unsafe; the cooperative game-level breakpoints are
the supported equivalent. See doc/GdbServer.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
…overage

Replace the opcode/map/label-specific breakpoint engine with a generic
(event, filter) model and wire breakpoints across every major gameplay
subsystem, so a debugger or AI agent can pause the server on essentially any
in-game event.

Engine (GdbBreakpoints):
- 21 event families in a single enum; armed set tracked as a per-event
  bitmask, so the call-site guard is one relaxed atomic load.
- Each breakpoint is an event plus an optional numeric filter (0 = any), e.g.
  spell id, map id, quest id, creature entry.
- Name<->event table drives the monitor surface: mangos break
  events|list|clear|<event> [filter]|del <event> [filter].
- Stops never nest (reentrancy guard) so a breakpoint that fires from a
  command run while already stopped is ignored.

Wired call sites: opcode dispatch, login, logout, map enter, map leave,
spell cast, spell prepare, unit death, damage dealt, level up, loot, quest
accept/complete/reward, chat, item use, gossip select, creature create,
game-object use, command parse, and world tick (single-step the world loop).

Each fires inline on the world thread, captures live registers, and waits —
only while a debugger is attached, so an armed-but-unattended breakpoint
never stalls the server. New sites need a single GDB_BREAK(event, detail).

Verified the engine end-to-end (event listing, arm/list/clear, filter
matching, register-published 'g', monitor encode/decode) with the standalone
protocol test. Docs updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Prepare for wide subsystem coverage by relocating the breakpoint event
enum to the shared layer and adding a lightweight bridge so lower-layer
systems can raise game-level breakpoints.

- New shared/Debug/GdbEvents.h: GdbEvent enum (55 event families, <= 64 so
  the armed set stays a single bitmask), shared by both libraries.
- New shared/Debug/DebugBreakHook: a function-pointer shim (armed-mask
  getter + hit handler) that the game engine registers at startup, letting
  shared subsystems raise breakpoints without depending on game. Inert
  until registered. GDB_BREAK_SHARED() macro for shared call sites.
- GdbBp now aliases the shared enum, expands its name table to all 55
  events, and registers the bridge in GdbServer::Init.
- Wire the first shared-layer hooks: SQL query (DbQuery) and statement
  (DbExecute) execution in MySQLConnection.

Verified the engine end-to-end with the standalone protocol test (all 55
events list/arm/match) and -fsyntax-only across the debug subsystem.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
Wire game-level breakpoint hooks across every major subsystem via the event
model, taking total coverage to ~55 events / 53 call sites.

Added hooks:
- Netcode/auth: WorldSocket accept/close, HandleAuthSession, SendPacket.
- Warden: check request and violation handling.
- Scripting: creature AI selection (CreatureAISelector).
- Creature AI: enter/leave combat, AI update, respawn.
- Maps/instances: map create, grid load, dungeon create, instance reset.
- Economy/social: mail send/receive, auction list/bid, trade complete,
  group join.
- BG/pet/item/pvp/move: battleground start/end, pet summon, item
  equip/destroy, honorable kill, movement inform.
- Database (shared layer): SQL query/execute via the registered bridge.

These were discovered and applied by a parallel agent workflow, then
hand-audited: three insertions that the literal anchor placement put inside
an if-condition / constructor initializer lists were relocated into the
function bodies, and three detail expressions referencing not-yet-declared
packet locals were reduced to filter 0. Every call site was checked for
correct include, scope-valid detail expression, and statement-boundary
placement.

Docs updated with the full event catalogue.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_016L1LfL8h1fGgLYjsfvyhEo
@Krilliac Krilliac changed the title Add GDB-server debug endpoint (port from DuetOS) Add GDB-server debug endpoint Jun 29, 2026
@Krilliac Krilliac merged commit adae30e into master Jun 29, 2026
5 checks passed

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2846cd77d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +160 to +165
MonitorReq req;
req.ctx = ctx;
req.writer = writer;
req.line = line;
std::lock_guard<std::mutex> guard(m_monLock);
m_monitor.push_back(std::move(req));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep monitor sockets alive while requests are queued

When a plain-text monitor client sends a complete line and then disconnects before the next world tick, the socket can be closed and its final ACE reference released while this queued request still only contains the raw GdbMonSocket*. DrainMonitorRequests() later calls req.writer(req.ctx, ...), so the reply path can dereference a freed socket; take/hold a reference for queued monitor work or cancel pending requests on close.

Useful? React with 👍 / 👎.

Comment on lines +100 to +102
if (GdbBp::Armed(GdbBp::Event::ev) && \
GdbBp::Matches(GdbBp::Event::ev, static_cast<uint64>(detail))) \
GdbBp::Hit(GdbBp::Event::ev, static_cast<uint64>(detail)); \

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Route non-world breakpoints through the world thread

This macro is now used from non-world contexts such as WorldSocket::open in the ACE network reactor and the shared database layer, but GdbBp::Matches reads the unlocked g_entries vector while monitor commands can arm/disarm it, and GdbBp::Hit enters a stop loop documented as world-thread-only. With a debugger attached and a net/db breakpoint armed, those threads can race breakpoint edits or mutate RSP state while the world continues running; either dispatch these events to the world thread or make the registry and stop path thread-safe.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants