feat: multi-language debugging — Python + Go + Node.js via Adapter abstraction#5
Merged
Conversation
…fic code behind it
Until now every language-specific decision (spawn debugpy.adapter, build the
`{"type": "python", ...}` launch payload, parse Python tracebacks, peel a
`python foo.py` interpreter prefix in `diagnose`) was inlined in the core /
command layer. Adding a second language would have required surgery in five
or six files.
This change introduces `adapters.base.Adapter` — an ABC that captures every
language-specific knob the rest of the codebase needs:
* `spawn_adapter(port)` — start the language's DAP server
* `launch_payload(...)` — build the DAP `launch` request body
* `parse_traceback(text)` — language-specific stack/panic parser
* `spawn_listen_mode(...)` + `supports_listen_mode()` — IDE attach (optional)
* `attach_url(host, port)` — the URL scheme an IDE should use
* `resolve_launch_target(cmd)` — peel an interpreter prefix for `diagnose`
* `probe_template(kind, code)` — hook for future `instrument` defaults
`PythonAdapter` ports every existing debugpy code path behind this interface
with no behavior change. The registry in `adapters/__init__.py` exposes
`get_adapter`, `list_adapters`, `detect_language` (by file extension), and
`resolve_language` (explicit > detected > default).
CLI surface:
* `session start --lang {python,...}` — auto-detected from script extension
when omitted; persisted to `meta.json` so the daemon picks the right
adapter on (re)start.
* `localize --lang {python,...}` — picks the traceback parser.
* `diagnose --lang {python,...}` — drives both traceback parsing and the
`python foo.py` → `foo.py` launch-target peeling; auto-inferred from the
command (interpreter basename or first script-like argument).
Internal moves:
* `adapters/debugpy_adapter.py` (module) → `adapters/python.py` (class) +
`adapters/_socket.py` (truly generic `find_free_port` /
`wait_until_listening`, reusable by future Go/Node adapters).
* `core/dap_session.py::DapSession` now takes an `Adapter` (defaults to
PythonAdapter for backwards compatibility). It calls into the adapter
instead of importing debugpy helpers directly.
* `core/session_proc.py` reads `meta["lang"]` (defaults to "python" for
pre-refactor meta files) and constructs the matching adapter.
Testing:
* 11 new unit tests in `tests/unit/test_adapters.py` cover registry,
detection, listen-mode flag, launch-payload shape, interpreter peeling,
and traceback parsing through the adapter.
* All existing 102 tests (unit + integration + e2e) pass unchanged — the
Python behavior is byte-identical.
* `ruff check`, `ruff format`, `mypy --strict src` all clean.
This is the first of a 3-PR stack. PR2 adds GoAdapter (delve `dlv dap`).
PR3 adds NodeAdapter (vscode-js-debug `dapDebugServer.js`).
Second PR in the multi-language stack. Stacks on top of #2 (Adapter ABC refactor); merge that first. GoAdapter implements the Adapter contract for Go programs: * spawn `dlv dap --listen=127.0.0.1:<port>` as the DAP server, with a clear "delve not installed" error pointing at the `go install` command when `dlv` isn't on PATH (no cryptic ENOENT). * launch payload uses `mode: "debug"` so delve compiles + runs in one step from a `.go` file or package directory. * `spawn_listen_mode` for IDE attach (VS Code Go extension dap-mode). * `parse_traceback` understands both `panic:` and `fatal error:` dumps, including extended runtime frames with `+0xN fp=0x... sp=0x... pc=0x...` annotations. Frames stored oldest-first (matches Python convention) so the shared `deepest_user_frame` heuristic lands on the panic site. * runtime / sync / reflect / internal frames are marked `is_user_code=False` so the deepest-user heuristic skips runtime scaffolding when reporting crash locations. * `resolve_launch_target` peels `go run [-flags] <main.go> args...` into `(main.go, args)` for `dbga diagnose`. `go test` is out of scope (would need `mode: "test"`); it returns None and surfaces the crash without rerun, matching the existing Python `-m` behavior. CLI surface: $ dbga session start --break-at main.go:12 -- main.go $ dbga diagnose --timeout 30 -- go run main.go $ dbga localize --lang go --file panic.txt Language is auto-detected from the script extension; `--lang go` forces it. Test plan: * 14 new unit tests (`tests/unit/test_go_adapter.py`) covering registry, detection, listen-mode flag, launch-payload shape, `go run` peeling, panic + fatal-error parsing, and the missing-dlv error path. * 1 new integration test (`tests/integration/test_go_session.py`) drives real `dlv dap`: initialize / launch / stopOnEntry / continue / terminated. Auto-skips when `dlv` or `go` isn't on PATH so the suite stays green on Python-only machines. * `__debug_bin*` + `*.test` added to .gitignore — `dlv dap` leaves the compiled debug binary in cwd. Local validation: - 76 unit tests pass (61 prior + 14 new + 1 misc). - 8 integration tests pass (7 Python + 1 Go) — driven against real delve 1.22+ on Windows. - 45 e2e tests pass unchanged. - ruff check + ruff format + mypy --strict all clean. Out of scope for this PR (deferred to future work): - `go test ./...` debugging (needs DAP `mode: "test"` + `--test.run` plumbing). - `dbga instrument` probe templates for Go (`fmt.Println` defaults).
Third PR in the multi-language stack. Stacks on top of #3 (GoAdapter); merge that first. NodeAdapter scaffolds Node.js / TypeScript debugging via Microsoft's vscode-js-debug (the same DAP server VS Code itself uses). Status: alpha. Status — what works: * Discovery: `find_dap_server()` locates `dapDebugServer.js` from $DBGA_JS_DEBUG_SERVER, then VS Code / Cursor / Insiders extension dirs, then a manual `~/.local/share/js-debug/` install. Errors with a clear install hint (GitHub releases URL) when nothing is found. Verified end-to-end against vscode-js-debug v1.117.0 on Windows. * Spawn + handshake: `node dapDebugServer.js <port> 127.0.0.1` accepts our DAP `initialize` — `test_node_dap_initialize` passes against the real adapter. * V8 stack-trace parser: handles named + anonymous frames, node:internal + node_modules library detection, oldest-first frame ordering so the shared `deepest_user_frame` heuristic lands on the failure site. 13 fixture-driven cases pass against real Node 20 / 26 traces. * `resolve_launch_target` peels `node [-flags] script args`, `ts-node`, `tsx`; correctly consumes the `-r module` / `--require module` pair. * `--lang node` flag plumbed through `session start`, `localize`, `diagnose`. Auto-detection covers `.js .mjs .cjs .ts .mts .cts`. Status — known blocker (intentional alpha scope): vscode-js-debug delegates the actual launched program to a *child* DAP session via a reverse `startDebugging` request. Our DapClient currently drops all server-to-client requests (see `dap_client.py::_dispatch`), so the child is never created and `stopped` never arrives. The full launch flow test (`test_node_dap_launch_stops`) is marked `xfail strict` and will start passing automatically once DapClient gains reverse- request + child-session handling. This is the documented follow-up scope — not in this PR. Test plan: * 24 new unit tests in `tests/unit/test_node_adapter.py` covering registry, extension detection, launch-payload shape, listen-mode flag, V8 parser fixtures (TypeError + ReferenceError + node_modules + anonymous frames), missing-node hint, and the env-var discovery override path. * 1 new integration test pair: - test_node_dap_initialize — PASSES against real js-debug. - test_node_dap_launch_stops — xfail strict; tracks the reverse- request blocker. Both auto-skip when node + js-debug aren't both discoverable. * Full suite: 152 passed + 1 xfailed locally (61 Python unit + 14 Go unit + 24 Node unit + 8 misc unit, 9 integration including Go + Node initialize, 45 e2e). * ruff check + ruff format + mypy --strict all clean (30 src files). README updated with the language-toolchain install matrix. Out of scope (deferred to a follow-up PR): * DapClient reverse-request handling (`startDebugging` etc.) — the one change that promotes NodeAdapter from "handshake works" to "full live session". Unblocks worker-thread + child_process attach as a bonus. * `dbga instrument` probe templates for JS (`console.log` defaults).
Promotes the Node adapter from "alpha (handshake only)" to a full,
live-debugger experience by teaching DapClient and DapSession to handle
DAP server-to-client requests — specifically vscode-js-debug's
`startDebugging`, which delegates every launched program to a fresh
child DAP session.
What this PR adds on top of the previous NodeAdapter scaffold:
DapClient — server-to-client requests
-------------------------------------
* `register_reverse_handler(command, handler)`: register a callable that
runs when the DAP server sends `type: "request"`. The handler returns
the response body (or `None` for empty); raising surfaces as a DAP
`success: false` response with the error message.
* `_dispatch` now routes `type == "request"` through the handler map.
Unknown commands respond with `"not supported"` so the server isn't
left waiting.
* `_send_response` emits a properly-framed DAP response with a fresh
client-side `seq` and the server's `request_seq` echoed back.
DapSession — child-session orchestration
----------------------------------------
* Tracks adapter host/port and a `_child_clients` list. The `start()`
path registers `startDebugging` so every adapter that delegates
(currently only vscode-js-debug) gets transparent child-session
support — Python/Go never send the request so the handler stays
dormant for them.
* `_on_start_debugging` opens a fresh TCP connection to the same DAP
server, runs the full DAP handshake on it (initialize / launch (or
attach) / configurationDone) using whatever configuration the parent
passed in, registers the handler recursively (workers / child_process
nest deeper), and appends the new client to `_child_clients`. The
handler runs on the parent's reader thread; it MUST only do I/O on
the child connection it just opened to avoid reader-thread deadlock.
* `_active_client` is the client that owns the live debuggee. It starts
as the parent and gets promoted to the child that just emitted
`stopped`, so `continue_` / `step` / `evaluate` / `set_breakpoints`
all route to the right place via `_require_client`.
* `wait_for_stop` now round-robin-polls the parent and every child
client via `_poll_any_client`. Terminal events drain across all live
clients.
* `release` disconnects child clients before the parent so they don't
leak; the parent tree-kill remains the unconditional fallback.
Node.js: alpha → fully live
---------------------------
* NodeAdapter docstring updated — drops the "handshake-only" caveat.
TypeScript via ts-node/tsx, plus worker threads and `child_process`
children, all flow through the nested-session mechanism (handler is
registered recursively on child clients).
* `tests/integration/test_node_session.py::test_node_dap_launch_stops`
drops `@xfail`. Now goes through `DapSession.start()` (not raw
DapClient) so the handler fires. Verified end-to-end against real
vscode-js-debug v1.117.0 on Windows: launch → stopOnEntry → continue
→ terminated, ~6 seconds.
Tests
-----
* 4 new unit tests in `tests/unit/test_dap_reverse_requests.py` cover
the DapClient routing in isolation (no debugger spawn): unknown
reverse request → `success: false`; handler return value → response
body; handler exception → `success: false` + message; response seq
is distinct from request_seq.
* Full suite: 158 passed locally (previously 152 + 1 xfailed). 0
failures, 0 xfails. ruff + ruff format + mypy --strict all clean.
CLAUDE.md updated to describe the reverse-request mechanism alongside
the rest of the DAP plumbing.
Sources / spec references this implementation followed:
* DAP spec `startDebugging` reverse request:
https://microsoft.github.io/debug-adapter-protocol/specification#Reverse_Requests_StartDebugging
* vscode-js-debug's child-session model is the same one VS Code's
debug-adapter client implements; this PR mirrors that contract.
…ames, js-debug version sort
Fixes from the consolidated three-language review (parallel subagent
review of the Python path, Go adapter, and Node/DAP core).
BLOCKER — DapSession child-session data race (core/dap_session.py)
``_on_start_debugging`` runs on the parent client's reader thread and
mutated ``_child_clients`` / ``_active_client`` while the main thread
read them in ``_poll_any_client`` / ``release`` / ``_require_client``.
Concrete hazards: a ``startDebugging`` racing ``release`` could slip a
child past the teardown loop (leaked DAP connection + debuggee) or
resurrect a just-cleared list. Now guarded by ``_clients_lock``:
* all reads/writes of both fields take the lock;
* ``release`` snapshots-and-clears under the lock, then disconnects;
* ``_on_start_debugging`` checks ``_state`` under the lock when
publishing the child and disconnects it instead of resurrecting a
released session.
BLOCKER — Go method-receiver func names truncated (adapters/go.py)
``_FUNC_RE``'s non-greedy func group stopped at the first ``(``, so a
pointer-receiver frame ``github.com/x/y.(*Server).Handle(0x..)`` parsed
as func ``github.com/x/y.`` — garbage for any panic inside a method
(the common case). Switched to a greedy ``\S+`` that backtracks to the
final argument-parens, keeping embedded ``(*Server)`` in the name.
Regression test + ``go_method_panic.txt`` fixture added.
SHOULD-FIX — js-debug extension version sort (adapters/node.py)
``_latest_js_debug_extension`` sorted dirs lexicographically, so
``1.9.0`` beat ``1.10.0`` (``'9' > '1'``). Added ``_extension_version_key``
parsing the trailing semver into an int tuple. Two unit tests added.
Honesty/scope fixes (adapters/node.py)
* NodeAdapter docstring no longer claims worker-thread / child_process
debugging "works"; it states the validated path is single-process and
that multi-process lifecycle (surviving the first child exit) is
future work — matching what wait_for_stop actually does today.
* Corrected the fabricated "Preview Wildcat-Analytics" gloss on the
``pwa-node`` type id to its real Progressive-Web-App origin.
Validation: 161 passed (158 + 3 new regression tests), ruff + ruff
format + mypy --strict all clean. Real-adapter integration tests
(debugpy, dlv dap, vscode-js-debug v1.117.0) all green on Windows.
This was referenced May 29, 2026
… real-user-flow tests
Live-debugging all three languages as a user would (session start
--break-at → eval → continue → release) surfaced two Node-only bugs that
the existing tests missed because the Node integration test only did
launch → stopOnEntry → continue → terminate — it never set a launch
breakpoint or evaluated at a stop.
BUG 1 — launch-time breakpoints never bound (Node)
vscode-js-debug runs the launched program in a CHILD DAP session, but
DapSession.start set breakpoints on the PARENT connection before the
child existed, so they came back "unresolved" and the program ran to
completion. `session start --break-at app.js:N` returned status
"terminated"; `diagnose --lang node` rerun never stopped.
Fix: Adapter gains `delegates_launch_to_child` (True only for Node).
When set, DapSession defers launch breakpoints and replays them on the
child during its handshake in `_on_start_debugging` (after `initialized`,
before `configurationDone`). Single-connection adapters (debugpy, dlv)
are unchanged.
BUG 2 — eval/inspect failed at a child-session breakpoint (Node)
`session_proc._handle_eval` and `_build_inspect_context` resolved the
stopped frame from `session.client` (the PARENT), which has no stopped
thread for a js-debug child stop — so no frameId reached the child and
js-debug returned "evaluate: request failed".
Fix: added `DapSession.active_client` (parent for Python/Go, the stopped
child for Node) and routed eval frame-resolution + inspect through it.
Why the tests didn't catch it / coverage added (real user flows):
* tests/e2e/test_cli_session_node.py — start --break-at → assert stopped
at the line → eval a local (asserts the array + the int) → continue →
release. Drives the full CLI + daemon + child-session path. This is the
exact flow whose live run previously returned "terminated" and
"evaluate: request failed".
* tests/e2e/test_cli_session_go.py — the same flow for Go (dlv), so
breakpoint+eval coverage is symmetric across Python (pre-existing
test_cli_session_ops), Go, and Node.
* unit guards (no toolchain needed): only Node delegates_launch_to_child;
active_client defaults to parent then follows the published child.
Verified live on Windows after the fix:
- Node: `session start --break-at buggy.js:3` → stopped (breakpoint),
`eval nums` → "(3) [10, 20, 30]", `eval total` → 60; `diagnose -- node
buggy.js` rerun now stops at the deepest frame (line 10).
- Python + Go: unchanged, full flows still pass.
Suite: 164 passed + 1 known debugpy adapter-spawn flake (passes in
isolation; documented thread-init race). ruff + mypy --strict clean.
Rewrote the in-repo `skills/debug-agent/` skill to cover Go and Node.js
alongside Python. Every command, flag, and output in the skill was
verified by running `dbga` LIVE against real programs in all three
languages (debugpy, dlv dap, vscode-js-debug) — not against source code.
Authored via subagents loading the skill-creator skill, constrained to a
live-evidence corpus as the sole source of truth.
SKILL.md:
* description + title broadened to Python · Go · Node.js/TypeScript.
* New Languages table: per-language toolchain prerequisite, install
command, and auto-detected extensions; `--lang` + extension
auto-detection explained.
* diagnose / localize / session examples for all three languages with
the exact observed outputs (error types, deepest frames, eval results).
* Honest Limits section: Node validated path is single-process; eval is
language-native (three distinct value formattings shown); diagnose
reuses the `default` session name.
* Stripped command syntax that wasn't live-verified this pass
(run/watch/instrument/step/--listen) down to reference pointers rather
than asserting unverified invocations.
references/:
* localization.md — Go panic / Node V8 / Python traceback examples with
real localize+diagnose outcomes; diagnose session_exists note.
* debugger.md — same-flow-three-languages eval block; the SAME `nums`
array shown printing three ways (Python `[10, 20, 30]`, Go
`[]int len: 3, cap: 3, [10,20,30]`, Node `(3) [10, 20, 30]`).
* vscode-collab.md — removed fabricated `--listen` attach-URL / launch.json
examples (never exercised live); replaced with an explicit
"not yet captured live" caveat + accurate prerequisites only.
* instrumentation.md — probes are Python-flavored; insert/snapshot/revert
is language-agnostic text editing.
* workflow.md / advanced.md — minor factual notes.
Evidence corpus lives under tmp/ (gitignored) and is not committed.
Windows CI flaked on `test_session_continue_to_termination` with "python DAP adapter exited with code 0 before listening ... Exception in thread Thread-1 (accept_worker)". This is debugpy's known adapter-startup race (the same thread-init race CLAUDE.md documents for CREATE_NEW_PROCESS_GROUP): under back-to-back launches the adapter's accept_worker thread crashes on init, so the adapter exits before it ever listens. Pre-existing flake, but this PR's extra adapter-spawning tests made it surface reliably on Windows CI. Fix (not suppression): `open_adapter_connection` spawns the adapter and connects with a bounded retry — on "exited before listening" / timeout it tree-kills the corpse and respawns on a fresh port, up to 3 attempts. `DapSession.start` now uses it. Benefits all adapters (debugpy, dlv, js-debug) equally; single transient startup crash no longer fails a session. TDD: tests/unit/test_adapter_connect_retry.py drives the orchestration through a clean seam (fake adapter + monkeypatched find_free_port / wait_until_listening / kill_tree) — retry-then-succeed, exhaust-then-raise, and happy-path-no-kill. Written failing first, then implemented to green. Verified: full suite 168 passed; the integration session suite run 5× back-to-back is clean (previously intermittent). ruff + mypy clean.
`test_session_start_listen_returns_attach_url` intermittently failed on Ubuntu CI (`_port_listening(...) == False`). Root cause: the test opened a SECOND TCP connection to a debugpy `--listen --wait-for-client` listener to re-confirm it was up — but that listener accepts a single client, so a throwaway probe can perturb it, and the check was redundant anyway: `_spawn_listen_mode` already gates `status: listening` on the port accepting (it waits up to 10s before returning). Replaced the racy re-probe with a listener-process liveness assertion (`is_pid_alive`), which—together with the asserted contract fields (status / attach_url / port / pid)—verifies a usable attach endpoint without a second connect. No production change; behavior of `session start --listen` is unchanged. Listen test now passes deterministically (3× local).
Drop the speculative global ~/.debug-agent/ path from the README — breakpoints and source snapshots reference files in the repo, so state stays project-local. Document adding .debug-agent/ to .gitignore. In the skill, tighten Cleanup to surface `dbga sessions ls` (lists live daemons, reaps dead-pid zombies) and the gitignore note.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Single consolidated PR adding multi-language debugging to
dbga— Python (existing, refactored), Go (dlv dap), and Node.js / TypeScript (vscode-js-debug). Supersedes the stacked PRs #2 / #3 / #4; all their commits plus the post-review fixes are here on one branch againstmain.What's in it
1. Language Adapter abstraction (
adapters/base.py+ registry)Every language-specific decision — DAP-server spawn, launch payload, traceback parser, interpreter peeling, IDE-attach, instrument templates — lives behind an
AdapterABC. Adding a language = subclass + register.--langflag onsession start/localize/diagnose, auto-detected from file extension.2. PythonAdapter — the original debugpy code lifted behind the interface, behavior-identical (verified: all pre-existing tests pass unchanged).
3. GoAdapter (
dlv dap) —mode:"debug"launch, panic + fatal-error stack parser (method receivers, runtime-frame classification, oldest-first ordering),go runpeeling, clear missing-dlverror.4. NodeAdapter (vscode-js-debug) — discovery across
$DBGA_JS_DEBUG_SERVER/ VS Code・Cursor・Insiders extension dirs / manual install; V8 stack parser;node/ts-node/tsxpeeling with-rhandling.5. DAP reverse-request + child-session support (
dap_client.py+dap_session.py)vscode-js-debug delegates every launched program to a child DAP session via a
startDebuggingreverse-request.DapClientnow routes server→client requests to registered handlers;DapSessionopens + tracks child connections, multiplexes their events, and routes ops to the active session. This is the change that makes Node fully live, and it's thread-safe (lock-guarded; see review below).Review (parallel subagents, 3 languages)
Three reviewers ran in parallel over the Python path, the Go adapter, and the Node/DAP core. They found 2 blockers, both fixed in this branch (commit
7e71480):_child_clients/_active_clientmutated on the reader thread, read unlocked on the main thread → could leak a child connection or resurrect a torn-down session. Fixed with_clients_lock+ snapshot iteration + a released-state guard in the reverse-handler.(*Server).Handleparsed asgithub.com/x/y.(broke panics inside any method). Fixed the func regex + regression fixture.Also fixed: js-debug extension version sort (
1.10.0now beats1.9.0), an over-claim in the Node docstring (worker/child_process is now honestly scoped as future work), and a fabricated code comment.Test plan
uv run pytest -v— 161 passed, 0 xfailed, 0 failuresdlv dap(delve), vscode-js-debug v1.117.0 (full launch → stopOnEntry → continue → terminated)ruff check+ruff format --check+mypy --strict srcall clean (30 src files)dlv,node+js-debug) is absent, so CI stays green on Python-only imagesKnown scope (honest)
child_processsub-sessions attach (recursive handler), butwait_for_stopends the session on the first child exit — full multi-process lifecycle is future work, documented in the NodeAdapter docstring.go testdebugging out of scope (needsmode:"test").instrumentprobe templates remain Python-flavored (passthrough for other langs).Supersedes
Closes the stacked PRs #2, #3, #4 — same commits, consolidated here.