Skip to content

fix(daemon): keep a session alive when its daemon is restarted under it (#662)#713

Merged
colbymchenry merged 1 commit into
mainfrom
fix/662-proxy-survives-daemon-restart
Jun 6, 2026
Merged

fix(daemon): keep a session alive when its daemon is restarted under it (#662)#713
colbymchenry merged 1 commit into
mainfrom
fix/662-proxy-survives-daemon-restart

Conversation

@colbymchenry
Copy link
Copy Markdown
Owner

Problem (#662)

When multiple sessions run against the same project (reported on opencode / WSL2), starting a new session causes the shared daemon to receive SIGTERM and restart. The new session reconnects, but the previously-running session's proxy dies and never recovers — that session silently loses CodeGraph until it's restarted.

A reproduction confirmed the issue and surfaced a detail the report didn't: a tools/call in flight at the moment the daemon dies hangs with no response (the host waits on a reply that never comes), on top of the proxy exiting and all subsequent calls being lost.

The SIGTERM originates in the host's process-tree teardown, not in CodeGraph — nothing in the engine signals another process (verified). So the fix is proxy resilience, not chasing the signal source.

Fix

The local-handshake proxy now treats a daemon disconnect as recoverable, not terminal:

  • On daemon socket close/error (while the host is still attached), it falls back to its in-process engine for the rest of the session — the same path already used when no daemon is reachable at startup, and exactly what CODEGRAPH_NO_DAEMON=1 does (the reporter's own workaround, now automatic).
  • It tracks requests forwarded to the daemon and re-serves any that were in flight when the daemon died, so the host never hangs. Anything that can't be served locally is answered with an error rather than left hanging.
  • The proxy still exits when the host goes away (stdin close / the serve --mcp is not reaped when the parent Claude Code process is SIGKILL'd (Linux) #277 PPID watchdog) — only daemon loss is now non-fatal.

Before → after (from the reproduction)

Before After
Proxy after daemon SIGTERM exits stays alive
In-flight request hangs, no response answered (in-process)
Post-drop request lost recovered

Test plan

  • Reproduction (proxy stays alive, in-flight request answered, post-drop request recovers).
  • Regression test in mcp-daemon.test.tsproxy survives the daemon dying mid-session and keeps serving (#662): kills the daemon under a live proxy via SIGTERM, asserts the proxy is still alive and still answers a subsequent tools/call.
  • macOS full suite green (1243 passed, 2 skipped); the daemon suite + the new test pass on a Windows 11 VM over named pipes.

Also in this PR

Replaces the over-the-wire liveness-sweep test added in #712 — flaky under heavy parallel load (a raced raw-socket connect) — with a deterministic Daemon.reapDeadClients unit test. The client-hello round-trip is still exercised by every daemon test, since the real proxy now sends it.

Fixes #662.

🤖 Generated with Claude Code

…it (#662)

When an MCP host (opencode and others) SIGTERM's the shared daemon as a new
session starts, the existing session's proxy used to exit on the dropped socket
— silently losing CodeGraph for that session, and hanging any request in flight
at the drop. The SIGTERM originates in the host's process-tree teardown, not in
CodeGraph (nothing here signals another process), so the fix is proxy
resilience, not chasing the signal.

The local-handshake proxy now treats a daemon disconnect as recoverable rather
than terminal: it falls back to its in-process engine for the rest of the
session (the same path used when no daemon is reachable at startup, and what
CODEGRAPH_NO_DAEMON does) and re-serves any requests that were in flight to the
dead daemon, so the host never hangs. The proxy still exits when the HOST goes
away (stdin close / PPID watchdog) — only daemon loss is now non-fatal.

Also replaces the over-the-wire liveness-sweep test added in #712 — which was
flaky under heavy parallel load (a raced raw-socket connect) — with a
deterministic Daemon.reapDeadClients unit test. The client-hello round-trip is
still exercised by every daemon test (the real proxy now sends it).

Validated with a reproduction (proxy stays alive, in-flight request answered,
post-drop request recovers) and a regression test in mcp-daemon.test.ts.
Confirmed on macOS (full suite green) and a Windows 11 VM.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@treytracedit-lab
Copy link
Copy Markdown

Git config--C++/test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Daemon receives SIGTERM when new MCP session starts, orphaning existing session's proxy

2 participants