Skip to content

fix(mcp): don't block initialize handshake on heavy init (#172)#177

Merged
colbymchenry merged 1 commit into
mainfrom
fix/mcp-initialize-blocking
May 19, 2026
Merged

fix(mcp): don't block initialize handshake on heavy init (#172)#177
colbymchenry merged 1 commit into
mainfrom
fix/mcp-initialize-blocking

Conversation

@colbymchenry
Copy link
Copy Markdown
Owner

Summary

Fixes #172 — codegraph MCP tools silently failing to appear in Claude Code on slow filesystems.

Root cause

handleInitialize in src/mcp/index.ts was doing await this.tryInitializeDefault(projectPath) before calling sendResult. That await chain runs CodeGraph.open()await initGrammars()Parser.init() (tree-sitter WASM runtime bootstrap). On Docker Desktop VirtioFS (macOS) and WSL2, that bootstrap can take longer than Claude Code's ~30 s handshake timeout. Symptom: codegraph process is alive (ps shows it), the initialize message arrived on its stdin, but no JSON-RPC response was sent before the client timed out — so tools never appeared and there was no UI error.

Fix

  • Send the initialize response immediately.
  • Kick off tryInitializeDefault as a tracked background promise (this.initPromise).
  • retryInitIfNeeded becomes async and awaits the in-flight promise so we never open the SQLite file twice concurrently.
  • Tool-call sites updated to await this.retryInitIfNeeded().

Why the diagnosis is solid

Two independent reports converged on the same wire evidence:

  • @sashanclrp (WSL2) reported "tools don't appear" and proposed a Content-Length framing fix.
  • I verified on macOS and Linux (Docker) that Claude Code uses newline-delimited JSON — framing wasn't the cause.
  • @sgrimm posted a wire capture from a Docker-on-Mac sandbox proving the initialize was received but no response ever sent, with the child still running 30 s later. That points squarely at a blocking await in the initialize handler — which is what this PR removes.

Test plan

  • npm test — all 577 tests pass, including the new __tests__/mcp-initialize.test.ts.
  • Regression-catching ordering test: spawns the actual dist/bin/codegraph.js serve --mcp subprocess, asserts the JSON-RPC response on stdout arrives before startWatching's "File watcher active" log on stderr. Verified the test fails when the fix is reverted (expected 1 to be less than 0).
  • End-to-end verification with the official @modelcontextprotocol/sdk client (the SDK Claude Code embeds): handshake in 48 ms, 9 tools listed, codegraph_search callTool succeeds.
  • Reporter verification on the original WSL2 / Docker-on-Mac setups (not blocking — fix is logically correct and on-fast-filesystems behavior is unchanged).

Credit

  • @sashanclrp — original report with detailed environment, symptoms, and a working bridge as a workaround.
  • @sgrimm — independent reproduction and the wire capture that isolated the real root cause.

🤖 Generated with Claude Code

The MCP `initialize` handler was awaiting `tryInitializeDefault` —
which opens the SQLite DB and runs `await initGrammars()` (tree-sitter
WASM bootstrap) — before sending the JSON-RPC response. On slow
filesystems (Docker Desktop VirtioFS on macOS, WSL2) this could exceed
Claude Code's ~30s handshake timeout, leaving the codegraph child
process alive and unresponsive with no tools visible in the client.

Send the response first; defer the open to a tracked background
promise. The lazy retry path used by `tools/list` and `tools/call`
now awaits that promise instead of racing it with `openSync`, so we
never double-open the SQLite file.

Adds a subprocess-based regression test that asserts the JSON-RPC
response arrives on stdout before `startWatching()` logs to stderr.
This ordering check catches the regression on any filesystem, not
just slow ones where the timing matters in practice.

Reported by @sashanclrp; isolated by @sgrimm's wire capture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP server fails to connect with Claude Code: Content-Length framing vs newline-delimited JSON transport mismatch

1 participant