Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

- On Windows, CodeGraph's background processes no longer pile up without bound and saturate CPU over a long session. When the editor or agent that launched CodeGraph exited, its helper process couldn't tell its parent had gone — Windows reports process lineage differently than macOS and Linux — so the helper kept running, the shared background server never saw the client disconnect, and its idle timer never fired to shut it down. CodeGraph now detects parent-process exit directly on Windows, so helpers and the idle background server wind down promptly, the same as they already did on macOS and Linux. (#692, #576, #680)
- The shared background server has two further safeguards against ever lingering: it now drops a client the moment it detects that client's process is gone (even if the disconnect arrived uncleanly — a force-quit or a dropped connection that never closed the socket), and it won't stay running indefinitely with clients attached but no activity. Together these guarantee it always winds down, on every platform. (#692)
- A session no longer loses CodeGraph when the shared background server is restarted out from under it — for example when your MCP host (opencode and others) stops and restarts the server as you open another session. Previously the affected session's connection died silently and any request in flight at that moment hung; now CodeGraph keeps that session working by serving it locally, so the tools stay available without restarting the session. (#662)
- React Native native→JS events now connect through the common `sendEvent(context, "X", body)` wrapper. Many libraries (react-native-device-info and others) wrap the event emitter behind a helper whose `.emit(eventName, …)` takes a *variable*, so the matcher — which looked for `.emit("literal", …)` — missed it; the literal event name actually lives in the wrapper call. Now a native method that fires `sendEvent(…, "batteryLevelChanged", …)` links to the JS `addListener('batteryLevelChanged', …)` handler, so editing the native emitter surfaces the JS subscriber. (React Native)
- React Native / Expo cross-language bridges are more complete and more precise. An Expo Module method declared with a generic type — Android's `AsyncFunction<Float>("getBatteryLevelAsync")` — is now indexed (the `<Float>` used to defeat the matcher, so every Android Expo method was dropped and a JS call resolved only to the iOS Swift impl). The iOS and Android implementations of the same JS-visible method — both Expo Modules and classic NativeModules (`@ReactMethod` on Android, the matching method on iOS) — are now linked to each other, so a JS call that resolves to one platform still reaches the other and editing either platform's native code surfaces the JS caller. And a `Type.member` static read in native code (e.g. Android's `BatteryManager.EXTRA_LEVEL`) no longer falsely links to a coincidentally same-named class in another language (a web `BatteryManager`) — type references stay within a language family, while genuine cross-language bridges (config→code, JS↔native calls) are unaffected. (React Native, Expo)
- A TypeScript/JavaScript reference or import no longer gets mis-linked to a same-named class in a native language. In a React Native / Expo repo that has both a TypeScript `TestRunner` type and a Kotlin `TestRunner` class, a TS reference to `TestRunner` — or an `import React` sitting next to a Swift `React` — used to resolve onto the native symbol (the component resolver matched any same-named class regardless of language, and import statements weren't language-checked at all). References and imports now stay within their language family, so they land on the right symbol while genuine cross-language bridges (JS↔native calls, config→code) are untouched. A C/C++ `#include "Foo.h"` likewise no longer resolves to a same-named header from another platform (an iOS Objective-C `Foo.h`). (React Native, Expo, TypeScript, C/C++)
Expand Down
43 changes: 42 additions & 1 deletion __tests__/daemon-client-liveness.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
* the full handshake + sweep is exercised end-to-end in `mcp-daemon.test.ts`.
*/
import { describe, it, expect } from 'vitest';
import { parseClientHelloLine, peerIsDead } from '../src/mcp/daemon';
import { Daemon, parseClientHelloLine, peerIsDead } from '../src/mcp/daemon';

describe('parseClientHelloLine', () => {
it('parses a well-formed client-hello', () => {
Expand Down Expand Up @@ -67,3 +67,44 @@ describe('peerIsDead', () => {
expect(peerIsDead({ pid: 100, hostPid: 42 }, aliveAll)).toBe(false);
});
});

describe('Daemon.reapDeadClients', () => {
// Construct with idleTimeoutMs:0 so dropping the last client doesn't arm a real
// idle timer. The constructor opens no sockets/DB, so this stays a fast unit test.
const makeDaemon = () => new Daemon('/tmp/codegraph-reap-unit-test', { idleTimeoutMs: 0 }) as any;
const fakeSession = () => ({ stopped: false, stop() { this.stopped = true; } });

it('drops clients with a dead peer and leaves live ones attached', () => {
const d = makeDaemon();
const dead = fakeSession();
const live = fakeSession();
d.clients.add(dead); d.clientPeers.set(dead, { pid: 111, hostPid: null });
d.clients.add(live); d.clientPeers.set(live, { pid: 222, hostPid: null });

const reaped = d.reapDeadClients((pid: number) => pid !== 111); // 111 dead, 222 alive

expect(reaped).toBe(1);
expect(dead.stopped).toBe(true);
expect(d.clients.has(dead)).toBe(false);
expect(d.clientPeers.has(dead)).toBe(false); // peer record cleaned up too
expect(d.clients.has(live)).toBe(true);
});

it('never reaps a client with an unknown pid (no client-hello)', () => {
const d = makeDaemon();
const s = fakeSession();
d.clients.add(s); d.clientPeers.set(s, { pid: null, hostPid: null });

expect(d.reapDeadClients(() => false)).toBe(0); // everything "dead", but pid unknown
expect(d.clients.has(s)).toBe(true);
});

it('reaps a client whose host pid is gone even if its proxy pid is alive', () => {
const d = makeDaemon();
const s = fakeSession();
d.clients.add(s); d.clientPeers.set(s, { pid: 100, hostPid: 42 });

expect(d.reapDeadClients((pid: number) => pid !== 42)).toBe(1); // proxy 100 alive, host 42 dead
expect(d.clients.has(s)).toBe(false);
});
});
86 changes: 35 additions & 51 deletions __tests__/mcp-daemon.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -143,16 +143,6 @@ function readLockPid(root: string): number | null {
} catch { return null; }
}

/** The socket path the daemon actually bound, as it recorded in its lockfile —
* robust on Windows where a recomputed pipe path can differ from the daemon's. */
function readLockSocketPath(root: string): string | null {
try {
const raw = fs.readFileSync(path.join(root, '.codegraph', 'daemon.pid'), 'utf8');
const info = JSON.parse(raw);
return typeof info.socketPath === 'string' ? info.socketPath : null;
} catch { return null; }
}

function readDaemonLog(root: string): string {
try { return fs.readFileSync(path.join(root, '.codegraph', 'daemon.log'), 'utf8'); }
catch { return ''; }
Expand Down Expand Up @@ -369,47 +359,11 @@ describe('Shared MCP daemon (issue #411)', () => {
}
}, 30000);

it('reaps a client whose process died without the socket closing (liveness sweep, #692)', async () => {
const net = await import('net');
// Bring a daemon up via a real proxy (a live client), sweep fast.
const env = { CODEGRAPH_DAEMON_IDLE_TIMEOUT_MS: '30000', CODEGRAPH_DAEMON_CLIENT_SWEEP_MS: '300' };
const server = spawnServer(tempDir, env);
servers.push(server);
sendInitialize(server.child, `file://${tempDir}`, 1);
await waitFor(() => findResponse(server.stdout, 1), 10000);
await waitFor(() => (readLockPid(realRoot) ?? 0) > 0, 8000);

// Connect a RAW client that announces a dead pid and then never closes its
// socket — the exact phantom-client shape the sweep exists to catch. Use the
// socket path the daemon recorded in its lockfile (robust on Windows, where
// a recomputed named-pipe path can differ from the one the daemon bound).
const sockPath = await waitFor(() => readLockSocketPath(realRoot), 8000);
const raw = net.createConnection(sockPath);
raw.on('error', () => { /* ignore — we destroy it ourselves */ });
try {
// Consume the daemon hello (one line), then send our client-hello.
// Generous timeouts: the unref'd sweep interval can stretch under a busy
// event loop (engine init / a loaded CI box), so don't race it tight.
await new Promise<void>((resolve, reject) => {
let buf = '';
const to = setTimeout(() => reject(new Error('no daemon hello within 15s')), 15000);
raw.on('data', (c: Buffer) => {
buf += c.toString('utf8');
if (buf.includes('\n')) { clearTimeout(to); resolve(); }
});
});
raw.write(JSON.stringify({ codegraph_client: 1, pid: 999_999, hostPid: null }) + '\n');

// The sweep should detect pid 999999 is dead and reap that client.
await waitFor(
() => readDaemonLog(realRoot).includes('Reaping client with dead peer (pid 999999'),
15000,
);
} finally {
raw.destroy();
}
}, 60000);

// The over-the-wire client-hello → record → sweep path is covered by the
// deterministic `Daemon.reapDeadClients` unit test in daemon-client-liveness
// (a raw-socket variant here was flaky under heavy parallel load), plus the
// client-hello round-trip exercised by every test above (the real proxy now
// sends it). What stays here is the lifecycle behavior that needs real procs.
it('exits on the inactivity backstop even while a client stays connected (#692)', async () => {
// Backstop short, idle timeout long: with a client connected the idle timer
// never arms, so only the inactivity backstop can take the daemon down.
Expand Down Expand Up @@ -445,4 +399,34 @@ describe('Shared MCP daemon (issue #411)', () => {
expect(await waitProcessExit(daemonPid, 10000)).toBe(true);
expect(fs.existsSync(path.join(realRoot, '.codegraph', 'daemon.pid'))).toBe(false);
}, 30000);

it('proxy survives the daemon dying mid-session and keeps serving (#662)', async () => {
// The #662 scenario: an MCP host SIGTERM's the shared daemon while a session
// is live. The proxy must NOT exit (losing CodeGraph for that session) — it
// falls back to an in-process engine and keeps answering.
const env = { CODEGRAPH_DAEMON_IDLE_TIMEOUT_MS: '30000', CODEGRAPH_PPID_POLL_MS: '5000' };
const server = spawnServer(tempDir, env);
servers.push(server);
sendInitialize(server.child, `file://${tempDir}`, 1);
await waitFor(() => findResponse(server.stdout, 1), 10000);
await waitFor(() => server.stderr.some((l) => l.includes('Attached to shared daemon')), 8000);
await waitFor(() => (readLockPid(realRoot) ?? 0) > 0, 8000);
const daemonPid = readLockPid(realRoot)!;

// A warm call goes through the daemon.
sendMessage(server.child, { jsonrpc: '2.0', id: 2, method: 'tools/call', params: { name: 'codegraph_status', arguments: {} } });
await waitFor(() => findResponse(server.stdout, 2), 10000);

// Kill the daemon out from under the live proxy.
process.kill(daemonPid, 'SIGTERM');
expect(await waitProcessExit(daemonPid, 8000)).toBe(true);

// The proxy must still be alive and still answer — served in-process now.
expect(isAlive(server.child.pid!)).toBe(true);
await waitFor(() => server.stderr.some((l) => l.includes('serving this session in-process')), 8000);
sendMessage(server.child, { jsonrpc: '2.0', id: 3, method: 'tools/call', params: { name: 'codegraph_status', arguments: {} } });
const resp = await waitFor(() => findResponse(server.stdout, 3), 15000);
expect(resp.result !== undefined || resp.error !== undefined).toBe(true);
expect(isAlive(server.child.pid!)).toBe(true);
}, 45000);
});
50 changes: 45 additions & 5 deletions src/mcp/proxy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,19 @@ export async function runLocalHandshakeProxy(deps: LocalHandshakeDeps): Promise<
let engine: MCPEngine | null = null;
let engineReady: Promise<void> | null = null;
let shuttingDown = false;
// Requests forwarded to the daemon and not yet answered, keyed by JSON-RPC id.
// If the daemon dies mid-session (#662 — e.g. an MCP host SIGTERM's it when a
// new session starts), these would otherwise hang forever; we re-serve them
// in-process so the host always gets a reply.
const inflight = new Map<unknown, string>();
const trackInflight = (line: string): void => {
try {
const m = JSON.parse(line) as JsonRpc;
if (m && m.id !== undefined && typeof m.method === 'string' && m.method !== 'initialize') {
inflight.set(m.id, line);
}
} catch { /* unparseable — nothing we could re-serve anyway */ }
};

const writeClient = (obj: JsonRpc | string): void => {
try { process.stdout.write((typeof obj === 'string' ? obj : JSON.stringify(obj)) + '\n'); } catch { /* host gone */ }
Expand Down Expand Up @@ -221,11 +234,16 @@ export async function runLocalHandshakeProxy(deps: LocalHandshakeDeps): Promise<
}
} else if (msg.method === 'ping' && id !== undefined) {
writeClient({ jsonrpc: '2.0', id, result: {} });
} else if (id !== undefined && msg.method !== 'initialize') {
// A request we can't serve in-process (and the daemon is gone) — answer
// with an error rather than let the host hang on a reply that won't come.
writeClient({ jsonrpc: '2.0', id, error: { code: -32603, message: 'CodeGraph daemon unavailable' } });
}
// initialize already answered locally; notifications (initialized) need no reply.
};
const routeToDaemon = (line: string): void => {
if (daemonStatus === 'ready' && daemonSocket) {
trackInflight(line);
try { daemonSocket.write(line.endsWith('\n') ? line : line + '\n'); } catch { /* close path */ }
} else if (daemonStatus === 'failed') {
void handleLocally(line);
Expand Down Expand Up @@ -284,15 +302,37 @@ export async function runLocalHandshakeProxy(deps: LocalHandshakeDeps): Promise<
const line = sockBuf.slice(0, idx);
sockBuf = sockBuf.slice(idx + 1);
if (!line.trim()) continue;
if (clientInitId !== undefined) {
try { const m = JSON.parse(line) as JsonRpc; if (m.id === clientInitId && ('result' in m || 'error' in m)) continue; } catch { /* relay */ }
let resp: JsonRpc | null = null;
try { resp = JSON.parse(line) as JsonRpc; } catch { /* not JSON — relay verbatim */ }
if (resp && resp.id !== undefined && ('result' in resp || 'error' in resp)) {
inflight.delete(resp.id); // answered — no longer in flight
// Suppress the daemon's reply to the initialize we forwarded to prime it
// (the client already got the local handshake response).
if (clientInitId !== undefined && resp.id === clientInitId) continue;
}
writeClient(line);
}
});
socket.on('close', shutdown);
socket.on('error', shutdown);
for (const line of pending) { try { socket.write(line + '\n'); } catch { /* ignore */ } }
// The daemon going away does NOT end the session (#662). An MCP host can
// SIGTERM the shared daemon when another session starts; if we exited here,
// this host would silently lose CodeGraph and any in-flight request would
// hang. Instead, fall back to the in-process engine for the rest of the
// session and re-serve whatever the dead daemon never answered.
const onDaemonLost = (): void => {
if (shuttingDown || daemonStatus !== 'ready') return; // host teardown, or already handled
daemonStatus = 'failed';
try { daemonSocket?.destroy(); } catch { /* ignore */ }
daemonSocket = null;
process.stderr.write(
`[CodeGraph MCP] Shared daemon connection lost; serving this session in-process (degraded), re-serving ${inflight.size} in-flight request(s).\n`
);
const orphaned = [...inflight.values()];
inflight.clear();
for (const line of orphaned) void handleLocally(line);
};
socket.on('close', onDaemonLost);
socket.on('error', onDaemonLost);
for (const line of pending) { trackInflight(line); try { socket.write(line + '\n'); } catch { /* ignore */ } }
pending.length = 0;
} else if (!shuttingDown) {
daemonStatus = 'failed';
Expand Down