Problem
The integration-test skill (.claude/skills/integration-test/SKILL.md) does swift build in step 2 and then validates behavior via mcp__previewsmcp__* tool calls. But those calls hit the stdio MCP server that Claude Code spawned at session start — not the freshly built binary. The build step is essentially decorative from the test's perspective: any code change made during the session is invisible to the tools the skill is using to validate it.
Mechanism:
.mcp.json declares command: ".build/debug/previewsmcp" with args: ["serve"].
- Claude Code spawns that process once, at session start, and keeps it resident for the whole session.
swift build overwrites the on-disk binary, but the resident process keeps running the old code.
- All
mcp__previewsmcp__* calls go to the resident process.
So a contributor can edit code, run the skill, see green, and ship a regression — the test never exercised the change.
Architectural footgun (broader than the skill)
PreviewsMCP has two completely separate server processes with independent session state:
- Stdio MCP server —
previewsmcp serve (no flag). Self-contained: own PreviewHost, IOSSessionManager, ConfigCache (Sources/PreviewsCLI/ServeCommand.swift:57-73). This is what .mcp.json spawns.
- Unix-socket daemon —
previewsmcp serve --daemon at ~/.previewsmcp/serve.sock. Auto-spawned by every CLI subcommand other than serve. Persists across Claude Code restarts.
A session started via the MCP tools lives in process (1) and is invisible to previewsmcp list, previewsmcp snapshot --session-id …, etc. (which talk to process (2)). A contributor debugging an MCP-tool issue with the CLI will see an empty session list and waste time figuring out why.
This split-brain isn't documented in AGENTS.md beyond the bullet that says CLI subcommands talk to a daemon. The implication for testing and debugging is non-obvious.
Relationship to #142
#142 covers CLI-vs-daemon version mismatch on upgrade (Homebrew/source). This issue is adjacent but distinct: it's about the stdio MCP server going stale within a single Claude Code session, and about the two-process architecture being a footgun beyond just upgrades. A version-handshake fix shaped like the one proposed in #142 would not catch this case, because the stdio server has no peer to handshake with — the staleness is purely between the resident process and the on-disk binary.
Proposed fixes
1. Skill changes (small, immediate)
Update .claude/skills/integration-test/SKILL.md:
- Step 0:
previewsmcp kill-daemon || true — hermetic reset of the unix-socket daemon. Strictly speaking the skill currently only uses MCP tools, so this is belt-and-suspenders, but it prevents surprises if any future step shells out to a CLI subcommand.
- After step 2 (
swift build): explicit instruction to /exit and relaunch Claude Code so the stdio MCP server respawns from the new binary. Without this, every subsequent step is testing the wrong code.
- Loud failure mode: add a build-version check — see (2).
2. Build-hash MCP tool (recommended)
Expose a tiny MCP tool, e.g. preview_build_info, that returns the binary's build identity (commit SHA + dirty bit, baked in at build time via GenerateVersionTool or equivalent). The skill calls this as the first MCP call after swift build and compares against git rev-parse HEAD. If they differ, the skill aborts with a clear message:
Stdio MCP server is running build abc1234 but repo HEAD is def5678. Restart Claude Code so it respawns the server from the new binary, then re-run.
This makes the staleness impossible to miss and self-diagnosing.
3. Document the two-process model
Add a short section to AGENTS.md covering:
- The stdio server (used by Claude Code / Cursor via
.mcp.json) and the unix-socket daemon (used by CLI subcommands) are separate processes with separate session state.
- A session created via MCP tools is not visible to CLI subcommands like
list / snapshot --session-id, and vice versa.
- For integration testing or debugging, restart Claude Code to refresh the stdio server;
kill-daemon to refresh the unix-socket daemon. Both, if both paths are in play.
4. (Stretch) Unify the two backends
Longer-term: have the stdio serve mode forward to the unix-socket daemon instead of running its own self-contained engine. Then there's one source of truth for sessions, one binary to keep current, and #142's version-handshake fix covers everything. Bigger lift; out of scope for this issue but worth flagging as the durable answer.
Suggested scope
- PR 1: skill edits in (1). Trivial, ships the safety today.
- PR 2:
preview_build_info tool + skill check in (2).
- PR 3: doc update in (3).
- (4) tracked separately if/when the team wants to tackle it.
🤖 Filed from a Claude Code session after noticing that the integration-test skill silently exercises whatever binary was current at session start, regardless of intervening swift builds.
Problem
The
integration-testskill (.claude/skills/integration-test/SKILL.md) doesswift buildin step 2 and then validates behavior viamcp__previewsmcp__*tool calls. But those calls hit the stdio MCP server that Claude Code spawned at session start — not the freshly built binary. The build step is essentially decorative from the test's perspective: any code change made during the session is invisible to the tools the skill is using to validate it.Mechanism:
.mcp.jsondeclarescommand: ".build/debug/previewsmcp"withargs: ["serve"].swift buildoverwrites the on-disk binary, but the resident process keeps running the old code.mcp__previewsmcp__*calls go to the resident process.So a contributor can edit code, run the skill, see green, and ship a regression — the test never exercised the change.
Architectural footgun (broader than the skill)
PreviewsMCP has two completely separate server processes with independent session state:
previewsmcp serve(no flag). Self-contained: ownPreviewHost,IOSSessionManager,ConfigCache(Sources/PreviewsCLI/ServeCommand.swift:57-73). This is what.mcp.jsonspawns.previewsmcp serve --daemonat~/.previewsmcp/serve.sock. Auto-spawned by every CLI subcommand other thanserve. Persists across Claude Code restarts.A session started via the MCP tools lives in process (1) and is invisible to
previewsmcp list,previewsmcp snapshot --session-id …, etc. (which talk to process (2)). A contributor debugging an MCP-tool issue with the CLI will see an empty session list and waste time figuring out why.This split-brain isn't documented in
AGENTS.mdbeyond the bullet that says CLI subcommands talk to a daemon. The implication for testing and debugging is non-obvious.Relationship to #142
#142 covers CLI-vs-daemon version mismatch on upgrade (Homebrew/source). This issue is adjacent but distinct: it's about the stdio MCP server going stale within a single Claude Code session, and about the two-process architecture being a footgun beyond just upgrades. A version-handshake fix shaped like the one proposed in #142 would not catch this case, because the stdio server has no peer to handshake with — the staleness is purely between the resident process and the on-disk binary.
Proposed fixes
1. Skill changes (small, immediate)
Update
.claude/skills/integration-test/SKILL.md:previewsmcp kill-daemon || true— hermetic reset of the unix-socket daemon. Strictly speaking the skill currently only uses MCP tools, so this is belt-and-suspenders, but it prevents surprises if any future step shells out to a CLI subcommand.swift build): explicit instruction to/exitand relaunch Claude Code so the stdio MCP server respawns from the new binary. Without this, every subsequent step is testing the wrong code.2. Build-hash MCP tool (recommended)
Expose a tiny MCP tool, e.g.
preview_build_info, that returns the binary's build identity (commit SHA + dirty bit, baked in at build time viaGenerateVersionToolor equivalent). The skill calls this as the first MCP call afterswift buildand compares againstgit rev-parse HEAD. If they differ, the skill aborts with a clear message:This makes the staleness impossible to miss and self-diagnosing.
3. Document the two-process model
Add a short section to
AGENTS.mdcovering:.mcp.json) and the unix-socket daemon (used by CLI subcommands) are separate processes with separate session state.list/snapshot --session-id, and vice versa.kill-daemonto refresh the unix-socket daemon. Both, if both paths are in play.4. (Stretch) Unify the two backends
Longer-term: have the stdio
servemode forward to the unix-socket daemon instead of running its own self-contained engine. Then there's one source of truth for sessions, one binary to keep current, and #142's version-handshake fix covers everything. Bigger lift; out of scope for this issue but worth flagging as the durable answer.Suggested scope
preview_build_infotool + skill check in (2).🤖 Filed from a Claude Code session after noticing that the integration-test skill silently exercises whatever binary was current at session start, regardless of intervening
swift builds.