RFC: /mcp reconnect <name> — live teardown without breaking the cache prefix

Spun out of #105 (Stage C2b). The slow-toast half landed in #109; this captures what's left for follow-up design before any code is written.

## Why

`/mcp reconnect <name>` should let a user recover from a transiently broken MCP server (handshake stuck, server hot-restarted) without restarting the whole Reasonix session. The design doc (`docs/design/agent-tui-terminal.html` §37) lists `↻ reconnect 2/5` as one of the six lifecycle states.

## The constraint that makes this non-trivial

Reasonix's whole architecture rests on a byte-stable prompt prefix:

> system + tool_specs + few-shots → identical bytes across turns → DeepSeek prefix-cache hits at 90%+

Reconnecting a server tears down its `McpClient`, re-handshakes, and re-bridges. **If the reconnected server's `tools/list` came back with anything different** — added a tool, dropped one, changed a description, even reordered — the prompt prefix shifts and the next turn cache-misses entirely. On a long session that 80%+ cache hit becomes an unrecoverable loss.

So the design question isn't "how do we tear down and re-bridge"; that part is a few dozen lines. The question is "what do we do when the reconnected tool surface differs?"

## Approaches to weigh

**1. Strict (refuse-on-drift):** before re-bridging, snapshot the new `tools/list` and compare against the prior. If anything changed, refuse with a card explaining "tool surface changed, restart to apply". Preserves cache. Cost: reconnect is useless when the server has actually been updated upstream.

**2. Permissive + announce:** always accept the new surface; if it differs from the prior, emit a prominent warn card ("cache reset — next turn will be a full miss; subsequent turns reseed") and continue. Cost: silent expensive turns when users don't read warnings.

**3. Identity-only:** allow reconnect only when the new `tools/list` is byte-identical to the prior. Same effect as (1) for the user, but the trigger isn't user-visible. Probably worse than (1).

**4. Two-mode flag:** `/mcp reconnect <name>` is strict by default; `/mcp reconnect <name> --force` is permissive + warns. Best ergonomics, most code.

## Other open questions

- **Mid-turn vs between-turn:** is reconnect allowed while a turn is in flight? Easiest answer: no, queue until next prompt.
- **Other servers' callers:** if the server being reconnected was mid-tool-call, the in-flight `callTool` promise needs to reject cleanly. Existing AbortSignal threading should cover this.
- **Lifecycle event ordering:** the design's `↻ reconnect 2/5  backoff 4s` implies retry semantics. Is the "2/5" surface from the underlying transport's reconnect attempts, or from a Reasonix-level retry wrapper around the manual `/mcp reconnect`? Probably the latter, capped at 5 with exponential backoff, but worth pinning.
- **`r` keybind in `/mcp` browser:** Stage B left this as a stub. Once this RFC lands, the `r` key triggers the same code path.

## Out of scope

- Auto-reconnect on transient failures during dispatch. That's a separate failure-mode question (currently a `failed` lifecycle line is emitted and the session continues without it). Tracking issue if anyone wants it.
- Hot-reloading the `--mcp` flag list to add new servers mid-session (different feature; would also need cache-prefix work).

## Suggested resolution shape

I'd commit to **approach #4** (default strict, `--force` for permissive) once the open questions above have at-least-loose answers. Spike: prototype the strict path first against the bundled demo MCP server in tests, confirm cache prefix stability, then layer `--force`.

Closes part of #105 (specifically: removes the `r` keybind stub from McpBrowser once shipped).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: /mcp reconnect <name> — live teardown without breaking the cache prefix #110

Why

The constraint that makes this non-trivial

Approaches to weigh

Other open questions

Out of scope

Suggested resolution shape

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

RFC: /mcp reconnect <name> — live teardown without breaking the cache prefix #110

Description

Why

The constraint that makes this non-trivial

Approaches to weigh

Other open questions

Out of scope

Suggested resolution shape

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions