Skip to content

feat(mcp): tool-list drift classifier — groundwork for #110 C2b#114

Merged
esengine merged 1 commit intomainfrom
feat/mcp-drift-classifier
May 2, 2026
Merged

feat(mcp): tool-list drift classifier — groundwork for #110 C2b#114
esengine merged 1 commit intomainfrom
feat/mcp-drift-classifier

Conversation

@esengine
Copy link
Copy Markdown
Owner

@esengine esengine commented May 2, 2026

Refs #110.

What

Pure function classifyToolListDrift(before, after) returning a DriftReport. Encodes the cache-cost taxonomy validated by the live DeepSeek spike (#113):

kind what cache impact (empirical)
identity same names, order, content free
append every before-tool unchanged, new tools tacked at the end ~95% hit (94.8% observed)
edit same names + positions, content of ≥1 tool changed bounded loss past divergence (84% hit observed)
reorder same set, different order; OR additions not at the end catastrophic (effectively full miss)
remove any before-tool missing from after; dominates even if others were added catastrophic

Report carries added / removed / edited arrays so the policy layer (next PR) can populate the user-facing warn line precisely without re-walking the lists.

Why now

C2b (the /mcp reconnect slash) needs to map drift → policy. Splitting the classifier into its own PR gets the ground truth landed and reviewable before the imperative teardown logic is layered on top.

Touch

  • New src/mcp/drift.ts — pure classifier + DriftReport type
  • New tests/mcp-drift.test.ts — 12 cases covering identity, append (single + multiple), edit (description + schema), remove (alone + with adds), reorder, and edge cases (empty before / empty after)

No source code outside this new module is touched.

Next PR (C2b proper)

/mcp reconnect <name> slash + r keybind in McpBrowser modal. The handler will:

  1. Locate live McpClient for <name>, snapshot its current tool list
  2. Tear down: mcp.close(), drop the prefixed tools from ToolRegistry
  3. Re-handshake on a new transport, re-bridge
  4. classifyToolListDrift over old vs new tool lists
  5. Apply policy:
    • identity / append → silent success (✓ connected)
    • edit → success + warn card showing edited array + per-tool divergence point
    • reorder / remove → refuse unless --force (or --strict user already opted in to that)
  6. Lifecycle vocabulary: ↻ reconnect start, ✓ connected / ✖ failed end (per design §37)

Test plan

  • npm run verify passes (1772 tests, +12)
  • Pure function, no I/O — easy to extend with new edge cases

Pure function `classifyToolListDrift(before, after) → DriftReport`
encoding the cache-cost taxonomy validated by the live spike (#113):

- `identity`  — same names, same order, same content → free
- `append`    — every before-tool unchanged, new tools at the end → trivially cheap (94.8% hit observed)
- `edit`      — same names + positions, content of ≥1 tool changed → bounded loss past the divergence point
- `reorder`   — same set, different order, OR additions not at the end → catastrophic
- `remove`    — any before-tool missing from after → catastrophic; dominates even when other tools were added

Report carries `added` / `removed` / `edited` arrays so the policy
layer (next PR) can populate the warn line precisely.

Pure function, no I/O. 12 unit tests cover the matrix including
edge cases (empty before/after, remove-with-also-added).

Refs #110.
@esengine esengine merged commit 1920d90 into main May 2, 2026
1 check passed
@esengine esengine deleted the feat/mcp-drift-classifier branch May 2, 2026 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant