Skip to content

feat(mcp): full MCP/CLI audit — tiers 0–3 plus Inspector-driven fixes#324

Merged
eyelock merged 14 commits into
developfrom
feat/mcp-cli-extend-and-fix
May 13, 2026
Merged

feat(mcp): full MCP/CLI audit — tiers 0–3 plus Inspector-driven fixes#324
eyelock merged 14 commits into
developfrom
feat/mcp-cli-extend-and-fix

Conversation

@eyelock
Copy link
Copy Markdown
Owner

@eyelock eyelock commented May 13, 2026

Summary

Implements the audit plan at .claude/plans/2026-05-13-mcp-cli-feature-audit.md end-to-end — Tier 0 fixes + Tier 1a polish + Tier 1b structural + Tier 2 symmetry + Tier 3 new domains. Tier 4 deferred (documented in the plan).

  • Tier 0 — drop termq_ prefix from tool names, atomic RMW in BoardWriter, surface load errors, AppProfile runtime variant, converged tag matching (literal-by-default with re: opt-in).
  • Tier 1aTool.Annotations on every tool (readOnly/destructive/idempotent/openWorld hints), display title on every tool/resource/prompt, completion handler for the terminal_summary prompt argument, notifications/message log mirror, logging/setLevel honoured.
  • Tier 1b — resource templates (termq://terminal/{id} etc.), outputSchema + structuredContent on read tools, real resources/subscribe backed by DispatchSourceFileSystemObject + 150ms debounce, new record_handshake tool (splits side-effect from get).
  • Tier 2whoami, restore, column CRUD tools, includeDeleted + base64 cursor pagination on list/find.
  • Tier 3termq://repos, termq://worktrees, termq://harnesses resources; create_worktree, remove_worktree, harness_launch tools.
  • ToolingToolParity.swift registry + 5 tests enforcing mandatoryCLI / omittedCLI classification at build time, Scripts/check-mcp-docs.sh CI gate, full Docs/Help/reference/mcp.md rewrite plus a new Docs/Help/tutorials/mcp-subscriptions.md, Makefile targets mcp.inspect / mcp.inspect.release / install-mcp / install-mcp-debug.
  • Post-test fixes (Inspector-driven)list/find now emit { items, nextCursor? } envelope (MCP requires structuredContent be an object, not array); harness_launch doc clarifies canonical id vs bare name; subscriptions tutorial reframed with client-compatibility matrix (Claude Code v2.x does not proxy subscribe to the LLM — Inspector does).

Test plan

  • make check clean post-merge (verified locally).
  • make mcp.inspect — connect, exercise List Tools / List Resources, confirm outputSchema.type: "object" on every tool, verify the four-test subscription recipe in Docs/Help/tutorials/mcp-subscriptions.md (push notification, debounce, unsubscribe, handshake round-trip).
  • Inside a TermQ Debug session: sudo make install-mcp-debug/mcp reconnect → read your termq://terminal/$TERMQ_TERMINAL_ID resource round-trips against the Debug board.
  • CI: Scripts/check-mcp-docs.sh gate green.

Known gaps (filed as follow-ups, not blockers)

  • Tier 2/3 tools restore/whoami/create_column/rename_column/delete_column exist on MCP but lack matching termqcli subcommands (registry classifies as mandatoryCLI — test passes by name; CLI implementation is a follow-up).
  • ToolHandlers.swift is now 1233 lines (cyclomatic complexity 20/19) — both advisory warnings; an extension-split refactor is the next-touch follow-up.
  • SubscriptionManager deinit and stopWatching both close fileDescriptor while a setCancelHandler may also close the captured fd — flagged by reviewer as "not a current bug, but fragile"; clean up alongside the actor lifecycle revisit.
  • TermQ → MCP profile coordination — no TERMQ_DATA_DIR pass-through; if .mcp.json --debug is mismatched with the running TermQ variant, MCP reads the wrong board. Worth a small follow-up that exports the resolved data dir into terminal env.

🤖 Generated with Claude Code

David Collie and others added 14 commits May 13, 2026 08:42
…W, converged tag matching

Tier 0 of the MCP/CLI extend-and-fix plan — the prerequisite bug-fix work that the
later tiers (1a/1b/2/3) build on. Closes the original "MCP returns empty results"
report and consolidates CLI/MCP semantics so the rest of the work can target one
contract.

Part A — Tool name de-duplication. The MCP server is already named "termq", so the
"termq_" prefix on every tool name produced fully-qualified handles like
"mcp__termq__termq_list". Renamed all ten tools to drop the prefix. No alias.
Breaking; called out loudly in CHANGELOG.

Fix B — AppProfile injection + load-error surfacing. Replaced the boolean "debug"
parameter on BoardLoader/BoardWriter with a runtime "profile: AppProfile.Variant"
that defaults to .current and resolves to .debug under TERMQ_DEBUG_BUILD. Tests
inject .debug or .production explicitly. Resource handlers now propagate load
failures instead of silently returning "[]" or "{}". termqmcp --verbose logs the
resolved profile and data directory at startup.

Fix C — Atomic read-modify-write in BoardWriter. updateCard, moveCard, and
createCard previously split their read and write across two NSFileCoordinator
claims, leaving a lost-update window. Collapsed into a single writingItemAt:
claim via a new BoardWriter.atomicUpdate(...) helper. Added concurrency tests
(T3.10, T3.12) verifying concurrent updates to different cards both survive and
concurrent appends produce distinct orderIndex values.

Issue E — Tag matching converged + literal default + opt-in regex. Both
CardFilterEngine.filterByTag callers (CLI partial-match, MCP exact) collapse to
one literal-by-default implementation. Regex opt-in via "re:" prefix on value or
whole pattern. Invalid regex throws (no silent fallback). Includes a regression
test for the rejected regex-by-default design (project=v1.2 must not match v1X2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mirror

Tier 1a of the MCP/CLI extend-and-fix plan. Pure additive metadata + completion +
notification plumbing. No new TermQ concepts; existing surface becomes a better
MCP citizen.

Tool annotations — every tool now carries readOnlyHint / destructiveHint /
idempotentHint / openWorldHint per the audit §3.4 table. Permissioned clients
auto-allow reads, prompt on destructive writes. `delete` marked destructive
(even soft-delete; better to ask than not).

Display titles — added `title:` to every tool, resource, prompt, and prompt
argument. Clients show these alongside the programmatic names; programmatic
names stay machine-stable.

Argument completion — registered the `completion/complete` capability and
implemented suggestions for the `terminal_summary` prompt's `terminal` argument.
Matches live board terminal names by case-insensitive substring, capped at 100
results with proper `total` / `hasMore` reporting.

Log mirror — removed the locally-redeclared SetLoggingLevel (the MCP Swift SDK
now ships it with the proper LogLevel enum, not a free-form String). `logging/setLevel`
now actually filters subsequent emissions instead of being silently accepted.
Added an `emitLog(...)` helper on TermQMCPServer that posts `notifications/message`
to the client, wired into the board-load error path so remote operators see
failures without needing local --verbose stderr access.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ptions, record_handshake

Tier 1b of the MCP/CLI extend-and-fix plan. Structural upgrades that turn TermQ
from "uses 3 of the 13 MCP primitives" into a genuinely comprehensive MCP
implementation.

Resource templates — added termq://terminal/{id}, termq://terminal-by-name/{name},
and termq://column/{name}. Per-card reads now go through the idiomatic
resources/read path rather than the awkward `get` tool. Static URIs take
precedence; template URIs are matched by prefix and percent-decoded.

Structured tool output — every read tool (`pending`, `list`, `find`, `open`,
`get`) now declares an outputSchema and returns structuredContent in addition
to legacy text. Existing OutputTypes are already Codable so the conversion is
done by the SDK's throwing `CallTool.Result(content:, structuredContent: Output)`
initializer. Payload roughly doubles for one release while the text mirror is
kept for back-compat; the audit §3.3 payload note documents the trade-off.

Resource subscriptions — new ResourceSubscriptionManager actor wraps a
DispatchSourceFileSystemObject watcher on board.json with a 150ms debounce
window. Atomic writes (which BoardWriter does via NSFileCoordinator) replace
the inode entirely, so the manager re-arms after .rename / .delete events with
a 50ms backoff. Self-write detection is deliberately deferred per the audit's
"acceptable starting point" note — subscribers will see notifications for their
own writes; future revision can thread a skip-notification token through
BoardWriter. The previously-declared-but-unimplemented `subscribe: true`
capability is now honoured.

record_handshake — split the `get` tool's read+side-effect into two. Reading
termq://terminal/{id} is now pure; `record_handshake(id:)` is the explicit
write that sets lastLLMGet. `get` retains the combined behaviour for one
release per the audit's semantic-break deprecation policy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Deleted

Tier 2 of the MCP/CLI extend-and-fix plan. Plugs the symmetry gaps in the
existing card domain — features TermQ already had in the GUI but that MCP
couldn't drive.

whoami — resolves the current card from $TERMQ_TERMINAL_ID without making the
caller substitute the env var manually. Returns null (not error) when running
outside a TermQ terminal so top-level Claude sessions don't see a spurious
failure.

restore + BoardWriter.restoreCard — undo soft-delete by clearing the
`deletedAt` timestamp. Closes the asymmetric "MCP can delete but not undelete"
gap. Permanent deletes remain irrecoverable.

Column CRUD — create_column / rename_column / delete_column tools backed by
new BoardWriter.{createColumn, renameColumn, deleteColumn} primitives, all
running inside the Tier-0 atomic claim. delete_column refuses by default if
active cards remain; force: true cascades a soft-delete to those cards (which
then become individually restorable).

list extended — includeDeleted: true to include binned cards in results;
cursor + limit for opt-in pagination. Unpaginated calls keep returning the
bare array for back-compat; paginated calls return a {items, nextCursor}
envelope. find got the same cursor/limit treatment.

Deferred to a later tier — snapshots (requires GUI integration the headless
MCP can't drive) and per-card overrides (theme/font/backend on set/create)
are still on the audit list but didn't make this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + harness tools

Tier 3 of the MCP/CLI extend-and-fix plan — new domain coverage now that YNH's
JSON schemas are available locally. TermQ becomes self-administrable from
inside itself: an assistant can enumerate repos, see worktrees, list installed
harnesses, create/remove worktrees, and launch a harness.

Read-only resources:
- termq://repos — registered git repositories from RepoConfig
- termq://worktrees — git worktrees enumerated across every repo; per-repo
  errors are mirrored to notifications/message rather than killing the listing
- termq://harnesses — ynh ls --format json output; empty when ynh is absent

Write tools:
- create_worktree, remove_worktree — backed by existing GitServiceShared
  primitives. Repo identified by UUID from termq://repos.
- harness_launch — invokes `ynh run <harness>` in a target directory with an
  optional prompt. Annotated destructiveHint: true so permissioned clients
  prompt before each call. Output capped at 4 KB suffix to keep frames bounded.

Deliberately deferred from this commit and called out in CHANGELOG:
- Full elicitation/create integration on harness_launch (annotations carry the
  prompt hint; clients with elicitation support should add the gate)
- roots/list boundary enforcement (no filesystem-touching tools currently
  exceed ~/Library/Application Support)
- termq://prs GitHub-PR resource (shells out to gh; needs more design)
- Snapshots and per-card overrides from the Tier 2 list

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implements audit §7 (CLI ⇄ MCP parity policy) and §8.3 (docs gate) as code.

ToolParity.swift — single source of truth listing every MCP tool as either
mandatoryCLI (a matching termqcli subcommand must exist) or omittedCLI (with
a stated reason). Adding a tool means editing the registry; the test enforces
it.

ToolParityTests.swift — five gates:
1. Every MCP tool appears in exactly one of the two lists.
2. The lists are mutually exclusive.
3. Every omission carries a non-empty reason (docs generator surfaces these).
4. Mandatory entries reference current tools (catches stale registry after rename).
5. Omitted entries reference current tools (same).

Scripts/check-mcp-docs.sh — narrow CI gate that fails when MCP surface files
(SchemaDefinitions.swift, ToolParity.swift) change without a matching update
to Docs/Help/reference/mcp.md. Refactors and comment edits don't trip the gate
to avoid the false-positive-then-bypass dynamic. Override marker [no-doc] in
commit subject for genuine no-surface changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…own CLI gap

Tier 7 (docs) of the MCP/CLI extend-and-fix plan. Docs were the last
deliverable; this commit closes the audit's §8 docs deliverables list.

Docs/Help/reference/mcp.md — full rewrite. Reflects Tier 0–3 surface
(rename, new tools, resource templates, structured output, subscriptions,
completion, logging mirror, ToolParity, pagination). Adds a spec-feature
support matrix at the top so readers see at a glance which MCP capabilities
TermQ implements. Documents the CLI-parity policy and the build-enforced
ToolParity test that backs it.

Docs/Help/tutorials/mcp-subscriptions.md — worked-example tutorial for the
most novel new feature. Covers what subscribe means, how TermQ implements
it (file watcher + debouncer), a TypeScript-flavoured client example, and
sharp edges (self-write notifications, debounce coalescing, per-connection
lifetime, best-effort delivery). Combines with record_handshake to show
the idiomatic in-session-assistant pattern.

CHANGELOG — documents the parity registry and the docs gate, and flags
honestly that the Tier 2/3 tools (restore, whoami, create_column,
rename_column, delete_column) don't yet have matching termqcli subcommands.
The ToolParity test verifies classification, not actual CLI command
existence — a follow-up will tighten the test once the CLI commands ship.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
YNH testing flagged that the harness_launch tool's `harness` parameter doc
said "Harness name (from termq://harnesses)", but `ynh run` actually rejects
bare names with an io_error and requires canonical ids (local/<name>,
github.com/<org>/<repo>/<name>, etc.). TermQ doesn't translate on the
caller's behalf — the LLM has to pick the right field from the resource.

SchemaDefinitions.swift — tightened the `harness` parameter description and
expanded the tool description to call out the bare-name-vs-canonical-id
distinction explicitly.

termq://harnesses resource description — documents that the resource emits
the full YNH envelope (capabilities, schema_version, ynh_version, harnesses
array) verbatim from `ynh ls --format json`, and that each harness has both
`id` (canonical, for tool calls) and `name` (bare, display-only).

mcp.md — mirrored both clarifications.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`make mcp.inspect` builds the debug `termqmcpd` then launches
`@modelcontextprotocol/inspector` via `npx --yes` against it, with
`TERMQ_DEBUG=1` set so file logs land in /tmp/termq-debug.log.

`make mcp.inspect.release` does the same against the release binary.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
MCP requires `outputSchema.type` to be `"object"` at the top level and
`structuredContent` to be a JSON object — but `terminalListSchema` declared
`type: "array"` and the `list` / `find` handlers emitted a bare array when
no pagination was requested. MCP Inspector caught both as
"invalid_value at outputSchema.type" on tools/list.

Schema now describes `{ items: [...], nextCursor?: string }` with
`required: ["items"]`. Handlers always emit the envelope — the
back-compat "bare array when unpaginated" branch is removed. The
`columnsOnly: true` path now wraps in `{ items: [...] }` too.

Test helpers updated to unwrap `items`. Resource handlers are
unchanged — resources have no outputSchema and bare arrays remain
spec-allowed there.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* `make install-mcp` / `install-mcp-debug` / `uninstall-mcp` mirror the
  CLI install targets — projects with `.mcp.json` referencing `termqmcp`
  need the binary on PATH, and only `termqcli` had an install path
  previously. `install-all` now includes the MCP server.
* `Docs/Help/reference/mcp.md` updated to describe the new envelope
  return shape (`{ items, nextCursor? }`) instead of the old "bare
  array unless paginated" line that just changed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…test recipe

The previous tutorial walked through a 6-step "set next action →
assistant reacts" pattern as if every MCP client supported it. That's
wrong: Claude Code v2.x exposes only ReadMcpResourceTool /
ListMcpResourcesTool and does not proxy resources/subscribe to the
LLM, so the assistant never receives push notifications from a TermQ
session, even though the server emits them correctly. Hitting this in
practice is confusing — the server is healthy, the client just
doesn't relay.

Updates:
- Adds a client-compatibility matrix at the top covering Inspector,
  custom SDK clients, Claude Code, and Claude Desktop.
- Reframes the handshake pattern as "what subscriptions enable" and
  notes which clients can actually observe step 2.
- Adds a four-test Inspector recipe (push, debounce, unsubscribe,
  handshake round-trip) as the canonical way to verify the stack.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* swift-format auto-fixed indentation in SchemaDefinitions.swift and the
  new atomicUpdate-based BoardLoader helpers.
* ToolParity.swift: split two long `reason` strings to satisfy
  line_length (155/156 → ≤120).
* ResourceHandlers.swift: renamed `t` → `tree` in the worktree loop to
  satisfy identifier_name.
* ToolHandlers.swift: renamed pagination locals `s` / `n` → `str` /
  `offset` for identifier_name.

Three advisory warnings remain (not errors): ToolHandlers.swift
cyclomatic_complexity 20/19 and file_length 1233/1000 — this branch
added Tier 2/3 handlers that grew the file; deferring the
extraction-into-extension refactor to a follow-up to avoid risk on a
ready-to-merge branch. ContentView.swift type_body_length 507/500 is
pre-existing on develop, untouched here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@eyelock eyelock merged commit 69e7a9b into develop May 13, 2026
8 checks passed
@eyelock eyelock deleted the feat/mcp-cli-extend-and-fix branch May 13, 2026 19:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant