Skip to content

CLI MCP Parity

Nick edited this page Jun 20, 2026 · 4 revisions

CLI ↔ MCP Parity (v0.1.27)

CW has two front doors: the CLI for human speed, the MCP server for machine context. v0.1.27 makes them two renderings of ONE data source — the capability registry — declared, derived, and fail-closed on drift, so the same capability cannot diverge between surfaces. Shipped in v0.1.27. Repo doc: docs/cli-mcp-parity.7.md.

The CLI (node scripts/cw.js ..., dist/cli.js) serves human speed: terse, scannable text with meaningful exit codes. The MCP server (cw_* JSON-RPC tools) serves machine context: complete, stable, structured JSON. Before v0.1.27 these were two hand-maintained lists that could drift. Now they are two policies over one mechanism.

This page vs CLI and MCP Surface. CLI and MCP Surface is the catalog of what each surface offers and how to use it. This page is the guarantee that the two never diverge: the registry mechanism and the fail-closed parity gate behind it.

The design mantra for this layer:

One source of truth.
Two renderings.
Same name, same flags, same order, same defaults.
The surfaces never interfere.
Fail closed on drift.

The Borrowed Idea: Mechanism vs Policy

CW follows a base-system discipline that separates mechanism from policy. The mechanism is the capability registry at src/capability-registry.ts (compiled to dist/capability-registry.js) — the single source of truth. Every capability declares one shared core entry — the mechanism both surfaces route through — plus its CLI command, its MCP tool, the surface it lives on, and whether its payload is identical across surfaces.

No business logic is stranded in cli.ts or mcp-server.ts. Composite capabilities live in src/capability-core.ts (planSummary, appRun, sandboxChoose, commitEnvelope), so both surfaces call the same core entry and differ only in how they render its result. The CLI renders for a human; the MCP tool renders for a machine; neither owns the logic.

A new runtime capability is added once, in the registry, against one core entry. The CLI command and the MCP tool are then two policies over that one mechanism — which is exactly what the parity gate checks.

one core entry -> CLI rendering (human)
              \-> MCP rendering (machine)

1. One Source, Two Renderings

The capability registry is the single source of truth, not two lists kept in sync by hand. The parity matrix — one row per capability showing its CLI command, MCP tool, shared core entry, surface, and payload relationship — is derived from the live registry, not authored. The matrix is machine-complete by design: 199 capabilities, 186 MCP tools.

2. The Human Contract and the Machine Contract Do Not Interfere

The two surfaces have different contracts and must not leak into each other:

  • CLI = human speed. Default output is terse, scannable text with meaningful exit codes. The canonical payload is available on demand via --json or --format json. Human formatting is never emitted on the machine path.
  • MCP = machine context. The result is always complete, stable, structured JSON. Machine completeness is never forced into the default human view.

A capability marked payloadIdentical returns the same canonical JSON from cw <cmd> --json and from the cw_<tool> MCP result — whitespace and generation-moment ISO timestamps aside. The --json payload is the contract, the same bytes the MCP tool returns. The human text view is policy layered on top; it never changes the payload.

3. Divergence Is Declared, Never Silent

A capability may live on one surface only, or carry a divergent payload — but never silently. It must carry a recorded reason in the registry. Several capabilities are CLI-only; for example:

  • help — human help text. MCP hosts enumerate capabilities via tools/list, not a help command.
  • loop — a convenience alias of schedule create with kind=loop. MCP hosts use cw_schedule_create with kind=loop.
  • schedule daemon — a long-running desktop daemon process, not a request/response tool. MCP hosts drive ticks via cw_schedule_due and cw_schedule_run_now.

One capability is intentionally payload-divergent (projected): commit. Both surfaces route through the single core entry runner.commit. The CLI emits the raw StateCommitResult for scripting (commit.id, commit.evidence, commit.gate, commit.acceptanceRationale); cw_commit emits the operator commit envelope (commitId, verifierGated, checkpoint, evidenceCount, snapshotPath, nextActions, plus the raw result under commit). This is a declared projection via capability-core.commitEnvelope, not drift.

4. Fail Closed on Drift

The parity gate fails closed. Any of the following is a release-blocking error:

  • a capability present on one surface but missing from the other
  • an MCP tool that is live but not declared in the registry
  • a CLI command or token that is live but not declared in the registry
  • a surface-specific or payload-divergent capability with no recorded reason
  • a payload divergence on a capability marked payloadIdentical — that is, cw <cmd> --json and cw_<tool> returning different canonical JSON

There is no "fix it later" path. A surface mismatch blocks the release until the registry, the surfaces, and the recorded reasons agree.

5. Enforced and Smoke-Covered, Not Conventional

Parity is checked by scripts/parity-check.js --check, run by npm run parity:check and wired into npm run release:check. The check loads the registry, enumerates the live CLI commands and MCP tools, and fails closed on any of the rules above.

test/cli-mcp-parity-smoke.js proves the contract end to end. It verifies registry ⇄ CLI ⇄ MCP coverage (every declared capability resolves on its declared surfaces and nothing live is undeclared), confirms --json output equals the MCP payload for every payloadIdentical capability, confirms the declared commit projection, and confirms fail-closed behavior by injecting drift — a removed peer, an undeclared tool, a reasonless exception, a mutated payload — and asserting the gate rejects each one. It is included in npm test and npm run release:check.

What v0.1.27 Closed

v0.1.27 closed the historical gaps. It added MCP peers cw_init, cw_next, cw_state_check, cw_contract_show, cw_node_list, cw_node_show, and cw_node_graph; and CLI peers app run, operator status, operator report, sandbox choose, sandbox resolve, and report --json. Everything else is on both surfaces.

Why It Matters

Parity is the seam that lets the rest of the system stay surface-agnostic. Because both front doors render one declared registry, every later layer can be added once and reach both. The v0.1.28 Run Registry Control Plane added 13 control-plane capabilities declared once in the capability registry and validated by this same fail-closed gate, so each cw <cmd> --json is schema-identical to its cw_<tool>. The v0.1.29 Execution Backends layer is inspected via backend list|show|probe and selected by --backend (parallel to --sandbox), with a result/evidence envelope that is schema-identical across backends — so the parity surface is unchanged regardless of which backend executed a run.

Later capabilities follow the same rule. cw run --drive --incremental (incremental resume — reuse the cached result of every step whose inputs are unchanged) is declared once in the registry and validated by this same gate, so its cw run --json payload is schema-identical to the cw_run_drive MCP result and incremental resume behaves the same on both front doors. In CW, parity is not a convention; it is a derived, declared, and enforced property of the build. It is not done until it is documented and tested.

How A Capability Is Added

A new runtime capability is added once, as one row in the declarative BUILTIN_CAPABILITIES: CapabilityDescriptor[] table in src/capability-registry.ts. The row names ONE shared core entry (the single source both surfaces route through) plus its cli and mcp bindings. Both front doors read that one table; the parity gate fails closed on any drift between them. There is no separate per-surface registration step — the table is the mechanism, and the two renderings are policy.

See Also

Clone this wiki locally