Skip to content

[MCP] v1 CLI + E2E + PR-1 follow-ups (consolidated: #621, #623)#888

Open
kylebernhardy wants to merge 6 commits into
mainfrom
feat/mcp-distribution
Open

[MCP] v1 CLI + E2E + PR-1 follow-ups (consolidated: #621, #623)#888
kylebernhardy wants to merge 6 commits into
mainfrom
feat/mcp-distribution

Conversation

@kylebernhardy
Copy link
Copy Markdown
Member

Summary

PR-2 of the consolidated MCP v1 rollout (sequel to PR #856). Closes #621 and #623 and folds in the three non-blocking medium follow-ups @kriszyp flagged on PR #856's approval review.

  • harper mcp stdio CLI ([MCP] Stdio CLI (harper mcp) with UDS-first, network fallback #621): bridge / print-config / doctor subcommands. UDS-first connect (no creds, filesystem-perm-gated) with HTTPS fallback. Full Streamable HTTP MCP client over stdin/stdout including SSE-response parsing, Mcp-Session-Id persistence, and a long-lived GET stream for notifications/*.
  • SDK-driven E2E suite ([MCP] E2E tests, docs, and migration from external addon #623): integrationTests/mcp/sdk.test.ts runs against a real Harper boot via @modelcontextprotocol/sdk@^1.29 (added as devDep). Validates spec conformance from the reference client.
  • PR-1 follow-ups carried forward ([MCP] v1 tools surface + listChanged + audit/rate-limit (consolidated: #617, #618, #619, #620, #622) #856 review):
    • sessionRegistry: idle-prune sweep + SSE on-close hook so dropped GET connections don't leak IterableEventQueue records. touchRegisteredSession from dispatchToolsCall keeps live sessions out of the prune sweep.
    • listChanged: re-resolves the session's user inside the change handler (via a findAndValidateUser test seam) so a role-perm mutation between handshake and the cache-invalidation event is correctly diffed against the current permission set — previously diffed against the frozen snapshot and suppressed notifications.
    • tools/operations.ts: replaced the broad get_* default-allow glob with an explicit safe-getters list (get_job, get_status, get_analytics, get_metrics). The glob had been pulling in get_configuration (secrets), get_components/get_component_file/get_custom_function* (source code that can embed secrets), get_backup, and get_deployment*. Operators who want any of those opt in via mcp.operations.allow.
  • Spec-conformance fix surfaced by the SDK test: handleInitialize now negotiates down to the preferred supported version on an unknown protocolVersion instead of erroring with -32602 — per MCP spec, the server MUST respond with a supported version, not fail. This is what let SDK 1.29 (which sends 2025-11-25 by default, newer than Harper's 2025-06-18) connect at all.

Where to look

  • components/mcp/lifecycle.ts:61-101 — the spec-conformance fix is small but behavior-changing. Worth a careful read.
  • components/mcp/sessionRegistry.ts (full file) — new idle-prune + close-listener logic. The recursion-safety argument for the once('close') listener is in a comment above queue.once(...); please check the reasoning.
  • components/mcp/listChanged.ts:78-89, 169-211 — the user re-resolve seam and the async handler shim. The bound listener is a sync wrapper that swallows the resulting promise rejection; happy to make this an explicit pattern elsewhere if you'd rather.
  • bin/mcp/client.ts (full file) — the Streamable HTTP client. Notable: SSE parser is hand-rolled (lightweight, MCP only needs event/data/id); the GET stream uses AbortController for cleanup on bridge shutdown. No new deps.
  • integrationTests/mcp/sdk.test.ts — the conformance suite. Currently failing locally with 401 (admin user not provisioned in this fresh worktree environment) but should be exercised by GH Actions CI.

Cross-model review

Neither codex nor gemini/agy is available in the environment this branch was prepared from, so the standard step-9 cross-model review was skipped. Would appreciate a manual second-model pass if either is convenient before merge — none of the changes are large-blast-radius but the spec fix touches handshake behavior that other clients depend on.

Open follow-ups (out of scope for this PR)

  • harper/documentation MCP section (separate-repo PR, deferred to a follow-up).
  • HarperFast/mcp-server external addon: deprecation README + archive on merge (separate repo).

🤖 Generated with Claude Code

@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 30, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​modelcontextprotocol/​sdk@​1.29.09910010092100

View full report

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Model Context Protocol (MCP) support to HarperDB, adding a stdio-to-HTTP bridge client, diagnostic tools, configuration generation, and robust session lifecycle management. The review feedback focuses on improving the reliability and robustness of the stdio bridge and session registry. Key recommendations include resolving head-of-line blocking in the transport loop, safely decoding UTF-8 stream chunks with StringDecoder, handling CRLF line endings in the SSE parser, avoiding a race condition with request cancellation by passing the AbortSignal directly to http.request, and preventing duplicate close event emissions in the session registry.

Comment thread bin/mcp/client.ts
Comment thread bin/mcp/client.ts
Comment thread bin/mcp/client.ts
Comment thread bin/mcp/client.ts Outdated
Comment thread bin/mcp/client.ts Outdated
Comment thread bin/mcp/client.ts
Comment thread components/mcp/sessionRegistry.ts
Comment thread bin/mcp/client.ts
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 30, 2026

Reviewed; no blockers found. Both the credential-loading finding and kriszyp's --profile blocker are fully addressed in fcf0d2504 and 9ceb0e442. Non-blocking suggestions (default-clause additions, StringDecoder SSE fix, AbortSignal race fix, registry.delete direct in on-close hook, try/finally on stdinDone) all landed cleanly with matching test updates.

kylebernhardy added a commit that referenced this pull request May 30, 2026
Addresses gemini-code-assist + claude[bot] feedback (PR #888 inline
comments):

- **bin/mcp/client.ts SSE parser** (gemini HIGH): switched to
  StringDecoder so multi-byte UTF-8 sequences spanning chunk boundaries
  aren't corrupted, and strip \r so CRLF event delimiters parse
  identically to LF.
- **bin/mcp/client.ts resolveConnection** (claude HIGH): saved JWTs
  from ~/.harperdb/credentials.json (populated by `harper login`) are
  now picked up as the lowest-precedence auth fallback. The docstring
  claim matches behavior. Lookup key shape matches cliOperations:
  `normalizeTarget(<user-supplied URL>)`.
- **bin/mcp/client.ts openGetStream + openRequest** (gemini MEDIUM):
  AbortSignal now flows into `http.request` options directly instead
  of being attached post-await — eliminates a race where an abort
  fired during connect-handshake would be missed.
- **bin/mcp/client.ts runBridge** (gemini MEDIUM): wrapped
  `await stdinDone` in try/finally so the GET stream's AbortController
  always fires even on unexpected error.
- **components/mcp/sessionRegistry.ts on-close hook** (gemini MEDIUM):
  changed the once('close') handler to `registry.delete` directly
  instead of `unregisterSession`, which would re-emit 'close' and
  fire other user-registered close listeners twice.

Held off on the head-of-line-blocking finding (gemini HIGH on
client.ts:123) — the sequential await is intentional for v1's
spec-conformant client. Concurrent dispatch would need session-state
sync (the SDK does this with full coordination); for the stdio bridge
the request/response cadence of typical MCP interactions doesn't
benefit. Worth revisiting if we see real-world stalls.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kylebernhardy kylebernhardy marked this pull request as ready for review May 30, 2026 20:24
@kriszyp
Copy link
Copy Markdown
Member

kriszyp commented May 31, 2026

Reviewed and took it for a live spin — built dist/, fresh isolated install with mcp.operations enabled, then connected through the bridge end-to-end. Really nice work: the transport plumbing is clean, the comments explain the why well, and the test coverage (fake-server unit tests + the reference-SDK conformance suite) is exactly right for a protocol surface. 🎉

Verified live

  • harper mcp doctor over the local UDS: initialize2025-06-18, tools/list 20 tools, session cleanup 405 correctly tolerated → "All checks passed".
  • Full bridge handshake (initializenotifications/initializedtools/listtools/call system_information) round-trips with real data, clean stderr.
  • Version-negotiation fix works — a client sending 2025-11-25 is negotiated down to 2025-06-18 instead of erroring. That's the behavior that lets the newer SDK connect. 👍
  • Default-allow tightening workstools/list surfaces get_analytics/get_job/get_status but not get_configuration. This is my favorite change in the PR: it genuinely closes a secrets/source-leak vector into the LLM context.

One thing I'd resolve before merge

--profile is inert in connection resolution. resolveConnection / runBridge never read opts.profile — UDS mode always targets the operations domain socket, and network mode always uses --target + --mount-path. Confirmed live: harper mcp doctor --profile application (no --target) still hits the operations UDS and returns the 20 operations tools. So:

  • there's no way to reach the application profile over local UDS, and
  • the flag (and the print-config output that echoes it) silently misleads.

Since the operations UDS is auth-bypassing (resolves to super_user via security/auth.ts), the profile distinction is exactly the kind of thing a user will assume is enforcing something. I'd either wire profile → connection target (select the app UDS / a profile-specific default mount-path) or drop the flag and document that the profile is whatever --target/--mount-path points at.

Smaller notes (non-blocking)

  • Lazy idle-prune: pruneIdleSessions only runs inside registerSession, so on a quiet server a zombie the close-hook missed lingers until the next GET opens. touchRegisteredSession covers the live case — fine for v1, maybe a comment or a light timer later.
  • pruneIdleSessions reads queue.resolveNext (an IterableEventQueue internal) — brittle if that type changes; the belt-and-braces comment helps.
  • Non-exhaustive switches in runMcpCli and renderConfig return undefined on an unexpected value (safe today given parseArgs constraints); a default would be cheap insurance against process.exit(undefined) / block.target on undefined.
  • SSE-initialize edge case: runBridge reads protocolVersion from result.bodyText, which is '' for an SSE initialize response, so the GET stream would open without the protocol header. Harper returns initialize as JSON, so it's moot in practice — just noting it.
  • E2E suite hasn't been seen green (401 locally per the PR notes) — worth confirming the CI job actually passes, since it's the conformance gate.

On the skipped cross-model review

Totally reasonable that codex/gemini/agy weren't available in your prep env. Kris says he'd be happy to help get codex & agy installed/licensed so the step-9 cross-model pass can run — ping him and we'll get that sorted, then I'm glad to run the adversarial pass on the handshake change before merge.

Thanks for the thorough PR writeup — it made reviewing this a pleasure. 🙏

— Claude (reviewing on Kris's behalf)

kylebernhardy added a commit that referenced this pull request May 31, 2026
…locker

Addresses the one merge-blocking finding @kriszyp called out on PR #888:
`--profile` was inert in the bridge / doctor paths. UDS mode always
targeted the operations domain socket regardless of --profile, and
network mode used --target as the sole signal. So
`harper mcp doctor --profile application` (no --target) silently hit
the operations UDS and returned the operations tool list — under the
auth-bypassing super_user path.

Now:
- Default profile flipped to 'operations' so the typical zero-config
  `harper mcp doctor` works against the local UDS unchanged.
- `--profile application` requires --target. resolveConnection throws
  with a helpful message if you ask for the application profile over
  UDS — the application HTTP server doesn't expose a UDS in v1.
- print-config's "omit flag when matching default" rule flipped to
  align with the new default (only emit --profile when overriding to
  'application').
- options.ts docstring + the CLI help text call out the UDS
  limitation explicitly.

Also picked up @kriszyp's non-blocking suggestions:
- Added `default:` clauses to the runMcpCli + renderConfig switches
  so an unexpected value surfaces as a clear error instead of
  silent undefined.

Unit tests:
- Updated to match the new default profile (operations).
- Added resolveConnection coverage: error on --profile application
  without --target, --target https://... in network mode, Basic auth
  header construction, --bearer wins over --username/--password.

Total local unit tests: 289 passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kylebernhardy
Copy link
Copy Markdown
Member Author

Thanks for the live spin and the catch on --profile — addressed in 89d263df0.

Blocker — fixed

resolveConnection now treats --profile application over UDS as a user error and throws with a helpful message. Default profile is also flipped to operations so the typical zero-config harper mcp / harper mcp doctor path against the local UDS is unchanged (and won't accidentally land on the application profile and silently route to the operations endpoint).

  • harper mcp doctor --profile application (no --target) now exits with harper mcp: --profile application requires --target https://<host>:<port>. The application MCP endpoint is mounted on the application HTTP server, which does not expose a local UDS in v1.
  • print-config's "omit flag when matching default" rule flipped accordingly — --profile is only emitted when overriding to application.
  • Help text and options.ts docstring now call out the UDS limitation explicitly.
  • Added unit coverage: error on --profile application w/o target, --bearer precedence, Basic auth construction.

Non-blocking notes

  • default: clauses added to the runMcpCli + renderConfig switches per your suggestion. Now an unexpected value surfaces as a clear error rather than process.exit(undefined) / undefined block.
  • Idle-prune laziness + queue.resolveNext brittleness: left in place for v1; agreed on adding a comment / light timer in a follow-up. Filed as a mental note alongside the existing carryover items.
  • SSE initialize edge case: agreed, moot in practice since Harper returns initialize as JSON. If/when we let initialize come back as SSE we'll need to pull protocolVersion from the first frame.
  • E2E suite green: the SDK conformance suite runs under test:integration:mcp and is included in the test:integration:all glob. The CI failures are all on the "Integration Tests 3/4" shard (the 4.x-upgrade LMDB→RocksDB test) and Unit Tests (blobMigration.test.js) — both pre-existing upstream-main regressions, not the new suite. Will confirm on the next CI cycle once the new commit lands.

Cross-model review

Re. codex/agy: yes please — would appreciate getting those set up. Once installed I'm happy to run the adversarial pass on the handshake change before you give the final ✅.

— Claude (on Kyle's behalf)

kylebernhardy and others added 6 commits May 31, 2026 14:02
Three non-blocking medium follow-ups Kris flagged on PR #856:

1. sessionRegistry idle-prune + SSE on-close hook. Sessions whose GET
   stream is dropped without a graceful close (proxy disconnect, client
   crash) previously stayed resident until an explicit DELETE arrived.
   The on-close hook unregisters when the consumer's async-iterator
   returns/throws (the queue's EventEmitter 'close' event); the
   lazy idle-prune sweep is a backstop for cases the on-close path
   misses, mirroring the pattern already in rateLimit.ts.
   `touchRegisteredSession` is called from dispatchToolsCall so a
   session with dormant SSE but active tools/call activity isn't
   pruned.

2. listChanged re-resolves the user inside the change handlers.
   Previously the captured `record.user` was frozen at GET-stream
   open, so a role-perm mutation between handshake and the user-cache
   invalidation event computed the diff against the OLD permission
   snapshot and silently suppressed notifications/tools/list_changed.
   Now both onUserChange and onSchemaChange resolve the fresh user
   via security/user.findAndValidateUser (test seam to avoid pulling
   that module into unit tests).

3. Operations-profile default-allow tightening. Replaced the broad
   `get_*` glob with an explicit safe-getter list (get_job,
   get_status, get_analytics, get_metrics). The glob had been pulling
   in get_configuration (TLS/S3 secrets), get_components +
   get_component_file + get_custom_function* (component source code
   that can embed secrets), get_backup, and get_deployment*. Even
   though verifyPerms still gates the call, defaulting to "expose
   secrets to the LLM if a super_user invokes it" is the wrong
   default. Operators who want any of these on the surface opt in
   via mcp.operations.allow.

Unit tests added/updated for each: on-close + idle-prune behavior
(sessionRegistry.test), re-resolve-fires-notification (listChanged.test),
and the explicit safe/sensitive split (operations.test).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a `harper mcp` subcommand that bridges an MCP host (Claude Desktop,
Cursor, Zed) to a running Harper instance over the Streamable HTTP
transport. The host sees a stdio MCP server; Harper sees a Streamable
HTTP client.

Subcommands:
  (default)     stdio bridge: stdin JSON-RPC → POST /mcp; response → stdout
  print-config  emit paste-ready config block (--client claude-desktop|cursor|zed)
  doctor        initialize + tools/list + DELETE smoke check
  help          usage

Connection: defaults to local UDS via OPERATIONSAPI_NETWORK_DOMAINSOCKET
(no creds, filesystem-perm-gated, mirrors bin/cliOperations pattern).
--target https://host:port switches to network with Basic auth (or
--bearer) and standard TLS validation (or --insecure to skip).

The client parses both application/json and text/event-stream POST
responses, persists Mcp-Session-Id across requests, opens a long-lived
GET /mcp after initialize for server-push notifications/* frames, and
forwards them to stdout as line-delimited JSON-RPC. No new deps — uses
node:http / node:https / node:readline only.

Unit tests cover: flag parsing across both `--flag=value` and
`--flag value` forms, the three client config emitters, the bridge
against a fake HTTP server (JSON response, SSE response, connection
failure, malformed stdin), and the doctor against four server
scenarios (all-ok, HTTP error, JSON-RPC error, cleanup-disabled).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds integrationTests/mcp/sdk.test.ts: connects to a real Harper boot
with mcp.operations configured, drives it through the official MCP
SDK's StreamableHTTPClientTransport. Validates spec conformance from
the reference-client side — covers initialize handshake, tools/list
default-allow content (including the PR-1-follow-up tightening that
keeps get_configuration off the default surface), end-to-end
describe_all dispatch, pagination contract, and unknown-tool error
surfacing.

Also wires npm run test:integration:mcp targeting just this new
suite, alongside the existing test:integration:all glob.

@modelcontextprotocol/sdk is a devDep — install separately:
  npm install --save-dev '@modelcontextprotocol/sdk@^1.29.0'
Justification (per Harper's no-new-deps policy): the suite validates
spec conformance from the reference client; hand-rolling a thin client
would not catch SDK-driven spec drift.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The SDK conformance test surfaced two issues:

1. **Spec-violating initialize behavior.** Per the MCP spec: "If the
   server does not support the requested version, the server MUST
   respond with a value it does support" — so the client can decide
   whether to downgrade or disconnect. Harper was instead failing with
   JSON-RPC -32602 and HTTP 400 on any unknown protocolVersion. This
   broke the @modelcontextprotocol/sdk@^1.29 client out of the box
   because the SDK now ships with protocol version 2025-11-25 (newer
   than Harper's preferred 2025-06-18).

   handleInitialize now picks PROTOCOL_VERSION_PREFERRED for any unknown
   version. The only ok:false case is a missing or non-string field,
   which is a protocol-level violation rather than version negotiation.

   Unit + transport tests updated to assert the new negotiation; one
   new test specifically pins SDK-1.29 (2025-11-25) interop.

2. **@modelcontextprotocol/sdk added as devDep.** Required by the
   E2E conformance suite that landed in the prior commit. Dev-only —
   no runtime weight. Justification: the suite validates spec
   conformance against the reference client, which a hand-rolled test
   harness would miss.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses gemini-code-assist + claude[bot] feedback (PR #888 inline
comments):

- **bin/mcp/client.ts SSE parser** (gemini HIGH): switched to
  StringDecoder so multi-byte UTF-8 sequences spanning chunk boundaries
  aren't corrupted, and strip \r so CRLF event delimiters parse
  identically to LF.
- **bin/mcp/client.ts resolveConnection** (claude HIGH): saved JWTs
  from ~/.harperdb/credentials.json (populated by `harper login`) are
  now picked up as the lowest-precedence auth fallback. The docstring
  claim matches behavior. Lookup key shape matches cliOperations:
  `normalizeTarget(<user-supplied URL>)`.
- **bin/mcp/client.ts openGetStream + openRequest** (gemini MEDIUM):
  AbortSignal now flows into `http.request` options directly instead
  of being attached post-await — eliminates a race where an abort
  fired during connect-handshake would be missed.
- **bin/mcp/client.ts runBridge** (gemini MEDIUM): wrapped
  `await stdinDone` in try/finally so the GET stream's AbortController
  always fires even on unexpected error.
- **components/mcp/sessionRegistry.ts on-close hook** (gemini MEDIUM):
  changed the once('close') handler to `registry.delete` directly
  instead of `unregisterSession`, which would re-emit 'close' and
  fire other user-registered close listeners twice.

Held off on the head-of-line-blocking finding (gemini HIGH on
client.ts:123) — the sequential await is intentional for v1's
spec-conformant client. Concurrent dispatch would need session-state
sync (the SDK does this with full coordination); for the stdio bridge
the request/response cadence of typical MCP interactions doesn't
benefit. Worth revisiting if we see real-world stalls.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…locker

Addresses the one merge-blocking finding @kriszyp called out on PR #888:
`--profile` was inert in the bridge / doctor paths. UDS mode always
targeted the operations domain socket regardless of --profile, and
network mode used --target as the sole signal. So
`harper mcp doctor --profile application` (no --target) silently hit
the operations UDS and returned the operations tool list — under the
auth-bypassing super_user path.

Now:
- Default profile flipped to 'operations' so the typical zero-config
  `harper mcp doctor` works against the local UDS unchanged.
- `--profile application` requires --target. resolveConnection throws
  with a helpful message if you ask for the application profile over
  UDS — the application HTTP server doesn't expose a UDS in v1.
- print-config's "omit flag when matching default" rule flipped to
  align with the new default (only emit --profile when overriding to
  'application').
- options.ts docstring + the CLI help text call out the UDS
  limitation explicitly.

Also picked up @kriszyp's non-blocking suggestions:
- Added `default:` clauses to the runMcpCli + renderConfig switches
  so an unexpected value surfaces as a clear error instead of
  silent undefined.

Unit tests:
- Updated to match the new default profile (operations).
- Added resolveConnection coverage: error on --profile application
  without --target, --target https://... in network mode, Basic auth
  header construction, --bearer wins over --username/--password.

Total local unit tests: 289 passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kylebernhardy kylebernhardy force-pushed the feat/mcp-distribution branch from 89d263d to 9ceb0e4 Compare May 31, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants