Skip to content

feat(agents)!: Horton requires an explicit MCP allowlist#4363

Closed
msfstef wants to merge 1 commit into
mainfrom
security/pr5-horton-mcp-allowlist
Closed

feat(agents)!: Horton requires an explicit MCP allowlist#4363
msfstef wants to merge 1 commit into
mainfrom
security/pr5-horton-mcp-allowlist

Conversation

@msfstef
Copy link
Copy Markdown
Contributor

@msfstef msfstef commented May 19, 2026

Summary

Closes the "every MCP server's tools auto-attached to Horton" surface (horton.ts:396 — unconditional ...mcp.tools()). registerHorton, createBuiltinAgentHandler, and BuiltinAgentsServer now require an mcpAllowlist: '*' | ReadonlyArray<string>:

  • `'*'` — every currently-registered MCP server (the previous default, now explicit-unsafe).
  • `[]` — disable MCP entirely.
  • `['gmail', 'github']` — restrict to named servers.

Stacked on #4354 (the characterization tests, base branch). When that lands, this auto-rebases to main.

See plans/sandboxing-investigation.md §1.4, §1.6, §3.5.

Why required (not optional with a default)

A silent default — even a safe one like `[]` — would let customers ship with the wrong toolset without realizing it. Forcing the choice catches the failure mode at compile time. One line per callsite to update.

Cross-package edit

The desktop app's `BuiltinAgentsServer` callsite (packages/agents-desktop/src/main.ts:2105) is updated to pass `'*'`. Rationale: the user already curates the MCP server list in the desktop settings UI, so the choice of "expose them all to Horton" is explicit at the configuration layer. A per-server allowlist surfaced in the UI would be a separate change.

The env-driven CLI entrypoint reads `ELECTRIC_AGENTS_MCP_ALLOWLIST` (`*` or comma-separated names) and defaults to `[]` for secure-by-default deployment — operators opt in explicitly.

Breaking

Compile-time error on every direct caller. Migration is one line per callsite:

```ts
registerHorton(registry, { ..., mcpAllowlist: '*' }) // keep old behavior
registerHorton(registry, { ..., mcpAllowlist: [] }) // disable MCP
registerHorton(registry, { ..., mcpAllowlist: ['gmail'] }) // explicit list
```

`createAgentHandler(agentServerUrl, ...)` gains `mcpAllowlist` as its second positional argument.

Test plan

  • Horton composition characterization from PR 0 updated:
    • `mcpAllowlist: '*'` → unrestricted MCP sentinel present
    • `mcpAllowlist: []` → no MCP sentinel at all
    • `mcpAllowlist: ['gmail']` → sentinel with that allowlist
  • Built-in toolset unaffected regardless of allowlist
  • Existing `horton-model-selection` / `bootstrap-mcp` / `builtin-pull-wake-registration` tests updated for new required field
  • `pnpm test` (67/67), `pnpm typecheck`, `pnpm stylecheck` clean in agents
  • desktop `pnpm typecheck` clean (modulo pre-existing `@electric-sql/client` import errors in `cloud-agent-servers.ts`, unrelated)

🤖 Generated with Claude Code

Base automatically changed from msfstef/desktop-golden-test-agents to main May 19, 2026 13:34
@codecov
Copy link
Copy Markdown

codecov Bot commented May 19, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
383 1 382 2
View the full list of 1 ❄️ flaky test(s)
test/horton-pull-wake-e2e.test.ts > pull-wake Horton e2e with mocked LLM > dispatches explicit runner-policy wakes and Horton writes mocked responses

Flake rate in main: 100.00% (Passed 0 times, Failed 8 times)

Stack Traces | 20.1s run time
Error: initial send did not reach Horton within 20000ms

Timed out after 20000ms

entity: 200 
{"url":"/horton/pull-wake-horton-1779199168882","type":"horton","status":"idle","streams":{"main":".../horton/pull-wake-horton-1779199168882/main","error":".../horton/pull-wake-horton-1779199168882/error"},"dispatch_policy":{"targets":[{"type":"runner","runnerId":"horton-pull-wake-e2e-test"}]},"tags":{},"spawn_args":{},"created_by":"/principal/user%3Atest-user","created_at":1779199168939,"updated_at":1779199169049}

runner health: 200 
{"runner":{"id":"horton-pull-wake-e2e-test","admin_status":"enabled","liveness_status":"online","lease_expires_at":"2026-05-19T14:01:00.883Z","lease_remaining_ms":71902,"wake_stream":".../runners/horton-pull-wake-e2e-test/wake","wake_stream_offset":"0000000000000000_0000000000000179","last_seen_at":"2026-05-19T13:59:30.883Z","created_at":"2026-05-19T13:59:28.865Z"},"client":{"status":"streaming","stream_connected":true,"last_heartbeat_ok":true,"last_claim_result":"claimed","started_at":"2026-05-19T13:59:28.879Z","stream_connected_since":"2026-05-19T13:59:28.898Z","last_error":null,"last_error_at":null,"last_heartbeat_at":"2026-05-19T13:59:28.915Z","last_claim_at":"2026-05-19T13:59:28.965Z","last_dispatch_at":"2026-05-19T13:59:28.982Z","reconnect_count":0,"events_received":1,"claims_succeeded":1,"claims_skipped":0,"claims_failed":0},"claims":{"active_count":0,"active":[]},"dispatch":{"entities_with_active_claim":0,"entities_with_outstanding_wake":0,"entities_with_pending_work":0},"health":{"status":"healthy","issues":[]}}

subscription runner:horton-pull-wake-e2e-test:e8801d9ecbbd6a68: 200 OK
{"id":"runner:horton-pull-wake-e2e-test:e8801d9ecbbd6a68","subscription_id":"runner:horton-pull-wake-e2e-test:e8801d9ecbbd6a68","type":"pull-wake","streams":[{"path":"horton/pull-wake-horton-1779199168882/main","link_type":"explicit","acked_offset":"0000000000000000_0000000000000684"}],"wake_stream":"runners/horton-pull-wake-e2e-test/wake","lease_ttl_ms":30000,"created_at":"2026-05-19T13:59:28.950Z","status":"active","description":"Electric Agents runner horton-pull-wake-e2e-test"}

runner wake stream: 200 OK
[{"type":"wake","subscription_id":"runner:horton-pull-wake-e2e-test:e8801d9ecbbd6a68","stream":"horton/pull-wake-horton-1779199168882/main","generation":1,"ts":1779199168962}]

entity main stream: 200 OK
[{"type":"entity_created","key":"entity-created","value":{"entity_type":"horton","timestamp":"2026-05-19T13:59:28.939Z","args":{}},"headers":{"operation":"insert","offset":"0000000000000000_0000000000000169"}},{"type":"inbox","key":"msg-in-1779199168960-fv3nr2","value":{"from":"/principal/user%3Atest-user","payload":"Please answer via pull-wake.","timestamp":"2026-05-19T13:59:28.960Z","mode":"immediate","status":"processed","processed_at":"2026-05-19T13:59:28.960Z"},"headers":{"operation":"insert","offset":"0000000000000000_0000000000000469"}},{"type":"error","key":"error-1-69c23822-a5c4-4e96-97b4-fd1afa9d3833","value":{"error_code":"HANDLER_FAILED","message":"Cannot read properties of undefined (reading 'length')"},"headers":{"operation":"insert","offset":"0000000000000000_0000000000000684"}}]
 ❯ waitForMockCallWithDiagnostics test/horton-pull-wake-e2e.test.ts:208:11
 ❯ test/horton-pull-wake-e2e.test.ts:345:5

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Replaces the unconditional `...mcp.tools()` spread in
`createAssistantHandler` (previously at `horton.ts:396`) with an
allowlist-aware expansion. `mcpAllowlist` is now a required field on
the registration options at every layer customers touch:
`registerHorton`, `createBuiltinAgentHandler`, `BuiltinAgentsServer`.
Acceptable values:

- `'*'` — every registered MCP server (the previous behavior;
  explicit-unsafe).
- `[]` — disable MCP entirely.
- `['gmail', ...]` — restrict to named servers.

Forcing the choice catches the "I accidentally exposed every MCP server
to LLM-driven tool calls" failure mode at compile time. Existing
callers see a TS error and update one line.

Desktop wiring passes `'*'` because the user already curates the MCP
server list in the desktop settings UI. The env-driven entrypoint reads
`ELECTRIC_AGENTS_MCP_ALLOWLIST` (`*` or comma-separated names) and
defaults to `[]` for a secure-by-default deployment.

BREAKING: any code calling `registerHorton`, `createBuiltinAgentHandler`,
`BuiltinAgentsServer`, or `createAgentHandler` must add `mcpAllowlist`
or supply it as a positional arg. Migration is one line per callsite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@msfstef msfstef force-pushed the security/pr5-horton-mcp-allowlist branch from 75d9c70 to 104c319 Compare May 19, 2026 13:54
@msfstef msfstef added the claude label May 19, 2026
@msfstef
Copy link
Copy Markdown
Contributor Author

msfstef commented May 19, 2026

Closing as MCP servers are already explicitly registered and added so no reason to have a separate allowlist mechanism IMO.

@msfstef msfstef closed this May 19, 2026
@claude
Copy link
Copy Markdown

claude Bot commented May 19, 2026

Claude Code Review

Summary

This PR replaces Horton's unconditional ...mcp.tools() spread with a required mcpAllowlist: '*' | ReadonlyArray<string> field on registerHorton, createBuiltinAgentHandler, and BuiltinAgentsServer. The design — fail-loud at compile time so callers can't silently ship the wrong toolset — is the right shape for this surface. The agents-package tests are well thought through and the env entrypoint defaults to [] (secure-by-default). However, several real callsites outside the agents package were not updated, and one of them is the production electric-ax CLI; another is the e2e test that codecov is now reporting as 100% failing with the exact runtime error the missing field produces.

What's Working Well

  • The three-state allowlist ('*' | [] | string[]) maps cleanly onto the mcp.tools() sentinel (packages/agents-mcp/src/tools.ts:26): undefined allowlist = all servers, string[] = restricted, omit-sentinel = disabled. Composes naturally with what's already there.
  • Making the field required (no ?) instead of providing a default is the correct call for a security surface — a silent default, even a safe one, lets the wrong choice slip through code review. The PR description justifies this well.
  • Env parsing in parseMcpAllowlistEnv (packages/agents/src/entrypoint-lib.ts:6) is tight: trims whitespace, filters empties, defaults to [] when unset. The CLI deployment story is secure-by-default while the desktop path is explicit-unsafe at a single audited callsite.
  • Test coverage in horton-tool-composition.test.ts walks all three modes plus the invariant that the built-in toolset is unaffected by the allowlist — exactly what a follow-up PR would need to regress-check.
  • The changeset narrative is unusually well-written and points to the right migration for each caller archetype.

Issues Found

Critical (Must Fix)

Four callsites not updated — production CLI broken, e2e test crashing

Files:

  • packages/electric-ax/src/start.ts:408 — production electric-ax CLI entrypoint (new BuiltinAgentsServer({...}), no mcpAllowlist)
  • packages/agents-server/test/horton-pull-wake-e2e.test.ts:278
  • packages/agents-server/test/horton-spawn-worker.test.ts:34
  • packages/agents-server/test/horton-title-generation.test.ts:34

All four import BuiltinAgentsServer either from @electric-ax/agents (path-mapped to source in tsconfig.json) or directly from ../../agents/src/server. With mcpAllowlist declared as a required field on BuiltinAgentsServerOptions (packages/agents/src/server.ts:80), strict TypeScript ("strict": true in both electric-ax/tsconfig.json and agents-server/tsconfig.json) will fail to compile these packages.

Smoking gun: the codecov bot on this PR reports horton-pull-wake-e2e.test.ts failing with:

Error: Cannot read properties of undefined (reading 'length')

That is exactly the crash produced by mcpAllowlist.length > 0 in horton.ts:403 when mcpAllowlist is undefined. Vitest runs in transpile-only mode here (vitest.config.ts has no type-checking step), so the missing required field doesn't block the test — it just crashes at first wake. The "100% flake rate in main" framing in codecov's report is misleading: this is a deterministic crash this PR introduced, not a pre-existing flake.

Impact:

  • pnpm typecheck will fail in packages/electric-ax and packages/agents-server once merged.
  • electric-ax ships the CLI most users start the runtime with — without mcpAllowlist, the runtime crashes at the first Horton wake instead of failing fast at startup, and "no MCP for anyone" becomes the implicit behavior at a callsite where MCP previously worked.
  • The PR description's "pnpm test (67/67), pnpm typecheck, pnpm stylecheck clean in agents" line confirms only the agents package was checked. The desktop check is the only other typecheck run.

Suggested fix:

  • packages/electric-ax/src/start.ts:408 — pick a deliberate default for the CLI. Most defensible is mcpAllowlist: [] (matches the env-entrypoint default; CLI users opt in via env), but if electric-ax is meant to be the "everything-on" dev convenience, '*' would be defensible too — that's the security choice the PR is forcing, and electric-ax should make it explicitly.
  • The three agents-server/test/horton-*.test.ts files — add mcpAllowlist: '*' (or []) to each new BuiltinAgentsServer({...}) call. These are integration tests of Horton behavior, not the allowlist itself, so the value just needs to match what the test expects of MCP availability.

Important (Should Fix)

Changeset is marked patch for a breaking API change

File: .changeset/horton-mcp-allowlist.md:1-4

Both packages (@electric-ax/agents, @electric-ax/agents-desktop) are bumped patch, but this is a TS-loud breaking change to three public APIs and a positional-argument shift in a fourth. The commit message header is feat(agents)!: — the ! flags breaking — and the changeset body itself opens with "Breaking, TS-loud." Pre-1.0 (agents is at 0.4.3) some teams allow breaking changes inside patch, but a downstream consumer auto-updating patch versions will hit compile errors with no signal from the version number. minor matches the intent of the change and the rest of the document.

Suggestions (Nice to Have)

createAgentHandler positional-arg shift is brittle

File: packages/agents/src/bootstrap.ts:163

mcpAllowlist is inserted as the second positional argument, pushing workingDirectory, streamFn, createElectricTools, and serveEndpoint down by one. Any caller that passes more than just the URL has to renumber the rest. Out of scope for this PR, but worth noting: this signature is asking to be refactored to an options object (matching createBuiltinAgentHandler) the next time it's touched. If you want to head off the next breaking change, consider doing it now while this is already a breaking PR.

parseMcpAllowlistEnv accepts * only as the whole string

File: packages/agents/src/entrypoint-lib.ts:6-13

ELECTRIC_AGENTS_MCP_ALLOWLIST="*,gmail" parses to ['*', 'gmail']* becomes a literal server name. Not dangerous (still restrictive, not a wildcard escape), but a user writing *,gmail probably meant "all + gmail" or just "all", not "the server named * plus gmail". Either reject the mixed form or treat any * token as the wildcard.

Per-wake allocation of the allowlist copy

File: packages/agents/src/agents/horton.ts:404

mcp.tools([...mcpAllowlist]) clones the readonly array into a fresh string[] on every assistant wake. The clone is needed because mcp.tools types its parameter as mutable string[], but the allowlist is constant for the handler's lifetime — could be hoisted into the closure or computed once in createAssistantHandler. Microscopic; only worth doing if you're touching this code anyway.

Issue Conformance

No linked issue, which the review template flags as a warning. The PR description references plans/sandboxing-investigation.md §1.4, §1.6, §3.5 instead, and the "why required (not optional with a default)" rationale is clearly stated, so the design intent is well-documented even without a tracking issue. The implementation matches the stated design: required field, three-state allowlist, secure-by-default CLI, explicit-unsafe desktop. Scope is appropriately narrow — no per-tool allowlist, no UI surface in the desktop app, both correctly deferred.

Previous Review Status

First review iteration.


Review iteration: 1 | 2026-05-19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants