Skip to content

Hybrid architecture: Nerve-as-MCP-server + Codex thread sync + external-agents bootstrap#79

Merged
pufit merged 4 commits into
mainfrom
pufit/external-agents-mcp
May 20, 2026
Merged

Hybrid architecture: Nerve-as-MCP-server + Codex thread sync + external-agents bootstrap#79
pufit merged 4 commits into
mainfrom
pufit/external-agents-mcp

Conversation

@pufit
Copy link
Copy Markdown
Member

@pufit pufit commented May 20, 2026

Summary

Three-phase rollout that turns Nerve into the MCP backbone for external chat agents (Codex, Claude Code) while keeping it the always-on backend for tasks, plans, memory, notifications, and sync sources.

Commits

  1. refactor(tools): runtime-agnostic tool registry — extracts Nerve's tool definitions out of the in-process claude_agent_sdk decorator into a ToolRegistry + ToolSpec + ToolContext model. Tools split by domain (handlers/tasks.py, handlers/memory.py, …) so they can be served via the current Claude SDK adapter and the new external MCP server without logic duplication. Drops module-level session-id globals that were unsafe under concurrent sessions.

  2. feat(mcp): external MCP server endpoint over Streamable HTTP — adds nerve/mcp_server/ mounted at /mcp/v1. Bearer-JWT auth reuses the gateway secret. Each external connection materializes as a source=external "satellite session" so tool calls, plans, and notifications attribute to a real Nerve session and show up in the UI. ask_user from an external client is fire-and-forget (Nerve can't inject answers back into a remote agent's thread). HoA tools gated behind mcp_endpoint.include_hoa.

  3. feat(sources): Codex thread sync sourceCodexThreadSource tails ~/.codex/sessions/ and ~/.codex/archived_sessions/, filtering at the session_meta.cwd boundary so only threads inside the configured Nerve workspace are ingested. Translator covers every Codex 0.130 item type (user/agent message, encrypted reasoning, MCP and non-MCP tool calls, command exec, file edit, web search) with deterministic external_ids. A v028 migration adds a partial unique index over (session_id, external_id) so the MCP server path and the rollout-sync path dedupe naturally when both see the same call. AppServerOrigin and CloudCodexOrigin are stubbed for follow-up; LocalRolloutOrigin is the production path.

  4. feat(external-agents): bootstrap + sync for Codex / Claude Codenerve/external_agents/ registers CodexAgent and ClaudeCodeAgent, each declaring which workspace files render to which external paths (~/.codex/AGENTS.md, ~/.claude/CLAUDE.md). The bootstrap wizard adds a multi-select step that writes the agents' config files with allow-all permissions + an MCP server entry pointing at Nerve. An always-on SyncService regenerates the memory bundles every 15 min (hash-gated). New session_context() MCP tool returns recalled memU priors + active skills + session metadata as a single tool call — the equivalent of the dynamic "Recalled Memories" block Nerve-owned sessions get from build_system_prompt. ConfigWriter enforces a hard allowlist (~/.codex/, ~/.claude/, ~/.cursor/) so Nerve can't write outside agent-config paths.

Tests

  • 727 passing (was 582 on main) — +145 across the four phases.
  • New suites: test_tool_registry, test_mcp_*, test_external_ask_user, test_satellite_sessions, test_codex_* (6 files), test_external_agents_* (6 files), test_session_context_tool.

Verified end-to-end on the Pi

  • Full restart cycle: registry refactor lands without changing web/Telegram/cron behavior; tools still discoverable via in-process MCP.
  • External MCP server: Codex Desktop (remote-machine SSH to Pi) authenticates via inline http_headers = { Authorization = "Bearer …" }, calls tools/list cleanly, invokes Nerve tools, gets satellite-session attribution.
  • Codex thread sync: live ~/.codex/sessions/ tailing turns 4 in-workspace threads into satellite sessions (101 messages total, zero dedup collisions); out-of-workspace threads are skipped at the first rollout line.
  • Bootstrap sync: ~/.codex/AGENTS.md rendered with SOUL + IDENTITY + USER + TOOLS + MEMORY concatenated; Codex picks it up via its AGENTS.md discovery walk; session_context() exposed in tools/list.

Test plan

  • Pull the branch, pytest tests/, cd web && npm run build, nerve restart.
  • Enable sources.codex.enabled: true and the external_agents target block in your config.yaml (see docs/codex-sync.md and docs/external-mcp.md).
  • From an MCP-compatible client of your choice: initializetools/list against /mcp/v1/ should return the full tool catalog; tools/call for session_context(topic="…") should return your recall priors.
  • If you use Codex: open the ~/.codex/config.toml example in docs/external-mcp.md, paste your JWT into http_headers, start a thread inside your Nerve workspace, and confirm the thread shows up as a codex:<thread_id> satellite session in the Nerve UI.

Generated by Nerve

pufit added 4 commits May 20, 2026 16:26
Split nerve/agent/tools.py (2,379 LOC) into nerve/agent/tools/ with a
runtime-agnostic ToolRegistry, per-call ToolContext, and a Claude SDK
adapter. Same handlers will be served via a future external MCP server
(plan-95ba92e2) and a Codex thread sync source (plan-8a94a7e6).

Key changes:
- ToolContext carries session_id + collaborators; replaces 7 module
  globals (_workspace/_db/_engine/...) that were race-prone under
  concurrent sessions.
- Handlers in tools/handlers/{tasks,plans,memory,skills,notifications,
  sources,mcp_admin,hoa}.py return ToolResult; cross-domain calls are
  direct function imports.
- HTTP routes (tasks.py, plans.py) and plan_service.py invoke tools
  through registry.invoke() instead of legacy SdkMcpTool.handler().
- AgentEngine builds a fresh ToolContext per session in
  _build_mcp_servers(); set_notification_service() setter installs the
  service that previously had to be written into a module global.
- Back-compat shim in tools/__init__.py keeps init_tools, ALL_TOOLS,
  create_session_mcp_server, _*_impl helpers, and lazy SdkMcpTool
  re-exports working so existing tests pass unchanged.

Tests: 584 prior tests still pass; 17 new tests in test_tool_registry.py
cover ToolRegistry CRUD, schema invariants, and concurrent session_id
isolation (the bug this refactor fixes).
Adds nerve/mcp_server/ — a runtime adapter exposing the tool registry
to external MCP clients (Codex, Claude Code, Cursor) at /mcp/v1.
Each MCP connection is attributed to a Nerve "satellite session"
(source="external") so external tool calls appear in the UI alongside
native ones, and every call is logged to session_events for audit.

Design decisions baked in:
  * Auth reuses the gateway JWT (Authorization: Bearer or ?token=);
    dev mode (no jwt_secret) bypasses, matching gateway.auth.
  * ask_user is fire-and-forget externally — Nerve can't inject the
    answer back into Codex's thread, so NotificationService.handle_answer
    short-circuits for external sessions, broadcasts to the UI, and
    skips engine.run().
  * HTTP only; stdio deferred.
  * No SSE EventStore — nothing exposed blocks long enough to need it.
  * No DB migration — source="external" + metadata JSON is sufficient.
  * HoA tools off by default, gated on config.mcp_endpoint.include_hoa.

Routing: the FastAPI SPA catch-all /{path:path} is registered last in
create_app(), so /mcp/v1 must be mounted before it. Done via a
deferred ASGI handler installed in create_app() that looks up the
live StreamableHTTPSessionManager from a module global the lifespan
fills in once the engine is up; until then the mount returns 503.

Tests: 33 new tests (mcp_server protocol, satellite resolver, JWT auth,
audit writer, external ask_user guard, full HTTP integration via
TestClient) on top of 601 prior tests. Full suite: 634 passed.

Off by default — flip mcp_endpoint.enabled=true in config to advertise.
Synchronise Codex rollout transcripts into Nerve as satellite sessions
so memory_recall, the memory sweep, the UI, and external MCP server's
tool-call audit all see the same first-class conversation.

* LocalRolloutOrigin tails ~/.codex/sessions and archived_sessions,
  filtering at file-open time on session_meta.cwd. AppServerOrigin
  and CloudCodexOrigin are stubbed for follow-up.
* Translator covers every Codex item type observed on 0.130.0 with
  deterministic external_ids; the v028 unique index dedupes whether
  an item arrives via the rollout or the external MCP server.
* MCP server's SatelliteSessionResolver detects Codex thread UUIDs
  in client_session_id and adopts the convergent codex:<thread_id>
  format so both paths collapse onto one session row.
* Gateway lifespan owns service start/stop. Diagnostics endpoint
  surfaces per-origin stats. Disabled by default.

Smoke-tested against live Codex 0.130.0 rollouts on the Pi — 4
threads, 101 messages, 20 duplicates correctly dropped.
Self-configures third-party chat agents to consume Nerve as an MCP
server, and keeps their memory bundles (~/.codex/AGENTS.md,
~/.claude/CLAUDE.md) in sync with workspace identity files (SOUL,
USER, MEMORY, ...) on a timer.

New package `nerve/external_agents/`:
- `registry` — `ExternalAgent` ABC + lazy `AGENT_REGISTRY`
- `agents/{codex,claude_code}` — one-shot config + initial bundle
- `renderers/{base,codex_global,claude_code,passthrough}` — bundle styles
- `writer` — atomic, allowlist-bound, backup/skip/merge policies,
  hash sidecar for idempotency
- `sync_service` — background coroutine on a timer, hash-gated writes

New `session_context()` MCP tool that bundles recall priors (biased
by topic), active skills, and session metadata in one call —
external agents call it as their first action to match the dynamic
context Nerve-owned sessions get from system-prompt injection.

Bootstrap wizard step `_step_external_agents` + apply step:
multi-select, conflict detection, issues a 10-year `kind=mcp` JWT
(reusing existing gateway JWT mechanism — no DB table needed),
writes each agent's config file + initial bundle, and records the
targets in config.yaml. Non-interactive via
`NERVE_EXTERNAL_AGENTS=codex,claude-code`.

Gateway:
- `ExternalAgentsConfig` in config.py (targets, interval, conflict policy)
- `/api/external-agents` routes (list, sync, enable/disable, remove)
- `SyncService` started/stopped from the gateway lifespan

Frontend:
- `ExternalAgentsSection` on the Diagnostics page — per-agent card
  with CLI version, last sync, per-file hash, pause/resume/remove,
  "Sync now"

Also bundles in-flight Codex-thread sync source tweaks
(ingester/translator/SessionSidebar) sitting in the working tree.

Tests: 42 new (writer, registry, renderers, both agents, sync
service loop semantics, session_context tool). Full suite 727 pass.

Implements plan-3bc42e5f.
@pufit pufit merged commit f3a02c3 into main May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant