Skip to content

Local ACP runtime + per-agent local-skill blocklist#23

Merged
hrhrng merged 13 commits into
mainfrom
worktree-acp-local-runtime
Apr 29, 2026
Merged

Local ACP runtime + per-agent local-skill blocklist#23
hrhrng merged 13 commits into
mainfrom
worktree-acp-local-runtime

Conversation

@hrhrng
Copy link
Copy Markdown
Member

@hrhrng hrhrng commented Apr 29, 2026

Summary

  • ACP local runtime: register a user laptop running oma bridge daemon as an OMA "runtime", spawn ACP-compatible agents (Claude Code today; codex/opencode wired but not functional yet) per-session, relay events through the new AcpProxyHarness and RuntimeRoom DO. AgentConfig opts in via harness: "acp-proxy" + runtime_binding. SessionDO / events / SSE / recovery / Console all reuse the existing meta-harness path — no changes to consumers of OMA agents.
  • Per-agent local-skill blocklist (C-series, three commits): daemon detects ~/.claude/skills/<id>/ globals, reports manifest in WS hello frame; per-agent setting hides selected skills from a session by building a filtered <cwd>/.claude-config/ tree of symlinks and spawning the child with CLAUDE_CONFIG_DIR pointing at it.
  • IDOR fix on the daemon-facing bundle endpoint: was only checking that the bearer was a valid runtime token; now also enforces session.tenant_id === auth.tenant_id (returns 404 on mismatch so it isn't an existence oracle).

Verified end-to-end on lane mvp-smoke

  • Daemon attached, hello manifest reported 50 local skills, GET /v1/runtimes returns them
  • PUT /v1/agents/<id> round-trips runtime_binding.local_skill_blocklist
  • Console "+ New agent" form: pick runtime → multi-select panel renders; uncheck agent-browser+audit → submit → GET /v1/agents/<new> returns the blocklist
  • Live ACP child has CLAUDE_CONFIG_DIR=…/.claude-config set; on disk: 48/50 skills symlinked, blocked ids absent, settings.json/plugins/ symlinked atomically; full prompt round-trip works ("OK" came back through claude-code-acp)
  • Bundle endpoint: legit sid → 200, missing sid → 404, no auth → 401

Known gaps before GA

  • CLI not yet npm-published (@openma/cli@0.3.0-beta.0); npx @openma/cli bridge setup currently fetches the older registry version
  • codex / opencode listed in KNOWN_ACP_AGENTS but don't actually work (codex doesn't speak ACP natively as of v0.123) — Console picker shouldn't offer them; remove or disable before publish
  • No edit-blocklist UI on agent detail page — can only set on create or via API
  • Plugin-bundled skills not filterable in v1 (passes through via wholesale plugins/ symlink); deferred until a real user hits it
  • macOS-only daemon install — Linux users can run the daemon but no systemd unit / uninstall

Production deploy notes

  • apps/main/migrations/0011_runtime_local_skills.sql (additive ADD COLUMN with DEFAULT) must apply before main worker rolls — lane CI does wrangler d1 migrations apply first; verify prod pipeline does the same
  • Agent worker needs services: [{ binding: "MAIN", service: "managed-agents" }] in production wrangler config (lane fix in commit 45fc228)
  • New DO class RuntimeRoom requires migration v5: { new_sqlite_classes: ["RuntimeRoom"] } in main wrangler

Test plan

  • Apply migration 0011 to prod D1
  • Verify prod main wrangler has RUNTIME_ROOM DO + new migration tag
  • Verify prod agent wrangler has MAIN service binding
  • Smoke: connect a personal laptop to prod via oma bridge setup, observe runtime online in Console
  • Smoke: bind an agent to it with empty blocklist, send a turn, observe response
  • Smoke: set blocklist on the same agent, recreate session, confirm blocked skill absent in <cwd>/.claude-config/skills

🤖 Generated with Claude Code

hrhrng and others added 13 commits April 28, 2026 22:19
Introduces a new "local runtime" agent path: users register their laptop
via `oma bridge setup`, run a long-running daemon that holds a reverse-WS
to OMA, and OMA delegates the agent loop for any agent with
`harness: "acp-proxy"` + `runtime_binding` to a Claude Code (or other
ACP-compatible) child process spawned on that machine.

Architecture:
  user laptop                              CF cloud (managed-agents)
  ──────────────                            ─────────────────────────
  oma bridge daemon ═WS══> RuntimeRoom DO (idFromName(runtime_id))
       │                          │
       │ stdin/stdout              │ fan-out by harness:<sid> tag
       ▼                          ▼
  claude-code-acp child     SessionDO.AcpProxyHarness.run()
                              (one turn per call)

OMA intervention surface (no ACP protocol field for system prompt):
  - System prompt + appendable_prompts → AGENTS.md in spawn cwd
  - Skills → .claude/skills/<name>/SKILL.md (claude-code-acp)
            or inlined into AGENTS.md (codex/gemini/opencode)
  - MCP servers → ACP `session/new.mcpServers` rewritten to OMA
                  /v1/mcp-proxy/<sid>/<server>; daemon's `oma_*` PAT as
                  authorization_token; real upstream creds never leave
                  Workers runtime
  - LLM API key → user-managed (claude /login or env)

Backend (apps/main):
  - migrations/0010_runtimes.sql: runtimes / runtime_tokens /
    connect_runtime_codes (3 tables; no FKs per project convention)
  - RuntimeRoom DO with hibernation API; daemon-tag + harness:<sid>
    fan-out routing; persists session_state for late-attach replay
  - Routes: /v1/runtimes/* (browser auth), /agents/runtime/* (daemon
    auth via sk_machine_*), /v1/mcp-proxy/:sid/:server (single SQL
    JOIN: api_token + sessions + vault credential lookup), /v1/internal/
    runtime-attach-harness (WS upgrade proxy from harness to RuntimeRoom)
  - run_worker_first updated to include /agents/runtime/*

Agent (apps/agent):
  - AcpProxyHarness implements HarnessInterface; per-turn opens WS to
    RuntimeRoom via MAIN service binding, sends session.start /
    session.prompt, drains session.event via AcpTranslator into
    SessionEvent broadcasts. shouldCompact/compact/deriveModelContext/
    onSessionInit are no-ops (ACP child manages own context)
  - AcpTranslator unwraps `event.update.sessionUpdate` correctly
    (real ACP wire shape, not the surface the SDK types suggest)
  - HarnessContext gains session_id + env.MAIN +
    env.INTEGRATIONS_INTERNAL_SECRET; SessionDO populates them
  - MAIN service binding added to all agent wrangler files

CLI (packages/cli):
  - `oma bridge {setup,daemon,status,uninstall}` — daemon code lifted
    from clash-space/clash, ported to OMA paths (~/.oma/bridge/*),
    fetches AGENTS.md+skills bundle from main, no LLM env injection
  - `oma runtime {list,rm}` — manage registered machines
  - `oma agents create --runtime <rid> --acp-agent <id>` — opt into
    acp-proxy harness with one flag pair
  - esbuild banner adds createRequire shim so ESM bundle handles ws's
    internal `require("stream")`
  - NodeSpawner: `AgentSpec.env` now `Record<string, string|undefined>`
    so callers can EXPLICITLY UNSET inherited keys (CLAUDECODE etc.)
  - SessionManager scrubs CLAUDECODE / CLAUDE_CODE_ENTRYPOINT /
    CLAUDE_CODE_SSE_PORT before spawn — claude-code-acp refuses
    nested-session start otherwise
  - SessionManager.prompt now propagates ACP `promptError` as
    session.error instead of silent session.complete

Console (apps/console):
  - /runtimes page — list registered machines, status / heartbeat,
    "+ Connect machine" with install instructions
  - /connect-runtime page — OAuth callback for `oma bridge setup`,
    mints exchange code, redirects to localhost CLI listener
  - Sidebar nav + RuntimesIcon

End-to-end verified:
  - Daemon mints token, attaches WS, hello manifest detected
    (claude-code-acp / codex / opencode), runtime online → offline
  - Bundle endpoint returns correct files for each acp_agent_id
  - Real Claude Code prompt round-trip: "What is 2+2?" → "Four."
    streamed through 3 agent_message_chunk events end-to-end
  - mcp-proxy 401/403 paths verified (single SQL JOIN auth)
  - SQL migration applies cleanly on local D1
  - typecheck passes for root + cli + acp-runtime
  - vite console build, esbuild cli bundle both clean

Not yet exercised:
  - Real SessionDO → MAIN service binding → RuntimeRoom path (mocked
    via /v1/internal/runtime-attach-harness; needs lane to verify)
  - Real upstream MCP proxy with vault creds
  - Browser-end OAuth in `oma bridge setup`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Long header description was pushing the button into a 2-line wrap on
narrow viewports. Two changes:
  - shrink-0 + whitespace-nowrap on the button so it doesn't squeeze
  - shorten the description paragraph (closer to ApiKeysList style)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two new fields to the agent create modal's basic tab:
  - Local Runtime dropdown (reuses /v1/runtimes; disables offline ones)
  - ACP agent dropdown (filled from the chosen runtime's daemon-detected
    agents — claude-code-acp / codex-cli / opencode / hermes / etc.)

When both are set, the agent is created with harness:'acp-proxy' and
runtime_binding:{runtime_id, acp_agent_id}. Empty runtime = default
cloud loop (no behaviour change).

Auto-picks the first detected ACP agent when a runtime is selected so
the user doesn't need to know the daemon's manifest strings.

Empty-state shows a link to /runtimes when no machines registered yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 'Local Runtime' row when agent has runtime_binding set so users can
verify their pick stuck after creating an acp-proxy agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three bugs hit during e2e on the lane:
  1. cli flag(): didn't accept --name=value form (only --name value)
  2. setup.ts: opened /connect-daemon (legacy clash path); should be
     /connect-runtime (matches Console route + my recent rename)
  3. agents POST/PUT route + AgentService NewAgentInput/UpdateAgentInput
     dropped runtime_binding silently — Console form sent it but the API
     stripped it, so harness=acp-proxy agents were created without their
     runtime binding and the AgentDetail row I added never rendered

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The registerHarness('acp-proxy', ...) line vanished during a stash/pop
through the linear/github merges. Without it resolveHarness('acp-proxy')
falls back to default and SessionDO calls Anthropic directly with the
wrong model — exactly what the lane SSE showed:
  span.model_request_end finish_reason=error
  error_message=login fail: Please carry the API secret key...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… fast partial redeploy

Two changes triggered by ACP e2e on lane:

1. lane-generate.mjs: agent.services rewrite was dropping MAIN. Lane
   AcpProxyHarness sessions errored 'MAIN service binding missing on
   agent worker' — adding the lane-suffixed MAIN binding fixes it.

2. deploy-lane.yml: new `workers` input (default 'all', accepts comma
   list of main/agent/integrations). Each step now gated by contains().
   Agent-only retry drops from ~7min full lane deploy to ~30s when the
   other workers are already live and only one app's code changed.
   Secret-set step also gated to 'all' since secrets only need to be
   pushed at full bring-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AcpProxyHarness calls main's /v1/internal/runtime-attach-harness which
requires the header secret. Lane was only pushing it to main + integrations
workers; agent didn't have it, so local-runtime sessions errored
'INTEGRATIONS_INTERNAL_SECRET unset — cannot call main internal endpoints'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Daemon's per-session spawn cwd (~/.oma/bridge/sessions/<short>/) was
auto-GC'd after 7 days of inactivity — surprising for users whose
sessions are platform-owned. Switch to platform-driven cleanup:

  - Drop gcOldSessions from session-cwd.ts + daemon startup
  - Add removeSessionCwd() called from SessionManager.dispose
  - Split SessionManager dispose vs disposeAll: dispose (platform 'session.
    dispose') kills child + rm cwd; disposeAll (daemon shutdown) kills
    children only — sessions are still alive at the platform, daemon
    coming back tomorrow needs the same dirs

Wire the platform → daemon signal: apps/main DELETE /v1/sessions/:id
now forwards session.dispose to the RuntimeRoom DO when the agent has
runtime_binding set. Best-effort — daemon offline doesn't block the
delete (the daemon could also reconcile via runtime_session lookup
returning 404 next attempt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
C1 of the local-skill blocklist series. Daemon scans
~/.claude/skills/<id>/SKILL.md (claude-code-acp globals) and reports the
manifest in the WS hello frame; main persists it as runtime.local_skills_json
and surfaces it on GET /v1/runtimes so Console can show users what each
attached machine has available.

No filtering yet — that's C2 (per-agent blocklist field) and C3 (daemon
applies the filter via CLAUDE_CONFIG_DIR symlinks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
C2 of the local-skill blocklist series. Adds optional
runtime_binding.local_skill_blocklist (string[]) to AgentConfig and a
Console multi-select panel under the Local Runtime picker on the agent
form. Options come from runtimes[].local_skills[acpAgentId] (populated
by C1) so the user only ever sees skills that actually exist on the
selected runtime.

Daemon enforcement lands in C3 — until then the field is informational.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…mlinks

C3 of the local-skill blocklist series — closes the loop. Bundle
endpoint now returns the agent's local_skill_blocklist alongside files.
On session.start for claude-code-acp the daemon builds
<cwd>/.claude-config/ by symlinking ~/.claude/* (atomically — settings,
credentials, plugins, agents, etc. preserved) and rebuilding skills/ to
include only non-blocklisted ids; spawns the child with
CLAUDE_CONFIG_DIR pointing at it. The user's real ~/.claude/ is
untouched.

v1 filters globals (~/.claude/skills/<id>/) only — plugin-bundled skills
come along with the wholesale plugins/ symlink. Their nested layout
(plugins/cache/<marketplace>/<plugin>/<ver>/skills/<id>/) needs a
recursive walk to filter individually; defer until a real user hits it.

Other ACP agents (codex, opencode, gemini) skip this path — their skill
ecosystems don't share Claude Code's filesystem layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The daemon-facing GET /agents/runtime/sessions/:sid/bundle previously
only checked that the bearer was a valid runtime token, not that the
sid belonged to the same tenant. A leaked sk_machine_* could be used
to enumerate other tenants' session ids and read their agent system
prompts + appendable_prompts via the bundle.

Extend authenticateRuntimeToken to also return owner_tenant_id, and
gate bundle access on session.tenant_id === auth.tenant_id. Returns
404 (not 403) on mismatch so the endpoint isn't an existence oracle.

Practical exploitability was very low (sids are unguessable UUIDs and
no credentials leaked), but this is the IDOR the prior comment in this
file flagged as 'tighten when sensitive prompts ship'. Closing it now
before the local-runtime feature lands in a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hrhrng hrhrng merged commit 631d7b7 into main Apr 29, 2026
1 check passed
hrhrng added a commit that referenced this pull request May 12, 2026
* feat(acp-runtime): local-runtime daemon + AcpProxyHarness end-to-end

Introduces a new "local runtime" agent path: users register their laptop
via `oma bridge setup`, run a long-running daemon that holds a reverse-WS
to OMA, and OMA delegates the agent loop for any agent with
`harness: "acp-proxy"` + `runtime_binding` to a Claude Code (or other
ACP-compatible) child process spawned on that machine.

Architecture:
  user laptop                              CF cloud (managed-agents)
  ──────────────                            ─────────────────────────
  oma bridge daemon ═WS══> RuntimeRoom DO (idFromName(runtime_id))
       │                          │
       │ stdin/stdout              │ fan-out by harness:<sid> tag
       ▼                          ▼
  claude-code-acp child     SessionDO.AcpProxyHarness.run()
                              (one turn per call)

OMA intervention surface (no ACP protocol field for system prompt):
  - System prompt + appendable_prompts → AGENTS.md in spawn cwd
  - Skills → .claude/skills/<name>/SKILL.md (claude-code-acp)
            or inlined into AGENTS.md (codex/gemini/opencode)
  - MCP servers → ACP `session/new.mcpServers` rewritten to OMA
                  /v1/mcp-proxy/<sid>/<server>; daemon's `oma_*` PAT as
                  authorization_token; real upstream creds never leave
                  Workers runtime
  - LLM API key → user-managed (claude /login or env)

Backend (apps/main):
  - migrations/0010_runtimes.sql: runtimes / runtime_tokens /
    connect_runtime_codes (3 tables; no FKs per project convention)
  - RuntimeRoom DO with hibernation API; daemon-tag + harness:<sid>
    fan-out routing; persists session_state for late-attach replay
  - Routes: /v1/runtimes/* (browser auth), /agents/runtime/* (daemon
    auth via sk_machine_*), /v1/mcp-proxy/:sid/:server (single SQL
    JOIN: api_token + sessions + vault credential lookup), /v1/internal/
    runtime-attach-harness (WS upgrade proxy from harness to RuntimeRoom)
  - run_worker_first updated to include /agents/runtime/*

Agent (apps/agent):
  - AcpProxyHarness implements HarnessInterface; per-turn opens WS to
    RuntimeRoom via MAIN service binding, sends session.start /
    session.prompt, drains session.event via AcpTranslator into
    SessionEvent broadcasts. shouldCompact/compact/deriveModelContext/
    onSessionInit are no-ops (ACP child manages own context)
  - AcpTranslator unwraps `event.update.sessionUpdate` correctly
    (real ACP wire shape, not the surface the SDK types suggest)
  - HarnessContext gains session_id + env.MAIN +
    env.INTEGRATIONS_INTERNAL_SECRET; SessionDO populates them
  - MAIN service binding added to all agent wrangler files

CLI (packages/cli):
  - `oma bridge {setup,daemon,status,uninstall}` — daemon code lifted
    from clash-space/clash, ported to OMA paths (~/.oma/bridge/*),
    fetches AGENTS.md+skills bundle from main, no LLM env injection
  - `oma runtime {list,rm}` — manage registered machines
  - `oma agents create --runtime <rid> --acp-agent <id>` — opt into
    acp-proxy harness with one flag pair
  - esbuild banner adds createRequire shim so ESM bundle handles ws's
    internal `require("stream")`
  - NodeSpawner: `AgentSpec.env` now `Record<string, string|undefined>`
    so callers can EXPLICITLY UNSET inherited keys (CLAUDECODE etc.)
  - SessionManager scrubs CLAUDECODE / CLAUDE_CODE_ENTRYPOINT /
    CLAUDE_CODE_SSE_PORT before spawn — claude-code-acp refuses
    nested-session start otherwise
  - SessionManager.prompt now propagates ACP `promptError` as
    session.error instead of silent session.complete

Console (apps/console):
  - /runtimes page — list registered machines, status / heartbeat,
    "+ Connect machine" with install instructions
  - /connect-runtime page — OAuth callback for `oma bridge setup`,
    mints exchange code, redirects to localhost CLI listener
  - Sidebar nav + RuntimesIcon

End-to-end verified:
  - Daemon mints token, attaches WS, hello manifest detected
    (claude-code-acp / codex / opencode), runtime online → offline
  - Bundle endpoint returns correct files for each acp_agent_id
  - Real Claude Code prompt round-trip: "What is 2+2?" → "Four."
    streamed through 3 agent_message_chunk events end-to-end
  - mcp-proxy 401/403 paths verified (single SQL JOIN auth)
  - SQL migration applies cleanly on local D1
  - typecheck passes for root + cli + acp-runtime
  - vite console build, esbuild cli bundle both clean

Not yet exercised:
  - Real SessionDO → MAIN service binding → RuntimeRoom path (mocked
    via /v1/internal/runtime-attach-harness; needs lane to verify)
  - Real upstream MCP proxy with vault creds
  - Browser-end OAuth in `oma bridge setup`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(console): keep '+ Connect machine' button on one line

Long header description was pushing the button into a 2-line wrap on
narrow viewports. Two changes:
  - shrink-0 + whitespace-nowrap on the button so it doesn't squeeze
  - shorten the description paragraph (closer to ApiKeysList style)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): agent create form picks Local Runtime + ACP agent

Adds two new fields to the agent create modal's basic tab:
  - Local Runtime dropdown (reuses /v1/runtimes; disables offline ones)
  - ACP agent dropdown (filled from the chosen runtime's daemon-detected
    agents — claude-code-acp / codex-cli / opencode / hermes / etc.)

When both are set, the agent is created with harness:'acp-proxy' and
runtime_binding:{runtime_id, acp_agent_id}. Empty runtime = default
cloud loop (no behaviour change).

Auto-picks the first detected ACP agent when a runtime is selected so
the user doesn't need to know the daemon's manifest strings.

Empty-state shows a link to /runtimes when no machines registered yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): show runtime_binding on AgentDetail

Adds 'Local Runtime' row when agent has runtime_binding set so users can
verify their pick stuck after creating an acp-proxy agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: actually persist runtime_binding through agent CRUD path

Three bugs hit during e2e on the lane:
  1. cli flag(): didn't accept --name=value form (only --name value)
  2. setup.ts: opened /connect-daemon (legacy clash path); should be
     /connect-runtime (matches Console route + my recent rename)
  3. agents POST/PUT route + AgentService NewAgentInput/UpdateAgentInput
     dropped runtime_binding silently — Console form sent it but the API
     stripped it, so harness=acp-proxy agents were created without their
     runtime binding and the AgentDetail row I added never rendered

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): re-register acp-proxy harness lost during rebase

The registerHarness('acp-proxy', ...) line vanished during a stash/pop
through the linear/github merges. Without it resolveHarness('acp-proxy')
falls back to default and SessionDO calls Anthropic directly with the
wrong model — exactly what the lane SSE showed:
  span.model_request_end finish_reason=error
  error_message=login fail: Please carry the API secret key...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): add MAIN binding to lane agent worker + workers= input for fast partial redeploy

Two changes triggered by ACP e2e on lane:

1. lane-generate.mjs: agent.services rewrite was dropping MAIN. Lane
   AcpProxyHarness sessions errored 'MAIN service binding missing on
   agent worker' — adding the lane-suffixed MAIN binding fixes it.

2. deploy-lane.yml: new `workers` input (default 'all', accepts comma
   list of main/agent/integrations). Each step now gated by contains().
   Agent-only retry drops from ~7min full lane deploy to ~30s when the
   other workers are already live and only one app's code changed.
   Secret-set step also gated to 'all' since secrets only need to be
   pushed at full bring-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): set INTEGRATIONS_INTERNAL_SECRET on lane agent worker

AcpProxyHarness calls main's /v1/internal/runtime-attach-harness which
requires the header secret. Lane was only pushing it to main + integrations
workers; agent didn't have it, so local-runtime sessions errored
'INTEGRATIONS_INTERNAL_SECRET unset — cannot call main internal endpoints'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: bind spawn-cwd lifetime to OMA session, not 7d age GC

Daemon's per-session spawn cwd (~/.oma/bridge/sessions/<short>/) was
auto-GC'd after 7 days of inactivity — surprising for users whose
sessions are platform-owned. Switch to platform-driven cleanup:

  - Drop gcOldSessions from session-cwd.ts + daemon startup
  - Add removeSessionCwd() called from SessionManager.dispose
  - Split SessionManager dispose vs disposeAll: dispose (platform 'session.
    dispose') kills child + rm cwd; disposeAll (daemon shutdown) kills
    children only — sessions are still alive at the platform, daemon
    coming back tomorrow needs the same dirs

Wire the platform → daemon signal: apps/main DELETE /v1/sessions/:id
now forwards session.dispose to the RuntimeRoom DO when the agent has
runtime_binding set. Best-effort — daemon offline doesn't block the
delete (the daemon could also reconcile via runtime_session lookup
returning 404 next attempt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): detect + report local skills installed on daemon machine

C1 of the local-skill blocklist series. Daemon scans
~/.claude/skills/<id>/SKILL.md (claude-code-acp globals) and reports the
manifest in the WS hello frame; main persists it as runtime.local_skills_json
and surfaces it on GET /v1/runtimes so Console can show users what each
attached machine has available.

No filtering yet — that's C2 (per-agent blocklist field) and C3 (daemon
applies the filter via CLAUDE_CONFIG_DIR symlinks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): per-agent local-skill blocklist on agent settings

C2 of the local-skill blocklist series. Adds optional
runtime_binding.local_skill_blocklist (string[]) to AgentConfig and a
Console multi-select panel under the Local Runtime picker on the agent
form. Options come from runtimes[].local_skills[acpAgentId] (populated
by C1) so the user only ever sees skills that actually exist on the
selected runtime.

Daemon enforcement lands in C3 — until then the field is informational.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): enforce local-skill blocklist via CLAUDE_CONFIG_DIR symlinks

C3 of the local-skill blocklist series — closes the loop. Bundle
endpoint now returns the agent's local_skill_blocklist alongside files.
On session.start for claude-code-acp the daemon builds
<cwd>/.claude-config/ by symlinking ~/.claude/* (atomically — settings,
credentials, plugins, agents, etc. preserved) and rebuilding skills/ to
include only non-blocklisted ids; spawns the child with
CLAUDE_CONFIG_DIR pointing at it. The user's real ~/.claude/ is
untouched.

v1 filters globals (~/.claude/skills/<id>/) only — plugin-bundled skills
come along with the wholesale plugins/ symlink. Their nested layout
(plugins/cache/<marketplace>/<plugin>/<ver>/skills/<id>/) needs a
recursive walk to filter individually; defer until a real user hits it.

Other ACP agents (codex, opencode, gemini) skip this path — their skill
ecosystems don't share Claude Code's filesystem layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(runtime): bundle endpoint enforces tenant ownership of requested sid

The daemon-facing GET /agents/runtime/sessions/:sid/bundle previously
only checked that the bearer was a valid runtime token, not that the
sid belonged to the same tenant. A leaked sk_machine_* could be used
to enumerate other tenants' session ids and read their agent system
prompts + appendable_prompts via the bundle.

Extend authenticateRuntimeToken to also return owner_tenant_id, and
gate bundle access on session.tenant_id === auth.tenant_id. Returns
404 (not 403) on mismatch so the endpoint isn't an existence oracle.

Practical exploitability was very low (sids are unguessable UUIDs and
no credentials leaked), but this is the IDOR the prior comment in this
file flagged as 'tighten when sensitive prompts ship'. Closing it now
before the local-runtime feature lands in a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hrhrng added a commit that referenced this pull request May 12, 2026
* feat(acp-runtime): local-runtime daemon + AcpProxyHarness end-to-end

Introduces a new "local runtime" agent path: users register their laptop
via `oma bridge setup`, run a long-running daemon that holds a reverse-WS
to OMA, and OMA delegates the agent loop for any agent with
`harness: "acp-proxy"` + `runtime_binding` to a Claude Code (or other
ACP-compatible) child process spawned on that machine.

Architecture:
  user laptop                              CF cloud (managed-agents)
  ──────────────                            ─────────────────────────
  oma bridge daemon ═WS══> RuntimeRoom DO (idFromName(runtime_id))
       │                          │
       │ stdin/stdout              │ fan-out by harness:<sid> tag
       ▼                          ▼
  claude-code-acp child     SessionDO.AcpProxyHarness.run()
                              (one turn per call)

OMA intervention surface (no ACP protocol field for system prompt):
  - System prompt + appendable_prompts → AGENTS.md in spawn cwd
  - Skills → .claude/skills/<name>/SKILL.md (claude-code-acp)
            or inlined into AGENTS.md (codex/gemini/opencode)
  - MCP servers → ACP `session/new.mcpServers` rewritten to OMA
                  /v1/mcp-proxy/<sid>/<server>; daemon's `oma_*` PAT as
                  authorization_token; real upstream creds never leave
                  Workers runtime
  - LLM API key → user-managed (claude /login or env)

Backend (apps/main):
  - migrations/0010_runtimes.sql: runtimes / runtime_tokens /
    connect_runtime_codes (3 tables; no FKs per project convention)
  - RuntimeRoom DO with hibernation API; daemon-tag + harness:<sid>
    fan-out routing; persists session_state for late-attach replay
  - Routes: /v1/runtimes/* (browser auth), /agents/runtime/* (daemon
    auth via sk_machine_*), /v1/mcp-proxy/:sid/:server (single SQL
    JOIN: api_token + sessions + vault credential lookup), /v1/internal/
    runtime-attach-harness (WS upgrade proxy from harness to RuntimeRoom)
  - run_worker_first updated to include /agents/runtime/*

Agent (apps/agent):
  - AcpProxyHarness implements HarnessInterface; per-turn opens WS to
    RuntimeRoom via MAIN service binding, sends session.start /
    session.prompt, drains session.event via AcpTranslator into
    SessionEvent broadcasts. shouldCompact/compact/deriveModelContext/
    onSessionInit are no-ops (ACP child manages own context)
  - AcpTranslator unwraps `event.update.sessionUpdate` correctly
    (real ACP wire shape, not the surface the SDK types suggest)
  - HarnessContext gains session_id + env.MAIN +
    env.INTEGRATIONS_INTERNAL_SECRET; SessionDO populates them
  - MAIN service binding added to all agent wrangler files

CLI (packages/cli):
  - `oma bridge {setup,daemon,status,uninstall}` — daemon code lifted
    from clash-space/clash, ported to OMA paths (~/.oma/bridge/*),
    fetches AGENTS.md+skills bundle from main, no LLM env injection
  - `oma runtime {list,rm}` — manage registered machines
  - `oma agents create --runtime <rid> --acp-agent <id>` — opt into
    acp-proxy harness with one flag pair
  - esbuild banner adds createRequire shim so ESM bundle handles ws's
    internal `require("stream")`
  - NodeSpawner: `AgentSpec.env` now `Record<string, string|undefined>`
    so callers can EXPLICITLY UNSET inherited keys (CLAUDECODE etc.)
  - SessionManager scrubs CLAUDECODE / CLAUDE_CODE_ENTRYPOINT /
    CLAUDE_CODE_SSE_PORT before spawn — claude-code-acp refuses
    nested-session start otherwise
  - SessionManager.prompt now propagates ACP `promptError` as
    session.error instead of silent session.complete

Console (apps/console):
  - /runtimes page — list registered machines, status / heartbeat,
    "+ Connect machine" with install instructions
  - /connect-runtime page — OAuth callback for `oma bridge setup`,
    mints exchange code, redirects to localhost CLI listener
  - Sidebar nav + RuntimesIcon

End-to-end verified:
  - Daemon mints token, attaches WS, hello manifest detected
    (claude-code-acp / codex / opencode), runtime online → offline
  - Bundle endpoint returns correct files for each acp_agent_id
  - Real Claude Code prompt round-trip: "What is 2+2?" → "Four."
    streamed through 3 agent_message_chunk events end-to-end
  - mcp-proxy 401/403 paths verified (single SQL JOIN auth)
  - SQL migration applies cleanly on local D1
  - typecheck passes for root + cli + acp-runtime
  - vite console build, esbuild cli bundle both clean

Not yet exercised:
  - Real SessionDO → MAIN service binding → RuntimeRoom path (mocked
    via /v1/internal/runtime-attach-harness; needs lane to verify)
  - Real upstream MCP proxy with vault creds
  - Browser-end OAuth in `oma bridge setup`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(console): keep '+ Connect machine' button on one line

Long header description was pushing the button into a 2-line wrap on
narrow viewports. Two changes:
  - shrink-0 + whitespace-nowrap on the button so it doesn't squeeze
  - shorten the description paragraph (closer to ApiKeysList style)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): agent create form picks Local Runtime + ACP agent

Adds two new fields to the agent create modal's basic tab:
  - Local Runtime dropdown (reuses /v1/runtimes; disables offline ones)
  - ACP agent dropdown (filled from the chosen runtime's daemon-detected
    agents — claude-code-acp / codex-cli / opencode / hermes / etc.)

When both are set, the agent is created with harness:'acp-proxy' and
runtime_binding:{runtime_id, acp_agent_id}. Empty runtime = default
cloud loop (no behaviour change).

Auto-picks the first detected ACP agent when a runtime is selected so
the user doesn't need to know the daemon's manifest strings.

Empty-state shows a link to /runtimes when no machines registered yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): show runtime_binding on AgentDetail

Adds 'Local Runtime' row when agent has runtime_binding set so users can
verify their pick stuck after creating an acp-proxy agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: actually persist runtime_binding through agent CRUD path

Three bugs hit during e2e on the lane:
  1. cli flag(): didn't accept --name=value form (only --name value)
  2. setup.ts: opened /connect-daemon (legacy clash path); should be
     /connect-runtime (matches Console route + my recent rename)
  3. agents POST/PUT route + AgentService NewAgentInput/UpdateAgentInput
     dropped runtime_binding silently — Console form sent it but the API
     stripped it, so harness=acp-proxy agents were created without their
     runtime binding and the AgentDetail row I added never rendered

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): re-register acp-proxy harness lost during rebase

The registerHarness('acp-proxy', ...) line vanished during a stash/pop
through the linear/github merges. Without it resolveHarness('acp-proxy')
falls back to default and SessionDO calls Anthropic directly with the
wrong model — exactly what the lane SSE showed:
  span.model_request_end finish_reason=error
  error_message=login fail: Please carry the API secret key...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): add MAIN binding to lane agent worker + workers= input for fast partial redeploy

Two changes triggered by ACP e2e on lane:

1. lane-generate.mjs: agent.services rewrite was dropping MAIN. Lane
   AcpProxyHarness sessions errored 'MAIN service binding missing on
   agent worker' — adding the lane-suffixed MAIN binding fixes it.

2. deploy-lane.yml: new `workers` input (default 'all', accepts comma
   list of main/agent/integrations). Each step now gated by contains().
   Agent-only retry drops from ~7min full lane deploy to ~30s when the
   other workers are already live and only one app's code changed.
   Secret-set step also gated to 'all' since secrets only need to be
   pushed at full bring-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): set INTEGRATIONS_INTERNAL_SECRET on lane agent worker

AcpProxyHarness calls main's /v1/internal/runtime-attach-harness which
requires the header secret. Lane was only pushing it to main + integrations
workers; agent didn't have it, so local-runtime sessions errored
'INTEGRATIONS_INTERNAL_SECRET unset — cannot call main internal endpoints'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: bind spawn-cwd lifetime to OMA session, not 7d age GC

Daemon's per-session spawn cwd (~/.oma/bridge/sessions/<short>/) was
auto-GC'd after 7 days of inactivity — surprising for users whose
sessions are platform-owned. Switch to platform-driven cleanup:

  - Drop gcOldSessions from session-cwd.ts + daemon startup
  - Add removeSessionCwd() called from SessionManager.dispose
  - Split SessionManager dispose vs disposeAll: dispose (platform 'session.
    dispose') kills child + rm cwd; disposeAll (daemon shutdown) kills
    children only — sessions are still alive at the platform, daemon
    coming back tomorrow needs the same dirs

Wire the platform → daemon signal: apps/main DELETE /v1/sessions/:id
now forwards session.dispose to the RuntimeRoom DO when the agent has
runtime_binding set. Best-effort — daemon offline doesn't block the
delete (the daemon could also reconcile via runtime_session lookup
returning 404 next attempt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): detect + report local skills installed on daemon machine

C1 of the local-skill blocklist series. Daemon scans
~/.claude/skills/<id>/SKILL.md (claude-code-acp globals) and reports the
manifest in the WS hello frame; main persists it as runtime.local_skills_json
and surfaces it on GET /v1/runtimes so Console can show users what each
attached machine has available.

No filtering yet — that's C2 (per-agent blocklist field) and C3 (daemon
applies the filter via CLAUDE_CONFIG_DIR symlinks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): per-agent local-skill blocklist on agent settings

C2 of the local-skill blocklist series. Adds optional
runtime_binding.local_skill_blocklist (string[]) to AgentConfig and a
Console multi-select panel under the Local Runtime picker on the agent
form. Options come from runtimes[].local_skills[acpAgentId] (populated
by C1) so the user only ever sees skills that actually exist on the
selected runtime.

Daemon enforcement lands in C3 — until then the field is informational.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): enforce local-skill blocklist via CLAUDE_CONFIG_DIR symlinks

C3 of the local-skill blocklist series — closes the loop. Bundle
endpoint now returns the agent's local_skill_blocklist alongside files.
On session.start for claude-code-acp the daemon builds
<cwd>/.claude-config/ by symlinking ~/.claude/* (atomically — settings,
credentials, plugins, agents, etc. preserved) and rebuilding skills/ to
include only non-blocklisted ids; spawns the child with
CLAUDE_CONFIG_DIR pointing at it. The user's real ~/.claude/ is
untouched.

v1 filters globals (~/.claude/skills/<id>/) only — plugin-bundled skills
come along with the wholesale plugins/ symlink. Their nested layout
(plugins/cache/<marketplace>/<plugin>/<ver>/skills/<id>/) needs a
recursive walk to filter individually; defer until a real user hits it.

Other ACP agents (codex, opencode, gemini) skip this path — their skill
ecosystems don't share Claude Code's filesystem layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(runtime): bundle endpoint enforces tenant ownership of requested sid

The daemon-facing GET /agents/runtime/sessions/:sid/bundle previously
only checked that the bearer was a valid runtime token, not that the
sid belonged to the same tenant. A leaked sk_machine_* could be used
to enumerate other tenants' session ids and read their agent system
prompts + appendable_prompts via the bundle.

Extend authenticateRuntimeToken to also return owner_tenant_id, and
gate bundle access on session.tenant_id === auth.tenant_id. Returns
404 (not 403) on mismatch so the endpoint isn't an existence oracle.

Practical exploitability was very low (sids are unguessable UUIDs and
no credentials leaked), but this is the IDOR the prior comment in this
file flagged as 'tighten when sensitive prompts ship'. Closing it now
before the local-runtime feature lands in a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hrhrng added a commit that referenced this pull request May 12, 2026
* feat(acp-runtime): local-runtime daemon + AcpProxyHarness end-to-end

Introduces a new "local runtime" agent path: users register their laptop
via `oma bridge setup`, run a long-running daemon that holds a reverse-WS
to OMA, and OMA delegates the agent loop for any agent with
`harness: "acp-proxy"` + `runtime_binding` to a Claude Code (or other
ACP-compatible) child process spawned on that machine.

Architecture:
  user laptop                              CF cloud (managed-agents)
  ──────────────                            ─────────────────────────
  oma bridge daemon ═WS══> RuntimeRoom DO (idFromName(runtime_id))
       │                          │
       │ stdin/stdout              │ fan-out by harness:<sid> tag
       ▼                          ▼
  claude-code-acp child     SessionDO.AcpProxyHarness.run()
                              (one turn per call)

OMA intervention surface (no ACP protocol field for system prompt):
  - System prompt + appendable_prompts → AGENTS.md in spawn cwd
  - Skills → .claude/skills/<name>/SKILL.md (claude-code-acp)
            or inlined into AGENTS.md (codex/gemini/opencode)
  - MCP servers → ACP `session/new.mcpServers` rewritten to OMA
                  /v1/mcp-proxy/<sid>/<server>; daemon's `oma_*` PAT as
                  authorization_token; real upstream creds never leave
                  Workers runtime
  - LLM API key → user-managed (claude /login or env)

Backend (apps/main):
  - migrations/0010_runtimes.sql: runtimes / runtime_tokens /
    connect_runtime_codes (3 tables; no FKs per project convention)
  - RuntimeRoom DO with hibernation API; daemon-tag + harness:<sid>
    fan-out routing; persists session_state for late-attach replay
  - Routes: /v1/runtimes/* (browser auth), /agents/runtime/* (daemon
    auth via sk_machine_*), /v1/mcp-proxy/:sid/:server (single SQL
    JOIN: api_token + sessions + vault credential lookup), /v1/internal/
    runtime-attach-harness (WS upgrade proxy from harness to RuntimeRoom)
  - run_worker_first updated to include /agents/runtime/*

Agent (apps/agent):
  - AcpProxyHarness implements HarnessInterface; per-turn opens WS to
    RuntimeRoom via MAIN service binding, sends session.start /
    session.prompt, drains session.event via AcpTranslator into
    SessionEvent broadcasts. shouldCompact/compact/deriveModelContext/
    onSessionInit are no-ops (ACP child manages own context)
  - AcpTranslator unwraps `event.update.sessionUpdate` correctly
    (real ACP wire shape, not the surface the SDK types suggest)
  - HarnessContext gains session_id + env.MAIN +
    env.INTEGRATIONS_INTERNAL_SECRET; SessionDO populates them
  - MAIN service binding added to all agent wrangler files

CLI (packages/cli):
  - `oma bridge {setup,daemon,status,uninstall}` — daemon code lifted
    from clash-space/clash, ported to OMA paths (~/.oma/bridge/*),
    fetches AGENTS.md+skills bundle from main, no LLM env injection
  - `oma runtime {list,rm}` — manage registered machines
  - `oma agents create --runtime <rid> --acp-agent <id>` — opt into
    acp-proxy harness with one flag pair
  - esbuild banner adds createRequire shim so ESM bundle handles ws's
    internal `require("stream")`
  - NodeSpawner: `AgentSpec.env` now `Record<string, string|undefined>`
    so callers can EXPLICITLY UNSET inherited keys (CLAUDECODE etc.)
  - SessionManager scrubs CLAUDECODE / CLAUDE_CODE_ENTRYPOINT /
    CLAUDE_CODE_SSE_PORT before spawn — claude-code-acp refuses
    nested-session start otherwise
  - SessionManager.prompt now propagates ACP `promptError` as
    session.error instead of silent session.complete

Console (apps/console):
  - /runtimes page — list registered machines, status / heartbeat,
    "+ Connect machine" with install instructions
  - /connect-runtime page — OAuth callback for `oma bridge setup`,
    mints exchange code, redirects to localhost CLI listener
  - Sidebar nav + RuntimesIcon

End-to-end verified:
  - Daemon mints token, attaches WS, hello manifest detected
    (claude-code-acp / codex / opencode), runtime online → offline
  - Bundle endpoint returns correct files for each acp_agent_id
  - Real Claude Code prompt round-trip: "What is 2+2?" → "Four."
    streamed through 3 agent_message_chunk events end-to-end
  - mcp-proxy 401/403 paths verified (single SQL JOIN auth)
  - SQL migration applies cleanly on local D1
  - typecheck passes for root + cli + acp-runtime
  - vite console build, esbuild cli bundle both clean

Not yet exercised:
  - Real SessionDO → MAIN service binding → RuntimeRoom path (mocked
    via /v1/internal/runtime-attach-harness; needs lane to verify)
  - Real upstream MCP proxy with vault creds
  - Browser-end OAuth in `oma bridge setup`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(console): keep '+ Connect machine' button on one line

Long header description was pushing the button into a 2-line wrap on
narrow viewports. Two changes:
  - shrink-0 + whitespace-nowrap on the button so it doesn't squeeze
  - shorten the description paragraph (closer to ApiKeysList style)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): agent create form picks Local Runtime + ACP agent

Adds two new fields to the agent create modal's basic tab:
  - Local Runtime dropdown (reuses /v1/runtimes; disables offline ones)
  - ACP agent dropdown (filled from the chosen runtime's daemon-detected
    agents — claude-code-acp / codex-cli / opencode / hermes / etc.)

When both are set, the agent is created with harness:'acp-proxy' and
runtime_binding:{runtime_id, acp_agent_id}. Empty runtime = default
cloud loop (no behaviour change).

Auto-picks the first detected ACP agent when a runtime is selected so
the user doesn't need to know the daemon's manifest strings.

Empty-state shows a link to /runtimes when no machines registered yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): show runtime_binding on AgentDetail

Adds 'Local Runtime' row when agent has runtime_binding set so users can
verify their pick stuck after creating an acp-proxy agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: actually persist runtime_binding through agent CRUD path

Three bugs hit during e2e on the lane:
  1. cli flag(): didn't accept --name=value form (only --name value)
  2. setup.ts: opened /connect-daemon (legacy clash path); should be
     /connect-runtime (matches Console route + my recent rename)
  3. agents POST/PUT route + AgentService NewAgentInput/UpdateAgentInput
     dropped runtime_binding silently — Console form sent it but the API
     stripped it, so harness=acp-proxy agents were created without their
     runtime binding and the AgentDetail row I added never rendered

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): re-register acp-proxy harness lost during rebase

The registerHarness('acp-proxy', ...) line vanished during a stash/pop
through the linear/github merges. Without it resolveHarness('acp-proxy')
falls back to default and SessionDO calls Anthropic directly with the
wrong model — exactly what the lane SSE showed:
  span.model_request_end finish_reason=error
  error_message=login fail: Please carry the API secret key...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): add MAIN binding to lane agent worker + workers= input for fast partial redeploy

Two changes triggered by ACP e2e on lane:

1. lane-generate.mjs: agent.services rewrite was dropping MAIN. Lane
   AcpProxyHarness sessions errored 'MAIN service binding missing on
   agent worker' — adding the lane-suffixed MAIN binding fixes it.

2. deploy-lane.yml: new `workers` input (default 'all', accepts comma
   list of main/agent/integrations). Each step now gated by contains().
   Agent-only retry drops from ~7min full lane deploy to ~30s when the
   other workers are already live and only one app's code changed.
   Secret-set step also gated to 'all' since secrets only need to be
   pushed at full bring-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): set INTEGRATIONS_INTERNAL_SECRET on lane agent worker

AcpProxyHarness calls main's /v1/internal/runtime-attach-harness which
requires the header secret. Lane was only pushing it to main + integrations
workers; agent didn't have it, so local-runtime sessions errored
'INTEGRATIONS_INTERNAL_SECRET unset — cannot call main internal endpoints'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: bind spawn-cwd lifetime to OMA session, not 7d age GC

Daemon's per-session spawn cwd (~/.oma/bridge/sessions/<short>/) was
auto-GC'd after 7 days of inactivity — surprising for users whose
sessions are platform-owned. Switch to platform-driven cleanup:

  - Drop gcOldSessions from session-cwd.ts + daemon startup
  - Add removeSessionCwd() called from SessionManager.dispose
  - Split SessionManager dispose vs disposeAll: dispose (platform 'session.
    dispose') kills child + rm cwd; disposeAll (daemon shutdown) kills
    children only — sessions are still alive at the platform, daemon
    coming back tomorrow needs the same dirs

Wire the platform → daemon signal: apps/main DELETE /v1/sessions/:id
now forwards session.dispose to the RuntimeRoom DO when the agent has
runtime_binding set. Best-effort — daemon offline doesn't block the
delete (the daemon could also reconcile via runtime_session lookup
returning 404 next attempt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): detect + report local skills installed on daemon machine

C1 of the local-skill blocklist series. Daemon scans
~/.claude/skills/<id>/SKILL.md (claude-code-acp globals) and reports the
manifest in the WS hello frame; main persists it as runtime.local_skills_json
and surfaces it on GET /v1/runtimes so Console can show users what each
attached machine has available.

No filtering yet — that's C2 (per-agent blocklist field) and C3 (daemon
applies the filter via CLAUDE_CONFIG_DIR symlinks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): per-agent local-skill blocklist on agent settings

C2 of the local-skill blocklist series. Adds optional
runtime_binding.local_skill_blocklist (string[]) to AgentConfig and a
Console multi-select panel under the Local Runtime picker on the agent
form. Options come from runtimes[].local_skills[acpAgentId] (populated
by C1) so the user only ever sees skills that actually exist on the
selected runtime.

Daemon enforcement lands in C3 — until then the field is informational.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): enforce local-skill blocklist via CLAUDE_CONFIG_DIR symlinks

C3 of the local-skill blocklist series — closes the loop. Bundle
endpoint now returns the agent's local_skill_blocklist alongside files.
On session.start for claude-code-acp the daemon builds
<cwd>/.claude-config/ by symlinking ~/.claude/* (atomically — settings,
credentials, plugins, agents, etc. preserved) and rebuilding skills/ to
include only non-blocklisted ids; spawns the child with
CLAUDE_CONFIG_DIR pointing at it. The user's real ~/.claude/ is
untouched.

v1 filters globals (~/.claude/skills/<id>/) only — plugin-bundled skills
come along with the wholesale plugins/ symlink. Their nested layout
(plugins/cache/<marketplace>/<plugin>/<ver>/skills/<id>/) needs a
recursive walk to filter individually; defer until a real user hits it.

Other ACP agents (codex, opencode, gemini) skip this path — their skill
ecosystems don't share Claude Code's filesystem layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(runtime): bundle endpoint enforces tenant ownership of requested sid

The daemon-facing GET /agents/runtime/sessions/:sid/bundle previously
only checked that the bearer was a valid runtime token, not that the
sid belonged to the same tenant. A leaked sk_machine_* could be used
to enumerate other tenants' session ids and read their agent system
prompts + appendable_prompts via the bundle.

Extend authenticateRuntimeToken to also return owner_tenant_id, and
gate bundle access on session.tenant_id === auth.tenant_id. Returns
404 (not 403) on mismatch so the endpoint isn't an existence oracle.

Practical exploitability was very low (sids are unguessable UUIDs and
no credentials leaked), but this is the IDOR the prior comment in this
file flagged as 'tighten when sensitive prompts ship'. Closing it now
before the local-runtime feature lands in a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hrhrng added a commit that referenced this pull request May 12, 2026
* feat(acp-runtime): local-runtime daemon + AcpProxyHarness end-to-end

Introduces a new "local runtime" agent path: users register their laptop
via `oma bridge setup`, run a long-running daemon that holds a reverse-WS
to OMA, and OMA delegates the agent loop for any agent with
`harness: "acp-proxy"` + `runtime_binding` to a Claude Code (or other
ACP-compatible) child process spawned on that machine.

Architecture:
  user laptop                              CF cloud (managed-agents)
  ──────────────                            ─────────────────────────
  oma bridge daemon ═WS══> RuntimeRoom DO (idFromName(runtime_id))
       │                          │
       │ stdin/stdout              │ fan-out by harness:<sid> tag
       ▼                          ▼
  claude-code-acp child     SessionDO.AcpProxyHarness.run()
                              (one turn per call)

OMA intervention surface (no ACP protocol field for system prompt):
  - System prompt + appendable_prompts → AGENTS.md in spawn cwd
  - Skills → .claude/skills/<name>/SKILL.md (claude-code-acp)
            or inlined into AGENTS.md (codex/gemini/opencode)
  - MCP servers → ACP `session/new.mcpServers` rewritten to OMA
                  /v1/mcp-proxy/<sid>/<server>; daemon's `oma_*` PAT as
                  authorization_token; real upstream creds never leave
                  Workers runtime
  - LLM API key → user-managed (claude /login or env)

Backend (apps/main):
  - migrations/0010_runtimes.sql: runtimes / runtime_tokens /
    connect_runtime_codes (3 tables; no FKs per project convention)
  - RuntimeRoom DO with hibernation API; daemon-tag + harness:<sid>
    fan-out routing; persists session_state for late-attach replay
  - Routes: /v1/runtimes/* (browser auth), /agents/runtime/* (daemon
    auth via sk_machine_*), /v1/mcp-proxy/:sid/:server (single SQL
    JOIN: api_token + sessions + vault credential lookup), /v1/internal/
    runtime-attach-harness (WS upgrade proxy from harness to RuntimeRoom)
  - run_worker_first updated to include /agents/runtime/*

Agent (apps/agent):
  - AcpProxyHarness implements HarnessInterface; per-turn opens WS to
    RuntimeRoom via MAIN service binding, sends session.start /
    session.prompt, drains session.event via AcpTranslator into
    SessionEvent broadcasts. shouldCompact/compact/deriveModelContext/
    onSessionInit are no-ops (ACP child manages own context)
  - AcpTranslator unwraps `event.update.sessionUpdate` correctly
    (real ACP wire shape, not the surface the SDK types suggest)
  - HarnessContext gains session_id + env.MAIN +
    env.INTEGRATIONS_INTERNAL_SECRET; SessionDO populates them
  - MAIN service binding added to all agent wrangler files

CLI (packages/cli):
  - `oma bridge {setup,daemon,status,uninstall}` — daemon code lifted
    from clash-space/clash, ported to OMA paths (~/.oma/bridge/*),
    fetches AGENTS.md+skills bundle from main, no LLM env injection
  - `oma runtime {list,rm}` — manage registered machines
  - `oma agents create --runtime <rid> --acp-agent <id>` — opt into
    acp-proxy harness with one flag pair
  - esbuild banner adds createRequire shim so ESM bundle handles ws's
    internal `require("stream")`
  - NodeSpawner: `AgentSpec.env` now `Record<string, string|undefined>`
    so callers can EXPLICITLY UNSET inherited keys (CLAUDECODE etc.)
  - SessionManager scrubs CLAUDECODE / CLAUDE_CODE_ENTRYPOINT /
    CLAUDE_CODE_SSE_PORT before spawn — claude-code-acp refuses
    nested-session start otherwise
  - SessionManager.prompt now propagates ACP `promptError` as
    session.error instead of silent session.complete

Console (apps/console):
  - /runtimes page — list registered machines, status / heartbeat,
    "+ Connect machine" with install instructions
  - /connect-runtime page — OAuth callback for `oma bridge setup`,
    mints exchange code, redirects to localhost CLI listener
  - Sidebar nav + RuntimesIcon

End-to-end verified:
  - Daemon mints token, attaches WS, hello manifest detected
    (claude-code-acp / codex / opencode), runtime online → offline
  - Bundle endpoint returns correct files for each acp_agent_id
  - Real Claude Code prompt round-trip: "What is 2+2?" → "Four."
    streamed through 3 agent_message_chunk events end-to-end
  - mcp-proxy 401/403 paths verified (single SQL JOIN auth)
  - SQL migration applies cleanly on local D1
  - typecheck passes for root + cli + acp-runtime
  - vite console build, esbuild cli bundle both clean

Not yet exercised:
  - Real SessionDO → MAIN service binding → RuntimeRoom path (mocked
    via /v1/internal/runtime-attach-harness; needs lane to verify)
  - Real upstream MCP proxy with vault creds
  - Browser-end OAuth in `oma bridge setup`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(console): keep '+ Connect machine' button on one line

Long header description was pushing the button into a 2-line wrap on
narrow viewports. Two changes:
  - shrink-0 + whitespace-nowrap on the button so it doesn't squeeze
  - shorten the description paragraph (closer to ApiKeysList style)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): agent create form picks Local Runtime + ACP agent

Adds two new fields to the agent create modal's basic tab:
  - Local Runtime dropdown (reuses /v1/runtimes; disables offline ones)
  - ACP agent dropdown (filled from the chosen runtime's daemon-detected
    agents — claude-code-acp / codex-cli / opencode / hermes / etc.)

When both are set, the agent is created with harness:'acp-proxy' and
runtime_binding:{runtime_id, acp_agent_id}. Empty runtime = default
cloud loop (no behaviour change).

Auto-picks the first detected ACP agent when a runtime is selected so
the user doesn't need to know the daemon's manifest strings.

Empty-state shows a link to /runtimes when no machines registered yet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): show runtime_binding on AgentDetail

Adds 'Local Runtime' row when agent has runtime_binding set so users can
verify their pick stuck after creating an acp-proxy agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: actually persist runtime_binding through agent CRUD path

Three bugs hit during e2e on the lane:
  1. cli flag(): didn't accept --name=value form (only --name value)
  2. setup.ts: opened /connect-daemon (legacy clash path); should be
     /connect-runtime (matches Console route + my recent rename)
  3. agents POST/PUT route + AgentService NewAgentInput/UpdateAgentInput
     dropped runtime_binding silently — Console form sent it but the API
     stripped it, so harness=acp-proxy agents were created without their
     runtime binding and the AgentDetail row I added never rendered

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(agent): re-register acp-proxy harness lost during rebase

The registerHarness('acp-proxy', ...) line vanished during a stash/pop
through the linear/github merges. Without it resolveHarness('acp-proxy')
falls back to default and SessionDO calls Anthropic directly with the
wrong model — exactly what the lane SSE showed:
  span.model_request_end finish_reason=error
  error_message=login fail: Please carry the API secret key...

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): add MAIN binding to lane agent worker + workers= input for fast partial redeploy

Two changes triggered by ACP e2e on lane:

1. lane-generate.mjs: agent.services rewrite was dropping MAIN. Lane
   AcpProxyHarness sessions errored 'MAIN service binding missing on
   agent worker' — adding the lane-suffixed MAIN binding fixes it.

2. deploy-lane.yml: new `workers` input (default 'all', accepts comma
   list of main/agent/integrations). Each step now gated by contains().
   Agent-only retry drops from ~7min full lane deploy to ~30s when the
   other workers are already live and only one app's code changed.
   Secret-set step also gated to 'all' since secrets only need to be
   pushed at full bring-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(lane): set INTEGRATIONS_INTERNAL_SECRET on lane agent worker

AcpProxyHarness calls main's /v1/internal/runtime-attach-harness which
requires the header secret. Lane was only pushing it to main + integrations
workers; agent didn't have it, so local-runtime sessions errored
'INTEGRATIONS_INTERNAL_SECRET unset — cannot call main internal endpoints'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: bind spawn-cwd lifetime to OMA session, not 7d age GC

Daemon's per-session spawn cwd (~/.oma/bridge/sessions/<short>/) was
auto-GC'd after 7 days of inactivity — surprising for users whose
sessions are platform-owned. Switch to platform-driven cleanup:

  - Drop gcOldSessions from session-cwd.ts + daemon startup
  - Add removeSessionCwd() called from SessionManager.dispose
  - Split SessionManager dispose vs disposeAll: dispose (platform 'session.
    dispose') kills child + rm cwd; disposeAll (daemon shutdown) kills
    children only — sessions are still alive at the platform, daemon
    coming back tomorrow needs the same dirs

Wire the platform → daemon signal: apps/main DELETE /v1/sessions/:id
now forwards session.dispose to the RuntimeRoom DO when the agent has
runtime_binding set. Best-effort — daemon offline doesn't block the
delete (the daemon could also reconcile via runtime_session lookup
returning 404 next attempt).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): detect + report local skills installed on daemon machine

C1 of the local-skill blocklist series. Daemon scans
~/.claude/skills/<id>/SKILL.md (claude-code-acp globals) and reports the
manifest in the WS hello frame; main persists it as runtime.local_skills_json
and surfaces it on GET /v1/runtimes so Console can show users what each
attached machine has available.

No filtering yet — that's C2 (per-agent blocklist field) and C3 (daemon
applies the filter via CLAUDE_CONFIG_DIR symlinks).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(console): per-agent local-skill blocklist on agent settings

C2 of the local-skill blocklist series. Adds optional
runtime_binding.local_skill_blocklist (string[]) to AgentConfig and a
Console multi-select panel under the Local Runtime picker on the agent
form. Options come from runtimes[].local_skills[acpAgentId] (populated
by C1) so the user only ever sees skills that actually exist on the
selected runtime.

Daemon enforcement lands in C3 — until then the field is informational.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(runtime): enforce local-skill blocklist via CLAUDE_CONFIG_DIR symlinks

C3 of the local-skill blocklist series — closes the loop. Bundle
endpoint now returns the agent's local_skill_blocklist alongside files.
On session.start for claude-code-acp the daemon builds
<cwd>/.claude-config/ by symlinking ~/.claude/* (atomically — settings,
credentials, plugins, agents, etc. preserved) and rebuilding skills/ to
include only non-blocklisted ids; spawns the child with
CLAUDE_CONFIG_DIR pointing at it. The user's real ~/.claude/ is
untouched.

v1 filters globals (~/.claude/skills/<id>/) only — plugin-bundled skills
come along with the wholesale plugins/ symlink. Their nested layout
(plugins/cache/<marketplace>/<plugin>/<ver>/skills/<id>/) needs a
recursive walk to filter individually; defer until a real user hits it.

Other ACP agents (codex, opencode, gemini) skip this path — their skill
ecosystems don't share Claude Code's filesystem layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(runtime): bundle endpoint enforces tenant ownership of requested sid

The daemon-facing GET /agents/runtime/sessions/:sid/bundle previously
only checked that the bearer was a valid runtime token, not that the
sid belonged to the same tenant. A leaked sk_machine_* could be used
to enumerate other tenants' session ids and read their agent system
prompts + appendable_prompts via the bundle.

Extend authenticateRuntimeToken to also return owner_tenant_id, and
gate bundle access on session.tenant_id === auth.tenant_id. Returns
404 (not 403) on mismatch so the endpoint isn't an existence oracle.

Practical exploitability was very low (sids are unguessable UUIDs and
no credentials leaked), but this is the IDOR the prior comment in this
file flagged as 'tighten when sensitive prompts ship'. Closing it now
before the local-runtime feature lands in a release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant