Skip to content

Port Registrar: browser agent + Chrome extension migration + origin allowlist#2345

Merged
TalZaccai merged 13 commits into
mainfrom
talzacc/browser-agent-migration
May 19, 2026
Merged

Port Registrar: browser agent + Chrome extension migration + origin allowlist#2345
TalZaccai merged 13 commits into
mainfrom
talzacc/browser-agent-migration

Conversation

@TalZaccai
Copy link
Copy Markdown
Contributor

Depends on PR 2 (code agent migration). Branched off talzacc/code-and-localview-migration and must merge after it. Reuses the discoverPort() helper, the registrar SDK, and the async-start/close pattern proven there.

Why

PR 2 migrated the code agent and proved the discovery handshake against an in-repo VS Code extension. The browser agent is the second concrete migration and exercises a meaningfully different consumer shape:

  • The Chrome / Edge extension is out-of-process and distributed separately — discovery has to work from a service worker (MV3) over isomorphic-ws, not Node ws.
  • A single browser-agent server has two simultaneous client populations — the extension(s) and the Electron shell's inline browser — sharing one session id; the lifecycle / refcount accounting differs from code's per-schema model.
  • The Electron shell is the most-used host, so the BROWSER_WEBSOCKET_PORT escape hatch matters more for debugging than CODE_WEBSOCKET_PORT did.

This PR also closes a design gap that PR 2 quietly skipped: per design §4.2, every per-agent listener migrated to the PortRegistrar must Origin-gate WebSocket upgrades to keep an OS-assigned ephemeral port from being dialed by arbitrary web pages on the same host. Both the browser AND the code agent get the gate in this PR.

What this PR changes

browser agent server: async start + close + Origin gate

AgentWebSocketServer is now an async start(port = 0) factory rather than a constructor. It resolves only after the listening event so the OS-assigned port is readable synchronously via .port, and rejects on EADDRINUSE under fixed-port overrides instead of swallowing the error in a permanent error listener. close() now awaits server.close() and proactively closes all tracked client sockets first, so a rapid disable → re-enable cycle under BROWSER_WEBSOCKET_PORT rebinds cleanly.

The new verifyClient callback applies isAllowedAgentOrigin() to every upgrade request before the connection event fires. Allowed: chrome-extension://*, moz-extension://*, http(s)://localhost(:port), http(s)://127.0.0.1(:port), and absent / null Origin (Node ws clients don't send one; the bridge binds loopback so this is OS-restricted). Anything else gets HTTP 403.

code agent: Origin parity

CodeAgentWebSocketServer gets the same gate with a code-agent-specific allowlist (vscode-webview://, vscode-file://, vscode-resource:// plus loopback / absent Origin). The two originAllowlist files are deliberately duplicated rather than shared — different scheme prefixes per agent, ~30 LOC each, and a shared module would force every agent that opens a port to depend on a new common package.

Handler lifecycle: refcounted shared server, idempotent cleanup

browserActionHandler.mts no longer constructs the server at module load. It now mirrors PR 2's pattern: module-scoped sharedBrowserServer, sharedStartingPromise, sharedClosingPromise, sharedBrowserRefCount, with an ensureSharedBrowserServer() that serializes concurrent enables and waits for any in-flight close before re-binding (matters under fixed-port override; with port = 0 the OS would just hand out a different port).

The browser agent's lifecycle differs from code's in two ways worth flagging for reviewers:

  1. Per-session refcount, not per-schemaupdateBrowserContext early-returns for non-browser schemas, so refcount accounting tracks agentContext.browserSchemaEnabled (always 0 or 1 per session) rather than enabled.size.
  2. Two cleanup paths converge on one helper — the prior code only released its session registration in closeBrowserContext, leaking on the disable path. This PR funnels both updateAgentContext(false, ...) and closeBrowserContext through cleanupBrowserSession(), which is idempotent: if browserSchemaEnabled is already false, it bails without touching shared state.

BROWSER_WEBSOCKET_PORT is the new escape hatch (mirrors CODE_WEBSOCKET_PORT); malformed values fall back to OS-assigned with a debug log rather than crashing.

Chrome extension: discovery-then-connect

extension/serviceWorker/websocket.ts now does the same dance the Coda extension does in PR 2:

  1. Read agentServerHost from chrome.storage.sync (default ws://localhost:8999/).
  2. Call discoverPort("browser", "default", { url: agentServerHost }) against the agent-server's discovery channel.
  3. Construct the browser-agent URL by reusing the agent-server host and substituting the discovered port.
  4. On unreachable or not-registered, return undefined and let the 5s reconnect loop retry. No hardcoded fallback — same call the user made for PR 2 (this is a research project, no lingering external clients to maintain back-compat for).

Three pre-existing extension bugs are fixed in the same patch since they directly block the new flow:

  • Settings cache never invalidated — the module-level settings cache was loaded once and never refreshed, so a user changing agentServerHost saw the old endpoint until the service worker restarted. Cache removed; settings re-read on every createWebSocket.
  • Reconnect timer leakreconnectWebSocket() scheduled a fresh setInterval on every call, and the onclose handler called it unconditionally, so flapping connectivity stacked timers indefinitely. Now guarded by a module-level singleton.
  • Options page rejected blank input — the WebSocket-host field rejected empty strings, so users couldn't say "use the default". Now blank means "use AGENT_SERVER_DEFAULT_URL".

The setting itself is renamed websocketHostagentServerHost to match the dispatcher path (dispatcherConnection.ts already used this key) and reflect that the user now configures the agent-server URL, not the per-agent port.

Validation

  • pnpm --filter browser-typeagent build — green (full agent + extension + Vite client builds).
  • pnpm --filter agent-sdk --filter agent-rpc --filter agent-dispatcher --filter browser-typeagent build — green (post-fix rebuild across all touched packages).
  • pnpm --filter code-agent build && pnpm --filter code-agent test — green (18/18 still pass).
  • pnpm --filter agent-dispatcher test669/669 pass, including agentReadiness.spec.js and sessionContext.spec.js.
  • pnpm --filter browser-typeagent test:local253/254 pass. The one failure is a pre-existing flake in contentDownloader.test.ts ("invalid URL gracefully" timing out at 5s); the file isn't touched by this PR (last edited in bbcb8f30).
  • New / updated unit tests:
    • agentWebSocketServer.test.ts28 tests. Exercises async start, close closing tracked clients first, and the Origin allowlist (covers chrome-extension://, http://localhost:8081, absent Origin, and rejection of https://evil.example.com).
    • websocket.test.ts5 tests. Covers discoverPort-driven endpoint resolution and asserts only one setInterval is scheduled across repeated reconnectWebSocket() calls (the singleton fix).
  • Manual smoke test against the pr-3-manual-tests.md checklist confirmed end-to-end after both fixes: extension auto-discovers the port, @browser open a new tab to bing.com dispatches without manual @config agent refresh browser, and getWebFlowsForDomain round-trips successfully through the SW ↔ agent webAgent proxy.

Reading order for reviewers

This PR is two commits (the follow-up commit is intentionally separate to keep the smoke-test fixes reviewable in isolation). Suggested reading order:

  1. agents/browser/src/agent/{agentWebSocketServer,originAllowlist}.mts + tests — the server-side primitives (async start/close, Origin gate). Identical structure to the PR 2 CodeAgentWebSocketServer.
  2. agents/browser/src/agent/{browserActionHandler,browserActions}.mts — the lifecycle wiring. The interesting bits are ensureSharedBrowserServer / cleanupBrowserSession and the convergence of the disable + close paths. Also contains the localHostPort = 0 reset and the notifyReadinessChanged() calls in the WS lifecycle callbacks.
  3. agents/code/src/{codeAgentWebSocketServer,originAllowlist}.ts — Origin parity for the code agent (small).
  4. agents/browser/src/extension/serviceWorker/{websocket,storage,index}.ts + views/options.{ts,html} — the extension consumer. Demonstrates discoverPort() from MV3 and the three bug fixes (cache, reconnect singleton, blank-input).
  5. agentSdk/src/agentInterface.ts + dispatcher/src/execute/sessionContext.ts + agentRpc/src/{types,client,server}.ts — the new notifyReadinessChanged() API and its in-process + RPC plumbing. Mirrors the existing reloadAgentSchema() pattern.

Follow-up PRs

PR Scope
4 Migrate visualStudio host webview
5 Migrate onboarding-scaffolder + tighten originAllowlist on agentServer

@TalZaccai TalZaccai force-pushed the talzacc/browser-agent-migration branch from e5ac9b9 to 11719be Compare May 14, 2026 20:58
@TalZaccai TalZaccai changed the title Port Registrar — browser Agent + Chrome Extension Migration + Origin Allowlist Port Registrar — browser Agent + Chrome Extension Migration + Origin Allowlist May 14, 2026
@TalZaccai TalZaccai changed the title Port Registrar — browser Agent + Chrome Extension Migration + Origin Allowlist Port Registrar: browser Agent + Chrome Extension Migration + Origin Allowlist May 14, 2026
@TalZaccai TalZaccai changed the title Port Registrar: browser Agent + Chrome Extension Migration + Origin Allowlist Port Registrar: browser agent + Chrome extension migration + origin allowlist May 14, 2026
@TalZaccai TalZaccai force-pushed the talzacc/browser-agent-migration branch from 0794f00 to 2fd8262 Compare May 15, 2026 19:28
@TalZaccai TalZaccai requested a review from Copilot May 15, 2026 19:30
TalZaccai and others added 8 commits May 15, 2026 17:03
…fork-failure port reset

- originAllowlist (browser+code): accept '[::1]' (Node URL parser preserves brackets in hostname) so IPv6 loopback dev clients pass the Origin gate.

- browserActionHandler.createViewServiceHost: reset agentContext.localHostPort to 0 in the synchronous fork catch path so a retried enable can re-register a fresh port (closes the third EADDRINUSE-recurrence path; the disable + close paths were fixed in the prior commit).

- Add agentWebSocketServer.test.ts cases for http://[::1], http://[::1]:8081, https://[::1]:5173.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…swallow

Two new tests in sessionContext.spec.ts:

- delegates to agents.refreshReadiness with the agent's own name

- swallows errors from refreshReadiness so an event-driven caller (WS onClientConnected/onClientDisconnected) cannot throw into the emitter path

Covers the contract added in commit e5ac9b9 that wires browser-extension connect/disconnect events to automatic dispatcher readiness refresh, removing the need for users to run \@config agent refresh browser\ manually.

Tests: 671 passed, 671 total (was 669).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR 3 migrated the browser agent + Chrome extension off hardcoded port

8081 to dynamic ports via the agent-server discovery channel, but missed

the shell's inline-browser -> browser-agent WebSocket caller in

browserIpc.ts which still hardcoded 8081 via createWebSocket(...).

In connect mode this caused the inlineBrowser client to silently fail

to connect, leaving the agent-server with only the Chrome extension as

an active client and routing all browser actions there even when the

user expected the inline browser to handle them.

Replace the hardcoded createWebSocket(...,8081,...) call with a

discoverPort(...) lookup against the agent-server (default

ws://localhost:8999/, overridable via WEBSOCKET_HOST), then open the

WebSocket directly with the resolved host:port. Mirrors the chrome

extension's resolveBrowserEndpoint flow in serviceWorker/websocket.ts.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR 3 migrated the browser agent off hardcoded port 8081 onto

OS-assigned ports discovered through the agent-server's WS-fronted

discovery channel. The discovery channel was only hosted by the

agentServer process, so when the Electron shell ran in standalone

mode (its own in-process dispatcher, no separate agentServer), the

Chrome extension had no way to find the in-process browser agent's

port and silently failed to connect.

The PortRegistrar already lives in the dispatcher (good), so both

host modes have the data; the standalone shell just wasn't exposing

it. Fix:

  * Extract the discovery handler factory to

    @typeagent/agent-server-protocol as createDiscoveryHandlers

    so both hosts share one definition (server.ts refactored to

    use it; protocol stays dispatcher-free by accepting a lookup

    callback rather than IPortRegistrar).

  * Standalone shell pre-builds a PortRegistrar, hands it to

    createDispatcher, and starts a tiny WebSocket server on

    AGENT_SERVER_DEFAULT_PORT (8999) hosting the discovery channel.

    Self-registers under AGENT_SERVER_DISCOVERY_NAME for symmetry

    with agentServer.

  * Bind is exact (no fallback): EADDRINUSE on 8999 means a real

    agentServer or another shell instance is already there, in

    which case silently rebinding to a random port would only

    confuse users whose extensions point at the default. Logged as

    a shell-init error so the user can decide.

  * closeDispatcher tears down the discovery server alongside the

    dispatcher to avoid leaking the listening socket on shutdown.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address code-review findings on PR 3:

1. (major) instance.ts could leak the standalone discovery WS if a

   later init step (createDispatcher etc.) threw after the discovery

   server bound port 8999. The catch-block returned without closing

   the listening socket, so the next shell launch would EADDRINUSE.

   Hoist standaloneDiscovery above the try{} so the catch can see it

   and tear it down.

2. (minor) browserIpc.ts and webSocketUtils.ts treat the WEBSOCKET_HOST

   env var with subtly different semantics (base URL vs full endpoint).

   Document the divergence inline so users know to set a base URL

   without a path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document the discovery channel host, transport, and shared factory:
- agents/browser/README.md: discovery channel hosted by both agentServer
  and standalone Electron shell
- agentServer/protocol/README.md: DiscoveryChannelName, DiscoveryInvokeFunctions,
  and the createDiscoveryHandlers shared factory pattern
- shell/README.md: in-process discovery WS in local mode and EADDRINUSE
  fail-loud behavior on port 8999 collisions

Also sort ts/packages/shell/package.json deps so npm-package-sort-metadata
repo policy check passes (the websocket-channel-server addition was placed
out of order).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@TalZaccai TalZaccai force-pushed the talzacc/browser-agent-migration branch from 1d28a7b to 4594fc8 Compare May 16, 2026 00:06
Reformats files that drifted from prettier's preferred style:
- agentServer/protocol/README.md (added in the discovery docs commit)
- agents/browser/src/extension/views/options.html
- agents/browser/test/serviceWorker/websocket.test.ts

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@TalZaccai TalZaccai marked this pull request as ready for review May 18, 2026 23:22
@TalZaccai TalZaccai requested a review from robgruen May 18, 2026 23:25
@TalZaccai TalZaccai mentioned this pull request May 19, 2026
Comment thread ts/packages/agents/browser/src/extension/serviceWorker/index.ts
Comment thread ts/packages/agents/code/src/codeAgentWebSocketServer.ts Outdated
…ures

- Rewrote three `Per design §4.2` comments (originAllowlist in code + browser agents, codeAgentWebSocketServer) to be self-contained; they pointed at a doc not present in the repo.

- serviceWorker/index.ts: pass the JSON.parse error to debugWebAgentProxy instead of swallowing it silently, so malformed payloads on the webAgent proxy channel are diagnosable via debug output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@TalZaccai TalZaccai temporarily deployed to development-fork May 19, 2026 18:17 — with GitHub Actions Inactive
@TalZaccai TalZaccai added this pull request to the merge queue May 19, 2026
Merged via the queue into main with commit 1992593 May 19, 2026
22 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants