Port Registrar: browser agent + Chrome extension migration + origin allowlist#2345
Merged
Conversation
e5ac9b9 to
11719be
Compare
browser Agent + Chrome Extension Migration + Origin Allowlist0794f00 to
2fd8262
Compare
…fork-failure port reset - originAllowlist (browser+code): accept '[::1]' (Node URL parser preserves brackets in hostname) so IPv6 loopback dev clients pass the Origin gate. - browserActionHandler.createViewServiceHost: reset agentContext.localHostPort to 0 in the synchronous fork catch path so a retried enable can re-register a fresh port (closes the third EADDRINUSE-recurrence path; the disable + close paths were fixed in the prior commit). - Add agentWebSocketServer.test.ts cases for http://[::1], http://[::1]:8081, https://[::1]:5173. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…swallow Two new tests in sessionContext.spec.ts: - delegates to agents.refreshReadiness with the agent's own name - swallows errors from refreshReadiness so an event-driven caller (WS onClientConnected/onClientDisconnected) cannot throw into the emitter path Covers the contract added in commit e5ac9b9 that wires browser-extension connect/disconnect events to automatic dispatcher readiness refresh, removing the need for users to run \@config agent refresh browser\ manually. Tests: 671 passed, 671 total (was 669). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR 3 migrated the browser agent + Chrome extension off hardcoded port 8081 to dynamic ports via the agent-server discovery channel, but missed the shell's inline-browser -> browser-agent WebSocket caller in browserIpc.ts which still hardcoded 8081 via createWebSocket(...). In connect mode this caused the inlineBrowser client to silently fail to connect, leaving the agent-server with only the Chrome extension as an active client and routing all browser actions there even when the user expected the inline browser to handle them. Replace the hardcoded createWebSocket(...,8081,...) call with a discoverPort(...) lookup against the agent-server (default ws://localhost:8999/, overridable via WEBSOCKET_HOST), then open the WebSocket directly with the resolved host:port. Mirrors the chrome extension's resolveBrowserEndpoint flow in serviceWorker/websocket.ts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR 3 migrated the browser agent off hardcoded port 8081 onto
OS-assigned ports discovered through the agent-server's WS-fronted
discovery channel. The discovery channel was only hosted by the
agentServer process, so when the Electron shell ran in standalone
mode (its own in-process dispatcher, no separate agentServer), the
Chrome extension had no way to find the in-process browser agent's
port and silently failed to connect.
The PortRegistrar already lives in the dispatcher (good), so both
host modes have the data; the standalone shell just wasn't exposing
it. Fix:
* Extract the discovery handler factory to
@typeagent/agent-server-protocol as createDiscoveryHandlers
so both hosts share one definition (server.ts refactored to
use it; protocol stays dispatcher-free by accepting a lookup
callback rather than IPortRegistrar).
* Standalone shell pre-builds a PortRegistrar, hands it to
createDispatcher, and starts a tiny WebSocket server on
AGENT_SERVER_DEFAULT_PORT (8999) hosting the discovery channel.
Self-registers under AGENT_SERVER_DISCOVERY_NAME for symmetry
with agentServer.
* Bind is exact (no fallback): EADDRINUSE on 8999 means a real
agentServer or another shell instance is already there, in
which case silently rebinding to a random port would only
confuse users whose extensions point at the default. Logged as
a shell-init error so the user can decide.
* closeDispatcher tears down the discovery server alongside the
dispatcher to avoid leaking the listening socket on shutdown.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address code-review findings on PR 3:
1. (major) instance.ts could leak the standalone discovery WS if a
later init step (createDispatcher etc.) threw after the discovery
server bound port 8999. The catch-block returned without closing
the listening socket, so the next shell launch would EADDRINUSE.
Hoist standaloneDiscovery above the try{} so the catch can see it
and tear it down.
2. (minor) browserIpc.ts and webSocketUtils.ts treat the WEBSOCKET_HOST
env var with subtly different semantics (base URL vs full endpoint).
Document the divergence inline so users know to set a base URL
without a path.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document the discovery channel host, transport, and shared factory: - agents/browser/README.md: discovery channel hosted by both agentServer and standalone Electron shell - agentServer/protocol/README.md: DiscoveryChannelName, DiscoveryInvokeFunctions, and the createDiscoveryHandlers shared factory pattern - shell/README.md: in-process discovery WS in local mode and EADDRINUSE fail-loud behavior on port 8999 collisions Also sort ts/packages/shell/package.json deps so npm-package-sort-metadata repo policy check passes (the websocket-channel-server addition was placed out of order). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1d28a7b to
4594fc8
Compare
Reformats files that drifted from prettier's preferred style: - agentServer/protocol/README.md (added in the discovery docs commit) - agents/browser/src/extension/views/options.html - agents/browser/test/serviceWorker/websocket.test.ts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closed
robgruen
approved these changes
May 19, 2026
…ures - Rewrote three `Per design §4.2` comments (originAllowlist in code + browser agents, codeAgentWebSocketServer) to be self-contained; they pointed at a doc not present in the repo. - serviceWorker/index.ts: pass the JSON.parse error to debugWebAgentProxy instead of swallowing it silently, so malformed payloads on the webAgent proxy channel are diagnosable via debug output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
PR 2 migrated the
codeagent and proved the discovery handshake against an in-repo VS Code extension. Thebrowseragent is the second concrete migration and exercises a meaningfully different consumer shape:isomorphic-ws, not Nodews.code's per-schema model.BROWSER_WEBSOCKET_PORTescape hatch matters more for debugging thanCODE_WEBSOCKET_PORTdid.This PR also closes a design gap that PR 2 quietly skipped: per design §4.2, every per-agent listener migrated to the
PortRegistrarmust Origin-gate WebSocket upgrades to keep an OS-assigned ephemeral port from being dialed by arbitrary web pages on the same host. Both the browser AND the code agent get the gate in this PR.What this PR changes
browseragent server: asyncstart+close+ Origin gateAgentWebSocketServeris now an asyncstart(port = 0)factory rather than a constructor. It resolves only after thelisteningevent so the OS-assigned port is readable synchronously via.port, and rejects on EADDRINUSE under fixed-port overrides instead of swallowing the error in a permanenterrorlistener.close()now awaitsserver.close()and proactively closes all tracked client sockets first, so a rapiddisable → re-enablecycle underBROWSER_WEBSOCKET_PORTrebinds cleanly.The new
verifyClientcallback appliesisAllowedAgentOrigin()to every upgrade request before theconnectionevent fires. Allowed:chrome-extension://*,moz-extension://*,http(s)://localhost(:port),http(s)://127.0.0.1(:port), and absent /nullOrigin (Nodewsclients don't send one; the bridge binds loopback so this is OS-restricted). Anything else gets HTTP 403.codeagent: Origin parityCodeAgentWebSocketServergets the same gate with a code-agent-specific allowlist (vscode-webview://,vscode-file://,vscode-resource://plus loopback / absent Origin). The twooriginAllowlistfiles are deliberately duplicated rather than shared — different scheme prefixes per agent, ~30 LOC each, and a shared module would force every agent that opens a port to depend on a new common package.Handler lifecycle: refcounted shared server, idempotent cleanup
browserActionHandler.mtsno longer constructs the server at module load. It now mirrors PR 2's pattern: module-scopedsharedBrowserServer,sharedStartingPromise,sharedClosingPromise,sharedBrowserRefCount, with anensureSharedBrowserServer()that serializes concurrent enables and waits for any in-flight close before re-binding (matters under fixed-port override; withport = 0the OS would just hand out a different port).The browser agent's lifecycle differs from
code's in two ways worth flagging for reviewers:updateBrowserContextearly-returns for non-browserschemas, so refcount accounting tracksagentContext.browserSchemaEnabled(always 0 or 1 per session) rather thanenabled.size.closeBrowserContext, leaking on the disable path. This PR funnels bothupdateAgentContext(false, ...)andcloseBrowserContextthroughcleanupBrowserSession(), which is idempotent: ifbrowserSchemaEnabledis already false, it bails without touching shared state.BROWSER_WEBSOCKET_PORTis the new escape hatch (mirrorsCODE_WEBSOCKET_PORT); malformed values fall back to OS-assigned with a debug log rather than crashing.Chrome extension: discovery-then-connect
extension/serviceWorker/websocket.tsnow does the same dance the Coda extension does in PR 2:agentServerHostfromchrome.storage.sync(defaultws://localhost:8999/).discoverPort("browser", "default", { url: agentServerHost })against the agent-server's discovery channel.unreachableornot-registered, returnundefinedand let the 5s reconnect loop retry. No hardcoded fallback — same call the user made for PR 2 (this is a research project, no lingering external clients to maintain back-compat for).Three pre-existing extension bugs are fixed in the same patch since they directly block the new flow:
settingscache was loaded once and never refreshed, so a user changingagentServerHostsaw the old endpoint until the service worker restarted. Cache removed; settings re-read on everycreateWebSocket.reconnectWebSocket()scheduled a freshsetIntervalon every call, and theonclosehandler called it unconditionally, so flapping connectivity stacked timers indefinitely. Now guarded by a module-level singleton.AGENT_SERVER_DEFAULT_URL".The setting itself is renamed
websocketHost→agentServerHostto match the dispatcher path (dispatcherConnection.tsalready used this key) and reflect that the user now configures the agent-server URL, not the per-agent port.Validation
pnpm --filter browser-typeagent build— green (full agent + extension + Vite client builds).pnpm --filter agent-sdk --filter agent-rpc --filter agent-dispatcher --filter browser-typeagent build— green (post-fix rebuild across all touched packages).pnpm --filter code-agent build && pnpm --filter code-agent test— green (18/18 still pass).pnpm --filter agent-dispatcher test— 669/669 pass, includingagentReadiness.spec.jsandsessionContext.spec.js.pnpm --filter browser-typeagent test:local— 253/254 pass. The one failure is a pre-existing flake incontentDownloader.test.ts("invalid URL gracefully" timing out at 5s); the file isn't touched by this PR (last edited inbbcb8f30).agentWebSocketServer.test.ts— 28 tests. Exercises asyncstart,closeclosing tracked clients first, and the Origin allowlist (coverschrome-extension://,http://localhost:8081, absent Origin, and rejection ofhttps://evil.example.com).websocket.test.ts— 5 tests. CoversdiscoverPort-driven endpoint resolution and asserts only onesetIntervalis scheduled across repeatedreconnectWebSocket()calls (the singleton fix).pr-3-manual-tests.mdchecklist confirmed end-to-end after both fixes: extension auto-discovers the port,@browser open a new tab to bing.comdispatches without manual@config agent refresh browser, andgetWebFlowsForDomainround-trips successfully through the SW ↔ agent webAgent proxy.Reading order for reviewers
This PR is two commits (the follow-up commit is intentionally separate to keep the smoke-test fixes reviewable in isolation). Suggested reading order:
agents/browser/src/agent/{agentWebSocketServer,originAllowlist}.mts+ tests — the server-side primitives (async start/close, Origin gate). Identical structure to the PR 2CodeAgentWebSocketServer.agents/browser/src/agent/{browserActionHandler,browserActions}.mts— the lifecycle wiring. The interesting bits areensureSharedBrowserServer/cleanupBrowserSessionand the convergence of the disable + close paths. Also contains thelocalHostPort = 0reset and thenotifyReadinessChanged()calls in the WS lifecycle callbacks.agents/code/src/{codeAgentWebSocketServer,originAllowlist}.ts— Origin parity for the code agent (small).agents/browser/src/extension/serviceWorker/{websocket,storage,index}.ts+views/options.{ts,html}— the extension consumer. DemonstratesdiscoverPort()from MV3 and the three bug fixes (cache, reconnect singleton, blank-input).agentSdk/src/agentInterface.ts+dispatcher/src/execute/sessionContext.ts+agentRpc/src/{types,client,server}.ts— the newnotifyReadinessChanged()API and its in-process + RPC plumbing. Mirrors the existingreloadAgentSchema()pattern.Follow-up PRs
visualStudiohost webviewonboarding-scaffolder+ tightenoriginAllowlistonagentServer