feat(server): /screencast WebSocket route + ScreencastManager#1060
Conversation
Exposes a per-windowId screencast feed over a Hono WebSocket so external
consumers (agent-company first) can render the agent's live browser as
JPEG frames without a separate CDP attach.
- New `/screencast?windowId=N` route uses hono/bun `upgradeWebSocket`.
- `ScreencastManager` ref-counts subscribers per windowId: first viewer
starts `Page.startScreencast` against the active page in that window,
last viewer triggers `Page.stopScreencast`. Frames broadcast as
`{type:"frame", data:<base64 jpeg>, metadata}`; ack-gated so a slow
consumer can't queue frames unboundedly.
- `Browser.getActivePageForWindow(windowId)` resolves windowId to a
ProtocolApi session via `Browser.getActiveTab({windowId})` — works
for hidden windows where `Browser.getTabs({includeHidden:true})` may
not yet expose the tab.
- Quality / frame-size knobs live in `@browseros/shared/constants/limits`
as `SCREENCAST_LIMITS` (q=80, 1280x800, every nth frame).
- Integration test against live CDP covers subscribe → frame arrives →
unsubscribe cleanup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
❌ Tests failed — 1/1064 failed
Failed tests
|
Greptile SummaryThis PR adds a
Confidence Score: 3/5Safe to merge for single-subscriber use; the concurrent-subscriber race in ScreencastManager can silently leave a second subscriber connected but receiving no frames. The concurrent-subscribe path shares a single pending startSession Promise across all callers for the same targetId. If the first caller's WebSocket closes before the Promise resolves, its continuation calls stopSession synchronously — removing the frame listener and deleting the session from the map — before the second caller's continuation adds its socket to the now-orphaned session. The second subscriber receives a status:connected message and possibly a cached frame, but live frames never arrive and no error is surfaced. packages/browseros-agent/apps/server/src/api/services/screencast/screencast-manager.ts — specifically the concurrent subscribe path through getOrStartSession when a pending start is shared between two callers and one disconnects mid-flight. Important Files Changed
Sequence DiagramsequenceDiagram
participant Client as WebSocket Client
participant Route as screencast.ts
participant Manager as ScreencastManager
participant Browser as Browser
participant CDP as CDP / Chromium
Client->>Route: "WS connect ?windowId=N&pageId=M"
Route->>Manager: subscribe(windowId, pageId, ws)
Manager->>Browser: getActivePageForWindow / getPageSession
Browser->>CDP: Target.attachToTarget + domain enables
CDP-->>Browser: sessionId
Browser-->>Manager: "{targetId, session, url}"
Manager->>CDP: Page.startScreencast (first subscriber)
CDP-->>Manager: ack
Manager->>Client: "status {connected, url}"
Manager->>Client: lastFrame (cached) OR captureScreenshot (idle page)
loop Each compositor frame
CDP->>Manager: screencastFrame event
Manager->>Client: "{type:frame, data, metadata}"
Manager->>CDP: Page.screencastFrameAck
end
Client->>Route: WS close
Route->>Manager: unsubscribe(handle, ws)
alt last subscriber
Manager->>CDP: Page.stopScreencast
end
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
packages/browseros-agent/apps/server/src/api/services/screencast/screencast-manager.ts:73-107
**Concurrent subscriber orphaned when first subscriber closes during startup**
When two callers concurrently `subscribe` to the same `targetId`, they both await the same `startPromise` (via `pendingStarts`). If subscriber A's WebSocket closes while that Promise is in-flight, A's continuation runs first, passes the `readyState` guard, calls `stopIfIdle` → `stopSession`. `stopSession` synchronously calls `session.unsubscribeFrame()` (removing the CDP frame listener) and `sessions.delete(targetId)`, then yields to await `stopScreencast`. B's continuation then runs, adds `wsB` to the orphaned session, and receives a `status: connected` + `lastFrame`. But because the frame listener was removed by A's `stopSession`, live frames never arrive for B — the stream silently goes dark with no error surfaced to the client.
### Issue 2 of 3
packages/browseros-agent/packages/shared/src/constants/limits.ts:84-90
`SUBSCRIBER_BACKPRESSURE_BYTES` is exported but never imported or used anywhere in the screencast implementation — backpressure logic is not implemented. Per the dead code guideline, unused constants should be removed rather than left as aspirational scaffolding.
```suggestion
export const SCREENCAST_LIMITS = {
DEFAULT_JPEG_QUALITY: 80,
EVERY_NTH_FRAME: 1,
MAX_WIDTH: 1280,
MAX_HEIGHT: 800,
} as const
```
### Issue 3 of 3
packages/browseros-agent/apps/server/src/api/services/screencast/screencast-manager.ts:27-33
The `'detached'` variant of `ScreencastStatusMessage` is defined but never emitted — no call site in `ScreencastManager` ever sends `status: 'detached'`. Per the dead code rule, unused type variants should be removed rather than left as dead scaffolding.
```suggestion
export interface ScreencastStatusMessage {
type: 'status'
status: 'connected'
windowId: number
pageId?: number
url?: string
}
```
Reviews (3): Last reviewed commit: "fix(server): address Greptile P1s on scr..." | Re-trigger Greptile |
Chromium's Page.startScreencast only emits frames on compositor invalidation — a static page produces one frame on attach, then nothing. Late subscribers to an existing session (or any subscriber on an idle page) saw status:"connected" and a blank canvas forever. - Cache the most recent screencastFrame on each ScreencastSession and replay it to every new subscriber on subscribe. - If no frame has been cached yet (first subscriber to a quiet page), fall back to a one-shot Page.captureScreenshot and inject it as a synthetic frame so the canvas paints immediately. Best-effort — errors are logged but don't break the subscription. Verified against a static hidden chrome://newtab/: before the fix a late subscriber saw 0 frames over 4s; after, the primed screenshot lands within 16ms and the canvas paints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`/screencast?windowId=N` always streamed the window's active tab, which loses the agent's intended target as soon as a new tab opens. Sessions now key off `targetId` and the WS route accepts an optional `pageId` query — the manager resolves it via `getPageSession(pageId)` (with a fallback `listPages()` refresh if the cache is stale) and falls back to the window's active tab when omitted. `attachTab` is shared between the active-tab and pageId paths so both go through the same CDP enable + Page.startScreencast pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@greptile-ai review |
Three issues caught by review: 1. Console collection silently broken for screencast-first tabs. `attachTab` cached a Page.enable-only session in `Browser.sessions`. Agent tools later resolving the same `targetId` short-circuited on the cached entry and skipped the DOM/Runtime/Log/Accessibility enables + `consoleCollector.attach()`. `getConsoleLogs` returned nothing for those tabs with no error surfaced. Removed `attachTab` and routed both `getActivePageForWindow` and `getPageSession` through the canonical `attachToPage`. Active-tab path resolves the Browser-internal pageId via a new `ensurePageIdForTarget` helper (with a `listPages` refresh fallback). 2. Zombie subscriber when WS closes during startup. If the client disconnected while `subscribe`'s awaits were in flight, the route's `onClose` saw `handle === null` and skipped unsubscribe. `subscribe` then added the dead ws to `subscribers`; broadcast skipped it but never decremented, so `stopSession` never fired and the CDP screencast leaked. Now `subscribe` checks `ws.readyState` after the awaits and bails (calling `stopIfIdle` to clean up a freshly started session with no real subscribers). 3. `broadcast` dropped a failed subscriber without checking for empty set. If the failed send was the last subscriber, the session leaked. Extracted the empty-check into `stopIfIdle` and call it from `broadcast`, `unsubscribe`, and the close-during-startup path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@greptile review |
When two subscribers share a pendingStart for the same targetId and the first one's ws closes mid-flight, the first continuation runs first, sees ws closed, and stopIfIdle → stopSession synchronously removes the frame listener and deletes the session from the map. The second continuation then adds its ws to the orphaned session object, sends `status:connected`, and never receives a frame. Mark sessions `disposed: true` inside stopSession and have subscribe loop back to getOrStartSession when it sees a disposed session — which is no longer in `this.sessions` so the next attempt opens a fresh stream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
/screencastWebSocket route on the agent server. Streams the agent's browser as base64 JPEG frames overPage.startScreencast, so callers can render a live canvas without popping the native window.Page.startScreencast, last triggersstopScreencast. A slow consumer can't queue frames unboundedly.?pageId=Nfollows a specific Browser-internal page; omitted, it falls back to the window's active tab. Sessions are keyed offtargetId, so two viewers of the same tab share one CDP stream regardless of how each requested it.lastFrameimmediately; if none exists (idle page that hasn't repainted),Page.captureScreenshotprimes the canvas synthetically so the pane never sits black.@browseros/shared/constants/limitsasSCREENCAST_LIMITS.How it fits
agent-company mounts this WebSocket via an SSE proxy (paired PR https://github.com/browseros-ai/agent-company/pull/76) and renders frames into a canvas in the chat surface. The agent-side renderer lifts `pageId` off the latest browseros tool's input/output and threads it here, so the canvas follows whichever tab the agent is acting on.
API
Test plan
🤖 Generated with Claude Code