Skip to content

refactor(browser): route get_browser_state_with_recovery screenshot through ScreenshotEvent#43

Draft
caffeinum wants to merge 1 commit into
webllm:mainfrom
caffeinum:refactor/screenshot-event-dispatch
Draft

refactor(browser): route get_browser_state_with_recovery screenshot through ScreenshotEvent#43
caffeinum wants to merge 1 commit into
webllm:mainfrom
caffeinum:refactor/screenshot-event-dispatch

Conversation

@caffeinum
Copy link
Copy Markdown
Contributor

Summary

BrowserSession.get_browser_state_with_recovery currently calls page.screenshot() directly when include_screenshot=true, bypassing ScreenshotWatchdog. This means:

  1. Highlights leak into screenshots. ScreenshotWatchdog.on_ScreenshotEvent already strips highlights via remove_highlights() before capture, but the direct path skips the watchdog entirely. Action-overlay bboxes/labels can occlude trailing characters of credentials, copy buttons, and other text — we've seen the agent hallucinate truncations (e.g. read e2b_*cf as e2b_*c) when overlays cover the right edge of a value.
  2. Playwright page.screenshot() is used instead of CDP. Other callers (screenshot action, observation refreshes) already route through take_screenshot() which uses Page.captureScreenshot via CDP. The direct path is the odd one out.

Fix

Replace the inline page.screenshot() call with a ScreenshotEvent dispatch:

const dispatchResult = await this._withAbort(
  this.dispatch_browser_event(new ScreenshotEvent({ full_page: true })),
  signal
);
screenshot = (dispatchResult?.event?.event_result as string | null) ?? null;

ScreenshotWatchdog handles the strip and the CDP capture. Net diff: -26/+11 lines.

Reference: Python upstream

Python browser-use solves the strip centrally in browser_use/browser/watchdogs/screenshot_watchdog.py:55-62remove_highlights() runs inside the screenshot pipeline so every screenshot path is covered. This PR makes the TS port consistent with that design: one screenshot codepath, one strip site.

Test plan

  • npx tsc --noEmit passes
  • Manual smoke: real Chromium against wikipedia.org with highlight_elements: true + include_screenshot: true → full-page PNG returned, no overlays visible, 94 interactive elements indexed.
  • Existing browser-session tests pass on CI (locally blocked on chromium install).

Behavioral diff vs the direct path

take_screenshot() short-circuits on about:blank/chrome://newtab/ to a 4-pixel placeholder string; the old direct path would have captured a real (tiny black) screenshot. This is consistent with how the other callers of take_screenshot() already behave.

🤖 Generated with Claude Code

…hrough ScreenshotEvent

Replaces the inline remove_highlights + direct page.screenshot bypass in
get_browser_state_with_recovery with a ScreenshotEvent dispatch.
ScreenshotWatchdog already handles the highlight strip centrally, so this
collapses two strip sites into one and lets the watchdog handle bypass paths
the same way it handles the normal path.

CDP-based screenshot (via take_screenshot) replaces the playwright
page.screenshot call, matching the canonical screenshot path used elsewhere
in the session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@caffeinum
Copy link
Copy Markdown
Contributor Author

Cross-link: this PR is the centralized bbox-strip port mentioned as a follow-up in #42's "Out of scope" section.

It also closes a latent bug where the get_browser_state_with_recovery screenshot bypassed ScreenshotWatchdog entirely — meaning highlights leaked into every step's observation, not just credential-extract. We saw the agent hallucinate truncations (e2b_*cfe2b_*c) on credential reads where overlays covered the trailing edge.

Smoke-tested locally with real Chromium against wikipedia.org (94 interactive elements, highlight_elements: true, include_screenshot: true). Returned PNG has zero overlays visible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant