Fix local browser startup ECONNREFUSED in CI by monadoid · Pull Request #1973 · browserbase/stagehand

monadoid · 2026-04-07T12:58:54Z

Why

Local Chromium startup in CI intermittently fails with ECONNREFUSED before tests ever reach their real assertions. The failures reproduce on a clean branch off main, so this PR is focused only on hardening local browser/session startup.

What changed

retries the full local V3.init() launch-and-connect path for transient startup failures instead of failing immediately on the first refused connection
adds failure-only launch diagnostics so if the flake returns we capture Chrome stdout/stderr plus devtools endpoint state
limits the retry logic to local browser startup so Browserbase-backed sessions are unaffected

Summary by cubic

Hardens CI by retrying and verifying local Chromium startup to reduce ECONNREFUSED flakes. Enables strict structured outputs across supported AI SDK providers.

Bug Fixes
- Retries local browser init once in CI on ECONNREFUSED/ECONNRESET and CDP-connect errors; waits and cleans up between attempts.
- Installs stable Chrome via browser-actions/setup-chrome and runs a preflight launch check before server-v3 integration and local E2E tests, passing CHROME_PATH to the verifier.
New Features
- Enables strict structured outputs: sets strict: true in OpenAIClient json_schema and adds provider options for openai, azure, google, vertex, anthropic, groq, cerebras, and mistral.
- Adds unit tests for AI SDK provider options; publishes a patch to @browserbasehq/stagehand.

^{Written for commit 134e98d. Summary will update on new commits. Review in cubic}

changeset-bot · 2026-04-07T12:59:07Z

🦋 Changeset detected

Latest commit: 134e98d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages

Name	Type
@browserbasehq/stagehand	Patch
@browserbasehq/stagehand-evals	Patch
@browserbasehq/stagehand-server-v3	Patch
@browserbasehq/stagehand-server-v4	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cubic-dev-ai

2 issues found across 2 files

Confidence score: 2/5

There is a high-confidence startup failure path in packages/core/lib/v3/launch/local.ts where Chrome is not killed in the catch block, which can leak orphaned browser processes across retries and degrade host stability.
packages/core/lib/v3/launch/local.ts also reports raw /json/version and /json/list payloads in diagnostics, which may expose webSocketDebuggerUrl values and increase sensitive endpoint leakage risk.
Because both findings are concrete and user-impacting (resource leakage and unsanitized diagnostic output), this looks risky to merge without a targeted fix.
Pay close attention to packages/core/lib/v3/launch/local.ts - ensure failed launches always clean up Chrome and sanitize diagnostic data before logging.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1" location="packages/core/lib/v3/launch/local.ts:67">
P1: Chrome process is leaked on startup failure. The catch block emits diagnostics but never calls `chrome.kill()`, and the caller cannot clean it up because the `LaunchedChrome` object is never returned. On each retry this orphans a Chrome process, worsening the CI environment the PR is trying to fix.</violation>

<violation number="2" location="packages/core/lib/v3/launch/local.ts:277">
P1: Custom agent: **Exception and error message sanitization**

Raw `/json/version` and `/json/list` responses contain `webSocketDebuggerUrl` (a CDP connect URL) and are passed unsanitized into diagnostic output. Strip or redact the `webSocketDebuggerUrl` field from the captured devtools snapshots before emitting them through `onDiagnostic`.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant Test as Test/Caller
    participant V3 as V3.init()
    participant LLC as launchLocalChrome
    participant Chrome as Chromium Process
    participant Dev as DevTools (HTTP)
    participant Log as Logger

    Test->>V3: init()
    
    loop NEW: Local Init Retry Loop (up to 2 attempts in CI)
        V3->>Log: "Launching local browser"
        V3->>LLC: launchLocalChrome(opts)
        
        LLC->>Chrome: Spawn process
        LLC->>LLC: NEW: attachChromeProcessDiagnostics()
        
        rect rgb(240, 240, 240)
            Note over LLC,Dev: Connection Phase
            LLC->>Dev: Poll /json/version
            alt Success
                Dev-->>LLC: 200 OK (WS URL)
                LLC->>LLC: waitForWebSocketReady()
                LLC-->>V3: { ws, chrome }
            else Connection Refused (ECONNREFUSED)
                LLC->>Dev: NEW: captureDevtoolsSnapshots()
                Dev-->>LLC: Status/Body or Fetch Error
                LLC->>Log: NEW: onDiagnostic (Logs stdout/stderr/exit/CDP state)
                LLC-->>V3: throw Error
            end
        end

        alt NEW: Error is retryable AND attempts remaining
            V3->>V3: isRetryableLocalInitError(error)
            V3->>V3: NEW: cleanupFailedLocalInitAttempt()
            V3->>Chrome: kill()
            V3->>V3: fs.rmSync(tempProfile)
            Note over V3: Continue loop to next attempt
        else Success or Final Failure
            V3-->>Test: Return Context or Final Error
        end
    end

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

monadoid · 2026-04-07T13:24:14Z

@cubic-dev-ai rerun your review

cubic-dev-ai · 2026-04-07T13:24:23Z

@cubic-dev-ai rerun your review

@monadoid I have started the AI code review. It will take a few minutes to complete.

cubic-dev-ai

2 issues found across 2 files

Confidence score: 3/5

There is some merge risk due to a concrete information-exposure concern: in packages/core/lib/v3/launch/local.ts, raw chromeStderr/chromeStdout can include DevTools listening on ws://..., which may leak a CDP WebSocket endpoint in diagnostics.
A secondary design/robustness issue exists in packages/core/lib/v3/launch/local.ts: emitLaunchFailureDiagnostics disposes diagnostics even though the caller also disposes it, creating lifecycle ambiguity and potential double-dispose side effects.
The score is 3 because the top issue is medium severity (6/10) with high confidence and could affect users via leaked diagnostic output, while the other issue appears lower severity and likely straightforward to fix.
Pay close attention to packages/core/lib/v3/launch/local.ts - sanitize launch diagnostics output and clarify ownership/disposal of diagnostics.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1" location="packages/core/lib/v3/launch/local.ts:282">
P2: Custom agent: **Exception and error message sanitization**

The `chromeStderr` and `chromeStdout` diagnostic fields emit raw Chrome process output without redacting the CDP WebSocket URL (`DevTools listening on ws://…`) that Chrome writes to stderr on startup. The codebase already redacts `webSocketDebuggerUrl` from JSON endpoint responses via `redactWebSocketDebuggerUrl` — apply equivalent sanitization (e.g. a regex replacing `ws://…/devtools/…` patterns) to the stdout/stderr buffers before including them in diagnostics.</violation>

<violation number="2" location="packages/core/lib/v3/launch/local.ts:285">
P2: `emitLaunchFailureDiagnostics` calls `diagnostics?.dispose()` internally, but the caller in the `catch` block also disposes it. The "emit" function should not own the lifecycle of a resource it receives — leave disposal to the caller, which already handles it.

(Based on your team's feedback about splitting functions that mix capture and emit logic.) [FEEDBACK_USED]</violation>
</file>

Architecture diagram

sequenceDiagram
    participant App as V3 Init
    participant Launch as launchLocalChrome
    participant Proc as Chromium Process
    participant DevTools as DevTools API (:port)
    participant Log as Logger

    Note over App,Log: Local Browser Startup Flow (with Retry Logic)

    loop Attempt 1 to N (NEW: N=2 in CI)
        App->>Launch: launchLocalChrome(options)
        
        Launch->>Proc: Spawn Chromium
        Launch->>Launch: NEW: attachChromeProcessDiagnostics()
        Proc-->>Launch: Listeners for stdout/stderr/exit

        rect rgb(200, 220, 240)
            Note right of Launch: Connection Phase
            Launch->>DevTools: Poll /json/version
            
            alt Success (Happy Path)
                DevTools-->>Launch: WebSocket URL
                Launch->>Launch: waitForWebSocketReady()
                Launch-->>App: { ws, chrome }
            else Failure (NEW: Diagnostic Path)
                Note right of Launch: e.g., ECONNREFUSED
                Launch->>DevTools: NEW: captureDevtoolsSnapshots()
                DevTools-->>Launch: HTTP status / Payload
                Launch->>Launch: NEW: Snapshot stdout/stderr buffers
                Launch->>Log: NEW: emitLaunchFailureDiagnostics()
                Launch->>Proc: kill()
                Launch-->>App: Throw Error
            end
        end

        alt NEW: Retryable Error (ECONNREFUSED/RESET)
            App->>App: NEW: cleanupFailedLocalInitAttempt()
            Note over App: Wait for next attempt
        else Terminal Error or Max Retries Reached
            App-->>App: Throw final error
        end
    end

    App->>App: create V3Context(ws)
    App->>App: startShutdownSupervisor()

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

pirate · 2026-04-07T18:38:18Z

were the checks that were failing preceeded by actions/verify-chromium-launch?

https://github.com/browserbase/stagehand/blob/main/.github/actions/verify-chromium-launch/action.yml

This action usually solves the problem in my experience, better to check the CI is stable before we ever reach our code.

cubic-dev-ai

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1">
P1: Chrome process leak: if `waitForWebSocketDebuggerUrl` or `waitForWebSocketReady` throws after `launch()` succeeds, the spawned Chrome process is orphaned. The old code cleaned this up with `killLaunchedChrome(chrome)` in a catch block. Without it, each failed connection attempt leaves a zombie Chrome process, which compounds in CI retry loops.</violation>
</file>

<file name="packages/core/lib/v3/v3.ts">

<violation number="1">
P1: Chrome process and temp profile leak if `V3Context.create()` fails after `launchLocalChrome()` succeeds. The old code's `cleanupFailedLocalInitAttempt` handled this, but it was removed with the retry loop. If the CDP context creation throws, the launched Chrome process becomes an orphan and the temp `userDataDir` is left on disk, because `this.state` is never set to `LOCAL` and the outer catch only cleans up the instance logger.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

monadoid · 2026-04-07T19:48:31Z

@pirate just switched over to to using verify-chromium-launch but still seeing ECONNREFUSED

## Why This PR is meant to add structured-output support across the provider paths we use, while preserving the existing model-facing act/observe identifier shape. ## What Changed - Kept the existing model-facing `elementId` / encoded ID format for act and observe, including the existing `^\\d+-\\d+$` validation. - Kept the structured-output provider configuration changes across the supported clients and AI SDK paths. - Continued resolving encoded IDs against the existing xpath map at the handler boundary. - Updated the unit regression coverage to assert the encoded-ID structured-output path. ## Test Plan Added passing tests

This reverts commit cc018d6.

cubic-dev-ai

3 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1" location="packages/core/lib/v3/launch/local.ts:78">
P1: Custom agent: **Exception and error message sanitization**

The `error` and `processError` diagnostic fields are emitted without sanitization, while `chromeStdout`/`chromeStderr` in the same payload are run through `sanitizeChromeProcessOutput()`. Connection-failure errors (ECONNREFUSED, ECONNRESET) typically embed the full CDP websocket URL in their message. Apply `sanitizeChromeProcessOutput()` (or equivalent) to `error` and `processError` before emitting them.</violation>

<violation number="2" location="packages/core/lib/v3/launch/local.ts:84">
P2: `chromeDiagnostics.dispose()` is never called on the error path. After `emitLaunchFailureDiagnostics` takes its snapshot, the stdout/stderr/exit/error listeners remain attached to the Chrome child process. Add a `finally`-style cleanup or call `dispose()` after the snapshot is taken.</violation>

<violation number="3" location="packages/core/lib/v3/launch/local.ts:292">
P2: `captureDevtoolsSnapshots` uses `fetch` without a timeout. Since this runs on the error path (Chrome likely failed to start or is unresponsive), these requests can hang indefinitely, delaying the re-thrown startup error. Add an `AbortSignal.timeout()` to bound the diagnostic capture.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

This reverts commit bb12c89.

monadoid added 5 commits April 7, 2026 12:32

chore: add temporary ci diagnostics

e0e5d72

test: harden local startup and self-heal coverage

8ea4e73

fix: retry local browser startup in ci

043d0e8

chore: narrow ci flake fix to local init retry

e211147

debug: log local chromium startup failures

f4b949b

cubic-dev-ai bot reviewed Apr 7, 2026

View reviewed changes

Comment thread packages/core/lib/v3/launch/local.ts Outdated

Comment thread packages/core/lib/v3/launch/local.ts Outdated

fix: clean up chrome startup diagnostics

8252a0b

cubic-dev-ai bot reviewed Apr 7, 2026

View reviewed changes

Comment thread packages/core/lib/v3/launch/local.ts Outdated

Comment thread packages/core/lib/v3/launch/local.ts Outdated

fix: sanitize chrome startup diagnostics

c9465d9

fix: verify chromium before server v3 integration tests

b04a5d9

cubic-dev-ai bot reviewed Apr 7, 2026

View reviewed changes

monadoid and others added 4 commits April 8, 2026 17:32

ci: pin chrome 146 for server integration tests

cc018d6

Revert "ci: pin chrome 146 for server integration tests"

f682514

This reverts commit cc018d6.

debug: log local chromium startup failures

711af35

cubic-dev-ai bot reviewed Apr 8, 2026

View reviewed changes

Comment thread packages/core/lib/v3/launch/local.ts Outdated

Comment thread packages/core/lib/v3/launch/local.ts Outdated

Comment thread packages/core/lib/v3/launch/local.ts Outdated

monadoid added 3 commits April 8, 2026 18:30

chore: remove local chromium startup diagnostics

bb12c89

Revert "chore: remove local chromium startup diagnostics"

ec86abc

This reverts commit bb12c89.

ci: use setup-chrome for local browser jobs

061923c

pirate reviewed Apr 8, 2026

View reviewed changes

Comment thread packages/core/lib/v3/launch/local.ts Outdated

pirate reviewed Apr 8, 2026

View reviewed changes

Comment thread packages/core/lib/v3/launch/local.ts Outdated

chore: drop local chromium startup diagnostics

134e98d

pirate approved these changes Apr 8, 2026

View reviewed changes

monadoid merged commit 14b64ec into main Apr 8, 2026
188 checks passed

github-actions bot mentioned this pull request Apr 8, 2026

Version Packages #1979

Open

This was referenced Apr 7, 2026

Version Packages nxtreaming/stagehand#2

Closed

Version Packages nxtreaming/stagehand#3

Open

Version Packages edisplay/stagehand#5

Open

github-actions bot mentioned this pull request Apr 16, 2026

Version Packages Difocd/stagehand#1

Open

Conversation

monadoid commented Apr 7, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What changed

Summary by cubic

Uh oh!

changeset-bot bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

monadoid commented Apr 7, 2026

Uh oh!

cubic-dev-ai bot commented Apr 7, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pirate commented Apr 7, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

monadoid commented Apr 7, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

monadoid commented Apr 7, 2026 •

edited by cubic-dev-ai bot

Loading

changeset-bot bot commented Apr 7, 2026 •

edited

Loading