Skip to content

Fix local browser startup ECONNREFUSED in CI#1973

Merged
monadoid merged 16 commits intomainfrom
econnrefusedfix
Apr 8, 2026
Merged

Fix local browser startup ECONNREFUSED in CI#1973
monadoid merged 16 commits intomainfrom
econnrefusedfix

Conversation

@monadoid
Copy link
Copy Markdown
Contributor

@monadoid monadoid commented Apr 7, 2026

Why

Local Chromium startup in CI intermittently fails with ECONNREFUSED before tests ever reach their real assertions. The failures reproduce on a clean branch off main, so this PR is focused only on hardening local browser/session startup.

What changed

  • retries the full local V3.init() launch-and-connect path for transient startup failures instead of failing immediately on the first refused connection
  • adds failure-only launch diagnostics so if the flake returns we capture Chrome stdout/stderr plus devtools endpoint state
  • limits the retry logic to local browser startup so Browserbase-backed sessions are unaffected

Summary by cubic

Hardens CI by retrying and verifying local Chromium startup to reduce ECONNREFUSED flakes. Enables strict structured outputs across supported AI SDK providers.

  • Bug Fixes

    • Retries local browser init once in CI on ECONNREFUSED/ECONNRESET and CDP-connect errors; waits and cleans up between attempts.
    • Installs stable Chrome via browser-actions/setup-chrome and runs a preflight launch check before server-v3 integration and local E2E tests, passing CHROME_PATH to the verifier.
  • New Features

    • Enables strict structured outputs: sets strict: true in OpenAIClient json_schema and adds provider options for openai, azure, google, vertex, anthropic, groq, cerebras, and mistral.
    • Adds unit tests for AI SDK provider options; publishes a patch to @browserbasehq/stagehand.

Written for commit 134e98d. Summary will update on new commits. Review in cubic

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 7, 2026

🦋 Changeset detected

Latest commit: 134e98d

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
@browserbasehq/stagehand Patch
@browserbasehq/stagehand-evals Patch
@browserbasehq/stagehand-server-v3 Patch
@browserbasehq/stagehand-server-v4 Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files

Confidence score: 2/5

  • There is a high-confidence startup failure path in packages/core/lib/v3/launch/local.ts where Chrome is not killed in the catch block, which can leak orphaned browser processes across retries and degrade host stability.
  • packages/core/lib/v3/launch/local.ts also reports raw /json/version and /json/list payloads in diagnostics, which may expose webSocketDebuggerUrl values and increase sensitive endpoint leakage risk.
  • Because both findings are concrete and user-impacting (resource leakage and unsanitized diagnostic output), this looks risky to merge without a targeted fix.
  • Pay close attention to packages/core/lib/v3/launch/local.ts - ensure failed launches always clean up Chrome and sanitize diagnostic data before logging.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1" location="packages/core/lib/v3/launch/local.ts:67">
P1: Chrome process is leaked on startup failure. The catch block emits diagnostics but never calls `chrome.kill()`, and the caller cannot clean it up because the `LaunchedChrome` object is never returned. On each retry this orphans a Chrome process, worsening the CI environment the PR is trying to fix.</violation>

<violation number="2" location="packages/core/lib/v3/launch/local.ts:277">
P1: Custom agent: **Exception and error message sanitization**

Raw `/json/version` and `/json/list` responses contain `webSocketDebuggerUrl` (a CDP connect URL) and are passed unsanitized into diagnostic output. Strip or redact the `webSocketDebuggerUrl` field from the captured devtools snapshots before emitting them through `onDiagnostic`.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant Test as Test/Caller
    participant V3 as V3.init()
    participant LLC as launchLocalChrome
    participant Chrome as Chromium Process
    participant Dev as DevTools (HTTP)
    participant Log as Logger

    Test->>V3: init()
    
    loop NEW: Local Init Retry Loop (up to 2 attempts in CI)
        V3->>Log: "Launching local browser"
        V3->>LLC: launchLocalChrome(opts)
        
        LLC->>Chrome: Spawn process
        LLC->>LLC: NEW: attachChromeProcessDiagnostics()
        
        rect rgb(240, 240, 240)
            Note over LLC,Dev: Connection Phase
            LLC->>Dev: Poll /json/version
            alt Success
                Dev-->>LLC: 200 OK (WS URL)
                LLC->>LLC: waitForWebSocketReady()
                LLC-->>V3: { ws, chrome }
            else Connection Refused (ECONNREFUSED)
                LLC->>Dev: NEW: captureDevtoolsSnapshots()
                Dev-->>LLC: Status/Body or Fetch Error
                LLC->>Log: NEW: onDiagnostic (Logs stdout/stderr/exit/CDP state)
                LLC-->>V3: throw Error
            end
        end

        alt NEW: Error is retryable AND attempts remaining
            V3->>V3: isRetryableLocalInitError(error)
            V3->>V3: NEW: cleanupFailedLocalInitAttempt()
            V3->>Chrome: kill()
            V3->>V3: fs.rmSync(tempProfile)
            Note over V3: Continue loop to next attempt
        else Success or Final Failure
            V3-->>Test: Return Context or Final Error
        end
    end
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/core/lib/v3/launch/local.ts Outdated
Comment thread packages/core/lib/v3/launch/local.ts Outdated
@monadoid
Copy link
Copy Markdown
Contributor Author

monadoid commented Apr 7, 2026

@cubic-dev-ai rerun your review

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai bot commented Apr 7, 2026

@cubic-dev-ai rerun your review

@monadoid I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files

Confidence score: 3/5

  • There is some merge risk due to a concrete information-exposure concern: in packages/core/lib/v3/launch/local.ts, raw chromeStderr/chromeStdout can include DevTools listening on ws://..., which may leak a CDP WebSocket endpoint in diagnostics.
  • A secondary design/robustness issue exists in packages/core/lib/v3/launch/local.ts: emitLaunchFailureDiagnostics disposes diagnostics even though the caller also disposes it, creating lifecycle ambiguity and potential double-dispose side effects.
  • The score is 3 because the top issue is medium severity (6/10) with high confidence and could affect users via leaked diagnostic output, while the other issue appears lower severity and likely straightforward to fix.
  • Pay close attention to packages/core/lib/v3/launch/local.ts - sanitize launch diagnostics output and clarify ownership/disposal of diagnostics.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1" location="packages/core/lib/v3/launch/local.ts:282">
P2: Custom agent: **Exception and error message sanitization**

The `chromeStderr` and `chromeStdout` diagnostic fields emit raw Chrome process output without redacting the CDP WebSocket URL (`DevTools listening on ws://…`) that Chrome writes to stderr on startup. The codebase already redacts `webSocketDebuggerUrl` from JSON endpoint responses via `redactWebSocketDebuggerUrl` — apply equivalent sanitization (e.g. a regex replacing `ws://…/devtools/…` patterns) to the stdout/stderr buffers before including them in diagnostics.</violation>

<violation number="2" location="packages/core/lib/v3/launch/local.ts:285">
P2: `emitLaunchFailureDiagnostics` calls `diagnostics?.dispose()` internally, but the caller in the `catch` block also disposes it. The "emit" function should not own the lifecycle of a resource it receives — leave disposal to the caller, which already handles it.

(Based on your team's feedback about splitting functions that mix capture and emit logic.) [FEEDBACK_USED]</violation>
</file>
Architecture diagram
sequenceDiagram
    participant App as V3 Init
    participant Launch as launchLocalChrome
    participant Proc as Chromium Process
    participant DevTools as DevTools API (:port)
    participant Log as Logger

    Note over App,Log: Local Browser Startup Flow (with Retry Logic)

    loop Attempt 1 to N (NEW: N=2 in CI)
        App->>Launch: launchLocalChrome(options)
        
        Launch->>Proc: Spawn Chromium
        Launch->>Launch: NEW: attachChromeProcessDiagnostics()
        Proc-->>Launch: Listeners for stdout/stderr/exit

        rect rgb(200, 220, 240)
            Note right of Launch: Connection Phase
            Launch->>DevTools: Poll /json/version
            
            alt Success (Happy Path)
                DevTools-->>Launch: WebSocket URL
                Launch->>Launch: waitForWebSocketReady()
                Launch-->>App: { ws, chrome }
            else Failure (NEW: Diagnostic Path)
                Note right of Launch: e.g., ECONNREFUSED
                Launch->>DevTools: NEW: captureDevtoolsSnapshots()
                DevTools-->>Launch: HTTP status / Payload
                Launch->>Launch: NEW: Snapshot stdout/stderr buffers
                Launch->>Log: NEW: emitLaunchFailureDiagnostics()
                Launch->>Proc: kill()
                Launch-->>App: Throw Error
            end
        end

        alt NEW: Retryable Error (ECONNREFUSED/RESET)
            App->>App: NEW: cleanupFailedLocalInitAttempt()
            Note over App: Wait for next attempt
        else Terminal Error or Max Retries Reached
            App-->>App: Throw final error
        end
    end

    App->>App: create V3Context(ws)
    App->>App: startShutdownSupervisor()
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/core/lib/v3/launch/local.ts Outdated
Comment thread packages/core/lib/v3/launch/local.ts Outdated
@pirate
Copy link
Copy Markdown
Member

pirate commented Apr 7, 2026

were the checks that were failing preceeded by actions/verify-chromium-launch?

https://github.com/browserbase/stagehand/blob/main/.github/actions/verify-chromium-launch/action.yml

This action usually solves the problem in my experience, better to check the CI is stable before we ever reach our code.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1">
P1: Chrome process leak: if `waitForWebSocketDebuggerUrl` or `waitForWebSocketReady` throws after `launch()` succeeds, the spawned Chrome process is orphaned. The old code cleaned this up with `killLaunchedChrome(chrome)` in a catch block. Without it, each failed connection attempt leaves a zombie Chrome process, which compounds in CI retry loops.</violation>
</file>

<file name="packages/core/lib/v3/v3.ts">

<violation number="1">
P1: Chrome process and temp profile leak if `V3Context.create()` fails after `launchLocalChrome()` succeeds. The old code's `cleanupFailedLocalInitAttempt` handled this, but it was removed with the retry loop. If the CDP context creation throws, the launched Chrome process becomes an orphan and the temp `userDataDir` is left on disk, because `this.state` is never set to `LOCAL` and the outer catch only cleans up the instance logger.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

@monadoid
Copy link
Copy Markdown
Contributor Author

monadoid commented Apr 7, 2026

@pirate just switched over to to using verify-chromium-launch but still seeing ECONNREFUSED

monadoid and others added 4 commits April 8, 2026 17:32
## Why
This PR is meant to add structured-output support across the provider
paths we use, while preserving the existing model-facing act/observe
identifier shape.

## What Changed
- Kept the existing model-facing `elementId` / encoded ID format for act
and observe, including the existing `^\\d+-\\d+$` validation.
- Kept the structured-output provider configuration changes across the
supported clients and AI SDK paths.
- Continued resolving encoded IDs against the existing xpath map at the
handler boundary.
- Updated the unit regression coverage to assert the encoded-ID
structured-output path.

## Test Plan
Added passing tests
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/launch/local.ts">

<violation number="1" location="packages/core/lib/v3/launch/local.ts:78">
P1: Custom agent: **Exception and error message sanitization**

The `error` and `processError` diagnostic fields are emitted without sanitization, while `chromeStdout`/`chromeStderr` in the same payload are run through `sanitizeChromeProcessOutput()`. Connection-failure errors (ECONNREFUSED, ECONNRESET) typically embed the full CDP websocket URL in their message. Apply `sanitizeChromeProcessOutput()` (or equivalent) to `error` and `processError` before emitting them.</violation>

<violation number="2" location="packages/core/lib/v3/launch/local.ts:84">
P2: `chromeDiagnostics.dispose()` is never called on the error path. After `emitLaunchFailureDiagnostics` takes its snapshot, the stdout/stderr/exit/error listeners remain attached to the Chrome child process. Add a `finally`-style cleanup or call `dispose()` after the snapshot is taken.</violation>

<violation number="3" location="packages/core/lib/v3/launch/local.ts:292">
P2: `captureDevtoolsSnapshots` uses `fetch` without a timeout. Since this runs on the error path (Chrome likely failed to start or is unresponsive), these requests can hang indefinitely, delaying the re-thrown startup error. Add an `AbortSignal.timeout()` to bound the diagnostic capture.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/core/lib/v3/launch/local.ts Outdated
Comment thread packages/core/lib/v3/launch/local.ts Outdated
Comment thread packages/core/lib/v3/launch/local.ts Outdated
Comment thread packages/core/lib/v3/launch/local.ts Outdated
Comment thread packages/core/lib/v3/launch/local.ts Outdated
@monadoid monadoid merged commit 14b64ec into main Apr 8, 2026
188 checks passed
@github-actions github-actions bot mentioned this pull request Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants