[Claimed #1892] Support gpt 5.4 cua upd by github-actions[bot] · Pull Request #2022 · browserbase/stagehand

github-actions · 2026-04-21T22:50:40Z

Mirrored from external contributor PR #1892 after approval by @miguelg719.

Original author: @alexcarv318
Original PR: #1892
Approved source head SHA: 0c6031748e17eebfd6ef570e15c13fd1f4d25022

@alexcarv318, please continue any follow-up discussion on this mirrored PR. When the external PR gets new commits, this same internal PR will be marked stale until the latest external commit is approved and refreshed here.

Original description

why

The original GPT-5.4 native CUA support work from #1792 could not be merged cleanly due to conflicts with current main.
This PR carries that work forward on top of the latest code so the feature can be reviewed and merged.

All implementation credit goes to @Kylejeong2. I only resolved merge conflicts and migrated the changes onto the current branch.

what changed

Cherry-picked the GPT-5.4 native CUA support commits from feat: add support for gpt 5.4 native computer use #1792 onto current main.
Resolved merge conflicts in OpenAICUAClient while preserving both:
- existing current-main behavior, and
- new GPT-5.4 batched computer_call.actions handling.
Kept model/provider mappings and public CUA model exports aligned with GPT-5.4 support.
Included the related example and test updates from the original PR.

test plan

corepack pnpm install
corepack pnpm run test:core -- packages/core/dist/esm/tests/unit/public-api/llm-and-agents.test.js
Manual smoke run of example command:
- corepack pnpm --filter @browserbasehq/stagehand run example -- gpt54-cua-example
- (execution reached model call; requires valid OPENAI_API_KEY for full runtime success)

Summary by cubic

Adds native Computer Use for OpenAI gpt-5.4 using the new computer tool with batched actions, while keeping the legacy computer_use_preview flow for compatibility.

New Features
- Map gpt-5.4 to the OpenAI provider and add openai/gpt-5.4 to AVAILABLE_CUA_MODELS.
- For gpt-5.x, accept action or actions[], execute all in a batch, and reply with computer_screenshot (with detail).
- Preserve preview flow: single action, input_image outputs, and current_url on outputs.
- Add gpt5-4-cua-example.ts and update types/tests for batched actions and computer_screenshot outputs.
Bug Fixes
- Remove per-action screenshots in batched flows; the client now takes one screenshot after each computer_call/batch.
- Add a unit test to ensure no per-action screenshots occur during batched actions.

^{Written for commit 610cbc7. Summary will update on new commits. Review in cubic}

github-actions · 2026-04-21T22:50:44Z

This mirrored PR has been merged into main. The original external PR #1892 is now completed.

changeset-bot · 2026-04-21T22:50:45Z

⚠️ No Changeset found

Latest commit: 610cbc7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

cubic-dev-ai

2 issues found across 5 files

Confidence score: 3/5

There is moderate merge risk because both findings are medium severity (5–6/10) with high confidence (8–9/10), indicating a likely policy and maintainability regression rather than a cosmetic issue.
In packages/core/lib/v3/agent/OpenAICUAClient.ts, adding a hardcoded gpt-5 model-name gate can block valid models and create user-facing failures as model catalogs evolve.
In packages/core/lib/v3/types/public/agent.ts, extending a hardcoded CUA allowlist repeats the same anti-pattern, increasing the chance of drift and inconsistent model acceptance behavior.
Pay close attention to packages/core/lib/v3/agent/OpenAICUAClient.ts, packages/core/lib/v3/types/public/agent.ts - hardcoded allowed-model checks may cause avoidable model rejection and future regressions.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/types/public/agent.ts">

<violation number="1" location="packages/core/lib/v3/types/public/agent.ts:452">
P2: Custom agent: **Ensure we never check against hardcoded lists of allowed LLM model names**

New code adds another hardcoded allowed model name to CUA allowlist, violating the rule to avoid hardcoded allowed-model checks.</violation>
</file>

<file name="packages/core/lib/v3/agent/OpenAICUAClient.ts">

<violation number="1" location="packages/core/lib/v3/agent/OpenAICUAClient.ts:60">
P2: Custom agent: **Ensure we never check against hardcoded lists of allowed LLM model names**

New code hardcodes a `gpt-5` model-name gate, violating the rule against hardcoded allowed-model checks.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant App as User Application
    participant Prov as AgentProvider
    participant Client as OpenAICUAClient
    participant API as OpenAI API
    participant Page as Browser Page

    Note over App,Prov: Initialization
    App->>Prov: Request agent (model: "gpt-5.4")
    Prov->>Prov: NEW: Map gpt-5.4 to "openai" provider
    Prov-->>Client: Create OpenAICUAClient

    Note over Client,API: Execution Loop
    App->>Client: execute(instruction)
    
    loop Agent Steps
        Client->>Client: NEW: check usesNewComputerTool (model ~ gpt-5)
        
        Client->>API: Send history + tools
        Note right of Client: NEW: uses "computer" tool for gpt-5<br/>Legacy: uses "computer_use_preview"

        API-->>Client: Return tool call (ComputerCallItem)

        alt NEW: Batched Actions (GPT-5.4)
            loop for action in actions[]
                Client->>Page: executeAction(action)
                Page-->>Client: success/fail
            end
        else Legacy Single Action
            Client->>Page: executeAction(action)
            Page-->>Client: success/fail
        end

        Client->>Page: captureScreenshot()
        Page-->>Client: base64 string

        alt NEW: GPT-5.4 Response Format
            Client->>Client: Build "computer_screenshot" output
            Note over Client: Sets detail: "original"
        else Legacy Response Format
            Client->>Client: Build "input_image" output
            Note over Client: CHANGED: current_url excluded from new tool format
        end

        Client->>API: POST tool_outputs (screenshot + call_id)
    end

    Client-->>App: Return final result message

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

pirate

this is kinda gnarly, I wish aisdk would handle difference like this for us but 🤷

Kylejeong2 and others added 4 commits April 21, 2026 15:39

feat: add support for gpt 5.4 native computer use

161b76c

update test

3595c37

Just gpt-5.4 instead of gpt-5.4-2026-03-0

e9ca266

formatting and example rename

0c60317

github-actions Bot assigned miguelg719 Apr 21, 2026

github-actions Bot added external-contributor Tracks PRs mirrored from external contributor forks. external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. labels Apr 21, 2026

github-actions Bot mentioned this pull request Apr 21, 2026

Support gpt 5.4 cua upd #1892

Closed

3 tasks

bump

a1bb83a

cubic-dev-ai Bot reviewed Apr 21, 2026

View reviewed changes

Comment thread packages/core/lib/v3/types/public/agent.ts

Comment thread packages/core/lib/v3/agent/OpenAICUAClient.ts

miguelg719 added 2 commits April 21, 2026 16:04

remove unnecessary screenshot in between batched actions

b94eb2b

bump for re-running CI

610cbc7

pirate approved these changes Apr 22, 2026

View reviewed changes

miguelg719 merged commit 0ce782b into main Apr 22, 2026
204 checks passed

github-actions Bot added external-contributor:completed The mirrored PR has been merged and the external contributor flow is complete. and removed external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. labels Apr 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Claimed #1892] Support gpt 5.4 cua upd#2022

[Claimed #1892] Support gpt 5.4 cua upd#2022
miguelg719 merged 7 commits intomainfrom
external-contributor-pr-1892

github-actions Bot commented Apr 21, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

github-actions Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Apr 21, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

pirate left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

github-actions Bot commented Apr 21, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Original description

why

what changed

test plan

Summary by cubic

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pirate left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Apr 21, 2026 •

edited by cubic-dev-ai Bot

Loading

github-actions Bot commented Apr 21, 2026 •

edited

Loading

changeset-bot Bot commented Apr 21, 2026 •

edited

Loading