Skip to content

[Claimed #1892] Support gpt 5.4 cua upd#2022

Merged
miguelg719 merged 7 commits intomainfrom
external-contributor-pr-1892
Apr 22, 2026
Merged

[Claimed #1892] Support gpt 5.4 cua upd#2022
miguelg719 merged 7 commits intomainfrom
external-contributor-pr-1892

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot commented Apr 21, 2026

Mirrored from external contributor PR #1892 after approval by @miguelg719.

Original author: @alexcarv318
Original PR: #1892
Approved source head SHA: 0c6031748e17eebfd6ef570e15c13fd1f4d25022

@alexcarv318, please continue any follow-up discussion on this mirrored PR. When the external PR gets new commits, this same internal PR will be marked stale until the latest external commit is approved and refreshed here.

Original description

why

The original GPT-5.4 native CUA support work from #1792 could not be merged cleanly due to conflicts with current main.
This PR carries that work forward on top of the latest code so the feature can be reviewed and merged.

All implementation credit goes to @Kylejeong2. I only resolved merge conflicts and migrated the changes onto the current branch.

what changed

  • Cherry-picked the GPT-5.4 native CUA support commits from feat: add support for gpt 5.4 native computer use #1792 onto current main.
  • Resolved merge conflicts in OpenAICUAClient while preserving both:
    • existing current-main behavior, and
    • new GPT-5.4 batched computer_call.actions handling.
  • Kept model/provider mappings and public CUA model exports aligned with GPT-5.4 support.
  • Included the related example and test updates from the original PR.

test plan

  • corepack pnpm install
  • corepack pnpm run test:core -- packages/core/dist/esm/tests/unit/public-api/llm-and-agents.test.js
  • Manual smoke run of example command:
    • corepack pnpm --filter @browserbasehq/stagehand run example -- gpt54-cua-example
    • (execution reached model call; requires valid OPENAI_API_KEY for full runtime success)

Summary by cubic

Adds native Computer Use for OpenAI gpt-5.4 using the new computer tool with batched actions, while keeping the legacy computer_use_preview flow for compatibility.

  • New Features

    • Map gpt-5.4 to the OpenAI provider and add openai/gpt-5.4 to AVAILABLE_CUA_MODELS.
    • For gpt-5.x, accept action or actions[], execute all in a batch, and reply with computer_screenshot (with detail).
    • Preserve preview flow: single action, input_image outputs, and current_url on outputs.
    • Add gpt5-4-cua-example.ts and update types/tests for batched actions and computer_screenshot outputs.
  • Bug Fixes

    • Remove per-action screenshots in batched flows; the client now takes one screenshot after each computer_call/batch.
    • Add a unit test to ensure no per-action screenshots occur during batched actions.

Written for commit 610cbc7. Summary will update on new commits. Review in cubic

@github-actions github-actions Bot added external-contributor Tracks PRs mirrored from external contributor forks. external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. labels Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

github-actions Bot commented Apr 21, 2026

This mirrored PR has been merged into main. The original external PR #1892 is now completed.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 21, 2026

⚠️ No Changeset found

Latest commit: 610cbc7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions Bot mentioned this pull request Apr 21, 2026
3 tasks
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 5 files

Confidence score: 3/5

  • There is moderate merge risk because both findings are medium severity (5–6/10) with high confidence (8–9/10), indicating a likely policy and maintainability regression rather than a cosmetic issue.
  • In packages/core/lib/v3/agent/OpenAICUAClient.ts, adding a hardcoded gpt-5 model-name gate can block valid models and create user-facing failures as model catalogs evolve.
  • In packages/core/lib/v3/types/public/agent.ts, extending a hardcoded CUA allowlist repeats the same anti-pattern, increasing the chance of drift and inconsistent model acceptance behavior.
  • Pay close attention to packages/core/lib/v3/agent/OpenAICUAClient.ts, packages/core/lib/v3/types/public/agent.ts - hardcoded allowed-model checks may cause avoidable model rejection and future regressions.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/types/public/agent.ts">

<violation number="1" location="packages/core/lib/v3/types/public/agent.ts:452">
P2: Custom agent: **Ensure we never check against hardcoded lists of allowed LLM model names**

New code adds another hardcoded allowed model name to CUA allowlist, violating the rule to avoid hardcoded allowed-model checks.</violation>
</file>

<file name="packages/core/lib/v3/agent/OpenAICUAClient.ts">

<violation number="1" location="packages/core/lib/v3/agent/OpenAICUAClient.ts:60">
P2: Custom agent: **Ensure we never check against hardcoded lists of allowed LLM model names**

New code hardcodes a `gpt-5` model-name gate, violating the rule against hardcoded allowed-model checks.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant App as User Application
    participant Prov as AgentProvider
    participant Client as OpenAICUAClient
    participant API as OpenAI API
    participant Page as Browser Page

    Note over App,Prov: Initialization
    App->>Prov: Request agent (model: "gpt-5.4")
    Prov->>Prov: NEW: Map gpt-5.4 to "openai" provider
    Prov-->>Client: Create OpenAICUAClient

    Note over Client,API: Execution Loop
    App->>Client: execute(instruction)
    
    loop Agent Steps
        Client->>Client: NEW: check usesNewComputerTool (model ~ gpt-5)
        
        Client->>API: Send history + tools
        Note right of Client: NEW: uses "computer" tool for gpt-5<br/>Legacy: uses "computer_use_preview"

        API-->>Client: Return tool call (ComputerCallItem)

        alt NEW: Batched Actions (GPT-5.4)
            loop for action in actions[]
                Client->>Page: executeAction(action)
                Page-->>Client: success/fail
            end
        else Legacy Single Action
            Client->>Page: executeAction(action)
            Page-->>Client: success/fail
        end

        Client->>Page: captureScreenshot()
        Page-->>Client: base64 string

        alt NEW: GPT-5.4 Response Format
            Client->>Client: Build "computer_screenshot" output
            Note over Client: Sets detail: "original"
        else Legacy Response Format
            Client->>Client: Build "input_image" output
            Note over Client: CHANGED: current_url excluded from new tool format
        end

        Client->>API: POST tool_outputs (screenshot + call_id)
    end

    Client-->>App: Return final result message
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/core/lib/v3/types/public/agent.ts
Comment thread packages/core/lib/v3/agent/OpenAICUAClient.ts
Copy link
Copy Markdown
Member

@pirate pirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is kinda gnarly, I wish aisdk would handle difference like this for us but 🤷

@miguelg719 miguelg719 merged commit 0ce782b into main Apr 22, 2026
204 checks passed
@github-actions github-actions Bot added external-contributor:completed The mirrored PR has been merged and the external contributor flow is complete. and removed external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor:completed The mirrored PR has been merged and the external contributor flow is complete. external-contributor Tracks PRs mirrored from external contributor forks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants