Skip to content

core(cua): Anthropic tool_result image handling hardcodes PNG media type #2035

@BABTUNA

Description

@BABTUNA

Why

AnthropicCUAClient currently hardcodes media_type: "image/png" and strips only a PNG data URL prefix when building tool_result image blocks.

If screenshot input is ever JPEG (or any non-PNG image data URL), we send incorrect metadata and malformed base64 payload (because the non-PNG prefix is not removed).

Current behavior

In packages/core/lib/v3/agent/AnthropicCUAClient.ts:

  • media_type is always "image/png" (around lines 606 and 717)
  • base64 extraction uses screenshot.replace(/^data:image\/png;base64,/, "")
  • captureScreenshot() returns PNG data URLs today, but this path is brittle to upstream format changes or alternate providers.

Proposed change

Parse the image MIME type from the screenshot data URL and strip the prefix generically:

  • derive media_type from data:image/<mime>;base64,
  • fallback to image/png if parsing fails
  • use parsed base64 payload in both success and error tool_result branches

Suggested files

  • packages/core/lib/v3/agent/AnthropicCUAClient.ts

Acceptance criteria

  • media_type matches actual screenshot MIME type
  • base64 payload is valid for PNG and JPEG data URLs
  • existing PNG behavior remains unchanged
  • add/update tests for MIME parsing + fallback behavior

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions