Skip to content

fix(llm): emit structured input_image content for tool-result media in OpenAI Responses#28754

Merged
kitlangton merged 2 commits into
anomalyco:devfrom
kitlangton:fix/openai-responses-tool-image
May 22, 2026
Merged

fix(llm): emit structured input_image content for tool-result media in OpenAI Responses#28754
kitlangton merged 2 commits into
anomalyco:devfrom
kitlangton:fix/openai-responses-tool-image

Conversation

@kitlangton
Copy link
Copy Markdown
Contributor

@kitlangton kitlangton commented May 22, 2026

Issue for this PR

Closes #28859

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

OpenAI Responses was lowering all tool results through toolResultText. For content-typed tool results, that JSON-stringified the entire content array, including base64 image media, into function_call_output.output.

This PR widens the Responses request schema so function_call_output.output can be either a string or an array of structured content items. Image tool-result media is now emitted as input_image content, while text/json/error tool results keep the existing string behavior. Unsupported non-image tool-result media now returns a clear LLMError.

It also adds protocol-level regression tests and a recorded golden scenario for a real image returned from a tool result.

How did you verify your code works?

  • packages/llm: bun run test passed with 209 pass, 28 skip
  • root bun run typecheck passed with 15 successful tasks
  • recorded replay for openai-responses-gpt-5-5-image-tool-result passed

Screenshots / recordings

Not applicable. This is an LLM protocol wire-shape fix.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

…n OpenAI Responses

The native OpenAI Responses protocol previously lowered every tool-result
into a string via toolResultText, which for content-typed results
(`{ type: 'content', value: [text, media] }`) JSON-stringified the entire
array — including multi-megabyte base64 image data URLs — into a single
`function_call_output.output` string. OpenAI Responses rejects this
shape and emits a contentless stream `error` event, surfacing to the
caller as the bare "OpenAI Responses stream error".

Widen `function_call_output.output` in the body schema to accept the
real API shape (string or array of input_text/input_image) and add a
media-aware lowering helper that:

- emits structured `input_image` items for image media in tool results
- keeps the legacy string path for text/json/error results so existing
  cassettes and provider expectations are unchanged
- raises a clear LLMError for unsupported tool-result media types (e.g.
  audio) instead of silently encoding them

Adds three protocol-level reproducer tests for the lowering and a
RECORD-gated golden scenario (`image-tool-result`) that exercises a
real OpenAI Responses tool-image roundtrip end-to-end.
@github-actions github-actions Bot added needs:issue needs:compliance This means the issue will auto-close after 2 hours. labels May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

  1. Open an issue describing the bug/feature (if one doesn't exist)
  2. Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

@github-actions
Copy link
Copy Markdown
Contributor

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search results, I found one related PR:

fix(llm): emit structured image blocks for tool-result media in Anthropic Messages - PR #28755

This is a sibling fix mentioned in the current PR's description. It addresses the same issue but for Anthropic Messages instead of OpenAI Responses. Both PRs are part of the same effort to properly handle tool-result media with structured content types rather than JSON-stringifying them, which prevents token bloat when switching between providers and avoids OpenAI/Anthropic API rejections.

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label May 22, 2026
@github-actions github-actions Bot closed this May 22, 2026
@kitlangton kitlangton reopened this May 22, 2026
@kitlangton kitlangton merged commit 700d012 into anomalyco:dev May 22, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI Responses JSON-stringifies image media returned from tool results

1 participant