fix(llm): emit structured image blocks for tool-result media in Anthropic Messages#28755
Conversation
…opic Messages
The Anthropic Messages protocol lowered every tool-result into a single
string via toolResultText, which for content-typed results
(`{ type: 'content', value: [text, media] }`) JSON-stringified the entire
array — including multi-megabyte base64 image data URLs — into the
`tool_result.content` field. The wire request still parsed, but the
giant string silently consumed the context window: in one recorded
session a single screenshot read pushed a later turn over Claude's 1M
token limit with `prompt is too long: 1033591 tokens > 1000000`.
Widen `tool_result.content` in the body schema to match the real API
shape (`string | (TextBlock | ImageBlock)[]`) and add a media-aware
lowering helper that:
- emits Anthropic-native `image` blocks for image media in tool results
- keeps the legacy string path for text / json / error results so existing
cassettes and provider expectations are unchanged
- raises a clear LLMError for unsupported tool-result media types (e.g.
audio) instead of silently encoding them
Adds three protocol-level reproducer tests for the lowering and a
RECORD-gated golden scenario (`image-tool-result`) shared with the
sibling OpenAI Responses fix so the next end-to-end refresh covers both
providers.
|
This PR doesn't fully meet our contributing guidelines and PR template. What needs to be fixed:
Please edit this PR description to address the above within 2 hours, or it will be automatically closed. If you believe this was flagged incorrectly, please let a maintainer know. |
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Based on my search results, I found one related PR that may be relevant: Related PR (not a duplicate):
This is a sibling PR that addresses the same underlying issue but for OpenAI Responses instead of Anthropic Messages. Per your PR description, you mentioned "A sibling fix lives on These are separate fixes for different providers (not duplicates), but they're part of the same feature initiative to handle image blocks in tool results properly. No duplicate PRs found. |
|
This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window. Feel free to open a new pull request that follows our guidelines. |
Issue for this PR
Closes #28861
Type of change
What does this PR do?
Anthropic Messages was lowering all tool results through
toolResultText. For content-typed tool results, that JSON-stringified the entire content array, including base64 image media, intotool_result.content.This PR widens the Anthropic request schema so
tool_result.contentcan be either a string or an array of structured text/image blocks. Image tool-result media is now emitted as Anthropic-nativeimageblocks, while text/json/error tool results keep the existing string behavior. Unsupported non-image tool-result media now returns a clearLLMError.It also adds protocol-level regression tests and a recorded golden scenario for a real image returned from a tool result.
How did you verify your code works?
packages/llm: bun run testpassed with 209 pass, 28 skipbun run typecheckpassed with 15 successful tasksanthropic-opus-4-7-image-tool-resultpassedScreenshots / recordings
Not applicable. This is an LLM protocol wire-shape fix.
Checklist