fix(image-handling): close 4 audit follow-ups from PR #155#156
Merged
Conversation
Addresses the four pre-existing gaps the PR #155 audit surfaced: 1. validate_images_for_api ran only on the Anthropic-direct call_model path; all provider.chat_stream_response paths skipped it. Promoted into BaseProvider._prepare_messages so every provider validates client-side. query._call_model_sync catches ImageSizeError and surfaces a media_size assistant error (matches the existing server-side classification). agent_loop._call_provider_for_turn re-raises ImageSizeError instead of falling back to chat() (which would re-trigger the same validation). Validator also now walks images nested inside tool_result.content — post-PR #154 the Read tool returns image blocks there and the old walker missed them. 2. OpenAI tool->user split symmetric correlation. Tool message body now always carries [multimodal content for tool_use_id=X ...] when the synthetic role=user message follows; the user message leads with a [content for tool_use_id=X] text block so the link is visible from both directions. Empty content sentinel ([empty tool result]) guards the "non-empty content required" invariant for content=[] cases. 3. @file.pdf and other binary @-mentions no longer hit the text-mode open() that produced mojibake-in-system-reminder. Known binary extensions (pdf/zip/docx/...) plus a NUL-byte content sniff route to a new "binary" attachment kind with a Read-tool hint (PDF specifically mentions the pages parameter). BOM-aware decoder (_read_text_with_encoding) handles Windows-emitted UTF-16/UTF-32 files via the BOM-aware codecs (utf-16, utf-32, utf-8-sig) so the BOM is consumed rather than preserved as a literal U+FEFF char. Post-decode garbled-text detector (NUL zero-tolerance + 3% U+FFFD threshold) falls back to a binary attachment when an adversarial spoofed-BOM blob would otherwise leak. 4. _anthropic_image_block_to_openai now has a sibling _anthropic_document_block_to_openai that translates Anthropic DocumentBlock (PDF) to OpenAI's {type:"file"} shape. Defensive: no production path currently produces DocumentBlock for OpenAI-compat, but future ones won't silently pass through. Test plan: - 94 new tests across test_at_mention_binary_files.py, test_base_provider_image_validation.py, test_agent_loop_image_size_propagation.py, test_openai_compat_document_translation.py, plus extensions to test_image_validation.py, test_openai_compat_image_translation.py, test_query_loop.py. - Wider suite: 5015 pass / 10 pre-existing failures unchanged. - Critic review loop: APPROVE after three rounds (round-2 caught a UTF-16 BOM regression I introduced; round-3 fixed it and added the post-decode garble detector). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced May 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the four pre-existing gaps the PR #155 audit surfaced. Each was out of scope for the user's original bug but a real correctness hazard.
1.
validate_images_for_apiskipped on every provider pathOnly the Anthropic-direct
services/api/claude.py:call_modelpath validated; allprovider.chat_stream_responsepaths (anthropic, openai, glm, deepseek, openrouter, minimax) bypassed it. A>5 MBbase64 that slipped past the @-mention compression fallback would surface as an opaque API error.BaseProvider._prepare_messages— every provider now validates client-side.OpenAICompatibleProvider._prepare_messagescallssuper()first so validation runs on the Anthropic-shape payload BEFORE translation toimage_url/file.query._call_model_synccatchesImageSizeErrorand emits amedia_sizeassistant API-error message (matches the existing server-side classification used byis_media_size_error).agent_loop._call_provider_for_turnre-raisesImageSizeErrorinstead of falling back toprovider.chat()—chat()runs the same_prepare_messagesand would re-trigger the validation, masking the original error._iter_image_blocksonly walked top-level content. Post-feat(read): port TS image-handling pipeline to Python (Tier C parity) #154 the Read tool ships images insidetool_result.content, which the validator was blind to. New recursion with_MAX_TOOL_RESULT_DEPTH=32depth cap walks nested tool_result content while bounding pathological depth.2. OpenAI tool→user split: asymmetric
tool_use_idcorrelationThe pre-existing
[multimodal content delivered ...]placeholder only fired when the original tool_result was image-only. For text+image tool_results the text body landed on therole=toolmessage with no marker, and the syntheticrole=usercarrying the image_url had no link back to the parent tool_call at all.[multimodal content for tool_use_id=X delivered in the following message]when a synthetic user message follows (appended after the text body for text+image; standalone for image-only).[content for tool_use_id=X]text block, then theimage_url/fileblocks. Correlation visible from both directions.[empty tool result]) honors OpenAI's "tool message content must be non-empty" requirement forcontent: []edge cases.3.
@file.pdfmojibake (Bug A pattern for non-image binaries)The pre-#155 text-mode
open(path, "r", encoding="utf-8", errors="replace")produced utf-8 replacement chars over PDF/zip/docx binary bytes — exact Bug A failure mode, just for a different file class. TS routes PDFs through the Read tool'spagesparameter, not @-mention.pdf/zip/tar/docx/...) skip the text read. PDF gets a Read-tool hint mentioning thepagesparameter; other binaries get a generic hint._looks_like_binary) catches binaries whose extension we haven't enumerated._read_text_with_encoding) handles Windows-emitted UTF-16 LE/BE, UTF-32 LE/BE, and UTF-8 BOM files via the BOM-aware codecs (utf-16,utf-32,utf-8-sig) so the BOM is consumed rather than preserved as a literal U+FEFF char._decoded_text_looks_garbled): NUL char zero-tolerance + 3% U+FFFD threshold. Falls back to the binary branch when an adversarial spoofed-BOM blob would otherwise leak NUL bytes or replacement-char mojibake into the prompt.binaryattachment kind inexpand_at_mentions+ REPL rendering (Skipped binary file <path>).4. DocumentBlock translation missing in OpenAI-compat converter
_anthropic_image_block_to_openaihandledtype=imageonly. If aDocumentBlockever reached an OpenAI-compat provider it would pass through unchanged and the server would reject or drop it._anthropic_document_block_to_openaitranslates{type:"document", source:{type:"base64", media_type:"application/pdf", ...}}→{type:"file", file:{filename, file_data}}. Mirrors the image translator's defensive contract (None on bad shape, empty-data guard)._translate_anthropic_multimodal_blockdispatcher centralises image+document translation for both the user-message and tool_result branches.Test plan
tests/test_at_mention_binary_files.py(18) — PDF/zip/docx routing, UTF-16/UTF-32/UTF-8 BOM decoding with clean rendered output, NUL-heavy and high-FFD adversarial cases, image branch unchanged.tests/test_base_provider_image_validation.py(10) — direct_prepare_messagesvalidation for both BaseProvider and OpenAICompatibleProvider.tests/test_agent_loop_image_size_propagation.py(3) —_call_provider_for_turnre-raises ImageSizeError; NotImplementedError + generic Exception fallbacks still work.tests/test_openai_compat_document_translation.py(13) — document translator + multimodal tool_result split.test_image_validation.py(5: nested-tool_result, deep-nesting depth-cap, depth-cap-within-limit, sentinel paths),test_openai_compat_image_translation.py(1 new + 1 updated for tool_use_id correlation),test_query_loop.py(1: ImageSizeError → media_size mapping).🤖 Generated with Claude Code