Skip to content

fix(kimi-k25): emit images inline in tool-response content#30

Merged
hallerite merged 1 commit into
fix/qwen35-tool-response-imagesfrom
fix/kimi-k25-tool-response-images
May 13, 2026
Merged

fix(kimi-k25): emit images inline in tool-response content#30
hallerite merged 1 commit into
fix/qwen35-tool-response-imagesfrom
fix/kimi-k25-tool-response-images

Conversation

@hallerite
Copy link
Copy Markdown
Member

Summary

  • Extends the Qwen3.5 fix from fix(qwen35): emit images inline in tool-response content #25 to KimiK25Renderer (which serves both Kimi K2.5 and K2.6).
  • Threads emit_image from render() and bridge_to_next_turn() through _render_tool_body into _emit_content, so images inside tool message content render inline instead of being silently dropped.
  • Extends _supports_tool_message_images so test_tool_response_image_byte_parity exercises Kimi K2.5/K2.6.

Why the existing tests didn't catch this

Two compounding gaps. First, _emit_content silently drops image parts when emit_image is None — that fallback was meant for assistant-body text rewriting but doubled as a backdoor for callers that simply forgot to plumb it (i.e. _render_tool_body). Second, every existing multimodal test put images on user messages; nothing exercised the tool path. #25 is the first test that does — this PR extends its gate to flag (and now cover) Kimi.

Stacking

Based on #25 (whose tests this PR uses to assert byte-parity). GitHub will auto-retarget the base to main once #25 merges.

Test plan

  • pytest tests/test_multimodal.py::test_tool_response_image_byte_parity — both moonshotai/Kimi-K2.5 and moonshotai/Kimi-K2.6 PASS against the HF processor across all three cases (single tool image, multi-turn tool images, consecutive tools with mixed media + text-only).
  • Regression sweep: pytest tests/test_multimodal.py -k Kimi → 8/8 pass (3 pre-existing byte-parity / placeholders / bridge tests × 2 Kimi variants + 2 new tool-response tests).
  • Ruff lint + format clean.

🤖 Generated with Claude Code

Thread ``emit_image`` from ``render()`` and ``bridge_to_next_turn()``
through ``_render_tool_body`` so image parts inside ``tool`` message
content (browser-agent screenshots, etc.) render as
``<|media_begin|>image<|media_content|><|media_pad|><|media_end|>``
inline instead of being silently dropped by ``_emit_content``.

Same bug class as Qwen3.5 (PR #25). Affects both Kimi K2.5 and K2.6
since they share ``KimiK25Renderer``.

Extends the ``_supports_tool_message_images`` gate in the multimodal
suite so ``test_tool_response_image_byte_parity`` exercises Kimi
K2.5/K2.6 against the HF processor — 8/8 Kimi multimodal cases pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hallerite hallerite merged commit 7c70310 into fix/qwen35-tool-response-images May 13, 2026
1 of 2 checks passed
@hallerite hallerite deleted the fix/kimi-k25-tool-response-images branch May 13, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant