feat(tui): send /attach images as multimodal content (#2584) by Hmbown · Pull Request #2607 · Hmbown/CodeWhale

Hmbown · 2026-06-03T04:06:19Z

Adds OpenAI-compatible image_url content blocks for multimodal image support. Rebased from @xyuai's PR #2587 onto v0.8.51-era main with cycle-removal and compaction-refactor conflicts resolved.

Changes

models.rs: ImageUrlContent struct, ContentBlock::ImageUrl variant
client/chat.rs: image_parts collection, multimodal wire format, image-aware inspection, stream-event no-op
10 files: exhaustiveness arms for new variant

Test

cargo test -p codewhale-tui: 3,931 passed, 0 failed (includes new request_builder_emits_openai_image_url_parts_for_user_images test)

Closes #2584. Credit: @xyuai for the original implementation in #2587.

@xyuai

Adds OpenAI-compatible image_url content blocks to the chat message model, wiring attached images through build_chat_messages_with_reasoning as multimodal user-content arrays. When images are present, user messages emit a content array of text + image_url parts instead of a plain string, matching the OpenAI vision API shape. - models.rs: new ImageUrlContent struct, ContentBlock::ImageUrl variant - client/chat.rs: image_parts collection, multimodal wire format for user messages, image-aware message inspection, stream-event no-op - Exhaustiveness arms added across 10 files (compaction, seam_manager, capacity_flow, purge, notifications, session_picker, utils, working_set, rlm/session, runtime_api) - Test: request_builder_emits_openai_image_url_parts_for_user_images Credit: @xyuai (PR #2587 — root cause + initial implementation) Closes: #2584 Co-authored-by: xyuai <xyuai@users.noreply.github.com>

greptile-apps

Hmbown has reached the 50-review limit for trial accounts. To continue receiving code reviews, upgrade your plan.

gemini-code-assist

Code Review

This pull request adds support for multimodal ImageUrl content blocks across the TUI client, updating message serialization, compaction, token estimation, and UI rendering. The feedback highlights three critical issues where ImageUrl blocks are handled incorrectly: returning serde_json::Value::Null in runtime_api.rs and rlm/session.rs introduces invalid nulls into serialized content arrays, and estimating ImageUrl blocks as 0 tokens in compaction.rs can lead to context window overflows. Suggestions are provided to properly serialize the image blocks and use a default estimate of 85 tokens.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-03T04:08:07Z

                    crate::models::ContentBlock::CodeExecutionToolResult { tool_use_id, content } => {
                        json!({ "type": "tool_result", "tool_use_id": tool_use_id, "content": content })
                    }
+                    crate::models::ContentBlock::ImageUrl { .. } => serde_json::Value::Null,


Returning serde_json::Value::Null for ImageUrl blocks introduces null elements into the content array of the API response. This violates the expected schema (where all elements should be valid ContentBlock objects) and can cause deserialization or rendering failures in API clients. Instead, represent the ImageUrl block properly using its type and URL.

crate::models::ContentBlock::ImageUrl { image_url } => { json!({ "type": "image_url", "image_url": { "url": image_url.url.clone() } }) }

gemini-code-assist · 2026-06-03T04:08:07Z

            "tool_use_id": tool_use_id,
            "content": content,
        }),
+        ContentBlock::ImageUrl { .. } => serde_json::Value::Null,


Returning serde_json::Value::Null for ImageUrl blocks introduces null elements into the compacted content array. This violates the expected schema of ContentBlock objects and can cause deserialization or parsing errors when the RLM session is processed. Instead, represent the ImageUrl block as a valid JSON object, truncating the URL to prevent context bloat from large base64 strings.

ContentBlock::ImageUrl { image_url } => json!({ "type": "image_url", "image_url": { "url": truncate_chars(&image_url.url, 128), } }),

gemini-code-assist · 2026-06-03T04:08:07Z

+            | ContentBlock::CodeExecutionToolResult { .. }
+            | ContentBlock::ImageUrl { .. } => 0,


Estimating ImageUrl blocks as 0 tokens will cause the compaction logic to significantly underestimate the total token count of the conversation. This can lead to context window overflow errors (HTTP 400) from the LLM provider because compaction won't be triggered in time when images are attached. Every image in OpenAI's multimodal API costs at least 85 tokens (low detail base cost). We should use a conservative default estimate of 85 tokens for ImageUrl blocks.

Suggested change

| ContentBlock::CodeExecutionToolResult { .. }

| ContentBlock::ImageUrl { .. } => 0,

| ContentBlock::CodeExecutionToolResult { .. } => 0,

ContentBlock::ImageUrl { .. } => 85,

@xyuai

…bown#2587) (Hmbown#2607) Adds OpenAI-compatible image_url content blocks to the chat message model, wiring attached images through build_chat_messages_with_reasoning as multimodal user-content arrays. When images are present, user messages emit a content array of text + image_url parts instead of a plain string, matching the OpenAI vision API shape. - models.rs: new ImageUrlContent struct, ContentBlock::ImageUrl variant - client/chat.rs: image_parts collection, multimodal wire format for user messages, image-aware message inspection, stream-event no-op - Exhaustiveness arms added across 10 files (compaction, seam_manager, capacity_flow, purge, notifications, session_picker, utils, working_set, rlm/session, runtime_api) - Test: request_builder_emits_openai_image_url_parts_for_user_images Credit: @xyuai (PR Hmbown#2587 — root cause + initial implementation) Closes: Hmbown#2584 Co-authored-by: xyuai <xyuai@users.noreply.github.com>

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 3, 2026

View reviewed changes

Hmbown merged commit dd26114 into main Jun 3, 2026
14 of 16 checks passed

Hmbown deleted the fix/image-attach-2587 branch June 3, 2026 04:27

Hmbown mentioned this pull request Jun 3, 2026

fix(tui): send /attach images as multimodal content #2587

Closed

This was referenced Jun 3, 2026

无法上传本地图片 #2584

Open

fix(release): stabilize v0.8.52 #2626

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tui): send /attach images as multimodal content (#2584)#2607

feat(tui): send /attach images as multimodal content (#2584)#2607
Hmbown merged 1 commit into
mainfrom
fix/image-attach-2587

Hmbown commented Jun 3, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		\| ContentBlock::CodeExecutionToolResult { .. }
		\| ContentBlock::ImageUrl { .. } => 0,

Conversation

Hmbown commented Jun 3, 2026

Changes

Test

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant