fix(tui): send /attach images as multimodal content by xyuai · Pull Request #2587 · Hmbown/CodeWhale

xyuai · 2026-06-02T12:22:01Z

Summary

convert /attach image placeholders into OpenAI-compatible multimodal image_url content blocks
read local image files and send them as data:image/...;base64,... URLs while keeping the transcript placeholder readable
add regression tests for attachment base64 conversion and chat request image parts

Verification

cargo fmt
CARGO_TARGET_DIR=C:\Users\ky\AppData\Local\Temp\codewhale-target cargo check -p codewhale-tui
CARGO_TARGET_DIR=C:\Users\ky\AppData\Local\Temp\codewhale-target cargo test -p codewhale-tui --bin codewhale-tui media_attachment_content_blocks_embed_image_as_data_url -- --nocapture
CARGO_TARGET_DIR=C:\Users\ky\AppData\Local\Temp\codewhale-target cargo test -p codewhale-tui --bin codewhale-tui request_builder_emits_openai_image_url_parts_for_user_images -- --nocapture

Greptile Summary

This PR wires /attach image placeholders into OpenAI-compatible multimodal requests by adding a ContentBlock::ImageUrl variant and a new media_attachment_content_blocks helper that reads local image files and encodes them as data:image/…;base64,… URLs.

New ContentBlock::ImageUrl in models.rs: adds ImageUrlContent struct and updates every match across compaction.rs, purge.rs, working_set.rs, and client/chat.rs; most non-chat files just emit a compact placeholder or skip the block.
file_mention.rs — media_attachment_content_blocks: parses [Attached image: … at <path>] lines, resolves the extension to a MIME type, and encodes the file as base64; unsupported types and I/O errors produce informative <Text> blocks instead of panicking.
client/chat.rs — build_chat_messages_with_reasoning: when image_parts is non-empty, user message content is serialised as a JSON array with separate text and image_url parts instead of a bare string; the bulk of the diff in several other files is whitespace-only reformatting.

Confidence Score: 3/5

Safe to review but needs the unbounded file-read fixed before merging — a large attached image will block the event loop and send an oversized payload.

The core multimodal wiring in chat.rs and models.rs is correct and the tests pass. The main concern is in media_attachment_content_blocks: it reads image files with no byte ceiling, while the analogous text-mention code explicitly caps reads at 128 KiB. On a system where a user attaches a large raw image (or accidentally attaches a non-image large file whose extension maps to a known MIME type), the entire file is read synchronously inside the async send-message handler, base64-encoded in memory, and forwarded to the API — with no guard against OOM or a payload the API will reject.

crates/tui/src/tui/file_mention.rs — the new media_attachment_content_blocks function needs a file-size cap matching the existing MAX_MENTION_FILE_BYTES pattern.

Important Files Changed

Filename	Overview
crates/tui/src/tui/file_mention.rs	Adds `media_attachment_content_blocks` which reads image files and emits `ContentBlock::ImageUrl` — no file-size cap, unlike the existing text-mention path that enforces MAX_MENTION_FILE_BYTES.
crates/tui/src/client/chat.rs	Correctly routes `ContentBlock::ImageUrl` blocks into OpenAI multimodal array content for user messages; adds `summarize_image_url_for_inspect` which duplicates `image_url_summary_for_compaction` from compaction.rs.
crates/tui/src/core/engine.rs	Calls `media_attachment_content_blocks` (which performs blocking fs::read) from within the async `handle_send_message`, inconsistent with the codebase's pattern of offloading blocking I/O to the blocking pool.
crates/tui/src/models.rs	Adds `ImageUrlContent` struct and `ContentBlock::ImageUrl` variant; straightforward, well-typed additions with correct serde rename attributes.
crates/tui/src/compaction.rs	Adds `ContentBlock::ImageUrl` arm to all match statements; bulk of diff is whitespace reformatting with no logic changes. The new arms correctly emit a placeholder summary and return an empty path list.

Sequence Diagram

sequenceDiagram
    participant User
    participant Engine as engine.rs
    participant FM as file_mention.rs
    participant FS as Filesystem
    participant Chat as client/chat.rs
    participant API as LLM API

    User->>Engine: handle_send_message(content)
    Engine->>FM: media_attachment_content_blocks(text)
    FM->>FM: extract_media_attachment_references(input)
    FM->>FM: image_mime_type_for_path(path)
    FM->>FS: fs::read(path) blocking, no size cap
    FS-->>FM: bytes
    FM-->>Engine: Vec ContentBlock::ImageUrl
    Engine->>Engine: push ImageUrl blocks into user Message
    Engine->>Chat: build_chat_messages(messages)
    Chat->>Chat: collect image_parts from ContentBlock::ImageUrl
    Chat->>Chat: wrap as multimodal array with text and image_url parts
    Chat-->>API: POST /chat/completions with content array

_{Reviews (1): Last reviewed commit: "fix(tui): send attached images as multim..." | Re-trigger Greptile}

Greptile also left 3 inline comments on this PR.

github-actions · 2026-06-02T12:22:13Z

Thanks @xyuai for taking the time to contribute.

This repository is currently observing a maintainer-managed contribution gate in dry-run mode, so this pull request is staying open. When enforcement is enabled, pull requests from contributors who are not listed in .github/APPROVED_CONTRIBUTORS will be closed automatically.

Please read CONTRIBUTING.md for the expected contribution shape. A maintainer can grant PR access by commenting /lgtm on a pull request.

gemini-code-assist

Code Review

This pull request introduces support for multimodal user messages by allowing images to be attached and sent as base64-encoded data URLs to OpenAI-compatible endpoints. It updates message inspection, session serialization, context purging, and the seam manager to handle the new ContentBlock::ImageUrl variant. Feedback on the changes highlights two key areas for improvement: restricting the supported image formats in image_mime_type_for_path to only those officially supported by major LLM providers (PNG, JPEG, WEBP, GIF) to prevent API errors, and implementing a file size limit check using std::fs::metadata before reading image files to avoid excessive memory usage and potential API failures.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-02T12:23:43Z

+fn image_mime_type_for_path(path: &Path) -> Option<&'static str> {
+    let ext = path.extension()?.to_str()?.to_ascii_lowercase();
+    match ext.as_str() {
+        "png" => Some("image/png"),
+        "jpg" | "jpeg" => Some("image/jpeg"),
+        "gif" => Some("image/gif"),
+        "webp" => Some("image/webp"),
+        "bmp" => Some("image/bmp"),
+        "tif" | "tiff" => Some("image/tiff"),
+        "ppm" => Some("image/x-portable-pixmap"),
+        _ => None,
    }
-    if buffer.contains(&0) {
-        return Err(std::io::Error::new(
-            std::io::ErrorKind::InvalidData,
-            "file appears to be binary",
-        ));
-    }
-    let text = std::str::from_utf8(&buffer)
-        .map_err(|_| std::io::Error::new(std::io::ErrorKind::InvalidData, "file is not UTF-8"))?
-        .to_string();
-    Ok((text, truncated))
 }


The image MIME type resolver currently supports formats like BMP, TIFF, and PPM. However, major LLM providers (such as OpenAI and Anthropic) only officially support PNG, JPEG, WEBP, and non-animated GIF. Sending unsupported image formats will result in a 400 Bad Request API error from the provider, disrupting the chat session.

It is recommended to restrict the supported image extensions to only those officially supported by the LLM APIs.

fn image_mime_type_for_path(path: &Path) -> Option<&'static str> { let ext = path.extension()?.to_str()?.to_ascii_lowercase(); match ext.as_str() { "png" => Some("image/png"), "jpg" | "jpeg" => Some("image/jpeg"), "gif" => Some("image/gif"), "webp" => Some("image/webp"), _ => None, } }

gemini-code-assist · 2026-06-02T12:23:43Z

+        match std::fs::read(path) {
+            Ok(bytes) => blocks.push(ContentBlock::ImageUrl {
+                image_url: ImageUrlContent {
+                    url: format!("data:{mime_type};base64,{}", BASE64.encode(bytes)),
+                },
+            }),
+            Err(err) => blocks.push(ContentBlock::Text {
+                text: format!(
+                    "<unreadable-attachment kind=\"image\" path=\"{}\">\n{err}\n</unreadable-attachment>",
+                    reference.path
+                ),
+                cache_control: None,
+            }),
        }


Reading arbitrary image files without a size limit can lead to high memory consumption, TUI lag, or API errors (OpenAI limits image uploads to 20MB, and Anthropic limits them to 5MB).

Checking the file size using std::fs::metadata before reading the file prevents loading excessively large files into memory and avoids guaranteed API failures.

match std::fs::metadata(path) { Ok(meta) if meta.len() > 10 * 1024 * 1024 => { blocks.push(ContentBlock::Text { text: format!( "<attachment-too-large kind=\"image\" path=\"{}\" limit=\"10MB\" />", reference.path ), cache_control: None, }); continue; } _ => {} } match std::fs::read(path) { Ok(bytes) => blocks.push(ContentBlock::ImageUrl { image_url: ImageUrlContent { url: format!("data:{mime_type};base64,{}", BASE64.encode(bytes)), }, }), Err(err) => blocks.push(ContentBlock::Text { text: format!( "<unreadable-attachment kind=\"image\" path=\"{}\">\n{err}\n</unreadable-attachment>", reference.path ), cache_control: None, }), }

greptile-apps · 2026-06-02T12:45:11Z

+        match std::fs::read(path) {
+            Ok(bytes) => blocks.push(ContentBlock::ImageUrl {
+                image_url: ImageUrlContent {
+                    url: format!("data:{mime_type};base64,{}", BASE64.encode(bytes)),
+                },
+            }),


Unbounded image file read — no size cap

std::fs::read(path) loads the entire file into memory with no ceiling. The text-mention path enforces MAX_MENTION_FILE_BYTES = 128 KiB via take(), but this image path has no equivalent guard. A user who accidentally attaches a multi-hundred-MB raw image (or a file masquerading as one via its path) will fully base64-encode it in memory, blocking the async runtime and potentially sending a payload large enough to be rejected by the API or exhaust process memory.

greptile-apps · 2026-06-02T12:45:12Z

+        content.extend(crate::tui::file_mention::media_attachment_content_blocks(
+            &text,
+        ));


Blocking file I/O on the async executor thread

media_attachment_content_blocks calls std::fs::read synchronously from a non-async function that is itself called from the async fn handle_send_message. The codebase's own comment at line ~1316 ("Run the git work on the blocking pool so the async runtime stays responsive") documents why this pattern is avoided elsewhere. For small images the impact is negligible, but the same call path is now used unconditionally on every message send that contains any [Attached image: …] placeholder, including messages with large files.

greptile-apps · 2026-06-02T12:45:13Z

+fn summarize_image_url_for_inspect(url: &str) -> String {
+    let Some((prefix, encoded)) = url.split_once(";base64,") else {
+        return first_chars(url, 96);
+    };
+    format!("{prefix};base64,<{} chars>", encoded.len())
+}
+


Duplicate summarisation helper

summarize_image_url_for_inspect here and image_url_summary_for_compaction in compaction.rs (line 306) are byte-for-byte identical. If the truncation length (96 chars) or the format string ever needs to change, both copies must be updated. Extracting a shared helper to utils.rs or models.rs would eliminate the divergence risk.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

fix(tui): send attached images as multimodal content

5534c5a

xyuai mentioned this pull request Jun 2, 2026

无法上传本地图片 #2584

Open

gemini-code-assist Bot reviewed Jun 2, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 2, 2026

View reviewed changes

Hmbown added this to the v0.8.51 milestone Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tui): send /attach images as multimodal content#2587

fix(tui): send /attach images as multimodal content#2587
xyuai wants to merge 1 commit into
Hmbown:mainfrom
xyuai:fix-attach-image-base64-2584

xyuai commented Jun 2, 2026 •

edited by greptile-apps Bot

Loading

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

greptile-apps Bot Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xyuai commented Jun 2, 2026 • edited by greptile-apps Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xyuai commented Jun 2, 2026 •

edited by greptile-apps Bot

Loading