Skip to content

fix(gemini-cli): accept multi-modal content (string | Part | Part[])#143

Merged
mike1858 merged 1 commit intomainfrom
fix/137-gemini-cli-content-array
Apr 17, 2026
Merged

fix(gemini-cli): accept multi-modal content (string | Part | Part[])#143
mike1858 merged 1 commit intomainfrom
fix/137-gemini-cli-content-array

Conversation

@mike1858
Copy link
Copy Markdown
Member

@mike1858 mike1858 commented Apr 17, 2026

Closes #137.

Problem

Gemini CLI 0.35.x changed the content field of chat-session messages from a plain string to a PartListUnion (from @google/genai), which may be:

  • a plain string (legacy form) — "content": "my prompt..."
  • a single Part object — "content": {"text": "my prompt..."}
  • an array of Part objects (current form) — "content": [{"text": "my prompt..."}]

The previous parser typed content: String and therefore failed on any session from the new format with:

Serde("invalid type: sequence, expected a string") at character 0

Fix

Added a tiny untagged enum mirroring the upstream shape:

#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(untagged)]
enum GeminiCliContent {
    Text(String),
    Part(GeminiCliPart),
    Parts(Vec<GeminiCliPart>),
}

GeminiCliPart only consumes text; other part kinds (inlineData, fileData, functionCall, …) are silently ignored by serde's default behaviour so schema growth upstream doesn't break us again. All five GeminiCliMessage variants now carry Option<GeminiCliContent> with #[serde(default)], which also handles null and missing content gracefully.

The only live text-consumer (fallback_session_name in the User arm) now calls GeminiCliContent::as_text() to build a plain-text preview across any of the three shapes, preserving the 50-char truncation.

Tests

Added 6 regression tests in src/analyzers/tests/gemini_cli.rs:

  • test_gemini_cli_content_array_of_parts — array on both user and gemini messages.
  • test_gemini_cli_content_mixed_parts — array with {"text": ...} and {"inlineData": ...} interleaved; text is concatenated, non-text parts ignored.
  • test_gemini_cli_content_single_part_object — single-object PartListUnion form.
  • test_gemini_cli_content_array_session_name_truncated — long array-extracted text still truncates to 50 chars.
  • test_gemini_cli_content_missing_or_null — missing and null content do not crash parsing.
  • test_gemini_cli_issue_137_regression — exact failure schema from the issue report.

The original test_gemini_cli_reasoning_tokens (legacy string format) still passes unchanged.

Verification

cargo build --quiet                             # clean
cargo test --quiet                              # 291 passed (was 285 + 6 new)
cargo clippy --all-targets -- -D warnings       # clean
cargo doc --quiet --no-deps                     # clean
cargo fmt --all --check                         # clean

Summary by CodeRabbit

  • Bug Fixes

    • Improved Gemini CLI session file parsing to robustly handle missing, null, or multimodal content fields without crashing.
  • New Features

    • Added support for flexible content formats in Gemini CLI sessions, including text arrays, single parts, and mixed content types.

Gemini CLI 0.35.x changed the `content` field of chat-session messages
from a plain string to a `PartListUnion` (from `@google/genai`), which
may be a string, a single `Part` object (`{"text": "..."}`), or an
array of `Part` objects (`[{"text": "..."}]`). The previous parser
typed `content: String` and therefore failed on any session from the new
format with:

    Serde("invalid type: sequence, expected a string") at character 0

This commit introduces a small untagged `GeminiCliContent` enum mirroring
the upstream `PartListUnion` shape, plus a `GeminiCliPart` struct that
only consumes `text` (other part kinds — `inlineData`, `fileData`,
`functionCall`, etc. — are silently ignored so schema growth upstream
doesn't break us again). All five `GeminiCliMessage` variants now carry
`Option<GeminiCliContent>` with `#[serde(default)]`, which also handles
`null` and missing content gracefully.

The only live text-consumer (`fallback_session_name` in the User arm)
now calls `GeminiCliContent::as_text()` to build a plain-text preview
across any of the three shapes, preserving the 50-char truncation.

Added 6 regression tests covering:
- array-of-parts on both user and gemini messages
- mixed array with non-text parts (`inlineData`)
- single Part object form
- long text truncation when extracted from an array
- null / missing content
- the exact failure schema from the issue report

Fixes #137
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

📝 Walkthrough

Walkthrough

This PR updates Gemini CLI message parsing to support a schema change where the content field transitioned from a simple string to an array of content objects, enabling multimodal content handling while maintaining backward compatibility.

Changes

Cohort / File(s) Summary
Gemini CLI Parser Implementation
src/analyzers/gemini_cli.rs
Added GeminiCliContent (untagged union supporting text string, single part, or parts array) and GeminiCliPart types to replace string-only content field. Introduced as_text() method to extract and concatenate text from multimodal content, with fallback to empty string for missing/null content.
Gemini CLI Test Suite
src/analyzers/tests/gemini_cli.rs
Added write_session() helper function to reduce test boilerplate. Introduced 6 regression tests covering array-of-parts content, mixed-type parts with non-text data, single-part objects, session-name truncation at 50 characters, missing/null content handling, and issue #137 schema regression validation.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Content once a string, now an array of dreams,
With multimodal magic in the Gemini streams,
Our parser adapts, handles parts and text too,
Legacy and new formats—we embrace both for you! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(gemini-cli): accept multi-modal content (string | Part | Part[])' clearly and concisely describes the main change: adding support for multiple content formats in Gemini CLI parsing.
Linked Issues check ✅ Passed All coding requirements from issue #137 are met: the parser now accepts string, single Part object, and array of Parts formats; provides text extraction via as_text(); and includes comprehensive regression tests covering all three formats.
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #137: updates to GeminiCliMessage content handling, addition of GeminiCliContent and GeminiCliPart types, text extraction method, session-name fallback logic, and comprehensive test coverage for the new functionality.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/137-gemini-cli-content-array

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/analyzers/gemini_cli.rs (1)

99-125: Untagged enum variant ordering is correct; consider one small robustness note.

Serde tries untagged variants in declaration order, so Text (string) → Part (object) → Parts (array) correctly dispatches on the JSON shape. One subtlety worth being aware of: because GeminiCliPart has only optional fields, any JSON object will successfully deserialize into the Part variant (including objects carrying only inlineData/functionCall/etc., which will produce text: None and contribute an empty string via as_text). That matches the stated "tolerate upstream schema growth" goal, but it also means a future scalar variant added to PartListUnion (e.g. a bare number) would be the only shape that actually fails parsing. Nothing to change today — just flagging for future maintainers.

Minor optional polish in as_text for the Parts arm:

♻️ Optional: iterator-based concatenation
-            GeminiCliContent::Parts(ps) => {
-                let mut out = String::new();
-                for p in ps {
-                    if let Some(t) = &p.text {
-                        out.push_str(t);
-                    }
-                }
-                out
-            }
+            GeminiCliContent::Parts(ps) => ps
+                .iter()
+                .filter_map(|p| p.text.as_deref())
+                .collect::<String>(),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/analyzers/gemini_cli.rs` around lines 99 - 125, The Parts arm of
GeminiCliContent::as_text manually builds a String by iterating and pushing each
part's text; change it to use iterator-based concatenation for clarity and
brevity: in the method as_text (enum GeminiCliContent) replace the for-loop in
the GeminiCliContent::Parts(ps) branch with an iterator pipeline that filters
Option<&String> values, maps to &str, and collects or folds into a single
String; keep the existing behavior that ignores None text (GeminiCliPart::text)
so non-text parts remain tolerated.
src/analyzers/tests/gemini_cli.rs (1)

71-346: Thorough regression coverage for issue #137.

The six new tests collectively cover: array-of-parts, mixed/unknown parts, single-part object, char-based 50-char truncation from arrays, missing/null content, and the exact failing schema from the issue. The truncation assertion ("This prompt is definitely longer than fifty charac..." at exactly 50 chars + ...) matches the implementation's chars().take(50) semantics — good.

Optional nit: these tests don't actually .await anything, so #[tokio::test] could be replaced with plain #[test] to avoid spinning up a runtime per case. Kept as-is is fine too, for consistency with the pre-existing test_gemini_cli_reasoning_tokens.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/analyzers/tests/gemini_cli.rs` around lines 71 - 346, Several new tests
(test_gemini_cli_content_array_of_parts, test_gemini_cli_content_mixed_parts,
test_gemini_cli_content_single_part_object,
test_gemini_cli_content_array_session_name_truncated,
test_gemini_cli_content_missing_or_null, test_gemini_cli_issue_137_regression)
are marked async and use #[tokio::test] but they never .await; change each to a
synchronous test by replacing #[tokio::test] with #[test] and convert the async
fn signatures to plain fn (remove async) so the runtime isn't unnecessarily
spawned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/analyzers/gemini_cli.rs`:
- Around line 99-125: The Parts arm of GeminiCliContent::as_text manually builds
a String by iterating and pushing each part's text; change it to use
iterator-based concatenation for clarity and brevity: in the method as_text
(enum GeminiCliContent) replace the for-loop in the GeminiCliContent::Parts(ps)
branch with an iterator pipeline that filters Option<&String> values, maps to
&str, and collects or folds into a single String; keep the existing behavior
that ignores None text (GeminiCliPart::text) so non-text parts remain tolerated.

In `@src/analyzers/tests/gemini_cli.rs`:
- Around line 71-346: Several new tests (test_gemini_cli_content_array_of_parts,
test_gemini_cli_content_mixed_parts, test_gemini_cli_content_single_part_object,
test_gemini_cli_content_array_session_name_truncated,
test_gemini_cli_content_missing_or_null, test_gemini_cli_issue_137_regression)
are marked async and use #[tokio::test] but they never .await; change each to a
synchronous test by replacing #[tokio::test] with #[test] and convert the async
fn signatures to plain fn (remove async) so the runtime isn't unnecessarily
spawned.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 670b57a0-15d0-40b1-ae45-1250980357bf

📥 Commits

Reviewing files that changed from the base of the PR and between 8521934 and 0b6d87f.

📒 Files selected for processing (2)
  • src/analyzers/gemini_cli.rs
  • src/analyzers/tests/gemini_cli.rs

@mike1858 mike1858 merged commit 9bf4659 into main Apr 17, 2026
6 checks passed
@mike1858 mike1858 deleted the fix/137-gemini-cli-content-array branch April 17, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Failed to parse Gemini CLI stats due to Schema change (expected string, found sequence)

1 participant