fix(desktop-backend): repair truncated Gemini JSON responses causing 500s by kodjima33 · Pull Request #5957 · BasedHardware/omi

kodjima33 · 2026-03-23T21:31:25Z

Summary

Gemini frequently returns truncated JSON when hitting max_output_tokens, causing EOF while parsing a string serde errors and 500s on /v1/conversations for all users (~500+ errors/day in prod)
Added parse_or_repair_json() helper that detects truncated JSON and closes open strings/brackets/braces to recover partial results
Applied to all 7 JSON parse sites in the LLM client
Made action items extraction non-fatal (like memories already was) — parse failure no longer blocks the conversation from being saved
Added retry with doubled max_tokens for structure extraction on first parse failure

Test plan

cargo build passes (verified locally)
Deploy to dev and verify 500 rate drops on /v1/conversations/from-segments
Verify conversations still process correctly with valid Gemini responses
Verify truncated responses are repaired (check logs for "Repaired truncated" messages)

🤖 Generated with Claude Code

…500s Gemini frequently returns truncated JSON when hitting max_output_tokens, causing "EOF while parsing a string" errors and 500s on /v1/conversations for all users (500+ errors/day in prod). Changes: - Add parse_or_repair_json() that detects truncated JSON and closes open strings, brackets, and braces to recover partial results - Apply repair to all 7 JSON parse sites (structure, brief structure, action items, memories, requires_context, date range, knowledge graph) - Make action items extraction non-fatal (like memories already was) so a parse failure doesn't block the entire conversation from being saved - Add retry with doubled max_tokens for structure extraction on first parse failure Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-03-23T21:35:15Z

Greptile Summary

This PR adds a parse_or_repair_json helper to recover from truncated Gemini responses (caused by max_output_tokens exhaustion) and applies it at all 7 JSON deserialization sites in desktop/Backend-Rust/src/llm/client.rs. It also makes action-item extraction non-fatal (matching the existing behavior for memories) and adds a retry-with-doubled-tokens path specifically for extract_structure.

Key changes:

parse_or_repair_json<T>: scans for unbalanced quotes/braces/brackets and appends a minimal closing suffix before re-parsing; falls through gracefully if repair fails.
extract_structure: on parse failure, automatically retries the Gemini call with double the token budget before propagating the error.
extract_action_items failure is now non-fatal — returns an empty list with a warn log instead of surfacing a 500.
Previous ad-hoc repair for memories (appending }}) replaced by the generalized helper.

Minor issues to address before merge:

last_was_string_content is written but never read anywhere in the function — dead code that will generate a Rust compiler warning (and a compile error under RUSTFLAGS=-D warnings).
The error value passed to tracing::warn! in the retry path includes the full raw LLM response, which can be verbose and may surface sensitive transcript content in log aggregators.
The retry in extract_structure fires on any parse failure, not just confirmed truncation — consider guarding it with a truncation-specific check to avoid unnecessary retries on schema-mismatch errors.

Confidence Score: 4/5

Safe to merge after addressing the unused variable warning; production reliability improvement is clear and well-scoped.
The core repair logic is sound and addresses a real, high-volume production issue. The three open concerns are all P2/style: an unused variable (compiler warning), log verbosity, and a retry over-trigger. None of these block correct behavior or introduce regressions. The non-fatal action-item change and the generalized repair function are clear improvements over the previous ad-hoc approach.
desktop/Backend-Rust/src/llm/client.rs — the parse_or_repair_json function (lines 17–90) is the only changed file and warrants the most attention.

Important Files Changed

Filename	Overview
desktop/Backend-Rust/src/llm/client.rs	Adds `parse_or_repair_json` helper to recover truncated Gemini responses; applies it at all 7 JSON parse sites; makes action-item extraction non-fatal; adds a retry-with-doubled-tokens path for structure extraction. Minor issues: unused `last_was_string_content` variable (compiler warning), full LLM response embedded in error/warn logs, and retry fires on any parse error rather than only on confirmed truncation.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[LLM Response from Gemini] --> B{parse_or_repair_json}
    B -->|Valid JSON| C[Deserialize T ✓]
    B -->|Parse fails| D[Scan chars: track in_string / stack]
    D --> E{Truncated?}
    E -->|Yes — build suffix| F["Append close-string + close brackets"]
    F --> G{Re-parse repaired JSON}
    G -->|OK| H[log info: Repaired truncated JSON]
    H --> C
    G -->|Still fails| I[Return Err with raw response]
    E -->|Empty / no suffix needed| I

    I --> J{Which call site?}
    J -->|extract_structure| K["retry with 3000 tokens (same prompt)"]
    K --> L{parse_or_repair_json again}
    L -->|OK| C
    L -->|Fails| M[Propagate error → 500]

    J -->|extract_action_items| N[warn + return vec empty — non-fatal]
    J -->|extract_memories| N
    J -->|requires_context / date_range / knowledge_graph| M

_{Reviews (1): Last reviewed commit: "fix(desktop-backend): repair truncated G..." | Re-trigger Greptile}

greptile-apps · 2026-03-23T21:35:18Z

+        let mut last_was_string_content = false;
+
+        for ch in trimmed.chars() {
+            if escape_next {
+                escape_next = false;
+                continue;
+            }
+            if ch == '\\' && in_string {
+                escape_next = true;
+                continue;
+            }
+            if ch == '"' {
+                in_string = !in_string;
+                last_was_string_content = false;
+                continue;
+            }
+            if in_string {
+                last_was_string_content = true;
+                continue;
+            }
+            last_was_string_content = false;
+            match ch {
+                '{' => stack.push('}'),
+                '[' => stack.push(']'),
+                '}' | ']' => { stack.pop(); }
+                _ => {}
+            }


last_was_string_content is unused dead code

last_was_string_content is declared, written to in multiple branches, but never actually read in any conditional check or returned value. The Rust compiler will emit an unused_assignments / unused variable warning for this. If #![deny(warnings)] or RUSTFLAGS=-D warnings is set in CI, this becomes a compile error.

Suggested change

let mut last_was_string_content = false;

for ch in trimmed.chars() {

if escape_next {

escape_next = false;

continue;

}

if ch == '\\' && in_string {

escape_next = true;

continue;

}

if ch == '"' {

in_string = !in_string;

last_was_string_content = false;

continue;

}

if in_string {

last_was_string_content = true;

continue;

}

last_was_string_content = false;

match ch {

'{' => stack.push('}'),

'[' => stack.push(']'),

'}' | ']' => { stack.pop(); }

_ => {}

}

let mut in_string = false;

let mut escape_next = false;

let mut stack: Vec<char> = Vec::new();

for ch in trimmed.chars() {

if escape_next {

escape_next = false;

continue;

}

if ch == '\\' && in_string {

escape_next = true;

continue;

}

if ch == '"' {

in_string = !in_string;

continue;

}

if in_string {

continue;

}

match ch {

'{' => stack.push('}'),

'[' => stack.push(']'),

'}' | ']' => { stack.pop(); }

_ => {}

}

}

greptile-apps · 2026-03-23T21:35:19Z

+    Err(format!(
+        "Failed to parse {} response: {} - {}",
+        label,
+        serde_json::from_str::<serde_json::Value>(response)
+            .err()
+            .map(|e| e.to_string())
+            .unwrap_or_else(|| "type mismatch".to_string()),
+        response
+    ))


Full LLM response logged in error messages

The error returned by parse_or_repair_json embeds the raw LLM response (potentially kilobytes of transcript data). When this is promoted to a tracing::warn! log — e.g., at the retry site:

tracing::warn!("Structure parse failed, retrying with 3000 tokens: {}", first_err);

— the complete truncated response lands in structured logs. This increases log volume significantly and may inadvertently surface sensitive transcript content in log aggregators/monitoring systems.

Consider truncating the response in the error message (e.g., limit to first 200 chars) and logging the full response only at tracing::debug! level:

Err(format!( "Failed to parse {} response: {} - {}…", label, serde_json::from_str::<serde_json::Value>(response) .err() .map(|e| e.to_string()) .unwrap_or_else(|| "type mismatch".to_string()), &response[..response.len().min(200)] ))

greptile-apps · 2026-03-23T21:35:21Z

+        // Try parsing, and if it fails (truncated JSON), retry with more tokens
+        let result: StructureResponse = match parse_or_repair_json(&response, "structure") {
+            Ok(r) => r,
+            Err(first_err) => {
+                tracing::warn!("Structure parse failed, retrying with 3000 tokens: {}", first_err);
+                let retry_response = self.call_with_schema(&prompt, Some(0.7), Some(3000), Some(schema)).await?;
+                parse_or_repair_json(&retry_response, "structure (retry)")?
+            }
+        };


Retry on parse failure may obscure non-truncation errors

The retry is triggered on any parse_or_repair_json failure, not just truncation failures. For example, if Gemini returns a structurally wrong schema (e.g., a field with an unexpected type), the retry will also fire, burning an extra LLM call and doubling latency before propagating the same error.

If you want to limit retries to genuine truncation scenarios, you could expose a separate was_truncated flag from parse_or_repair_json and only retry when that flag is set — or check the finish reason from the Gemini API response before retrying.

…500s (BasedHardware#5957) ## Summary - Gemini frequently returns truncated JSON when hitting `max_output_tokens`, causing `EOF while parsing a string` serde errors and **500s on `/v1/conversations`** for all users (~500+ errors/day in prod) - Added `parse_or_repair_json()` helper that detects truncated JSON and closes open strings/brackets/braces to recover partial results - Applied to all 7 JSON parse sites in the LLM client - Made action items extraction non-fatal (like memories already was) — parse failure no longer blocks the conversation from being saved - Added retry with doubled `max_tokens` for structure extraction on first parse failure ## Test plan - [ ] `cargo build` passes (verified locally) - [ ] Deploy to dev and verify 500 rate drops on `/v1/conversations/from-segments` - [ ] Verify conversations still process correctly with valid Gemini responses - [ ] Verify truncated responses are repaired (check logs for "Repaired truncated" messages) 🤖 Generated with [Claude Code](https://claude.com/claude-code)

kodjima33 merged commit 95d6d9e into main Mar 23, 2026
2 checks passed

kodjima33 deleted the worktree-fix-gemini-json-parsing branch March 23, 2026 21:31

greptile-apps Bot reviewed Mar 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(desktop-backend): repair truncated Gemini JSON responses causing 500s#5957

fix(desktop-backend): repair truncated Gemini JSON responses causing 500s#5957
kodjima33 merged 1 commit intomainfrom
worktree-fix-gemini-json-parsing

kodjima33 commented Mar 23, 2026

Uh oh!

Uh oh!

greptile-apps Bot commented Mar 23, 2026

Uh oh!

greptile-apps Bot Mar 23, 2026

Uh oh!

greptile-apps Bot Mar 23, 2026

Uh oh!

greptile-apps Bot Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kodjima33 commented Mar 23, 2026

Summary

Test plan

Uh oh!

Uh oh!

greptile-apps Bot commented Mar 23, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant