fix: harden OpenAI-compatible parsing for OpenRouter responses by jamiepine · Pull Request #304 · spacedriveapp/spacebot

jamiepine · 2026-03-04T00:16:00Z

Summary

Harden OpenAI-compatible parsing so chat responses no longer get dropped when providers return non-string message.content payloads.
Treat reasoning, reasoning_content, and reasoning_details as valid fallback text when content is otherwise empty, and support both tool_calls and legacy function_call outputs.
Improve provider-facing diagnostics by using provider-specific labels (instead of hardcoded OpenAI text) and including finish_reason context in empty-response errors.
Add targeted unit tests for content arrays, reasoning fallback, legacy function-call parsing, and the richer empty-response error path.

Example Response Shape

{
  "choices": [
    {
      "message": {
        "content": [{"type": "output_text", "text": "done"}],
        "reasoning": "working through the plan",
        "function_call": {"name": "reply", "arguments": "{\"content\":\"ok\"}"}
      },
      "finish_reason": "function_call"
    }
  ]
}

Testing

./scripts/preflight.sh
./scripts/gate-pr.sh

coderabbitai · 2026-03-04T00:16:06Z

Walkthrough

This change replaces hardcoded "OpenAI" strings with dynamic provider labels in error messages, extends OpenAI response parsing to handle content arrays, legacy tool calls, and reasoning fallbacks, and adds robust text collection and tool argument parsing logic with improved error handling.

Changes

Cohort / File(s)	Summary
Provider Labeling `src/llm/model.rs`	Replaces hardcoded "OpenAI" strings with dynamic `provider_label` derived from provider config throughout error messages, API errors, and response framing in `call_openai`, `call_openai_responses`, and related helpers.
Response Parsing Enhancement `src/llm/model.rs`	Extends `parse_openai_response` to support content returned as arrays of parts, tool calls via legacy `tool_calls` arrays and `function_call` payloads, and introduces `parse_openai_reasoning_fallback` for handling responses where content is empty but reasoning is provided in alternative fields.
Parsing Helper Functions `src/llm/model.rs`	Adds `collect_openai_text_content` for recursive text extraction, `parse_openai_tool_arguments` for robust stringified JSON and null handling, and `parse_openai_tool_call` for deriving tool call id, name, and arguments from various payload shapes.
Error Handling & Diagnostics `src/llm/model.rs`	Improves error handling for empty provider responses to include `provider_label` and `finish_reason`, adds diagnostic keys for better logging context, and introduces `provider_display_name` helper as fallback for provider labeling.
Tests & Validation `src/llm/model.rs`	Adds comprehensive tests covering new parsing paths (content array parts, reasoning fallbacks, legacy function calls, empty error messaging) and model remapping behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix: preserve conversation history and improve worker retrigger reliability #270: Both PRs modify logic for extracting and parsing tool-call arguments and reply content from OpenAI responses.
Fix Z.AI Coding Plan model routing #210: Both PRs modify the same call_openai and call_openai_responses code paths for provider-aware response handling.
fix: surface upstream provider details in proxy API errors #278: Both PRs improve upstream-provider-aware error formatting in src/llm/model.rs for OpenAI-compatible calls.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix: harden OpenAI-compatible parsing for OpenRouter responses' directly matches the main objective of the PR: improving OpenAI-compatible response parsing to handle non-string content payloads and other edge cases from OpenRouter.
Description check	✅ Passed	The description clearly explains the changes: hardening OpenAI parsing for non-string content, supporting reasoning fallback fields, handling both tool_calls and legacy function_call outputs, improving provider labeling in errors, and adding comprehensive tests.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/openrouter-empty-response-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tembo · 2026-03-04T00:21:06Z

src/llm/model.rs

+    }
+    if arguments_field.is_null() {
+        serde_json::json!({})
+    } else {


arguments_field.clone() will pass through non-object JSON (arrays/numbers), which can make downstream tool-arg handling inconsistent. Consider only accepting objects and defaulting to {} otherwise.

Suggested change

} else {

fn parse_openai_tool_arguments(arguments_field: &serde_json::Value) -> serde_json::Value {

if let Some(raw) = arguments_field.as_str() {

return serde_json::from_str(raw).unwrap_or_else(|_| serde_json::json!({}));

}

if arguments_field.is_object() {

arguments_field.clone()

} else {

serde_json::json!({})

}

}

coderabbitai

🧹 Nitpick comments (1)

src/llm/model.rs (1)

1519-1554: Consider adding recursion depth limit for defensive coding.

The recursive collect_openai_text_content function has no depth limit. While real API responses are unlikely to be deeply nested, malformed or adversarial responses could cause stack overflow.

This is a low-risk concern since provider responses are generally well-formed, but a depth guard would add resilience.

🛡️ Optional: Add depth limit for robustness

-fn collect_openai_text_content(value: &serde_json::Value, text_parts: &mut Vec<String>) {
+fn collect_openai_text_content(value: &serde_json::Value, text_parts: &mut Vec<String>) {
+    collect_openai_text_content_inner(value, text_parts, 0);
+}
+
+fn collect_openai_text_content_inner(value: &serde_json::Value, text_parts: &mut Vec<String>, depth: usize) {
+    const MAX_DEPTH: usize = 32;
+    if depth > MAX_DEPTH {
+        return;
+    }
     match value {
         serde_json::Value::String(text) => {
             if !text.trim().is_empty() {
                 text_parts.push(text.to_string());
             }
         }
         serde_json::Value::Array(items) => {
             for item in items {
-                collect_openai_text_content(item, text_parts);
+                collect_openai_text_content_inner(item, text_parts, depth + 1);
             }
         }
         serde_json::Value::Object(map) => {
             // ... existing text/summary/refusal extraction ...

             if let Some(content) = map.get("content") {
-                collect_openai_text_content(content, text_parts);
+                collect_openai_text_content_inner(content, text_parts, depth + 1);
             }
         }
         _ => {}
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/llm/model.rs` around lines 1519 - 1554, The recursive function
collect_openai_text_content lacks a depth guard and could overflow on
maliciously deep JSON; modify collect_openai_text_content to accept a
current_depth parameter (or use an internal helper) and a MAX_DEPTH constant
(e.g., 64), immediately return when current_depth >= MAX_DEPTH, and increment
current_depth on every recursive call (for array items and nested "content"
values and object traversal) so nested recursion stops once the depth limit is
reached; keep behavior identical otherwise (still collect strings from "text",
"summary", "refusal", arrays, and content).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/llm/model.rs`:
- Around line 1519-1554: The recursive function collect_openai_text_content
lacks a depth guard and could overflow on maliciously deep JSON; modify
collect_openai_text_content to accept a current_depth parameter (or use an
internal helper) and a MAX_DEPTH constant (e.g., 64), immediately return when
current_depth >= MAX_DEPTH, and increment current_depth on every recursive call
(for array items and nested "content" values and object traversal) so nested
recursion stops once the depth limit is reached; keep behavior identical
otherwise (still collect strings from "text", "summary", "refusal", arrays, and
content).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b5bb90 and 1bed130.

📒 Files selected for processing (1)

src/llm/model.rs

fix: harden OpenAI-compatible response parsing

1bed130

jamiepine marked this pull request as ready for review March 4, 2026 00:18

tembo bot reviewed Mar 4, 2026

View reviewed changes

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

Merge branch 'main' into fix/openrouter-empty-response-hardening

94e061b

jamiepine merged commit 5d82132 into main Mar 4, 2026
3 checks passed

This was referenced Mar 4, 2026

fix: harden OpenAI-compatible parsing with SSE fallback #312

Closed

fix(llm): implement native OpenAI-compatible SSE streaming #313

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden OpenAI-compatible parsing for OpenRouter responses#304

fix: harden OpenAI-compatible parsing for OpenRouter responses#304
jamiepine merged 2 commits intomainfrom
fix/openrouter-empty-response-hardening

jamiepine commented Mar 4, 2026

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Uh oh!

tembo bot Mar 4, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-    } else {
+fn parse_openai_tool_arguments(arguments_field: &serde_json::Value) -> serde_json::Value {
+    if let Some(raw) = arguments_field.as_str() {
+        return serde_json::from_str(raw).unwrap_or_else(|_| serde_json::json!({}));
+    }
+    if arguments_field.is_object() {
+        arguments_field.clone()
+    } else {
+        serde_json::json!({})
+    }
+}

Conversation

jamiepine commented Mar 4, 2026

Summary

Example Response Shape

Testing

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Uh oh!

tembo bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai bot commented Mar 4, 2026 •

edited

Loading