Skip to content

fix: harden OpenAI-compatible parsing for OpenRouter responses#304

Merged
jamiepine merged 2 commits intomainfrom
fix/openrouter-empty-response-hardening
Mar 4, 2026
Merged

fix: harden OpenAI-compatible parsing for OpenRouter responses#304
jamiepine merged 2 commits intomainfrom
fix/openrouter-empty-response-hardening

Conversation

@jamiepine
Copy link
Member

Summary

  • Harden OpenAI-compatible parsing so chat responses no longer get dropped when providers return non-string message.content payloads.
  • Treat reasoning, reasoning_content, and reasoning_details as valid fallback text when content is otherwise empty, and support both tool_calls and legacy function_call outputs.
  • Improve provider-facing diagnostics by using provider-specific labels (instead of hardcoded OpenAI text) and including finish_reason context in empty-response errors.
  • Add targeted unit tests for content arrays, reasoning fallback, legacy function-call parsing, and the richer empty-response error path.

Example Response Shape

{
  "choices": [
    {
      "message": {
        "content": [{"type": "output_text", "text": "done"}],
        "reasoning": "working through the plan",
        "function_call": {"name": "reply", "arguments": "{\"content\":\"ok\"}"}
      },
      "finish_reason": "function_call"
    }
  ]
}

Testing

  • ./scripts/preflight.sh
  • ./scripts/gate-pr.sh

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

Walkthrough

This change replaces hardcoded "OpenAI" strings with dynamic provider labels in error messages, extends OpenAI response parsing to handle content arrays, legacy tool calls, and reasoning fallbacks, and adds robust text collection and tool argument parsing logic with improved error handling.

Changes

Cohort / File(s) Summary
Provider Labeling
src/llm/model.rs
Replaces hardcoded "OpenAI" strings with dynamic provider_label derived from provider config throughout error messages, API errors, and response framing in call_openai, call_openai_responses, and related helpers.
Response Parsing Enhancement
src/llm/model.rs
Extends parse_openai_response to support content returned as arrays of parts, tool calls via legacy tool_calls arrays and function_call payloads, and introduces parse_openai_reasoning_fallback for handling responses where content is empty but reasoning is provided in alternative fields.
Parsing Helper Functions
src/llm/model.rs
Adds collect_openai_text_content for recursive text extraction, parse_openai_tool_arguments for robust stringified JSON and null handling, and parse_openai_tool_call for deriving tool call id, name, and arguments from various payload shapes.
Error Handling & Diagnostics
src/llm/model.rs
Improves error handling for empty provider responses to include provider_label and finish_reason, adds diagnostic keys for better logging context, and introduces provider_display_name helper as fallback for provider labeling.
Tests & Validation
src/llm/model.rs
Adds comprehensive tests covering new parsing paths (content array parts, reasoning fallbacks, legacy function calls, empty error messaging) and model remapping behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: harden OpenAI-compatible parsing for OpenRouter responses' directly matches the main objective of the PR: improving OpenAI-compatible response parsing to handle non-string content payloads and other edge cases from OpenRouter.
Description check ✅ Passed The description clearly explains the changes: hardening OpenAI parsing for non-string content, supporting reasoning fallback fields, handling both tool_calls and legacy function_call outputs, improving provider labeling in errors, and adding comprehensive tests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/openrouter-empty-response-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jamiepine jamiepine marked this pull request as ready for review March 4, 2026 00:18
}
if arguments_field.is_null() {
serde_json::json!({})
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguments_field.clone() will pass through non-object JSON (arrays/numbers), which can make downstream tool-arg handling inconsistent. Consider only accepting objects and defaulting to {} otherwise.

Suggested change
} else {
fn parse_openai_tool_arguments(arguments_field: &serde_json::Value) -> serde_json::Value {
if let Some(raw) = arguments_field.as_str() {
return serde_json::from_str(raw).unwrap_or_else(|_| serde_json::json!({}));
}
if arguments_field.is_object() {
arguments_field.clone()
} else {
serde_json::json!({})
}
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/llm/model.rs (1)

1519-1554: Consider adding recursion depth limit for defensive coding.

The recursive collect_openai_text_content function has no depth limit. While real API responses are unlikely to be deeply nested, malformed or adversarial responses could cause stack overflow.

This is a low-risk concern since provider responses are generally well-formed, but a depth guard would add resilience.

🛡️ Optional: Add depth limit for robustness
-fn collect_openai_text_content(value: &serde_json::Value, text_parts: &mut Vec<String>) {
+fn collect_openai_text_content(value: &serde_json::Value, text_parts: &mut Vec<String>) {
+    collect_openai_text_content_inner(value, text_parts, 0);
+}
+
+fn collect_openai_text_content_inner(value: &serde_json::Value, text_parts: &mut Vec<String>, depth: usize) {
+    const MAX_DEPTH: usize = 32;
+    if depth > MAX_DEPTH {
+        return;
+    }
     match value {
         serde_json::Value::String(text) => {
             if !text.trim().is_empty() {
                 text_parts.push(text.to_string());
             }
         }
         serde_json::Value::Array(items) => {
             for item in items {
-                collect_openai_text_content(item, text_parts);
+                collect_openai_text_content_inner(item, text_parts, depth + 1);
             }
         }
         serde_json::Value::Object(map) => {
             // ... existing text/summary/refusal extraction ...

             if let Some(content) = map.get("content") {
-                collect_openai_text_content(content, text_parts);
+                collect_openai_text_content_inner(content, text_parts, depth + 1);
             }
         }
         _ => {}
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llm/model.rs` around lines 1519 - 1554, The recursive function
collect_openai_text_content lacks a depth guard and could overflow on
maliciously deep JSON; modify collect_openai_text_content to accept a
current_depth parameter (or use an internal helper) and a MAX_DEPTH constant
(e.g., 64), immediately return when current_depth >= MAX_DEPTH, and increment
current_depth on every recursive call (for array items and nested "content"
values and object traversal) so nested recursion stops once the depth limit is
reached; keep behavior identical otherwise (still collect strings from "text",
"summary", "refusal", arrays, and content).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/llm/model.rs`:
- Around line 1519-1554: The recursive function collect_openai_text_content
lacks a depth guard and could overflow on maliciously deep JSON; modify
collect_openai_text_content to accept a current_depth parameter (or use an
internal helper) and a MAX_DEPTH constant (e.g., 64), immediately return when
current_depth >= MAX_DEPTH, and increment current_depth on every recursive call
(for array items and nested "content" values and object traversal) so nested
recursion stops once the depth limit is reached; keep behavior identical
otherwise (still collect strings from "text", "summary", "refusal", arrays, and
content).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7b5bb90 and 1bed130.

📒 Files selected for processing (1)
  • src/llm/model.rs

@jamiepine jamiepine merged commit 5d82132 into main Mar 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant