fix(memory): add fallback model chain for unavailable GMI models#1704
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughCloudChatProvider::chat_for_json now tries the configured model, and on "not available for your organization" errors iterates an ordered FALLBACK_MODELS list (skipping duplicates), returning on first success or bailing with an explicit message if all models are unavailable. ChangesModel Fallback and Unavailability Detection
Sequence Diagram(s)sequenceDiagram
participant Caller
participant chat_for_json
participant try_model
participant OpenHumanBackendProvider
Caller->>chat_for_json: request summarization (configured model)
chat_for_json->>try_model: try configured model
try_model->>OpenHumanBackendProvider: chat_with_history(model)
OpenHumanBackendProvider-->>try_model: error (model unavailable)
try_model-->>chat_for_json: Err(Unavailable)
chat_for_json->>chat_for_json: is_model_unavailable_error? true
chat_for_json->>try_model: try fallback model A
try_model->>OpenHumanBackendProvider: chat_with_history(fallback A)
OpenHumanBackendProvider-->>try_model: Ok(response)
try_model-->>chat_for_json: Ok(response)
chat_for_json-->>Caller: return response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/openhuman/memory/tree/ingest.rs (1)
556-561: 💤 Low valueConsider tightening the ASCII test assertion.
For pure ASCII input, every byte is already a char boundary, so
floor_char_boundarywill return exactlylen - 2048, producing a result of exactly 2048 bytes. The current range assertionpreview.len() <= 2048 + 3is correct but looser than necessary for this test case.🔍 Suggested refinement for test precision
fn body_preview_long_ascii_truncates_to_trailing_bytes() { let long = "A".repeat(4096); let preview = super::build_body_preview(&long); - assert!(preview.len() >= 2048); - assert!(preview.len() <= 2048 + 3); // at most 3 extra bytes from boundary rounding + // Pure ASCII: every byte is a char boundary, so we get exactly 2048 bytes + assert_eq!(preview.len(), 2048); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/memory/tree/ingest.rs` around lines 556 - 561, The test body_preview_long_ascii_truncates_to_trailing_bytes is too loose for ASCII input; since ASCII characters are single-byte, build_body_preview(long) should produce exactly 2048 bytes. Update the assertion in that test (function body_preview_long_ascii_truncates_to_trailing_bytes) to assert equality (preview.len() == 2048) instead of the current range check, keeping references to build_body_preview and the ASCII long string setup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/memory/tree/chat/cloud.rs`:
- Around line 129-136: Add a grep-friendly tracing/log line before each terminal
return in the cloud chat error branches: in the Err(e) case inside the cloud
chat request closure (the block that calls Err(e).with_context and references
prompt.kind and self.model) and in the final "all fallbacks unavailable" exit
path; use tracing::debug or tracing::trace to log a concise, searchable message
that includes the failure context (prompt.kind, self.model, and the
error/summary) right before returning so the terminal branches follow the repo
logging standard.
- Around line 33-35: The current is_model_unavailable_error function treats any
message containing "model" and "404" as an unavailable-model case which is too
broad; update is_model_unavailable_error to only return true for the explicit
"not available for your organization" phrase OR for error strings that clearly
indicate the model resource was not found/unavailable (e.g., match a tighter
pattern such as "model.*not found" or provider-specific "model .* not found|does
not exist|is not available" using a regex or explicit substrings) rather than
any generic "404" alongside "model", and add a unit test that verifies a generic
404 error message (containing "404" and "model" but not the precise unavailable
phrasing) returns false to prevent regression.
---
Nitpick comments:
In `@src/openhuman/memory/tree/ingest.rs`:
- Around line 556-561: The test
body_preview_long_ascii_truncates_to_trailing_bytes is too loose for ASCII
input; since ASCII characters are single-byte, build_body_preview(long) should
produce exactly 2048 bytes. Update the assertion in that test (function
body_preview_long_ascii_truncates_to_trailing_bytes) to assert equality
(preview.len() == 2048) instead of the current range check, keeping references
to build_body_preview and the ASCII long string setup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ccc73f0a-9ca8-46f0-8f8c-4bbad2d05102
📒 Files selected for processing (2)
src/openhuman/memory/tree/chat/cloud.rssrc/openhuman/memory/tree/ingest.rs
|
@Sathvik-1007 please resolve merge conflicts before review. |
eea811e to
9401748
Compare
9401748 to
9af4136
Compare
When configured cloud_llm_model returns 404 'not available for your organization', try FALLBACK_MODELS list before failing. Prevents summarization pipeline from blocking entirely on model provisioning. - Tighten is_model_unavailable_error to only match explicit phrase - Add log::warn at all terminal error paths per repo logging standard - Add negative test: generic 404 must NOT trigger fallback chain Closes tinyhumansai#1598
9af4136 to
1dbb797
Compare
Summary
Problem
deepseek-ai/DeepSeek-V4-Flash(or any configured model) isn't provisioned for the org.Solution
FALLBACK_MODELSconstant with known working summarization models (summarization-v1,deepseek-ai/DeepSeek-V3-0324,deepseek-ai/DeepSeek-V3).is_model_unavailable_error()helper that detects the 404 "not available" pattern.chat_for_json: try configured model first, on unavailable error iterate fallbacks (skipping configured model), return first success or bail with clear message.Submission Checklist
Impact
Related
AI Authored PR Metadata (required for Codex/Linear PRs)
Summary by CodeRabbit
Bug Fixes
Tests
Documentation