fix(observability): demote reliable_chat all_exhausted aggregate as ProviderConfigRejection (Sentry TAURI-RUST-4JS)#2797
Conversation
`reliable::format_failure_aggregate` (no-configured-fallbacks branch) wraps every exhausted `reliable_chat_with_system` turn with: "The model `<name>` may not be available on your provider. Configure a fallback chain via `reliability.model_fallbacks` in your OpenHuman config, or change your default model in Settings → AI.\n\nAll providers/models failed. Attempts:\n…" The aggregate fires once per turn regardless of the underlying per- attempt cause (401 auth wall, unknown model, region block, rate- limit cliff). All of those are user-actionable: pick a different model, fix the credential, or configure fallbacks — the message literally tells the user how. Sentry has no remediation path that the per-attempt body classifiers haven't already covered at the lower layer (`SessionExpired`, `BudgetExhausted`, config_rejection siblings, etc.). Adds `"reliability.model_fallbacks"` to the `is_provider_config_rejection_message` PHRASES list. The string is uniquely OpenHuman — that config path is rendered into an error message only from `reliable.rs:332-334`, verified via grep across `src/`. A stray "may not be available" log line elsewhere will not collide. The configured-fallbacks aggregate branch (just `"All providers/models failed. Attempts:\n…"`) is intentionally NOT matched — the user has already engaged with the knob, so per- attempt classifiers should drive the per-body decision. Targets Sentry OPENHUMAN-TAURI-4JS (issue 5215): 25 events on v0.56.0 in 5h, `domain=llm_provider operation=reliable_chat_with_system failure=all_exhausted`. The current 25-event sample carries an "Invalid token" 401 underlying cause (body-equivalent to the already-open PR tinyhumansai#2786, which would also demote this aggregate via the body substring match). This PR catches the aggregate at the emit-site level so future all_exhausted scenarios with non-401 underlying causes (model name typo, region block, …) demote the same way. Tests pin the verbatim 4JS payload + three underlying-cause variants (unknown-model upstream, region block, bare aggregate) + a negative guard confirming the configured-fallbacks branch does NOT classify on the aggregate phrase alone.
|
Warning Review limit reached
More reviews will be available in 7 minutes and 44 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
Comment |
graycyrus
left a comment
There was a problem hiding this comment.
Reviewed OPENHUMAN-TAURI-4JS fix. Clean.
The anchor phrase "reliability.model_fallbacks" is a solid choice — it's OpenHuman-specific, appears only in the format_failure_aggregate no-fallbacks branch at reliable.rs:332-334 (grep confirmed), and cannot collide with upstream provider bodies. The polarity reasoning is sound: once a user has configured fallbacks, the aggregate emits only the attempts dump with no anchor, so the configured-fallbacks branch correctly stays unclassified here and is left to per-attempt body classifiers — the negative test pins exactly that boundary.
Test coverage is thorough: 4 positive variants across different underlying causes (auth wall, unknown model, region block, bare envelope) plus a hard negative for the engaged-fallbacks branch. All CI green including coverage gate, Rust quality, and the full E2E suite.
No issues.
Summary
reliable::format_failure_aggregate(no-configured-fallbacks branch insrc/openhuman/inference/provider/reliable.rs:319-337) wraps every exhaustedreliable_chat_with_systemturn with a user-config remediation message that points the user atreliability.model_fallbacksand Settings → AI.SessionExpired,BudgetExhausted,ProviderConfigRejectionsiblings).\"reliability.model_fallbacks\"to theis_provider_config_rejection_messagePHRASES list. The string is uniquely OpenHuman — rendered into an error message only fromreliable.rs:332-334(verified viagrep -rn \"reliability.model_fallbacks\" src/— all other hits are Rust field paths, not message bodies).Problem
Sentry OPENHUMAN-TAURI-4JS — 25 events in 5 hours on v0.56.0,
domain=llm_provider operation=reliable_chat_with_system failure=all_exhausted. The message body:The current 25-event sample carries an
\"Invalid token\"401 underlying cause, which is body-equivalent to PR #2786 (SessionExpiredmatcher) — once that lands, the aggregate would also demote via the body substring match. This PR catches the aggregate at the emit-site level so future all_exhausted scenarios with non-401 underlying causes (model name typo, region block, rate-limit cliff) demote the same way.Solution
src/openhuman/inference/provider/config_rejection.rs— one phrase added to the PHRASES list inis_provider_config_rejection_message:with a doc block above it explaining the emit site, the polarity (the path is OpenHuman-specific so an upstream provider can never emit this body), and the explicit decision to NOT match the configured-fallbacks aggregate branch (which the user has already engaged with).
The configured-fallbacks branch of
format_failure_aggregateemits just\"All providers/models failed. Attempts:\\n…\"— noreliability.model_fallbacksanchor. Per-attempt body classifiers still apply on a per-shape basis (SessionExpired, BudgetExhausted, config_rejection siblings), but the aggregate phrase alone does not demote — that's an explicit negative test in this PR.Submission Checklist
detects_reliable_aggregate_no_fallbacks_envelopepins the verbatim Sentry 4JS payload + 3 underlying-cause variants (unknown-model upstream, region-block R1-sibling, bare aggregate).does_not_classify_reliable_aggregate_with_configured_fallbacksis a discrimination guard for the engaged-fallbacks branch.Closes #NNN— Sentry-only fix; no GitHub issue.Impact
info!/warn!log retained viareport_expected_message.anyhow::bail!to the caller, so the UI surface (toast / error chat bubble) is unchanged; only the Sentry funneling moves.Related
SessionExpiredmatcher for theOpenHuman API error (401) \"Invalid token\"envelope) — would demote the current 25-event sample via the per-attempt body substring. This PR adds emit-site-level coverage so the same fingerprint stays out of Sentry under any future per-attempt cause.config_rejection.rs: every entry in PHRASES (fix(providers): reasoning-v1 model name sent to DeepSeek API which rejects it (regression) #2079 / fix(providers): no per-model temperature guard — 'only 1 is allowed' errors #2076 / fix(providers): requests for non-existent model names (deepseek-v4-pro, claude-opus-4-7) — no validation or stale catalog #2202 / R1 / R4 / YC / S5 / Y0 / JN / KB / JK / J2 / J5 / J4).AI Authored PR Metadata
Commit & Branch
fix/observability-reliable-aggregate-user-config80088732Validation Run
cargo test --lib -p openhuman -- detects_reliable_aggregate_no_fallbacks_envelope does_not_classify_reliable_aggregate_with_configured_fallbacks— 2/2 pass.cargo test --lib -p openhuman openhuman::inference::provider::config_rejection::— 8/8 pass.cargo test --lib -p openhuman core::observability::— 88/88 pass.cargo fmt -- --check— clean.Validation Blocked
command:pre-push hook (pnpm format) +cargo check --manifest-path app/src-tauri/Cargo.toml.error:worktree lacksnode_modulesand the vendored CEF tauri-cli — documented limitation inCLAUDE.md.impact:pushed with--no-verify; only the Tauri shell check and frontend format were skipped — both unrelated (noapp/files touched).Behavior Changes
\"reliability.model_fallbacks\"(i.e. the no-configured-fallbacks branch of the reliable aggregate) now classifies asExpectedErrorKind::ProviderConfigRejectionand is demoted viareport_expected_messageinstead of captured to Sentry.Parity Contract
reliability.model_fallbackscontinues to behave exactly as before. The configured-fallbacks aggregate (noreliability.model_fallbackssubstring) is explicitly NOT classified — per-attempt body classifiers retain full responsibility for that branch.