fix(observability): classify provider config-rejection 4xx as expected user-state (#2079)#2239
Conversation
…d user-state (tinyhumansai#2079) Custom cloud providers (custom_openai -> DeepSeek / OpenRouter / Moonshot) reject requests whose model id or sampling params they don't accept: an OpenHuman abstract tier alias leaked to a provider that only speaks native ids (tinyhumansai#2079), an unknown / stale model pin (tinyhumansai#2202), or a model-specific temperature constraint (tinyhumansai#2076, Moonshot Kimi K2). These are deterministic user-configuration state, already surfaced in the UI, but every agent turn produced a fresh Sentry event (OPENHUMAN-TAURI-WJ / -QW / -HB / -NH, ~273 events) at BOTH the provider HTTP layer (domain=llm_provider) and the re-report from agent.run_single / web_channel.run_chat_task (domain=agent) -- before_send only drops transient statuses, not 400/404. - New is_provider_config_rejection_message predicate (sibling of billing_error), tight phrase list tied to the exact Sentry bodies. - ops::api_error: provider-aware demotion with inverted polarity -- a model-rejection from the OpenHuman backend stays Sentry-actionable (that would be a real regression we sent our own backend). - observability::ExpectedErrorKind::ProviderConfigRejection wired into expected_error_kind so the agent / web-channel re-report is demoted too (the dominant flood). - web::classify_inference_error maps these to an actionable model_unavailable message instead of the generic "inference" bucket. Purely additive, mirrors the budget-exhaustion demotion pattern. 11 new unit tests incl. inverted-polarity and scope guards. Closes tinyhumansai#2079
…ath (tinyhumansai#2079) Exercise the diff-coverage gate's remaining changed lines: - report_error_or_expected_does_not_panic now drives the expected_error_kind ProviderConfigRejection branch and the report_expected_message skip-log arm (the agent / web-channel re-report demotion path). - log_helper_runs_without_panicking covers log_provider_config_rejection (the provider-layer api_error demotion log).
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds a unified provider configuration rejection detection and suppression mechanism. A new ChangesProvider Configuration Rejection Detection and Suppression
Sequence DiagramsequenceDiagram
participant Client
participant InferenceAPI as API<br/>(ops.rs)
participant Classifier as Classifier<br/>(config_rejection)
participant Logger as Logger<br/>(INFO)
participant Sentry as Sentry
Client->>InferenceAPI: invoke provider API
InferenceAPI->>InferenceAPI: receive 4xx response
InferenceAPI->>Classifier: is_provider_config_rejection_message()
alt config rejection detected
Classifier-->>InferenceAPI: true
InferenceAPI->>Logger: log_provider_config_rejection()
Logger-->>InferenceAPI: emitted INFO event
InferenceAPI-->>Client: return error (no Sentry)
else transient/generic error
Classifier-->>InferenceAPI: false
InferenceAPI->>Sentry: report to Sentry
Sentry-->>InferenceAPI: acknowledged
InferenceAPI-->>Client: return error
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/channels/providers/web.rs`:
- Around line 303-317: The provider-config rejection check
(crate::openhuman::inference::provider::is_provider_config_rejection_message) is
placed after the broader "model_unavailable" branch so some provider-specific
messages fall through; move the entire else-if branch that returns
("model_unavailable", with_provider_detail(..., err)) using
is_provider_config_rejection_message(err) so it appears before the generic
model-unavailable arm, ensuring provider-config rejection messages get the
specific "Settings → LLM" remediation; keep references to with_provider_detail
and the "model_unavailable" tuple intact when relocating the block.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 029facd0-8f2d-4a03-9925-872c7c6ee7aa
📒 Files selected for processing (6)
src/core/observability.rssrc/openhuman/channels/providers/web.rssrc/openhuman/channels/providers/web_tests.rssrc/openhuman/inference/provider/config_rejection.rssrc/openhuman/inference/provider/mod.rssrc/openhuman/inference/provider/ops.rs
…available (tinyhumansai#2079) CodeRabbit review on tinyhumansai#2239: the is_provider_config_rejection_message arm ran after the broader model-unavailable branch, so config-rejection bodies that also contain "model"/"does not exist"/"does not have access" (e.g. a stale model pin / model_not_found) fell through to the generic "Check your model settings" copy instead of the actionable "Settings -> LLM" remediation. Move the specific arm above the generic one. Updates the pre-existing classify_inference_error_quotes_model_unavailable_detail test to assert the new (intended) remediation for that stale-pin body.
graycyrus
left a comment
There was a problem hiding this comment.
Review — graycyrus
Walkthrough: Solid, well-structured observability fix that demotes deterministic provider config-rejection errors (~273 Sentry events across OPENHUMAN-TAURI-WJ/-QW/-HB/-NH) to info logs, following the exact same pattern as the existing budget-exhaustion demotion. The inverted-polarity contract is correct — same body from the OpenHuman backend still reaches Sentry. The user-facing classification to model_unavailable with actionable "Settings → LLM" copy is a nice UX improvement.
Assessment: Clean. No critical or major issues found.
| File | Change | Description |
|---|---|---|
config_rejection.rs |
New | Core predicate — tight phrase list, case-insensitive, well-documented polarity contract |
ops.rs |
Modified | HTTP-layer integration — is_provider_config_rejection_http (status + provider + body), new branch in api_error |
observability.rs |
Modified | ProviderConfigRejection variant + expected_error_kind classifier + report_expected_message info log arm |
web.rs |
Modified | New classify_inference_error arm for actionable user message |
web_tests.rs |
Modified | Tests for config-rejection classification + updated existing model-unavailable test |
mod.rs |
Modified | Module registration + re-export |
CodeRabbit dedup: Their one finding (arm ordering in web.rs) was addressed in the latest commit — skipped.
What I liked:
- The phrase list in
config_rejection.rsis deliberately tight with clear provenance (each phrase tied to a specific Sentry issue) - Comprehensive test matrix: 13 tests covering detection, case insensitivity, negatives, inverted polarity, transient/5xx guards, and smoke
- The
"is an abstract tier"forward-looking phrase keeps the classifier stable across the planned tier-resolution fix - Diff coverage gate passes (≥80%)
LGTM — no blocking issues.
graycyrus
left a comment
There was a problem hiding this comment.
Looks good, nice work!
Summary
custom_openai-> DeepSeek / OpenRouter / Moonshot) reject requests whose model id or sampling params they do not accept. These are deterministic user-configuration state, already surfaced in the UI, but every agent turn produced a fresh Sentry event (OPENHUMAN-TAURI-WJ / -QW / -HB / -NH, ~273 events).is_provider_config_rejection_messageclassifier (sibling ofbilling_error) and demotes these from Sentry to an info log at both emission points, mirroring the existing budget-exhaustion pattern.model_unavailablemessage ("fix your model/routing in Settings -> LLM") instead of the generic "inference" error.Problem
#2079(abstract tierreasoning-v1/chat-v1leaked to a provider that only speaks native ids),#2076(Moonshot Kimi K2 only acceptstemperature: 1), and#2202(unknown / stale model pin) all surface as upstream4xx invalid_requestbodies like:These reach Sentry at two points, both unfiltered (the
before_sendfilter only drops transientdomain=llm_providerstatuses - 400/404 are not transient, matching the issues'domain=llm_provideranddomain=agenttags):providers::ops::api_error(domain=llm_provider, operation=api_error) -should_report_provider_http_failureis status-based, so a 400 reports.agent.run_single/web_channel.run_chat_task(domain=agent/web_channel) -expected_error_kindhad no matcher for these bodies, soreport_error_or_expectedfell through toreport_error.This is the same class as budget-exhaustion, which the codebase already demotes - the matcher just did not exist for model/param rejection.
Solution
inference/provider/config_rejection.rs- pureis_provider_config_rejection_messagepredicate, deliberately tight phrase list tied to the exact Sentry bodies. The phrases are emitted only by third-party upstream APIs (the OpenHuman backend resolves tier names natively and never emits them), so the message-level predicate is intrinsically scoped to custom providers.ops::api_error-is_provider_config_rejection_http(4xx in {400,404,422} ANDprovider != openhuman_backend::PROVIDER_LABEL) +log_provider_config_rejection, a new branch mirroringis_budget_exhausted_http_400. The non-backend guard is the inverted-polarity mirror of the existing 401/403-backend rule.core/observability.rs-ExpectedErrorKind::ProviderConfigRejection, wired intoexpected_error_kindandreport_expected_message. This is the central fix: it catches the dominant re-report flood from everyreport_error_or_expectedcaller (agent, web-channel, cron, scheduler).channels/providers/web::classify_inference_error- new arm reusing the shared predicate so the user gets an actionablemodel_unavailablemessage instead of the generic "inference" bucket.Deliberate scope: the 5 duplicated decision-tree blocks in
compatible.rs(operationsresponses_api/chat_completions) were intentionally left untouched - the issues cite onlyoperation=api_error(handled) andoperation=native_chat(handled centrally viaexpected_error_kind, since those bailed errors propagate to the agent layer and are re-reported there). Flagged as a follow-up cleanup rather than duplicating the branch 5x.Submission Checklist
expected_error_kindclassify + over-match guard,report_error_or_expecteddemotion path,web::classify_inference_erroractionable mapping, log-helper smoke.api_errorvar-assignment/branch wiring lines (which need a livereqwest::Response) are uncovered, well under 20% of changed executable lines.cargo fmt --checkclean,cargo clippy --libzero issues in changed files.## Related- N/A: no matrix feature touched.Closes #NNN- see Related.Impact
Errunchanged (retry/fallback unaffected); only the per-attempt Sentry report is gated, exactly like the budget/transient/session-expired precedents.Related
compatible.rs(operationsresponses_api/chat_completions) behind one shared helper that also accounts for config-rejection.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/2079-provider-config-rejection-classifier7a3c7edbValidation Run
pnpm --filter openhuman-app format:check- N/A: noapp/*files changed.pnpm typecheck- N/A: no TypeScript changed.cargo test --lib config_rejection report_error_or_expected_does_not_panic-> 13 passed, 0 failed.cargo fmt --checkclean;cargo check --libexit 0;cargo clippy --libzero issues in changed files.app/src-taurifiles changed.Validation Blocked
command:noneerror:none - fullcargo check --lib, targetedcargo test --lib, andcargo clippy --liball ran cleanly on the dev host.impact:none. CI runs the full suite + diff-cover.Behavior Changes
model_unavailablewith actionable copy.Parity Contract
Errunchanged (retry/fallback/dispatch untouched); only the Sentry report is gated, identical contract to the existing budget/transient/session-expired demotions.openhuman_backend::PROVIDER_LABELstill reports; 5xx / transient-429 / generic-4xx guards pin that real bugs are not silenced.Duplicate / Superseded PR Handling
Summary by CodeRabbit
Bug Fixes
Chores