Skip to content

fix(observability): classify provider config-rejection 4xx as expected user-state (#2079)#2239

Merged
graycyrus merged 3 commits into
tinyhumansai:mainfrom
sanil-23:fix/2079-provider-config-rejection-classifier
May 19, 2026
Merged

fix(observability): classify provider config-rejection 4xx as expected user-state (#2079)#2239
graycyrus merged 3 commits into
tinyhumansai:mainfrom
sanil-23:fix/2079-provider-config-rejection-classifier

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 19, 2026

Summary

  • Custom cloud providers (custom_openai -> DeepSeek / OpenRouter / Moonshot) reject requests whose model id or sampling params they do not accept. These are deterministic user-configuration state, already surfaced in the UI, but every agent turn produced a fresh Sentry event (OPENHUMAN-TAURI-WJ / -QW / -HB / -NH, ~273 events).
  • Adds a tight is_provider_config_rejection_message classifier (sibling of billing_error) and demotes these from Sentry to an info log at both emission points, mirroring the existing budget-exhaustion pattern.
  • Provider-aware, inverted polarity: the same body from the OpenHuman backend stays Sentry-actionable (that would be a real regression - we sent our own backend a bad request).
  • The user now sees an actionable model_unavailable message ("fix your model/routing in Settings -> LLM") instead of the generic "inference" error.
  • Purely additive (+~430 incl. tests), no behavior removed.

Problem

#2079 (abstract tier reasoning-v1/chat-v1 leaked to a provider that only speaks native ids), #2076 (Moonshot Kimi K2 only accepts temperature: 1), and #2202 (unknown / stale model pin) all surface as upstream 4xx invalid_request bodies like:

The supported API model names are deepseek-v4-pro or deepseek-v4-flash, but you passed reasoning-v1.
invalid temperature: only 1 is allowed for this model
Model 'claude-opus-4-7' is not available. Use GET /openai/v1/models to list available models.

These reach Sentry at two points, both unfiltered (the before_send filter only drops transient domain=llm_provider statuses - 400/404 are not transient, matching the issues' domain=llm_provider and domain=agent tags):

  1. providers::ops::api_error (domain=llm_provider, operation=api_error) - should_report_provider_http_failure is status-based, so a 400 reports.
  2. The re-report from agent.run_single / web_channel.run_chat_task (domain=agent/web_channel) - expected_error_kind had no matcher for these bodies, so report_error_or_expected fell through to report_error.

This is the same class as budget-exhaustion, which the codebase already demotes - the matcher just did not exist for model/param rejection.

Solution

  • New inference/provider/config_rejection.rs - pure is_provider_config_rejection_message predicate, deliberately tight phrase list tied to the exact Sentry bodies. The phrases are emitted only by third-party upstream APIs (the OpenHuman backend resolves tier names natively and never emits them), so the message-level predicate is intrinsically scoped to custom providers.
  • ops::api_error - is_provider_config_rejection_http (4xx in {400,404,422} AND provider != openhuman_backend::PROVIDER_LABEL) + log_provider_config_rejection, a new branch mirroring is_budget_exhausted_http_400. The non-backend guard is the inverted-polarity mirror of the existing 401/403-backend rule.
  • core/observability.rs - ExpectedErrorKind::ProviderConfigRejection, wired into expected_error_kind and report_expected_message. This is the central fix: it catches the dominant re-report flood from every report_error_or_expected caller (agent, web-channel, cron, scheduler).
  • channels/providers/web::classify_inference_error - new arm reusing the shared predicate so the user gets an actionable model_unavailable message instead of the generic "inference" bucket.

Deliberate scope: the 5 duplicated decision-tree blocks in compatible.rs (operations responses_api/chat_completions) were intentionally left untouched - the issues cite only operation=api_error (handled) and operation=native_chat (handled centrally via expected_error_kind, since those bailed errors propagate to the agent layer and are re-reported there). Flagged as a follow-up cleanup rather than duplicating the branch 5x.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) - 13 unit tests across 4 files: predicate phrase detection + negatives, provider-aware inverted-polarity (backend same body NOT suppressed), 5xx/transient/unrelated-4xx guards, expected_error_kind classify + over-match guard, report_error_or_expected demotion path, web::classify_inference_error actionable mapping, log-helper smoke.
  • Diff coverage >= 80% - every new predicate/branch/arm is exercised; only ~3 api_error var-assignment/branch wiring lines (which need a live reqwest::Response) are uncovered, well under 20% of changed executable lines. cargo fmt --check clean, cargo clippy --lib zero issues in changed files.
  • Coverage matrix updated - N/A: observability/error-classification change, no added/removed/renamed feature row.
  • All affected feature IDs listed under ## Related - N/A: no matrix feature touched.
  • No new external network dependencies introduced - confirmed (pure in-process classifier, no new crates).
  • Manual smoke checklist updated if this touches release-cut surfaces - N/A: internal error classification, no release-cut surface.
  • Linked issue closed via Closes #NNN - see Related.

Impact

  • Runtime / platform: none on hot paths - same lookup shape that already runs for budget-exhaustion; only adds branches on the error path.
  • Security / migration / compatibility: zero. Errors still propagate via Err unchanged (retry/fallback unaffected); only the per-attempt Sentry report is gated, exactly like the budget/transient/session-expired precedents.
  • User-visible: custom-provider users hitting an unknown model / abstract tier / temperature constraint now see an actionable "fix your model/routing in Settings -> LLM" message instead of a generic error; the ~273 Sentry events/window from these (OPENHUMAN-TAURI-WJ / -QW / -HB / -NH) drop to info-level breadcrumbs.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

AI-authored PR.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/2079-provider-config-rejection-classifier
  • Commit SHA: 7a3c7edb

Validation Run

  • pnpm --filter openhuman-app format:check - N/A: no app/* files changed.
  • pnpm typecheck - N/A: no TypeScript changed.
  • Focused tests: cargo test --lib config_rejection report_error_or_expected_does_not_panic -> 13 passed, 0 failed.
  • Rust fmt/check (if changed): cargo fmt --check clean; cargo check --lib exit 0; cargo clippy --lib zero issues in changed files.
  • Tauri fmt/check (if changed): N/A: no app/src-tauri files changed.

Validation Blocked

  • command: none
  • error: none - full cargo check --lib, targeted cargo test --lib, and cargo clippy --lib all ran cleanly on the dev host.
  • impact: none. CI runs the full suite + diff-cover.

Behavior Changes

  • Intended behavior change: provider 4xx config-rejection from a non-backend provider is demoted from a Sentry error to an info log at both the provider-HTTP layer and the agent/web-channel re-report; the user-facing error is classified model_unavailable with actionable copy.
  • User-visible effect: no more generic "inference" error for unknown-model / abstract-tier / temperature rejections - the message tells the user to fix their model/routing in Settings.

Parity Contract

  • Legacy behavior preserved: Yes - errors still propagate via Err unchanged (retry/fallback/dispatch untouched); only the Sentry report is gated, identical contract to the existing budget/transient/session-expired demotions.
  • Guard/fallback/dispatch parity checks: inverted-polarity test pins that a model-rejection body from openhuman_backend::PROVIDER_LABEL still reports; 5xx / transient-429 / generic-4xx guards pin that real bugs are not silenced.

Duplicate / Superseded PR Handling

Summary by CodeRabbit

  • Bug Fixes

    • Provider configuration rejection errors (unsupported models, invalid parameters) now provide actionable guidance directing users to their LLM settings instead of generic messages.
  • Chores

    • Enhanced error handling to properly classify and suppress expected provider configuration failures, reducing error report noise.

Review Change Stack

sanil-23 added 2 commits May 19, 2026 23:39
…d user-state (tinyhumansai#2079)

Custom cloud providers (custom_openai -> DeepSeek / OpenRouter / Moonshot)
reject requests whose model id or sampling params they don't accept: an
OpenHuman abstract tier alias leaked to a provider that only speaks
native ids (tinyhumansai#2079), an unknown / stale model pin (tinyhumansai#2202), or a
model-specific temperature constraint (tinyhumansai#2076, Moonshot Kimi K2). These
are deterministic user-configuration state, already surfaced in the UI,
but every agent turn produced a fresh Sentry event (OPENHUMAN-TAURI-WJ /
-QW / -HB / -NH, ~273 events) at BOTH the provider HTTP layer
(domain=llm_provider) and the re-report from agent.run_single /
web_channel.run_chat_task (domain=agent) -- before_send only drops
transient statuses, not 400/404.

- New is_provider_config_rejection_message predicate (sibling of
  billing_error), tight phrase list tied to the exact Sentry bodies.
- ops::api_error: provider-aware demotion with inverted polarity -- a
  model-rejection from the OpenHuman backend stays Sentry-actionable
  (that would be a real regression we sent our own backend).
- observability::ExpectedErrorKind::ProviderConfigRejection wired into
  expected_error_kind so the agent / web-channel re-report is demoted
  too (the dominant flood).
- web::classify_inference_error maps these to an actionable
  model_unavailable message instead of the generic "inference" bucket.

Purely additive, mirrors the budget-exhaustion demotion pattern. 11 new
unit tests incl. inverted-polarity and scope guards.

Closes tinyhumansai#2079
…ath (tinyhumansai#2079)

Exercise the diff-coverage gate's remaining changed lines:
- report_error_or_expected_does_not_panic now drives the
  expected_error_kind ProviderConfigRejection branch and the
  report_expected_message skip-log arm (the agent / web-channel
  re-report demotion path).
- log_helper_runs_without_panicking covers log_provider_config_rejection
  (the provider-layer api_error demotion log).
@sanil-23 sanil-23 requested a review from a team May 19, 2026 18:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Warning

Rate limit exceeded

@sanil-23 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 34 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 770cf826-2e80-433f-9f05-1a0d995433eb

📥 Commits

Reviewing files that changed from the base of the PR and between 7a3c7ed and 481538b.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/providers/web_tests.rs
📝 Walkthrough

Walkthrough

This PR adds a unified provider configuration rejection detection and suppression mechanism. A new is_provider_config_rejection_message classifier detects when provider APIs reject user model or parameter configuration and routes these deterministic, user-actionable errors through HTTP suppression, user-facing remediation messaging, and expected-error observability pipelines.

Changes

Provider Configuration Rejection Detection and Suppression

Layer / File(s) Summary
Provider config rejection detection foundation
src/openhuman/inference/provider/config_rejection.rs, src/openhuman/inference/provider/mod.rs
New config_rejection module exposes is_provider_config_rejection_message, which detects provider rejection phrases via case-insensitive substring matching. Includes unit tests for positive detection, case-insensitivity, and exclusion of transient/generic errors.
HTTP API error suppression for config rejections
src/openhuman/inference/provider/ops.rs
New helpers is_provider_config_rejection_http and log_provider_config_rejection integrate config rejection detection into api_error control flow. Routes 400/404/422 responses from custom providers with matching body text to INFO logging instead of Sentry, while preserving Sentry reporting for backend, 5xx, transient 429, and unrelated 4xx cases.
User-facing error classification and remediation
src/openhuman/channels/providers/web.rs, src/openhuman/channels/providers/web_tests.rs
classify_inference_error uses the classifier to map provider config rejections to model_unavailable category and returns actionable messages directing users to "Settings → LLM". Tests verify multiple provider error shapes surface consistent remediation guidance.
Observability expected-error classification
src/core/observability.rs
ExpectedErrorKind gains ProviderConfigRejection variant, wired into expected_error_kind classifier and report_expected_message logger. Routes config rejections to INFO-level logging with kind = provider_config_rejection, skipping error event emission. Tests verify correct classification and non-panicking integration with report_error_or_expected.

Sequence Diagram

sequenceDiagram
  participant Client
  participant InferenceAPI as API<br/>(ops.rs)
  participant Classifier as Classifier<br/>(config_rejection)
  participant Logger as Logger<br/>(INFO)
  participant Sentry as Sentry
  
  Client->>InferenceAPI: invoke provider API
  InferenceAPI->>InferenceAPI: receive 4xx response
  InferenceAPI->>Classifier: is_provider_config_rejection_message()
  alt config rejection detected
    Classifier-->>InferenceAPI: true
    InferenceAPI->>Logger: log_provider_config_rejection()
    Logger-->>InferenceAPI: emitted INFO event
    InferenceAPI-->>Client: return error (no Sentry)
  else transient/generic error
    Classifier-->>InferenceAPI: false
    InferenceAPI->>Sentry: report to Sentry
    Sentry-->>InferenceAPI: acknowledged
    InferenceAPI-->>Client: return error
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • tinyhumansai/openhuman#1763: Both PRs extend core::observability::ExpectedErrorKind and the shared expected_error_kind/report_error_or_expectedreport_expected_message classification flow by adding different new "expected" variants.
  • tinyhumansai/openhuman#1798: Both PRs extend src/core/observability.rs's error classification/suppression pipeline to demote non-fatal cases and suppress Sentry noise.
  • tinyhumansai/openhuman#1719: Both PRs modify the shared observability pipeline by extending ExpectedErrorKind and wiring new classification plus Sentry suppression in src/core/observability.rs.

Suggested labels

working

Suggested reviewers

  • graycyrus

Poem

🐰 A bunny hops through provider errors with glee,
Now config rejections are handled with clarity,
No more Sentry noise for settings gone wrong,
Just helpful guidance: "Check LLM, be strong!"
Silent INFO logs, and users align—
Provider happiness in logs so fine! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: classifying provider config-rejection 4xx errors as expected user-state. It is specific, directly related to the core objective, and uses precise terminology aligned with the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 19, 2026
coderabbitai[bot]
coderabbitai Bot previously requested changes May 19, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/channels/providers/web.rs`:
- Around line 303-317: The provider-config rejection check
(crate::openhuman::inference::provider::is_provider_config_rejection_message) is
placed after the broader "model_unavailable" branch so some provider-specific
messages fall through; move the entire else-if branch that returns
("model_unavailable", with_provider_detail(..., err)) using
is_provider_config_rejection_message(err) so it appears before the generic
model-unavailable arm, ensuring provider-config rejection messages get the
specific "Settings → LLM" remediation; keep references to with_provider_detail
and the "model_unavailable" tuple intact when relocating the block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 029facd0-8f2d-4a03-9925-872c7c6ee7aa

📥 Commits

Reviewing files that changed from the base of the PR and between 049d1ea and 7a3c7ed.

📒 Files selected for processing (6)
  • src/core/observability.rs
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/providers/web_tests.rs
  • src/openhuman/inference/provider/config_rejection.rs
  • src/openhuman/inference/provider/mod.rs
  • src/openhuman/inference/provider/ops.rs

Comment thread src/openhuman/channels/providers/web.rs Outdated
…available (tinyhumansai#2079)

CodeRabbit review on tinyhumansai#2239: the is_provider_config_rejection_message arm
ran after the broader model-unavailable branch, so config-rejection
bodies that also contain "model"/"does not exist"/"does not have access"
(e.g. a stale model pin / model_not_found) fell through to the generic
"Check your model settings" copy instead of the actionable
"Settings -> LLM" remediation. Move the specific arm above the generic
one. Updates the pre-existing classify_inference_error_quotes_model_unavailable_detail
test to assert the new (intended) remediation for that stale-pin body.
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — graycyrus

Walkthrough: Solid, well-structured observability fix that demotes deterministic provider config-rejection errors (~273 Sentry events across OPENHUMAN-TAURI-WJ/-QW/-HB/-NH) to info logs, following the exact same pattern as the existing budget-exhaustion demotion. The inverted-polarity contract is correct — same body from the OpenHuman backend still reaches Sentry. The user-facing classification to model_unavailable with actionable "Settings → LLM" copy is a nice UX improvement.

Assessment: Clean. No critical or major issues found.

File Change Description
config_rejection.rs New Core predicate — tight phrase list, case-insensitive, well-documented polarity contract
ops.rs Modified HTTP-layer integration — is_provider_config_rejection_http (status + provider + body), new branch in api_error
observability.rs Modified ProviderConfigRejection variant + expected_error_kind classifier + report_expected_message info log arm
web.rs Modified New classify_inference_error arm for actionable user message
web_tests.rs Modified Tests for config-rejection classification + updated existing model-unavailable test
mod.rs Modified Module registration + re-export

CodeRabbit dedup: Their one finding (arm ordering in web.rs) was addressed in the latest commit — skipped.

What I liked:

  • The phrase list in config_rejection.rs is deliberately tight with clear provenance (each phrase tied to a specific Sentry issue)
  • Comprehensive test matrix: 13 tests covering detection, case insensitivity, negatives, inverted polarity, transient/5xx guards, and smoke
  • The "is an abstract tier" forward-looking phrase keeps the classifier stable across the planned tier-resolution fix
  • Diff coverage gate passes (≥80%)

LGTM — no blocking issues.

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(providers): reasoning-v1 model name sent to DeepSeek API which rejects it (regression)

2 participants