fix(inference,ai-settings): prevent + handle ollama embedding-model-as-chat (Sentry TAURI-RUST-4P6) (#3359) by oxoxDev · Pull Request #3360 · tinyhumansai/openhuman

oxoxDev · 2026-06-04T11:07:13Z

Summary

A user picked an embedding model (bge-m3:latest) as their Ollama chat model → Ollama 400s "does not support chat" every turn → 36.6k Sentry events / 2 users on 0.57.13 (TAURI-RUST-4P6).
Prevention (the real fix): hide embedding-only Ollama models from the chat/LLM model pickers so the misconfig can't be made.
Handling (defense-in-depth): demote the 400 from Sentry (classifier) + replace the opaque JSON with an actionable "assign a chat-capable model" message.
Capability comes from Ollama /api/show capabilities (already fetched for the context-window gate — no extra round-trip). Fail-open on unknown capability.

Problem

bge-m3:latest is OpenHuman's default memory-tree embedding model. Nothing stopped a user selecting it as their Ollama chat model, after which every chat turn failed:

ollama API error (400 Bad Request): {"error":{"message":"\"bge-m3:latest\" does not support chat","type":"invalid_request_error","param":null,"code":null}}

Two failures compounded it:

No prevention — the chat-model picker listed every installed Ollama model, embedding-only ones included.
Noise + opaque UX — the 400 bypassed the 404-only completion_only_404_guard, the classifier had no matching phrase (so it re-reported every retry → 36.6k events), and the user saw raw upstream JSON instead of remediation.

Deterministic user-state (capability mismatch), not a server bug — same family as TAURI-RUST-35 (does not support tools).

Solution

Layer 1 — prevention (root cause):

Parse the Ollama /api/show capabilities list (previously deserialized but unused) and classify each model chat-capable / embedding-only / unknown (ollama_chat_capability, ollama.rs). Surface chat_capable on each diagnostics installed_models entry, fetched in the same /api/show round-trip that already resolves the context window.
Frontend filters chat_capable === false out of the local-model pickers (CustomRoutingDialog + GlobalOwnModelSelector) via isChatSelectableLocalModel. Unknown (null) stays visible — fail-open. The embedding model is configured in a separate panel, so embedding selection is unaffected.

Layer 2 — Sentry demotion: add "does not support chat" to is_provider_config_rejection_message (config_rejection.rs). Demotes error→info via both the inline guard sites and api_error. 36.6k → 0.

Layer 3 — actionable error: not_chat_capable_guard (400/422) rewrites the opaque JSON into model '<m>' does not support chat — assign a chat-capable model in Settings → AI, preserving the phrase so it stays demoted. Wired into all chat-completions error paths.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

Tests added or updated (happy path + at least one failure / edge case) — Rust: classifier verbatim body + 6 compatible guard/wire tests + ollama_chat_capability matrix; Frontend: 4 isChatSelectableLocalModel cases (false / true / unknown-fail-open / list-filter)
Diff coverage ≥ 80% — filter+map extracted to pure toSelectableChatModels helper with unit tests so the changed FE lines are covered (Coverage Gate)
Coverage matrix updated — N/A: behaviour-only change (no new feature row)
All affected feature IDs from the matrix are listed — N/A: none touched
No new external network dependencies introduced — wire tests use wiremock; capability uses the existing /api/show call
Manual smoke checklist updated if this touches release-cut surfaces — N/A: error-classification + picker filter, no release-cut surface
Linked issue closed via Closes #NNN in the ## Related section

Impact

Desktop (all platforms). Touches the Ollama chat-model picker (hides embedding-only models) and the chat-completions error path.
Security/perf: none. Capability reuses the existing /api/show round-trip — no extra requests.
Observability: removes the 36.6k/2-user Sentry stream; event becomes an info-level config-rejection log.
Compat / fail-open: older Ollama or an /api/show miss → chat_capable unknown → model stays selectable, so no usable model is ever hidden.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: fix/4p6-ollama-chat-capability-reject
Commit SHA: f073111 (classifier) · 751fa71 (guard) · 7497730 (capability) · 3baf728 (picker filter)

Validation Run

pnpm --filter openhuman-app format:check — prettier + rust fmt clean (verified locally)
pnpm typecheck — clean
Focused tests: cargo test --lib openhuman::inference::provider::{compatible,config_rejection} (151 + 11 ok), ...::local::ollama (44 ok); vitest run aiRouting.test.ts (10 ok)
Rust fmt/check (if changed): cargo fmt + cargo check clean; cargo clippy --lib clean on changed files
Tauri fmt/check (if changed): pnpm rust:check Finished clean (no shell change; verified post-submodule-init)

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: embedding-only Ollama models no longer appear in chat-model pickers; the residual 400 is demoted + made actionable
User-visible effect: can't misconfigure an embedding model as chat; clearer error if one slips through

Parity Contract

Legacy behavior preserved: unknown-capability models stay visible (fail-open); embedding-model selection panel unchanged; all other 4xx/5xx handling unchanged; guard fires only on 400/422 carrying the exact phrase
Guard/fallback/dispatch parity checks: not_chat_capable_ignores_unrelated_400 + not_chat_capable_requires_4xx_status; chat_capability_classifies_* covers completion/chat/embedding/unknown; api_error SessionExpired/classification on chat_with_history preserved (message upgraded post-hoc)

Duplicate / Superseded PR Handling

Duplicate PR(s): none (searched open + merged for does not support chat / bge-m3 / 4P6 / config_rejection)
Canonical PR: this
Resolution: N/A

…er-state (Sentry TAURI-RUST-4P6) (tinyhumansai#3359) An embedding model (bge-m3) picked as the Ollama chat model is rejected with a 400 'does not support chat' on every turn. The 400 bypasses the 404-only completion_only guard and the classifier had no matching phrase, so the raw body re-reported each turn — 36.6k events / 2 users. Add 'does not support chat' to is_provider_config_rejection_message so the event is demoted error->info, same treatment as the sibling 'does not support tools' (TAURI-RUST-35). Tests cover the verbatim Sentry body and the enriched message shape.

…ntry TAURI-RUST-4P6) (tinyhumansai#3359) Ollama 400s an embedding-model-as-chat with 'does not support chat'. Unlike the completion-only base-model case (404), this is a 400/422, so add a not_chat_capable_guard that fires on those statuses and rewrites the opaque upstream JSON into 'model <m> does not support chat — assign a chat-capable model in Settings -> AI'. The message preserves the phrase so it stays demoted by the config-rejection classifier on re-report. Wired into all chat-completions error paths (chat_with_system, chat_with_tools, and the api_error-delegated chat_with_history non-404 branch). 6 unit + wire tests.

coderabbitai · 2026-06-04T11:07:32Z

📝 Walkthrough

Walkthrough

Adds fast-fail detection and actionable remediation for OpenAI-compatible backends (Ollama) rejecting embedding models used as chat models. Detects HTTP 400/422 "does not support chat" errors, replaces opaque JSON with actionable configuration guidance, wires guards into three chat request paths, and extends error classification to prevent Sentry re-reporting.

Changes

Embedding Model as Chat Model — Handling

Layer / File(s)	Summary
Core error detection and message generation `src/openhuman/inference/provider/compatible.rs`, `src/openhuman/inference/provider/compatible_tests.rs`	Adds `is_not_chat_capable_model`, `not_chat_capable_model_message`, and `not_chat_capable_guard` helpers to detect HTTP 400/422 errors containing "does not support chat", generate model-specific actionable remediation text, and wrap into fast-fail errors. Unit tests validate status-code requirements, signature matching, message enrichment, classifier integration, and guard firing.
Chat request path integration `src/openhuman/inference/provider/compatible.rs`, `src/openhuman/inference/provider/compatible_tests.rs`	Wires the guard into `chat_with_system`, `chat_with_history`, and native `chat` entry points to short-circuit on non-chat-capable model rejections. End-to-end test confirms `chat_with_history` fails fast with actionable message and preserves config-rejection classification without fallback.
Provider config-rejection classifier extension `src/openhuman/inference/provider/config_rejection.rs`	Extends `is_provider_config_rejection_message` to recognize "does not support chat" as a provider-config-rejection substring. Unit tests validate raw and enriched error bodies for TAURI-RUST-4P6 (embedding model as chat) classify correctly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

tinyhumansai/openhuman#3211: Adds fast-fail model capability mismatch guard in the same three chat entry points for completion-only model 404 rejection pattern.
tinyhumansai/openhuman#2813: Extends is_provider_config_rejection_message to detect "does not support tools" rejection (same family of capability mismatch errors).
tinyhumansai/openhuman#2346: Adds logging and suppression of provider config-rejection errors in streaming and native chat paths.

Suggested labels

sentry-traced-bug, bug, rust-core

Suggested reviewers

M3gA-Mind
senamakel

Poem

🐰 A rabbit hops through error logs with glee,
"Your embedding's confused—use chat, you'll see!"
Four hundred's now actionable, helpful, and true,
No more Sentry floods from a model askew. 🔧✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	All objectives from `#3359` are met: the PR adds detection for 'does not support chat' to the config-rejection classifier, implements a guard generating actionable remediation messages, wires it into all chat completion paths, and includes comprehensive tests.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to the linked issue: classifier updates, guard implementation, error path integration, and related test coverage; no unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Title check	✅ Passed	The title accurately summarizes the main changes: fixing Ollama embedding-model-as-chat handling (Sentry TAURI-RUST-4P6) by adding detection and actionable error messages.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…nyhumansai#3359) Parse the /api/show 'capabilities' list (already deserialized but unused) and classify each model chat-capable / embedding-only / unknown via ollama_chat_capability. Surface it as 'chat_capable' on each diagnostics installed_models entry, fetched in the same /api/show round-trip that already resolves the context window. Fail-open: unknown (empty/unrecognised capabilities) stays chat-capable. Prereq for hiding embedding-only models from the chat picker (Sentry TAURI-RUST-4P6).

…inyhumansai#3359) Filter models the core flagged chat_capable=false out of the local-model pickers (CustomRoutingDialog + GlobalOwnModelSelector) via the new isChatSelectableLocalModel helper in useInstalledModels. Prevents the root-cause misconfig behind TAURI-RUST-4P6: picking an embedding model (bge-m3) as the chat model 400s every turn. Unknown capability stays visible (fail-open); embedding selection is a separate panel, unaffected. 4 helper tests added.

…per (tinyhumansai#3359) Extract the embedding-only filter + picker-shape map out of the useInstalledModels hook into a pure toSelectableChatModels helper in aiRouting.ts, with unit tests. The hook's map() lines were the only changed frontend lines uncovered by Vitest (AIPanel.tsx 484,486-487), dropping diff coverage to 50% and failing the >=80% Coverage Gate.

M3gA-Mind

Reviewed locally — well-structured, correctly layered fix (prevention + Sentry demotion + actionable error), with consistent fail-open behavior end to end and thorough tests. Approving in spirit; a few inline non-blocking notes below.

M3gA-Mind · 2026-06-04T13:35:35Z

+        ) {
+            return false;
+        }
+        error.to_lowercase().contains("does not support chat")


Non-blocking: this does a raw substring match (does not support chat) over the whole sanitized body. Given the verbatim Ollama wire shape this is near-impossible to false-positive in practice, and the 4xx gate above + the not_chat_capable_ignores_unrelated_400 test guard it well. The only theoretical snag is an unrelated 400 whose body happens to contain that phrase in a different field (e.g. an error about some other feature not supporting "chat history"). No change needed — flagging for completeness.

M3gA-Mind · 2026-06-04T13:35:35Z

+        // in `compatible.rs` rewrites the opaque upstream JSON into an
+        // actionable "assign a chat-capable model" message that still carries
+        // this substring, so it stays demoted.
+        "does not support chat",


Worth making the coupling load-bearing in a comment here too (it already is in compatible.rs): this anchor phrase is what keeps the rewritten actionable message demoted. If anyone ever softens the rewrite in not_chat_capable_model_message and drops the phrase, the 36.6k stream reopens. The TAURI-RUST-4P6-enriched classifier test covers it — good — but a one-liner here pointing at that dependency would help the next editor.

M3gA-Mind · 2026-06-04T13:35:35Z

+/// Callers treat `None` as "keep visible" — fail-open, never hide a model
+/// that might be usable for chat. Mirrors the non-rejecting `Unknown` arm of
+/// [`super::model_requirements::ContextEligibility`]. See Sentry TAURI-RUST-4P6.
+pub(crate) fn ollama_chat_capability(capabilities: &[String]) -> Option<bool> {


Nice — conservative in the right direction: only Some(false) when confident it's embedding-only, None everywhere ambiguous, and completion/chat wins over an embedding tag. The case/whitespace tolerance and the ["insert"]-only → None case are both covered by the matrix test. Mirrors the existing Unknown eligibility arm; consistent with the codebase's fail-open posture.

M3gA-Mind

Approving. Correct root-cause fix with well-reasoned defense-in-depth (prevention + Sentry demotion + actionable error), fail-open behavior that's consistent across Rust/TS/picker layers, exemplary documentation of the cross-module anchor-phrase coupling, and thorough tests (classifier matrix, guard signature, wiremock E2E, classifier-parity for both raw and enriched messages). The inline notes are non-blocking. Nice work.

oxoxDev added 2 commits June 4, 2026 16:36

oxoxDev requested a review from a team June 4, 2026 11:07

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage bug labels Jun 4, 2026

coderabbitai Bot previously approved these changes Jun 4, 2026

View reviewed changes

oxoxDev added 2 commits June 4, 2026 17:00

oxoxDev dismissed coderabbitai[bot]’s stale review via 3baf728 June 4, 2026 11:57

oxoxDev changed the title ~~fix(observability,inference): classify ollama embedding-model-as-chat 400 + actionable error (Sentry TAURI-RUST-4P6) (#3359)~~ fix(inference,ai-settings): prevent + handle ollama embedding-model-as-chat (Sentry TAURI-RUST-4P6) (#3359) Jun 4, 2026

M3gA-Mind reviewed Jun 4, 2026

View reviewed changes

M3gA-Mind approved these changes Jun 4, 2026

View reviewed changes

M3gA-Mind merged commit 122196b into tinyhumansai:main Jun 4, 2026
19 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference,ai-settings): prevent + handle ollama embedding-model-as-chat (Sentry TAURI-RUST-4P6) (#3359)#3360

fix(inference,ai-settings): prevent + handle ollama embedding-model-as-chat (Sentry TAURI-RUST-4P6) (#3359)#3360
M3gA-Mind merged 5 commits into
tinyhumansai:mainfrom
oxoxDev:fix/4p6-ollama-chat-capability-reject

oxoxDev commented Jun 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

M3gA-Mind left a comment

Uh oh!

M3gA-Mind Jun 4, 2026

Uh oh!

M3gA-Mind Jun 4, 2026

Uh oh!

M3gA-Mind Jun 4, 2026

Uh oh!

M3gA-Mind left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

oxoxDev commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

M3gA-Mind left a comment

Choose a reason for hiding this comment

Uh oh!

M3gA-Mind Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

M3gA-Mind Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

M3gA-Mind Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

M3gA-Mind left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

oxoxDev commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading