fix(inference): actionable error for completion-only model 404 + unmistakable run_code failure (#3193) by oxoxDev · Pull Request #3211 · tinyhumansai/openhuman

oxoxDev · 2026-06-02T12:03:31Z

Summary

run_code (and any chat call) against a completion-only / base model now fails fast with an actionable error that names the model and the fix, instead of an opaque chat completions unavailable; responses fallback failed chain.
A subagent-delegation failure is now wrapped in an unmistakable failure envelope so a weak orchestrator can't narrate a fabricated success ("wrote the file") when nothing actually ran.
Completion-only detection is shared across all three chat entrypoints (chat_with_system, chat_with_history, native tool-calling chat).

Problem

Reported in #3193 (openhuman-core 0.56.0, Win11, agentic_provider = "openhuman"): every run_code call returns

openai API error (404 Not Found): "This is not a chat model and thus not supported in the v1/chat/completions endpoint. Did you mean to use v1/completions?" (chat completions unavailable; responses fallback failed)

Root cause: OpenHuman only speaks the chat-completions API (with an optional /v1/responses fallback) — there is no /v1/completions path. When the model bound to the coding role (run_code → code_executor, model hint coding) is a completion-only model, /v1/chat/completions 404s and the responses fallback can't rescue it. The surfaced error was opaque and gave no remediation. The reporter also observed the orchestrator presenting the failure as success and fabricating output — the bare error text alone didn't stop a weak model from narrating a fake result.

Solution

compatible.rs: add is_completion_only_model_404 (tight match on the OpenAI signature — ordinary "model does not exist" 404s are intentionally not matched) and completion_only_model_message (names the model + "assign a chat-capable model"). A shared completion_only_404_guard is invoked at all three chat 404 handlers and short-circuits before the doomed /v1/responses fallback.
dispatch.rs: route subagent-failure results through format_subagent_failure, which states the task did not run and instructs the model not to report success or fabricate output, while preserving the underlying error.
Deliberately not adding a legacy /v1/completions transport — the real cause is a misconfigured model, so the fix is an actionable diagnostic, not a new (deprecated) endpoint surface.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
Diff coverage ≥ 80% — Coverage Gate (diff-cover ≥ 80%) passes in CI; focused Rust tests (completion_only*, subagent_failure_envelope) pass locally.
Coverage matrix updated — N/A: behaviour-only change (error-handling/diagnostic; no feature row added/removed/renamed)
No new external network dependencies introduced (wiremock mock server used for the 404 path)
Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release-cut surface touched
Linked issue closed via Closes #NNN in the ## Related section

Impact

Desktop/CLI (Rust core). No schema, migration, or protocol change.
Behaviour change is limited to error paths: a completion-only model now yields a clear, fixable message and stops before a futile fallback; failed delegations are reported as failures instead of being mistakable for success. Successful calls are unaffected.

Closes: Cannot write code on filesystem #3193
Follow-up PR(s)/TODOs: deferred sub-symptoms from Cannot write code on filesystem #3193 needing the reporter's transcript — delegate_tools_agent "400 Insufficient budget", spawn_parallel_agents non-determinism, and whether parallel_tools is honored. Confirmation of a literal success:true flag bug (vs. orchestrator narration) also pending the transcript.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: fix/3193-run-code-completion-only-404
Commit SHA: d555ae48472ce694570791ccbe7d99b742f9a3c4

Validation Run

N/A: no app/ changes — pnpm --filter openhuman-app format:check
N/A: no app/ changes — pnpm typecheck
Focused tests: cargo test --lib completion_only (6 passed) · cargo test --lib subagent_failure_envelope (1 passed)
Rust fmt/check (if changed): cargo fmt applied; cargo check --lib clean
N/A: no Tauri-shell changes — Tauri fmt/check

Validation Blocked

command: pnpm test:coverage / pnpm test:rust (full suites)
error: not run locally (heavy; full Rust test matrix OOMs locally)
impact: diff-coverage verified by CI gate (passing)

Behavior Changes

Intended behavior change: completion-only-model 404 → actionable fail-fast error; failed subagent delegation → unmistakable failure result.
User-visible effect: clear "assign a chat-capable model" guidance instead of an opaque fallback chain; agent no longer reports fabricated success after a hard failure.

Parity Contract

Legacy behavior preserved: ordinary 404s keep their existing fallback/enrich path; non-404 errors still flow through api_error (Sentry/classification) unchanged; successful responses unchanged.
Guard/fallback/dispatch parity checks: completion-only guard added uniformly to all three chat entrypoints; /v1/responses fallback still runs for non-completion-only 404s.

Duplicate / Superseded PR Handling

Duplicate PR(s): none found (no open PR touches run_code / completion endpoint)
Canonical PR: this PR
Resolution: N/A

…umansai#3193) A completion-only/base model assigned to a chat role 404s on /v1/chat/completions ('This is not a chat model … did you mean v1/completions?') and the /v1/responses fallback cannot rescue it, leaving an opaque 'responses fallback failed' chain. Detect the signature and fail fast with a message naming the model and the remediation (assign a chat-capable model), skipping the doomed fallback.

…ai#3193) On a hard delegation failure (e.g. run_code's coding model 404ing) the bare error text let a weak orchestrator narrate a plausible success and fabricate output. Wrap failures in an envelope that states the task did not run and forbids reporting success, while preserving the root error.

tinyhumansai#3193) Extract the completion-only detection into completion_only_404_guard and apply it in all three chat entrypoints (chat_with_system, chat_with_history, native chat) so a completion-only model fails fast on every path. Add a wiremock test proving the guard pre-empts the /v1/responses fallback end to end, plus a unit test for the guard's match/no-match branches.

coderabbitai · 2026-06-02T12:03:50Z

📝 Walkthrough

Walkthrough

Two independent error-handling improvements: (1) subagent failures now return a standardized envelope stating the task did not complete and discouraging output fabrication, and (2) OpenAI-compatible chat endpoints detect completion-only models on 404 and return actionable guidance instead of silent fallback.

Changes

Subagent Failure Envelope

Layer / File(s)	Summary
Standardized failure message and validation `src/openhuman/agent_orchestration/tools/dispatch.rs`	Error-handling path replaces simple error string with `format_subagent_failure()` helper, which builds an anti-fabrication envelope. Unit test validates required wording (no completion, anti-fabrication language) and preservation of tool name and root error.

Completion-Only Model 404 Detection

Layer / File(s)	Summary
Completion-only model detection predicates and guard `src/openhuman/inference/provider/compatible.rs`	New private helpers provide `is_completion_only_model_404` and `completion_only_404_guard` that recognize the specific 404 pattern from OpenAI-compatible providers indicating a completion-only model, and generate actionable error messages with remediation guidance.
Guard integration into chat paths `src/openhuman/inference/provider/compatible.rs`	`chat_with_system`, `chat_with_history`, and `chat` paths now apply the completion-only guard after sanitizing 404 error bodies: on match, they early-return the actionable error before any `/v1/responses` fallback.
Completion-only detection test coverage `src/openhuman/inference/provider/compatible_tests.rs`	Unit tests validate signature detection (true positives, false negatives, status-code gating) and message construction. Guard behavior test confirms the guard fires only on exact match. End-to-end test verifies the guard short-circuits and prevents responses fallback.

Sequence Diagram

flowchart TD
    A["404 from chat/completions"] --> B["Sanitize error body"]
    B --> C["Apply completion_only_404_guard"]
    C --> D{Guard matches<br/>completion-only<br/>signature?}
    D -->|Yes| E["Return actionable error<br/>with model name &amp;<br/>remediation hint"]
    D -->|No| F{"Responses fallback<br/>enabled?"}
    F -->|Yes| G["Attempt /v1/responses<br/>fallback"]
    F -->|No| H["Return enriched<br/>generic 404 error"]

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

tinyhumansai/openhuman#2214: Modifies compatible.rs 404 error handling to influence /v1/responses fallback logic and updates compatible_tests.rs with overlapping test scope.
tinyhumansai/openhuman#2814: Modifies compatible.rs non-streaming chat_completions error handling in the same control-flow region (main PR adds 404 completion-only guard, retrieved PR adds SessionExpired handling).

Suggested labels

rust-core, agent, bug, working

Suggested reviewers

graycyrus
senamakel

Poem

🐰 A subagent stumbled, but now truth will ring—
No more silent success when nothing took wing!
And models that whisper "I'm completion, not chat"
Get caught with a guard: "Not today, friend, flat fact!"
Clarity blooms where confusion once grew. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the two main changes: actionable error for completion-only model 404 and unmistakable run_code failure, directly matching the code changes.
Linked Issues check	✅ Passed	The PR addresses core coding requirements from `#3193`: detecting completion-only model 404 errors and preventing silent success fabrication via failure envelope in subagent delegation.
Out of Scope Changes check	✅ Passed	All changes are scoped to addressing `#3193`: dispatch.rs adds failure envelope, compatible.rs adds 404 detection, and tests validate both features—no unrelated alterations present.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

CodeGhost21

Inline findings posted on the changed lines. Overall: the core fix is correct, tightly scoped, and well-tested. Main follow-ups concern adjacent code paths (chat_with_tools, streaming) that share the same 404 surface but don't currently invoke the new guard.

CodeGhost21 · 2026-06-02T21:32:41Z

+/// Format a subagent-delegation failure so the orchestrator cannot mistake it
+/// for success. Kept as a standalone, side-effect-free fn so the exact wording
+/// is unit-testable without standing up a registry + failing model (#3193).
+fn format_subagent_failure(tool_name: &str, message: &str) -> String {


In-band prompt-engineering preamble now applied to every subagent failure.

format_subagent_failure is invoked on the generic run_subagent error path — so timeouts, budget exhaustion, transport errors, and any other transient failure now also carry the ~200-char "do NOT treat this as success or fabricate an output" preamble in transcripts, logs, and observability. That's intentional given #3193's "hallucinated success" symptom, but it's a behavior change broader than the issue title suggests ("completion-only 404"). Worth either:

calling out in the PR body's Behavior Changes section that the envelope is applied to all subagent failures, or

gating the envelope to a narrower class of "hard, non-retryable" failures so retry/budget cases keep their shorter error string.

Not blocking — flagging so it's a deliberate decision.

CodeGhost21 · 2026-06-02T21:32:41Z

+        }
+        let lower = error.to_lowercase();
+        lower.contains("not a chat model")
+            || (lower.contains("v1/chat/completions") && lower.contains("v1/completions"))


Second disjunct is correct today but fragile to OpenAI rewording.

The string "v1/chat/completions" does not contain "v1/completions" as a continuous substring (the /chat/ infix breaks it), so the two contains calls really are independent matches — but that's only obvious if you stop and parse it. The whole clause currently fires only because OpenAI's body happens to mention both endpoint paths ("...the v1/chat/completions endpoint. Did you mean to use v1/completions?"). If OpenAI ever drops the second URL from the wording, this branch silently stops matching and we fall back to the opaque path.

Suggest a one-line comment on line 853 making the "two independent substrings" intent explicit, e.g.:

// Defensive fallback: OpenAI's current phrasing references BOTH endpoint paths // as two separate substrings (`v1/chat/completions` does not contain `v1/completions`).

CodeGhost21 · 2026-06-02T21:32:41Z

+                        .map_err(|responses_err| {
+                            let fb = super::format_anyhow_chain(&responses_err);
+                            anyhow::anyhow!(
+                                "{} API error ({status}): {sanitized} (chat completions unavailable; responses fallback failed: {fb})",


Undocumented behavior change on the non-completion-only fallback path.

Before this PR, chat_with_history's responses-fallback failure read:

{name} API error (chat completions unavailable; responses fallback failed: {fb})

After this PR it reads:

{name} API error ({status}): {sanitized} (chat completions unavailable; responses fallback failed: {fb})

That's a strict improvement (now includes the original 404 body and status), but it's a behavior change to a path unrelated to completion-only detection — any ordinary 404 that triggers the responses fallback will now have a different error string. The PR body's Behavior Changes section only lists the completion-only and subagent-envelope changes, not this. Suggest adding a one-liner there so downstream log parsers / Sentry classifiers aren't surprised.

(Cross-reference: the same widening was applied at the native-tool-calling chat 404 path — consistent with this one, which is good.)

CodeGhost21 · 2026-06-02T21:32:41Z


+            // A completion-only model 404s here and the /v1/responses fallback
+            // cannot rescue it — fail fast with actionable guidance (#3193).
+            if let Some(err) = self.completion_only_404_guard(status, &sanitized, model) {


Adjacent code paths share the same 404 surface but skip the new guard.

The PR body claims "completion-only detection is shared across all three chat entrypoints", but two more entrypoints on this same provider can hit the same OpenAI completion-only 404:

chat_with_tools (line ~1697) — if !response.status().is_success() { return Err(super::api_error(&self.name, response).await); }. No guard, no responses fallback. If a run_code-style flow ever passes through the native tool-calling non-streaming path (and tools is non-empty), the user gets the opaque "not a chat model" 404 again. Transport-error path does fall back to chat_with_history (which has the guard), but the 404-status path does not.

Streaming paths — stream_chat_with_system (~line 2138) and stream_chat_with_history (~line 2298). Both already sanitize the error body, so adding a completion_only_404_guard branch is a one-liner that matches the pattern used here.

Proposal: either replicate the 3-line guard at all four additional sites (preferred — they're 1-liners and the helper is already shared), or document in the PR body why streaming + chat_with_tools are intentionally excluded. Otherwise #3193 can re-emerge through any of these paths.

oxoxDev added 3 commits June 2, 2026 17:25

oxoxDev requested a review from a team June 2, 2026 12:03

coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. agent Built-in agents, prompts, orchestration, and agent runtime in src/openhuman/agent/. working A PR that is being worked on by the team. bug labels Jun 2, 2026

coderabbitai Bot approved these changes Jun 2, 2026

View reviewed changes

CodeGhost21 mentioned this pull request Jun 2, 2026

Cascade error on agentic task — Plan, Run Code, and Tools Agent all failing in sequence #3104

Open

CodeGhost21 reviewed Jun 2, 2026

View reviewed changes

senamakel merged commit 1113965 into tinyhumansai:main Jun 2, 2026
26 of 31 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference): actionable error for completion-only model 404 + unmistakable run_code failure (#3193)#3211

fix(inference): actionable error for completion-only model 404 + unmistakable run_code failure (#3193)#3211
senamakel merged 3 commits into
tinyhumansai:mainfrom
oxoxDev:fix/3193-run-code-completion-only-404

oxoxDev commented Jun 2, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

CodeGhost21 left a comment

Uh oh!

CodeGhost21 Jun 2, 2026

Uh oh!

CodeGhost21 Jun 2, 2026

Uh oh!

CodeGhost21 Jun 2, 2026

Uh oh!

CodeGhost21 Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

oxoxDev commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

CodeGhost21 left a comment

Choose a reason for hiding this comment

Uh oh!

CodeGhost21 Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

CodeGhost21 Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

CodeGhost21 Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

CodeGhost21 Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oxoxDev commented Jun 2, 2026 •

edited

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading