Skip to content

Fix: ChatEngine never actually throws 503 — every routing failure becomes 404#7

Draft
mimeding wants to merge 2 commits into
mainfrom
cursor/chat-engine-noservice-503-2812
Draft

Fix: ChatEngine never actually throws 503 — every routing failure becomes 404#7
mimeding wants to merge 2 commits into
mainfrom
cursor/chat-engine-noservice-503-2812

Conversation

@mimeding

Copy link
Copy Markdown
Owner

Summary

Why this matters (business)

The error model in ChatEngine was already designed with two distinct outcomes — "you asked for a model nobody has" (404) and "we know how to serve this model, but the backend isn't healthy right now" (503). Client SDKs, the WorkView error classifier, and any future status-aware UI rely on those codes to give the right next action ("install the model" vs "retry in a moment" vs "check your network").

In practice, the 503 branch was unreachable code. Every routing failure — even ones where Foundation Models is genuinely available on the user's Mac but isAvailable() is reporting false at this moment — surfaced as 404. Users see "Model 'foundation' is not installed or registered with any provider" and reach for the model picker when they should just be retrying.

What's wrong (technical)

ChatEngine.EngineError declares both cases:

    struct EngineError: Error, LocalizedError {
        enum Kind {
            case modelNotFound(requested: String)
            case noServiceAvailable(requested: String)
        }
        ...
        var httpStatus: Int {
            switch kind {
            case .modelNotFound: return 404
            case .noServiceAvailable: return 503
            }
        }
    }

…but the ModelServiceRouter.resolve it consumes only returns .service or .none. The .none branch in ChatEngine always throws .modelNotFound:

        case .none:
            throw EngineError(kind: .modelNotFound(requested: request.model))

So .noServiceAvailable was dead code: it existed but nothing ever produced it.

Fix

Three small changes, one new test file:

  1. ModelServiceRouter.resolve now returns one of three routes:

    • .service — at least one candidate handles the model and reports isAvailable().
    • .unavailable(requestedModel:) — at least one candidate handles(model:) but every such service answers isAvailable() == false.
    • .none — no candidate handles the model at all.
  2. Both ChatEngine switch sites (streaming and non-streaming) now throw EngineError(.noServiceAvailable) for .unavailable. noServiceAvailable.httpStatus is already wired to 503, and PR Fix: extend ChatEngine HTTP-status mapping to Anthropic /messages and Open Responses /responses #6 propagates that to the wire response on the Anthropic / Open Responses endpoints; /v1/chat/completions already used it.

  3. CoreModelService collapses .unavailable and .none to its existing CoreModelError.modelUnavailable. The public CoreModelError API doesn't currently distinguish them, and the API-facing distinction is already covered via ChatEngine.

  4. New focused unit tests (ModelServiceRouterTests.swift) cover:

    • .service returned when the handler is available.
    • .none returned when no service claims the model.
    • .unavailable returned when a handler exists but reports unavailable.
    • Two candidates, one online and one offline — router picks the online one.
    • The same outcomes for remote-only services and for the default-model path.

Known limitation (called out in the commit message and in resolve's docstring)

This fixes the case where a local service reports !isAvailable(). It does not yet fix the case where a remote provider is configured but disconnected, because callers (e.g. ChatEngine) pass RemoteProviderManager.shared.connectedServices() — disconnected remotes aren't in the array at all, so the router never sees them. Distinguishing that case would need RemoteProviderManager to expose a configured-but-disconnected list, which is a separate, larger change.

Changes

  • Behavior change (additive — new enum case; existing .service/.none consumers behave identically)
  • UI change
  • Refactor / chore
  • Tests (new ModelServiceRouterTests)
  • Docs

Test Plan

  1. cd Packages/OsaurusCore && swift test --filter ModelServiceRouterTests should pass.
  2. Manually: simulate Foundation Models unavailability (e.g. on a Mac that doesn't support it) and call /v1/chat/completions with "model":"foundation". Expected: 503 + noServiceAvailable error body. Previously: 404 + modelNotFound.
  3. Call with a genuinely unknown model id ("model":"zzz"). Expected: 404 + modelNotFound. Unchanged.

Checklist

  • I have read CONTRIBUTING.md
  • I added/updated tests where reasonable
  • I updated docs/README as needed (n/a — internal routing change)
  • I verified build on macOS with Xcode 16.4+ (authored in a Linux sandbox; verified each touched file via swiftc -frontend -parse)
Open in Web Open in Cursor 

ChatEngine.EngineError already declared two cases that map to
different HTTP statuses:

  case modelNotFound       -> 404
  case noServiceAvailable  -> 503

…but the producer (the two switch sites in ChatEngine on
ModelServiceRouter.resolve and the one in CoreModelService) only ever
threw modelNotFound. The 503 case was dead code, and every routing
failure — including ones where a service that handles the model
existed but reported isAvailable() == false — surfaced as 404.

This contradicts the audit's expectation set by PR osaurus-ai#863 / issue osaurus-ai#858:
the WorkView error classifier and external API consumers use the HTTP
status code to give users actionable feedback. 'install the model' vs
'service unavailable, retry' is exactly the distinction we couldn't
make.

Extend ModelRoute with a third case, .unavailable, that the router
returns when at least one candidate service answers handles(model:)
== true but every such service answers isAvailable() == false. The
two ChatEngine switch sites now throw .noServiceAvailable for
.unavailable, so the existing 503 path actually fires. CoreModelService
collapses both to its existing modelUnavailable error since its
public API doesn't currently distinguish them (callers needing the
distinction go through ChatEngine).

Limitation worth calling out: the router only sees the connected
remote-provider list (callers pass connectedServices() at the call
site). A configured-but-disconnected remote provider is therefore not
visible to the router, and its absence still resolves to .none (404).
Fixing that requires RemoteProviderManager to expose the configured-
but-disconnected services list, which is a separate change.

New focused unit tests cover all three outcomes (service / unavailable
/ none) plus the multi-service-some-unavailable case (router prefers
the available service).

Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>
ModelManager.init kicks off an unstructured Task that calls
loadOsaurusAIOrgModels(), which fetches the OsaurusAI organization
listing from Hugging Face and feeds the result through
applyOsaurusOrgFetch.

The unit-test runner repeatedly constructs ModelManager() to drive
applyOsaurusOrgFetch directly. The background launch-time fetch
races with those test calls — whichever finishes last wins, and
the merge result is non-deterministic. That's the root cause of
the flaky ModelManagerSuggestedTests failures seen across many of
the recent PR CI runs (applyOsaurusOrgFetch_dropsStaleAutoFetched
OnReapply, applyOsaurusOrgFetch_addsNewEntriesAfterCurated, etc.).

Gate the launch-time fetch on a small isRunningInTestEnvironment
helper that checks for any of XCTestConfigurationFilePath,
XCTestBundlePath, or XCTestSessionIdentifier in the process
environment. Those variables are only present inside an xctest host
process; production app launches still get the HF fetch exactly as
before.

This is a network call, so removing it under tests also has the
side benefit of making the test suite work offline / on hermetic
CI runners.

Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants