Skip to content

modelerrors: make overflow errors more specific#2818

Merged
dgageot merged 1 commit into
docker:mainfrom
trungutt:trungutt/overflow-kind-classification
May 19, 2026
Merged

modelerrors: make overflow errors more specific#2818
dgageot merged 1 commit into
docker:mainfrom
trungutt:trungutt/overflow-kind-classification

Conversation

@trungutt
Copy link
Copy Markdown
Contributor

@trungutt trungutt commented May 19, 2026

Motivation

Context-overflow errors from LLM providers come in three distinct shapes, but modelerrors collapses them into one. The conflation has user-visible consequences:

  • A provider rejecting a single oversized request (HTTP 413, e.g. paste over the wire-body cap) is reported as "context window exceeded — try /compact" — but /compact cannot help, because the offending turn still has to be sent on every subsequent call. Sessions get stuck failing in the same way until the user starts over.
  • An image too large for the provider is reported as a generic context error, hiding the actual cause.
  • Token-count overflow and wire-size overflow share one retry path even though they need different recovery.

Without distinguishing them at the classifier, every downstream consumer (runtime retry logic, error-code routing, UI hints) has to guess.

Change

Introduce OverflowKind with three values, classified by a small two-tier function:

Kind Meaning Recovery shape
tokens Conversation exceeds the model's context window Compaction helps
wire Request body exceeds the provider's wire-level limit Compaction cannot
media Image, PDF, or attachment too large Strip media

Classifier:

  1. Structured signals (high confidence): HTTP 413, body.error.type == "request_too_large", body.error.code == "context_length_exceeded".
  2. Provider-prose substrings as a fallback, covering the observed error wording across Anthropic, OpenAI, Bedrock, Gemini, Mistral, Groq, Vertex, OpenRouter, Ollama, Kimi, MiniMax, z.ai, and others. Best-effort — provider wording is not contractual, so this list is expected to drift and is easy to extend (one line per provider phrase).

FormatError now returns three distinct, provider-agnostic messages (no vendor names — the wire cap is a deployment detail of the provider, not something the user can act on by name). Two new ErrorCode constants (request_too_large, media_too_large) reach external consumers via the existing ErrorEvent.Code channel so they can render the right hint per kind.

What is preserved

  • IsContextOverflowError matches the same set of errors as before.
  • Retry classification in ClassifyModelError is unchanged (overflow stays non-retryable).
  • Auto-compaction still fires on overflow (this PR labels; it does not yet change control flow).
  • NewContextOverflowError now auto-fills Kind from the underlying error, so existing wrap sites in pkg/runtime/fallback.go get the correct kind without code changes.

Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The overflow classification logic is well-structured and correct. The two-tier classifier (structured signals → prose fallback), the OverflowKindOf unwrapping semantics, FormatError switch, and classifyErrorCode routing all look sound. Backward compatibility is preserved: IsContextOverflowError still matches the same error set, and unclassifiable wraps default to OverflowKindTokens. Test coverage is thorough across 26 classifier cases, wrap semantics, and all three FormatError paths.

@trungutt trungutt changed the title feat(modelerrors): classify overflow errors by kind modelerrors: make overflow errors more specific May 19, 2026
Split the single ContextOverflowError bucket into three OverflowKinds so
the runtime and clients can react differently to each:

  * tokens — conversation exceeds the model's context window.
            Compaction can usually help.
  * wire   — request body exceeds the provider's wire-level limit
            (e.g. HTTP 413). The latest turn alone is over the cap;
            compaction-as-retry cannot help.
  * media  — image, PDF, or similar attachment too large.

Classification runs two tiers, in order:

  1. Structured signals (high confidence): HTTP 413, body.error.type =
     "request_too_large", body.error.code = "context_length_exceeded".
  2. Provider-prose substring patterns as fallback, covering the
     observed wording across Anthropic, OpenAI, Bedrock, Gemini,
     Mistral, Groq, Vertex, OpenRouter, Ollama, Kimi, MiniMax, z.ai,
     and others. Best-effort; provider wording is not contractual.

FormatError now returns three distinct, actionable, provider-agnostic
messages instead of one. The runtime emits new ErrorCodes
"request_too_large" and "media_too_large" alongside the existing
"context_exceeded" so external consumers can render the right hint.

Behaviour is preserved: IsContextOverflowError matches the same set of
errors as before, retry classification is unchanged, and auto-compaction
still fires on overflow. NewContextOverflowError now auto-fills Kind
from the underlying error so existing wrap sites get the correct shape
for free.
@trungutt trungutt force-pushed the trungutt/overflow-kind-classification branch from 5448e5f to a6c7044 Compare May 19, 2026 09:04
@trungutt trungutt marked this pull request as ready for review May 19, 2026 09:21
@trungutt trungutt requested a review from a team as a code owner May 19, 2026 09:21
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The classifier's two-tier approach (structured signals first, substring patterns as fallback) is sound. Tier 1 structured checks correctly map HTTP 413 → OverflowKindWire and context_length_exceededOverflowKindTokens. The OverflowKindOf helper correctly handles legacy zero-Kind wraps by falling back to classification on the underlying error. NewContextOverflowError auto-fills Kind correctly via classifyOverflow. The classifyErrorCode switch in loop_steps.go correctly routes all three kinds to distinct ErrorCode constants.

Intentional design noted: The PR description explicitly states "Auto-compaction still fires on overflow (this PR labels; it does not yet change control flow)" — the handleStreamError check is intentionally unchanged; control-flow differentiation for Wire/Media is deferred to a follow-up.

No bugs found in the changed code.

@dgageot dgageot merged commit c4ff92e into docker:main May 19, 2026
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants