Skip to content

fix(providers): add per-model attachment capability override (#2741)#3205

Merged
dgageot merged 3 commits into
mainfrom
fix/2741-attachment-caps-override
Jun 26, 2026
Merged

fix(providers): add per-model attachment capability override (#2741)#3205
dgageot merged 3 commits into
mainfrom
fix/2741-attachment-caps-override

Conversation

@Sayt-0

@Sayt-0 Sayt-0 commented Jun 22, 2026

Copy link
Copy Markdown
Member

Problem

Custom and aliased OpenAI-compatible providers (xai, mistral, nebius, ollama, and user-defined proxies) resolve attachment capabilities through a models.dev lookup keyed on the config provider/model string (base.Config.ID()). When that pair is absent from models.dev, modelinfo.LoadCaps swallows the "not found" error and returns conservative text-only capabilities, so attachment.Decide drops every image and PDF part with only a generic warning. Affected models lose attachments silently.

Fixes #2741.

Verified against the live models.dev catalogue:

Provider string In models.dev? Failing case
ollama no (only ollama-cloud, no local tags) ollama/llava dropped
xai yes, but grok-2-vision-1212 is gone (drifted to grok-4.x) legacy vision model dropped
custom proxy no (user-defined) vision-capable gpt-4o dropped

Why not a provider mapping table

The issue listed a provider to models.dev mapping table as one option. Live api.json shows it would fix almost nothing: xai, mistral, nebius, minimax, github-copilot already match models.dev exactly (identity mapping), ollama has no catalogue entry at all, and neither model-absence (grok-2-vision-1212) nor user-defined providers can be fixed by a provider rename. The vertexai precedent (withModelsDevProvider) works there only because those models do exist in models.dev under a different key, which does not hold here.

What this does

Implements two of the issue's three options: the per-model escape hatch and the non-silent diagnostic.

  • New typed override on the model config: capabilities: {image, pdf}. When set it is authoritative and skips the models.dev lookup; when omitted, behaviour is unchanged.
  • A deduplicated diagnostic (once per model id) when a model is absent from models.dev, naming the id and pointing at the override.
Layer Change
pkg/config/latest CapabilitiesConfig plus optional Capabilities field on ModelConfig; isShorthandOnly updated
pkg/modelinfo CapsOverride plus ResolveCaps(store, id, override); once-per-model miss diagnostic in LoadCaps
pkg/model/provider/base Config.CapsOverride() builds the override from config
providers override threaded through the 5 conversion paths: oaistream (openai chat, dmr), openai responses, anthropic, bedrock, gemini
agent-schema.json, examples/capability-overrides.yaml schema plus example

modelinfo.ResolveCaps takes plain booleans, so modelinfo keeps no dependency on the config package (layering preserved). The override is resolved once per client and passed down; a nil override reproduces prior behaviour at every call site.

Example:

models:
  llava-local:
    provider: ollama
    model: llava
    capabilities:
      image: true
      pdf: false

Issue expectations mapping

Issue option Status
Per-model capability override in config done (capabilities)
Warn rather than silently drop done (deduplicated miss diagnostic)
Provider to models.dev mapping table not done; shown ineffective against current models.dev data

Backward compatibility

Capabilities is optional and nil by default; existing configs are unaffected. Only pkg/config/latest is touched (frozen config versions untouched).

Tests

Test Covers
pkg/model/provider/oaistream repro the drop for an uncatalogued provider/model, and the override restoring the image
pkg/modelinfo ResolveCaps plus diagnostic override precedence, nil fallback, once-per-model warning
pkg/config/latest YAML round-trip, shorthand, deep clone of the override
pkg/model/provider end-to-end override survives Registry.New to the built client CapsOverride()
pkg/model/provider/base CapsOverride() mapping

Notes and residual design points

  • Status is needs-design; the chosen direction is the override plus the diagnostic. The field name and the "override fully wins" semantics are open for maintainer review.
  • Anthropic keeps the "any Claude model supports image and PDF" heuristic when no override is set; for a Claude model absent from models.dev the diagnostic may fire once even though the heuristic then restores capabilities. This is harmless and deduplicated.

Custom and aliased OpenAI-compatible providers (xai, mistral, nebius, ollama,
and user-defined proxies) resolve attachment capabilities through a models.dev
lookup keyed on the config provider/model string. When that pair is absent from
models.dev (uncatalogued provider, local Ollama tags, or a dropped model
version), detection silently fell back to text-only and dropped image and PDF
attachments (issue #2741).

Add a typed `capabilities: {image, pdf}` override on the model config. When set
it is authoritative and bypasses the models.dev lookup; when omitted, behaviour
is unchanged. Also emit a deduplicated diagnostic when a model is absent from
models.dev so the degraded behaviour is no longer silent.

The override is resolved once per client (base.Config.CapsOverride) and threaded
through the five attachment-conversion paths: oaistream (openai chat, dmr),
openai responses, anthropic, bedrock, gemini. modelinfo.ResolveCaps takes plain
booleans so modelinfo keeps no dependency on the config package.
@Sayt-0 Sayt-0 requested a review from a team as a code owner June 22, 2026 19:52
@aheritier aheritier added area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) kind/fix PR fixes a bug (maps to fix:). Use on PRs only. labels Jun 22, 2026
@aheritier aheritier marked this pull request as draft June 25, 2026 12:01
@aheritier aheritier added the status/needs-rebase PR has merge conflicts or is out of date with main label Jun 25, 2026
Reconcile the per-model attachment capability override (#2741) with main's
attachment caps-injection refactor.

Conflicts and resolution:
- pkg/model/provider/oaistream/messages.go, attachments.go: keep main's
  caps-injection internals (convertMessagesWithCaps / convertMultiContentWithCaps
  / convertDocumentWithCaps and the public ConvertMessagesWithCaps); re-add the
  override only at the public ConvertMessages / ConvertMultiContent entry points,
  resolving via modelinfo.ResolveCaps instead of LoadCaps. The now-dead
  convertDocument store wrapper is dropped.
- pkg/model/provider/dmr/client.go: default attachment capabilities to those
  declared in provider_opts (main's attachmentCaps); a config capabilities
  override, when present, takes precedence.
@Sayt-0 Sayt-0 force-pushed the fix/2741-attachment-caps-override branch from 4b2a969 to 8f24cef Compare June 25, 2026 16:58
@Sayt-0 Sayt-0 marked this pull request as ready for review June 25, 2026 17:02
@aheritier aheritier removed the status/needs-rebase PR has merge conflicts or is out of date with main label Jun 26, 2026
@dgageot dgageot merged commit 3fc2fb2 into main Jun 26, 2026
9 checks passed
@dgageot dgageot deleted the fix/2741-attachment-caps-override branch June 26, 2026 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/providers For features/issues/fixes related to LLM providers (Bedrock, LiteLLM, Qwen, custom, etc.) kind/fix PR fixes a bug (maps to fix:). Use on PRs only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom/aliased OpenAI-compatible providers: attachment caps may degrade to text-only

3 participants