Skip to content

feat(modelsdev): embed models.dev catalog snapshot as binary fallback#3277

Merged
dgageot merged 2 commits into
docker:mainfrom
dgageot:embed-modelsdev-snapshot
Jun 29, 2026
Merged

feat(modelsdev): embed models.dev catalog snapshot as binary fallback#3277
dgageot merged 2 commits into
docker:mainfrom
dgageot:embed-modelsdev-snapshot

Conversation

@dgageot

@dgageot dgageot commented Jun 26, 2026

Copy link
Copy Markdown
Member

Without a network connection and with no on-disk cache, the agent currently falls back to a small set of hardcoded model defaults. This means a fresh install in an air-gapped or offline environment has no useful model catalog at all. This change embeds a build-time snapshot of the full models.dev catalog into the binary as a last-resort fallback, so a fresh binary always ships with a complete, usable catalog even when nothing else is available.

The snapshot (pkg/modelsdev/snapshot.json, ~1.5 MB compact JSON) is generated by a small tool in pkg/modelsdev/internal/gen that fetches models.dev/api.json and re-marshals it through the trimmed Database type to keep the file small. It is regenerated via task update-models or go generate ./pkg/modelsdev/.... A weekly GitHub Actions workflow (.github/workflows/update-models.yml) keeps the snapshot current by opening an automated PR — this avoids stale data without tying reproducibility to a live network fetch at build time. loadDatabase now returns both a database and an authoritative flag; the Store only memoizes results that are authoritative (cache hit or live fetch), so a transient fetch failure does not pin the embedded fallback for the Store's lifetime — a subsequent lookup will retry once the network recovers.

Because the snapshot reliably carries Claude Fable and Opus 4.6–4.8 with their correct 1 M context windows, the hardcoded DefaultClaudeContextLimit special-casing in the Anthropic client has been removed. The client now falls back to a conservative 200 k floor only when no catalog entry exists at all. The snapshot freshness test (TestSnapshotDateIsFresh) is opt-in via the CHECK_MODELS_SNAPSHOT_FRESHNESS env var so it does not become a CI time-bomb on forks or stale branches after 90 days; a task check-models-fresh target runs it explicitly, while a separate always-on test validates that the embedded date is well-formed.

@dgageot dgageot requested a review from a team as a code owner June 26, 2026 16:43

@docker-agent docker-agent left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

One finding worth addressing before merge. Overall the design is sound — the authoritative flag correctly prevents pinning the embedded fallback in s.db, and the snapshot/embed wiring is clean. The generator tool, memoization logic, and test coverage look correct.

Comment thread pkg/modelsdev/snapshot.go
@aheritier aheritier added area/ci CI/CD workflows and pipeline area/models LLM model integrations and model providers area/providers/anthropic For features/issues/fixes related to the usage of Anthropic models kind/feat PR adds a new feature (maps to feat:). Use on PRs only. labels Jun 26, 2026
@dgageot dgageot force-pushed the embed-modelsdev-snapshot branch from cf244c8 to d1c7f36 Compare June 27, 2026 05:50
Adds a build-time snapshot of the models.dev catalog baked into the
binary via go:embed, used as last-resort fallback when the cache and
live fetch are both unavailable. A generator (pkg/modelsdev/internal/gen)
refreshes the snapshot; task update-models and a weekly GitHub Actions
workflow keep it current.

The fallback snapshot is not memoized on transient fetch failure, so the
Store retries the fetch once the network recovers. Context windows for
Fable and Opus 4.6-4.8 now come from the embedded snapshot (always
available, even offline); the Anthropic client falls back to the flat
200k floor only when no catalogue entry exists. A warning is logged when
the embedded snapshot fails to parse so a bad build artifact is
discoverable.

The snapshot freshness check is opt-in: TestSnapshotDateParses always
runs (time-independent) while TestSnapshotDateIsFresh is skipped unless
CHECK_MODELS_SNAPSHOT_FRESHNESS is set. A 'task check-models-fresh'
target provides the CI backstop.

Assisted-By: Claude
@dgageot dgageot force-pushed the embed-modelsdev-snapshot branch from d1c7f36 to 2d15724 Compare June 27, 2026 14:00
@dgageot dgageot merged commit 36180b4 into docker:main Jun 29, 2026
7 checks passed
@dgageot dgageot deleted the embed-modelsdev-snapshot branch June 29, 2026 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci CI/CD workflows and pipeline area/models LLM model integrations and model providers area/providers/anthropic For features/issues/fixes related to the usage of Anthropic models kind/feat PR adds a new feature (maps to feat:). Use on PRs only.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants