feat(modelsdev): embed models.dev catalog snapshot as binary fallback#3277
Merged
Conversation
docker-agent
left a comment
Contributor
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
One finding worth addressing before merge. Overall the design is sound — the authoritative flag correctly prevents pinning the embedded fallback in s.db, and the snapshot/embed wiring is clean. The generator tool, memoization logic, and test coverage look correct.
cf244c8 to
d1c7f36
Compare
Adds a build-time snapshot of the models.dev catalog baked into the binary via go:embed, used as last-resort fallback when the cache and live fetch are both unavailable. A generator (pkg/modelsdev/internal/gen) refreshes the snapshot; task update-models and a weekly GitHub Actions workflow keep it current. The fallback snapshot is not memoized on transient fetch failure, so the Store retries the fetch once the network recovers. Context windows for Fable and Opus 4.6-4.8 now come from the embedded snapshot (always available, even offline); the Anthropic client falls back to the flat 200k floor only when no catalogue entry exists. A warning is logged when the embedded snapshot fails to parse so a bad build artifact is discoverable. The snapshot freshness check is opt-in: TestSnapshotDateParses always runs (time-independent) while TestSnapshotDateIsFresh is skipped unless CHECK_MODELS_SNAPSHOT_FRESHNESS is set. A 'task check-models-fresh' target provides the CI backstop. Assisted-By: Claude
d1c7f36 to
2d15724
Compare
melmennaoui
approved these changes
Jun 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Without a network connection and with no on-disk cache, the agent currently falls back to a small set of hardcoded model defaults. This means a fresh install in an air-gapped or offline environment has no useful model catalog at all. This change embeds a build-time snapshot of the full models.dev catalog into the binary as a last-resort fallback, so a fresh binary always ships with a complete, usable catalog even when nothing else is available.
The snapshot (
pkg/modelsdev/snapshot.json, ~1.5 MB compact JSON) is generated by a small tool inpkg/modelsdev/internal/genthat fetchesmodels.dev/api.jsonand re-marshals it through the trimmedDatabasetype to keep the file small. It is regenerated viatask update-modelsorgo generate ./pkg/modelsdev/.... A weekly GitHub Actions workflow (.github/workflows/update-models.yml) keeps the snapshot current by opening an automated PR — this avoids stale data without tying reproducibility to a live network fetch at build time.loadDatabasenow returns both a database and an authoritative flag; theStoreonly memoizes results that are authoritative (cache hit or live fetch), so a transient fetch failure does not pin the embedded fallback for theStore's lifetime — a subsequent lookup will retry once the network recovers.Because the snapshot reliably carries Claude Fable and Opus 4.6–4.8 with their correct 1 M context windows, the hardcoded
DefaultClaudeContextLimitspecial-casing in the Anthropic client has been removed. The client now falls back to a conservative 200 k floor only when no catalog entry exists at all. The snapshot freshness test (TestSnapshotDateIsFresh) is opt-in via theCHECK_MODELS_SNAPSHOT_FRESHNESSenv var so it does not become a CI time-bomb on forks or stale branches after 90 days; atask check-models-freshtarget runs it explicitly, while a separate always-on test validates that the embedded date is well-formed.