Skip to content

refactor(router): unified model catalog as single source of truth#189

Merged
steventohme merged 1 commit into
mainfrom
steven/router-model-catalog
May 18, 2026
Merged

refactor(router): unified model catalog as single source of truth#189
steventohme merged 1 commit into
mainfrom
steven/router-model-catalog

Conversation

@steventohme
Copy link
Copy Markdown
Collaborator

Summary

Introduces internal/router/catalog/ — one struct literal per model holding tier, ordered provider bindings, and per-binding pricing. pricing and capability become thin facades over catalog, preserving every existing public type and helper.

Motivation

Adding a single new model used to require touching up to 11 places:

  • internal/router/pricing/pricing.go table
  • internal/router/capability/tier.go tiers + tier_test.go
  • internal/router/cluster/artifacts/v0.X/model_registry.json
  • internal/router/cluster/artifacts/v0.X/rankings.json
  • internal/router/cluster/artifacts/v0.X/metadata.yaml
  • install/install.sh, install/cc-statusline.sh
  • the per-provider modelIDMap in cmd/router/main.go (when SOC 2 routing flips a model)

It's all the same data structurally — exactly the smell aider, litellm, Cline, and continue.dev solve with a single typed catalog. This PR collapses the Go-side pieces of that pile to one struct literal in catalog.go plus a go run ./cmd/genprices to regenerate the shell price block.

Multi-provider per model (PR 8.5 of the SOC 2 plan)

Each Model carries an ordered Providers []ProviderBinding list. The cluster scorer picks the first binding whose Provider name is in the deploy's available set, and the planner can ask the catalog for (provider, model)-keyed pricing. Today every entry is a single-element list — the shape is in place so the SOC 2 isolation work (#187) can append an OpenRouter fallback to each OSS row without touching any call site:

{ID: "deepseek/deepseek-v4-pro", Tier: TierHigh, Providers: []ProviderBinding{
    {Provider: ProviderFireworks,  UpstreamID: "...", Price: Pricing{...}},
    {Provider: ProviderOpenRouter,                    Price: Pricing{...}}, // self-hoster fallback
}},

Managed prod (no OPENROUTER_API_KEY) gets the primary binding; self-hosters with only an OpenRouter key get the trailing one.

What changed

  • New package internal/router/catalog/ with catalog.go (data) + lookup.go (accessors) + catalog_test.go (schema invariants).
  • internal/router/pricing/pricing.gotable deleted; For/All/Pricing/DefaultCacheReadMultiplier now thin pass-throughs into catalog. Math helpers (EffectiveInputCost, EffectiveOutputCost) unchanged.
  • internal/router/capability/tier.gotiers map deleted; Tier/TierLow/TierMid/TierHigh aliased from catalog; TierFor/IsAtOrBelow/AllowedAtOrBelow/Validate delegate.
  • cmd/genprices already reads through otel.AllPricing()pricing.All() → catalog. Zero-diff regen.
  • CLAUDE.md + AGENTS.md mirrors updated for internal/router/, pricing/, capability/, and the new catalog/.

Behavior- and price-preserving

  • pricing.For / pricing.All / pricing.EffectiveInputCost / EffectiveOutputCost API-stable; values byte-identical to pre-refactor.
  • capability.TierFor / IsAtOrBelow / AllowedAtOrBelow / Validate API-stable.
  • install/install.sh + install/cc-statusline.sh regenerate with zero diff.

What's NOT in this PR

  • The planner call-site switch to catalog.PriceFor(provider, model) and the cluster scorer's ResolveBinding plumbing — both land in the follow-up PR that actually introduces multi-binding rows (SOC 2 provider isolation).
  • internal/router/model.go ModelSpec (wire-format capabilities like CapReasoning, CapExtendedThinking) — separate concept, not folded here.
  • No retraining, no artifact version bump.

Test plan

  • go test ./... green
  • go run ./cmd/genprices produces zero diff vs current install/ files
  • New catalog_test.go invariants: no dup IDs, every model has ≥1 binding, every Provider is a canonical constant, every binding has positive pricing, ByID date-suffix fallback works
  • Staging boot: cluster scorer initializes against catalog-backed pricing/tier; no Validate() regressions

🤖 Generated with Claude Code

Introduces internal/router/catalog/ — one struct literal per model holding
tier, ordered provider bindings, and per-binding pricing. Deletes the
internal/router/pricing/ and internal/router/capability/ packages; their
public surface was already a single-row table plus a few thin helpers,
and after the catalog absorbed them they had no logic left worth keeping.

Cost math (EffectiveInputCost / EffectiveOutputCost) lives in
catalog/cost.go so there's exactly one place in the codebase that owns
per-model data and the math built on top of it. The OTel emitter,
telemetry write path, billing debit hook, and planner all funnel through
catalog directly — no facade layer in between.

Adding a model used to touch up to 11 places (pricing table, tier table,
ModelSpec registry, model_registry.json, rankings.json, metadata.yaml,
install scripts, capability tests, and the per-provider modelIDMap in
cmd/router/main.go). After this change it's one struct literal in
catalog.go plus the model_registry.json entry for whichever cluster
artifact bundles serve it; `go run ./cmd/genprices` regenerates the
install-script price block.

The multi-binding shape is in place even though every model carries a
single binding today — this lets the SOC 2 provider-isolation work
(#187) append fallback bindings (e.g. OpenRouter behind
Bedrock/DeepInfra/Fireworks primaries) without touching call sites.

Behavior- and price-preserving:
- install/install.sh + install/cc-statusline.sh regenerate with zero diff.
- All existing tests pass (gofmt + go test ./... clean).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@steventohme steventohme force-pushed the steven/router-model-catalog branch from a8c0e19 to a636492 Compare May 18, 2026 02:50
@steventohme steventohme merged commit 9ea6fd0 into main May 18, 2026
7 checks passed
@steventohme steventohme deleted the steven/router-model-catalog branch May 18, 2026 02:52
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit a636492. Configure here.

Comment thread internal/proxy/service.go
Float64("catalog.requested_input_per_1m", reqPricing.InputUSDPer1M).
Float64("catalog.requested_output_per_1m", reqPricing.OutputUSDPer1M).
Float64("catalog.actual_input_per_1m", actPricing.InputUSDPer1M).
Float64("catalog.actual_output_per_1m", actPricing.OutputUSDPer1M).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTel span attribute keys silently renamed breaking downstream consumers

High Severity

The OTel span attribute keys were renamed from pricing.requested_input_per_1m / pricing.actual_input_per_1m etc. to catalog.requested_input_per_1m / catalog.actual_input_per_1m. These are wire-protocol names consumed by external dashboards, alerting rules, and analytics queries. Any downstream consumer querying the old pricing.* keys will silently receive no data. The PR claims to be "behavior- and price-preserving" but these attribute key names are part of the observable contract. The attribute keys are an external interface, not an internal Go package name — they needn't track the refactor.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit a636492. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant