refactor(core): source pricing + context limits from tokenlens registry by faraa2m · Pull Request #8 · faraa2m/tokenometer

faraa2m · 2026-05-09T03:34:14Z

Summary

Replace the hand-maintained RATES / MODELS tables in packages/core/src/rates.ts with a registry built at module load from @tokenlens/models (anthropic + openai + google subpaths). 7 curated models → 42 stable models, no extra manual maintenance.
Every ModelDescriptor now carries contextWindow, maxOutputTokens, and pricingSource: 'local' | 'tokenlens' (from models.dev).
Surface the new metadata in CLI (Limits: block under the cost table), web Playground (provider-grouped selector with · 128k context chips), and GitHub Action (Limits: line in the sticky PR comment).

Public API (getRate, getModel, KNOWN_MODELS, RateEntry, ModelDescriptor) is byte-compatible — no consumer code outside the touched files needs to change.

Why a hybrid registry, not a full swap?

claude-haiku-4-5 / claude-opus-4-7 / claude-sonnet-4-6 are not yet in tokenlens upstream (its anthropic catalog tops at claude-opus-4-1-20250805). They stay in a small LOCAL_OVERRIDES table that wins on id collision. A weekly cron (.github/workflows/registry-check.yml) runs scripts/check-overrides.mjs:

Detects when upstream picks up an overridden id → fail with "drop the override".
Diffs tokenlens-sourced pricing/context against packages/core/src/__snapshots__/registry.json → fail on drift, prompt to regenerate the snapshot.

On failure the workflow opens (or comments on) a tracking issue.

Other changes

benchmarks/run.mjs gains --filter / --models (also honors BENCH_MODELS env) so the regenerate sweep can be scoped — the matrix grew from 7×N×M to 42×N×M cells.
benchmarks/results.json regenerated against the full sweep.
Rates tests loosened from exact-dollar assertions → per-model positive-price invariants + canary checks on stable IDs + registry-size floor (>=15) so a broken registry can't silently pass.
packages/core adds two npm scripts: snapshot:registry, check:overrides.

Test plan

npm run build (all packages compile)
npm run lint (biome clean)
npm run typecheck
npx vitest run — 48 tests pass (was 46; added 2 for renderModelLimits)
npm run benchmarks — drift check passes against regenerated results.json
npm run check:overrides -w @tokenometer/core → OK — 42 models, 3 overrides intact, snapshot in sync.
CLI smoke: echo "..." | node packages/cli/dist/index.js - --model claude-opus-4-7,gpt-4o produces the expected table + new Limits: block + summary
First scheduled run of registry-check.yml (Mondays 12:00 UTC) — verify via workflow_dispatch after merge

Open follow-ups (out of scope)

The web bundle grows by the size of the bundled models.dev provider data (~33KB raw across the 3 providers). For an even smaller browser footprint, tokenlens ships an async fetchModels() path that could be opted into for packages/web only.
Adding gpt-3.5-turbo / gpt-4 / gpt-4-turbo (now in KNOWN_MODELS) reuses the existing o200k_base openai dispatch in tokenize.ts. Those older models are technically cl100k_base. Pre-existing bug surfaced by the wider catalog; worth a dedicated fix.

🤖 Generated with Claude Code

Replace the hand-maintained RATES/MODELS tables in packages/core with a registry built at module load from @tokenlens/models (anthropic + openai + google subpaths). The 7-entry curated set grows to 42 stable models with no additional manual maintenance, and every model now carries contextWindow + maxOutputTokens + pricingSource metadata sourced from models.dev. Three Anthropic models (claude-haiku-4-5, claude-opus-4-7, claude-sonnet-4-6) are not yet in tokenlens upstream, so they remain in a small LOCAL_OVERRIDES table that wins on id collision. A weekly .github/workflows/registry-check.yml runs scripts/check-overrides.mjs which detects when upstream catches up (action: drop the override) or when tokenlens-sourced pricing drifts from the checked-in snapshot at packages/core/src/__snapshots__/registry.json. CI opens a tracking issue on findings. Public API (getRate, getModel, KNOWN_MODELS, RateEntry, ModelDescriptor) is byte-compatible. Consumers gain the new metadata: - CLI prints a Limits: block under the cost table showing ctx + max output per unique model - web Playground groups model checkboxes by provider with a context-window chip suffix (gpt-4o · 128k) - GitHub Action appends a Limits: line to the sticky PR comment benchmarks/run.mjs gains a --filter / --models flag (also BENCH_MODELS env) so the regenerate sweep can be scoped — the matrix grew from 7×N×M to 42×N×M cells. Existing results.json is regenerated to match. Tests are loosened from exact-dollar assertions to (a) per-model positive-price invariants, (b) canary checks on stable IDs (gpt-4o, gpt-4o-mini), (c) a registry-size floor (>=15) so a broken/empty registry can't pass silently. 48 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

vercel · 2026-05-09T03:34:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
tokenometer	Ready	Preview, Comment	May 9, 2026 3:34am

- replace `--format md` with real format names (markdown, text) - drop stale `gemini-2.5-pro` example; add stdin example and `--help` pointer - correct empirical-mode description: real provider countTokens, no caching claim - fix CI snippet to match @v0 action inputs (models/formats/budget) - methodology table: empirical column reflects shipped countTokens dispatch - note pricing now comes from tokenlens registry (post #8 refactor) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ry (#8) Replace the hand-maintained RATES/MODELS tables in packages/core with a registry built at module load from @tokenlens/models (anthropic + openai + google subpaths). The 7-entry curated set grows to 42 stable models with no additional manual maintenance, and every model now carries contextWindow + maxOutputTokens + pricingSource metadata sourced from models.dev. Three Anthropic models (claude-haiku-4-5, claude-opus-4-7, claude-sonnet-4-6) are not yet in tokenlens upstream, so they remain in a small LOCAL_OVERRIDES table that wins on id collision. A weekly .github/workflows/registry-check.yml runs scripts/check-overrides.mjs which detects when upstream catches up (action: drop the override) or when tokenlens-sourced pricing drifts from the checked-in snapshot at packages/core/src/__snapshots__/registry.json. CI opens a tracking issue on findings. Public API (getRate, getModel, KNOWN_MODELS, RateEntry, ModelDescriptor) is byte-compatible. Consumers gain the new metadata: - CLI prints a Limits: block under the cost table showing ctx + max output per unique model - web Playground groups model checkboxes by provider with a context-window chip suffix (gpt-4o · 128k) - GitHub Action appends a Limits: line to the sticky PR comment benchmarks/run.mjs gains a --filter / --models flag (also BENCH_MODELS env) so the regenerate sweep can be scoped — the matrix grew from 7×N×M to 42×N×M cells. Existing results.json is regenerated to match. Tests are loosened from exact-dollar assertions to (a) per-model positive-price invariants, (b) canary checks on stable IDs (gpt-4o, gpt-4o-mini), (c) a registry-size floor (>=15) so a broken/empty registry can't pass silently. 48 tests pass. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- replace `--format md` with real format names (markdown, text) - drop stale `gemini-2.5-pro` example; add stdin example and `--help` pointer - correct empirical-mode description: real provider countTokens, no caching claim - fix CI snippet to match @v0 action inputs (models/formats/budget) - methodology table: empirical column reflects shipped countTokens dispatch - note pricing now comes from tokenlens registry (post #8 refactor) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

faraa2m merged commit 009479f into main May 9, 2026
7 checks passed

faraa2m mentioned this pull request May 9, 2026

docs(readme): align usage with shipped CLI + Action #10

Merged

3 tasks

faraa2m deleted the refactor/tokenlens-pricing-registry branch May 9, 2026 04:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(core): source pricing + context limits from tokenlens registry#8

refactor(core): source pricing + context limits from tokenlens registry#8
faraa2m merged 1 commit into
mainfrom
refactor/tokenlens-pricing-registry

faraa2m commented May 9, 2026

Uh oh!

vercel Bot commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

faraa2m commented May 9, 2026

Summary

Why a hybrid registry, not a full swap?

Other changes

Test plan

Open follow-ups (out of scope)

Uh oh!

vercel Bot commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant