💰 TokenWatch

Compare pay-as-you-go LLM inference pricing across inference providers. Enter your token volumes and find the cheapest option.

Live site: https://payg-inference-calculator.pages.dev

How it works

scripts/fetch-pricing.mjs fetches pricing from 3 tiers: direct providers (DeepInfra, Crof, EmberCloud, Wafer, Synthetic, Lilac), OpenRouter's de-aggregated /endpoints API (per-backend pricing for Fireworks, Together, Novita, SiliconFlow, etc.), and CSV-sourced static providers (Hyper, Makora, Xiaomimimo, OpenCode Go). Normalizes all pricing to $/M tokens and writes public/pricing.json.
public/ is a zero-dependency static site (HTML/CSS/JS) that loads pricing.json client-side and computes costs in-browser.
GitHub Actions runs the fetch script daily (0 0 * * * UTC), commits updated pricing, and deploys to Cloudflare Pages.

Usage

Search by provider: Type a provider name (e.g. "deepinfra", "fireworks", "wafer") to filter results to that inference provider across all models.
Search by model: Type a model name (e.g. "glm", "kimi", "gpt-4o") to filter results to matching models across all providers.
Both together: Use both search fields simultaneously (AND filter).
Token input: Enter total tokens (in millions) and set the percentage breakdown across input, cached input, and output. The calculator computes costs per offering and sorts cheapest-first.
Promo badges: Discounted offerings show a "promo" badge with the discount percentage. These are temporary prices — structural prices have no badge.

Token calculation

Costs are computed from a total token volume + percentage breakdown:

Field	Default	Description
Total tokens	1000 (M)	Total tokens in millions (1000 = 1B tokens)
Input %	2.5%	Tokens sent to the model
Cached input %	97%	Cached prompt tokens (discounted input)
Output %	0.5%	Tokens generated by the model

Example: 1000M tokens × 2.5% = 25M input tokens. Cost = (25M × $/M) / 1e6.

Presets: Agentic (2.5/97/0.5), Balanced (30/50/20), Heavy output (10/0/90), No cache (70/0/30).

Data sources

Source	Tier	Description
Direct providers	Tier 1	DeepInfra, Crof, EmberCloud, Wafer, Synthetic, Lilac — fetched via their own `/v1/models` endpoints
OpenRouter `/endpoints`	Tier 2	De-aggregated per-backend pricing — each backend (Fireworks, Together, Novita, SiliconFlow, etc.) becomes its own row
CSV-sourced	Tier 3	Hyper, Makora, Xiaomimimo (from `data/manual-pricing.csv`)
Hardcoded	Tier 3	OpenCode Go (16 models with user-provided pricing)

3-tier precedence: when the same (model, provider) appears in multiple tiers, the higher-authority tier wins — direct > OpenRouter > CSV/hardcoded. Quantization is not part of the dedup key — same model+provider at different quants collapses to one row.

Total: ~892 text-generation models across ~75 inference providers and 60+ underlying orgs (Anthropic, OpenAI, Google, DeepSeek, Z.ai, Qwen, Meta, Mistral, etc.)

Only text-generation models are included. TTS, image generation, video generation, and embeddings are filtered out. Multimodal input (text+image→text) is allowed.

Development

# Fetch pricing data (~317 API calls, ~15-20s)
npm run fetch

# Serve locally
npm run serve

Requires Node ≥18 (uses native fetch). No dependencies.

Project structure

scripts/
  fetch-pricing.mjs          # 3-tier fetch + OpenRouter de-aggregation + org extraction + dedup
data/
  manual-pricing.csv          # Static pricing for CSV-sourced providers
public/
  index.html                 # UI: dual search, usage inputs, results table (8 columns)
  app.js                     # State, search, cost computation, rendering (promo badges)
  styles.css                 # Dark/light theme, promo-badge, header-row, responsive
  pricing.json               # Generated data (refreshed daily by CI)
.github/workflows/
  refresh-pricing.yml        # Daily cron: fetch → commit → deploy to Cloudflare

CI/CD

The refresh-pricing.yml workflow runs daily at 00:00 UTC:

Fetches pricing from all sources (~317 API calls)
Filters to text-generation models only
Applies 3-tier dedup precedence
Aborts if >20% of API calls fail or model count drops >15% vs previous run
Commits pricing.json if changed
Deploys public/ to Cloudflare Pages

GitHub secrets required: CLOUDFLARE_API_TOKEN, CLOUDFLARE_ACCOUNT_ID.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
data		data
docs/conversations		docs/conversations
public		public
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💰 TokenWatch

How it works

Usage

Token calculation

Data sources

Development

Project structure

CI/CD

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

💰 TokenWatch

How it works

Usage

Token calculation

Data sources

Development

Project structure

CI/CD

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages