Skip to content

feat(playground): show per-message cost using models.dev pricing#2255

Merged
samuv merged 3 commits into
mainfrom
feat/chat-cost
May 14, 2026
Merged

feat(playground): show per-message cost using models.dev pricing#2255
samuv merged 3 commits into
mainfrom
feat/chat-cost

Conversation

@samuv
Copy link
Copy Markdown
Collaborator

@samuv samuv commented May 14, 2026

Summary

Adds a per-message USD cost next to the existing token breakdown on assistant messages in the playground chat. Pricing comes from models.dev (USD per 1M tokens), fetched once and cached on disk for 24h in the main process.

  • Inline: 100 → 50 = 150 • $ 0.0012 next to the existing token totals.
  • Tooltip: appends a Cost section with Input, Cached, Output, Total breakdown.
  • Cached input tokens use the model's cache_read rate when models.dev exposes it (e.g. Claude Sonnet 4.5, Haiku 4.5); otherwise treated as regular input.
  • No pricing → no cost rendered. Local providers (ollama, lmstudio) and unknown models render exactly as they did before.
  • Coverage: openai, anthropic, google, xai, and all 188 OpenRouter models with cost entries (slash-prefixed IDs already match).
Screenshot 2026-05-14 at 16 34 22

How it works

  • main/src/chat/pricing.ts fetches https://models.dev/api.json, caches { fetchedAt, data } to userData/models-dev-cache.json, refreshes in the background when older than 24h, and falls back to the last cached copy when offline.
  • IPC handler chat:get-model-pricing returns the extracted Record<providerId, Record<modelId, ModelCost>> map.
  • Renderer hook useModelPricing caches the map in React Query (staleTime: 24h).
  • calculateCost(usage, pricing) is a pure function — easy to unit-test, no React, no DOM.

What deliberately does NOT change

  • No new fields on MessageMetadata — cost is derived on render from existing totalUsage + model + the cached pricing map.
  • No changes to streaming, IPC transport, or persisted thread data.
  • No thread-level totals (per design decision).
  • No special-casing for ollama / lmstudio — they naturally have no pricing entry and fall through to "no cost rendered."

Test plan

  • pnpm run lint
  • pnpm run type-check
  • pnpm run test:nonInteractive (212 files, 2402 tests, all green)
  • Manual: open playground, pick an OpenAI / Anthropic / OpenRouter model with a known price, send a message → confirm inline $x.xxxx and tooltip breakdown appear after the response completes.
  • Manual: switch to Ollama or LM Studio → confirm the message renders exactly as before (no cost line).
  • Manual: airplane-mode reload → cached pricing still works.

🤖 Generated with Claude Code

Adds a per-message USD cost next to the existing token breakdown on
assistant messages. Cost is computed from models.dev pricing
(USD/1M tokens), fetched once and cached on disk for 24h in the main
process. Cached input tokens use the model's cache_read rate when
available. Providers/models with no pricing entry (local providers,
unknown models) render exactly as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 14, 2026 14:04
@samuv samuv self-assigned this May 14, 2026
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-message USD cost estimation to Playground assistant messages by fetching model token pricing from models.dev in the main process (disk-cached for 24h), exposing it via IPC, caching it in the renderer via React Query, and computing a cost breakdown from existing token usage metrics.

Changes:

  • Added main-process pricing fetch + 24h disk cache and an IPC endpoint (chat:get-model-pricing) to expose a provider/model pricing map.
  • Added renderer React Query hook to retrieve pricing and a pure calculateCost/formatUsd utility with unit tests.
  • Updated TokenUsage UI to render inline cost and a tooltip breakdown when pricing is available (and added component tests).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
renderer/src/features/chat/lib/calculate-cost.ts Pure cost calculation + USD formatting helpers.
renderer/src/features/chat/lib/tests/calculate-cost.test.ts Unit tests for cost calculation and formatting.
renderer/src/features/chat/hooks/use-model-pricing.ts React Query hook to fetch/cache pricing map from IPC.
renderer/src/features/chat/components/chat-message/token-usage.tsx UI: inline cost display + tooltip cost breakdown using pricing map.
renderer/src/features/chat/components/chat-message/assistant-message.tsx Passes model into TokenUsage so pricing lookup can be model-specific.
renderer/src/features/chat/components/chat-message/tests/token-usage.test.tsx Component tests for cost rendering behavior.
preload/src/api/chat.ts Adds getModelPricing() to the preload chat API + types.
main/src/ipc-handlers/chat/pricing.ts Registers IPC handler for chat:get-model-pricing.
main/src/ipc-handlers/chat/index.ts Wires pricing IPC handler registration into chat IPC registration.
main/src/ipc-handlers/chat/tests/index.test.ts Updates registration test to include pricing handler.
main/src/chat/pricing.ts Implements models.dev fetch, extraction, in-memory cache, and disk cache w/ TTL + offline fallback.

Comment thread renderer/src/features/chat/hooks/use-model-pricing.ts Outdated
Comment thread renderer/src/features/chat/components/chat-message/token-usage.tsx
Comment thread main/src/chat/pricing.ts
- useModelPricing now only enables its React Query fetch when a remote
  provider+model pair is known (skips ollama/lmstudio and the empty
  state). Avoids a no-op IPC + models.dev fetch for local-only users.
- Inline cost rendering no longer strips the dollar sign from the
  formatter output. The "<$0.01" case now reads correctly inline
  instead of "<0.01".
- Add unit tests for the main-process pricing module: cold-start
  fetch + write, warm disk cache within TTL, stale cache returns
  immediately + background refresh, offline with no disk cache, and
  non-OK upstream response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/L and removed size/M labels May 14, 2026
@samuv samuv merged commit 064ce77 into main May 14, 2026
18 checks passed
@samuv samuv deleted the feat/chat-cost branch May 14, 2026 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants