Skip to content

Releases: Metabuilder-Labs/tokenjam

v0.5.3

Choose a tag to compare

@anilmurty anilmurty released this 03 Jul 16:22
b76b012

What's Changed

  • Fix #303: add long-window options to Traces by @sanmaxdev in #304
  • Fix #299: align recent activity trace window by @sanmaxdev in #305
  • Fix #284: backfill --since flag parity by @sanmaxdev in #307
  • [chore] Complete StorageBackend protocol, remove direct db.conn access (#309) by @anilmurty in #314
  • [enhancement] Surface sample size + confidence interval on savings estimates (#308) by @anilmurty in #316
  • fix(lens): gate Analytics Stack control to stacking charts (#295) by @ashwmu in #315
  • Fix #310: use utcnow instead of datetime.now in ApiBackend by @tarun73 in #317
  • fix(lens): show leaderboard total + reconcile partial-dimension gap (#313) by @ashwmu in #319
  • Fix #320: capture prompt + completion content on provider patches by @anilmurty in #321
  • feat(lens): show breakdown subtotal on the active KPI tile (#318) by @ashwmu in #322
  • Fix #298: Clarify drift baseline agent scope and live session rules by @tarun73 in #323
  • Feat/mcp wiring doctor check by @tarun73 in #324
  • feat(ui): dynamic analytics csv export with loading state by @tarun73 in #325
  • docs: cross-link tokenjam-bench as the "prove" step (#336) by @anilmurty in #337
  • docs: add missing CLI reference commands (#280) by @sanmaxdev in #338
  • Add summarize scan surface: tj summarize list + MCP tool by @auspexlabs in #354
  • test: isolate CLI onboard tests so test_cli.py can't hang (#23) by @anshss in #339
  • fix: batch of 7 bug-hunt fixes (optimize, mcp, db, analytics, sdk) by @anshss in #353
  • Add summarize mechanism: prep/check — CLI (manual) + MCP (in-session) by @auspexlabs in #355
  • Add summarize apply/undo + backup (#329) by @auspexlabs in #356
  • Add summarize delivery: prep --via claude-p + --via api (#329) by @auspexlabs in #357
  • feat: Add pre-filled tjb command after model-downgrade finding (#330) by @tarun73 in #358
  • feat: Docker image for tj serve by @vigneshvinfra in #359
  • feat(cli): add read-only tj pricing list command by @Axiya3749 in #361
  • docs: document tj pricing list command (#282 follow-up) by @anilmurty in #362
  • Add Bedrock Nova Micro/Premier, Anthropic-on-Bedrock, and LiteLLM provider pricing entries by @Axiya3749 in #366
  • fix: deflake test_budget_cap_policy timezone-boundary test by @anilmurty in #372
  • fix: Claude Code session status + lifecycle (1/5 from #306) by @anshss in #367
  • feat: tj context cost diagnostics (3/5 from #306) by @anshss in #368
  • feat: zero-install / npm package (2/5 from #306, stacked on #368) by @anshss in #369
  • fix: Claude Code cost & alert correctness (4/5 from #306) by @anshss in #370
  • feat: session Approach / Map / Timeline (5/5 from #306, stacked on #367) by @anshss in #371
  • test: cross-backend StorageBackend parity suite (#51) by @anshss in #379
  • Fix #55: self-heal recorded-but-unlanded migrations (silent ingest drop → blank Status) by @anshss in #381
  • [feature] Close the loop: annotations, expectations, fix-history (#53) by @anshss in #380
  • feat(onboard): zero-token statusline for Claude Code/Codex; reserve MCP for SDK (#59) by @anshss in #374
  • Fix #340: resolve real tj binary for daemon install fallback by @anilmurty in #384
  • Fix #373: Bedrock pricing keys never match live spans — normalize provider + model id in get_rates by @anilmurty in #385
  • chore: add CI-gated GHCR Docker image publish workflow by @vigneshvinfra in #360
  • fix(ci): publish npm-wrapper tokenjam on release, version from tag (#65) by @anshss in #386
  • fix(mcp): route get_optimize_report through serve when DB is locked (#61) by @anshss in #388
  • fix(transcript): preserve TodoWrite payload instead of dropping it (#67) by @anshss in #391
  • Fix: Lens Overview false empty on historical-only telemetry by @anilmurty in #392
  • feat: price 1M-context [1m] variants + current OpenAI GPT-5 lineup by @anilmurty in #393
  • Fix #60: flag delegating sessions with unaccounted subagent turns in weighted quota by @anshss in #387
  • fix(#63): make tj context render while tj serve is running by @anshss in #390
  • fix(#62): ship output-cap hook DEFAULT-OFF opt-in — A/B gate failed (CC pre-truncates Bash at 30KB) by @anshss in #389
  • Feat/312 async hooks by @tarun73 in #363
  • docs: lead README with pipx + scrub ccusage competitor references by @anilmurty in #394
  • feat: add Prove (tokenjam-bench) nudge to onboard + tokenmaxx by @anilmurty in #395
  • fix: guard Lens KPI "% vs prev" deltas against near-empty prior window by @anilmurty in #396
  • Fix #294, #300: purge stale-scheme backfill span duplicates (cross-version) by @anilmurty in #397
  • Bump version to 0.5.3 by @anilmurty in #399

New Contributors

Full Changelog: v0.5.2...v0.5.3

v0.5.2 — first-run accuracy & polish

Choose a tag to compare

@anilmurty anilmurty released this 25 Jun 00:12
cce305e

TokenJam 0.5.2 — first-run accuracy & polish

Accuracy fixes (the headline)

  • Backfill no longer over-counts your spend. Claude Code sessions that were resumed or compacted were ingested 2–4× over — the same model calls re-counted under fresh message IDs. Token and cost figures (and everything derived from them) are now correct. If you backfilled before 0.5.2, wipe and re-run tj backfill claude-code to correct your historical numbers. (#294)
  • tj demo now works on installed versions. The incident scenarios were never packaged into the wheel, so tj demo listed nothing for pip/pipx users. Fixed — tj demo retry-loop runs out of the box. (#291)
  • Traces list raised to 200 by default, with "Showing N of M" + a Load more control. (#274)
  • tj budget --json for machine-readable budgets. (#281)

New capability

  • Full-request capture — sampling params and tools/tool_choice are now captured on LLM spans (gated by your [capture] settings). (#209)
  • Experimental enforcement-plane preview — an optional local proxy + policy engine (budget caps, routing/reuse policies) ships in suggest mode only: it shows what it would do and enforces nothing. Interfaces are unstable and off by default. (#219#223)

Docs & brand

  • Clearer contributing guide, Code of Conduct, refreshed dashboard screenshots, new black-on-white logo. (#289, #275, #273)

Thanks to our contributors 🙌

This release includes community contributions from @sanmaxdev (traces pagination #274, Code of Conduct #275) and @gitcommit90 (tj budget --json #281). Thank you!


Install: pipx install tokenjam
Full changelog: v0.5.1...v0.5.2

v0.5.1 — First-run polish

Choose a tag to compare

@anilmurty anilmurty released this 23 Jun 22:49
2c89074

0.5.1 sands down every rough edge a new user hits on day one — building on 0.5.0's Lens Visualizations with a unified dashboard, honest subscription framing, faithful backfill data, and a smoother first run.

Highlights

🏠 A unified Dashboard, now the default — the Analytics pivot explorer and the triage front-door are merged into one screen. Land on it, see your recoverable waste and health at a glance, then click any tile to drill straight into the explorer in place.

💳 Honest subscription framing, everywhere — per-item costs now read in tokens (not a confusing "% of cycle"), and the headline Spend shows a "× plan value" multiplier — your usage expressed as a multiple of what your subscription costs. Subscription users never see raw dollar "spend."

🔍 Faithful backfilled history — Claude Code history now reconstructs into session-level traces (a whole conversation as one trace, with its tool calls nested) instead of thousands of fragments. Cache read vs write tokens are tracked separately, and backfill reports honest session counts.

📊 Clearer charts — the cost-by-component chart no longer collapses into a single block, the cache-savings chart leads with the answer instead of an unreadable dual-axis, and the trace waterfall puts span names in a fixed column with cost-magnitude bars and timestamps.

👋 A warmer first runtj onboard opens with a branded welcome banner and ends with a next-steps nudge that leads with what works immediately (your last 30 days are already loaded — try tj tokenmaxx, tj optimize, tj serve).

✨ Polish — thousand-separated counts, softened first-day deltas, a Status list view by default, a helpful empty state when a metric doesn't apply to a dimension, and a real contribution funnel in the README.

Honest by construction

Every estimate is "estimated recoverable," never "saved." Every dollar-bearing surface respects plan tier — subscription users see token-share and plan-value, never raw spend.

Install / upgrade

pipx install tokenjam          # new
pipx upgrade tokenjam          # existing — then: tj stop && tj serve &

Full changelog: v0.5.0...v0.5.1

v0.5.0 — Lens Visualizations

Choose a tag to compare

@anilmurty anilmurty released this 23 Jun 17:19
3f9bd0d

TokenJam Lens grows from a single spend chart into a full visualization surface. This release ships the Lens Visualizations milestone — new ways to see where your tokens and dollars go, all offline, all plan-tier-honest.

Highlights

📊 Analytics pivot explorer — a new screen that turns one query into any view: pick a metric (spend / tokens / sessions / events), break it down by any dimension (model, agent, tool, day…), stack a second dimension, switch chart type, and export CSV. State lives in the URL, so any view is shareable and bookmarkable.

💸 Cost screen, deeper — stacked cost-by-model/agent breakdown over time, plus a cache-savings time-series showing cumulative captured savings against your cache hit-rate.

🔍 Optimize overview — cost-by-component with a recoverable-waste overlay, so the biggest savings opportunities surface at a glance.

✨ KPI sparklines + deltas — every Analytics headline tile now carries a trend sparkline and a period-over-period change vs. the prior window.

🧭 Trace waterfall, cost-first — the trace detail view annotates each span with cost and tokens alongside a magnitude bar, so you can see where the spend lands across a run.

🎨 Consistent coloring — one shared 12-hue color map gives every series the same color on every chart and screen; grouping by a time dimension no longer produces a confusing rainbow.

↪️ Deep-linked Overview — Overview tiles and the spend headline now click through into the Analytics explorer, pre-filtered to the relevant slice.

Honest by construction

Every dollar-bearing surface respects plan-tier framing: subscription users see token-share, never raw spend. Every estimate is "estimated recoverable," never "saved."

Docs

Install docs now lead with pipx install tokenjam (sidesteps PEP 668 on Homebrew/Debian/Ubuntu Python). The [mcp] extra is no longer needed — the MCP server ships in the base install.

Install

pipx install tokenjam      # recommended
# or: pip install tokenjam  (in a clean venv)

Full changelog: v0.4.2...v0.5.0

v0.4.2 — honesty hardening + cost-accuracy fixes

Choose a tag to compare

@anilmurty anilmurty released this 22 Jun 06:43
d6109e1

A maintenance release closing a full pass of plan-tier-framing and cost-accuracy bugs found by checking TokenJam against real usage, plus first-class user pricing.

Highlights

  • Honest plan-tier framing everywhere. Subscription/local users no longer see raw dollar figures across the CLI (tj cost) or the Lens web UI — Cost table, Traces, trace-detail, Status, and Optimize all suppress/reframe correctly; chart framing is consistent across time windows; CLI and Lens agree on the same data; axes read in local time with coherent daily labels. (#175, #176, #177, #178, #187, #188, #191, #197)
  • Cost accuracy. Fixed LiteLLM provider attribution (was recording "litellm" → ~42% cost undercount, NULL billing account, suppressed plan); LiteLLM now also captures prompt/completion content and cache tokens. Imported sessions inherit the configured plan tier. (#183, #194, #195)
  • First-class user pricing. Override per-model rates from your config — keyed by bare model name so it works even when the provider can't be inferred — with docs. (#200, #201)
  • SDK & onboarding. The SDK honors TJ_CONFIG (no more writing to the wrong DB); onboarding / tj doctor surface when Claude Code needs a restart to start sending telemetry. (#179, #196)

What's Changed

  • docs: README cleanup post-v0.4.1 (Reuse + Lens added, competitive table removed) by @anilmurty in #169
  • docs: drop stale CHANGELOG.md + add maintainer contact by @anilmurty in #170
  • Track A part 1: weekly GitHub traffic archive workflow by @anilmurty in #171
  • Track A part 1 follow-up: workflow pushes as PAT owner to satisfy main-branch protection by @anilmurty in #172
  • Track A: move traffic archive to traffic-data branch (close-out) by @anilmurty in #173
  • docs: add traffic-archive action to CLAUDE.md so future agents discover it by @anilmurty in #174
  • docs: add docs/optimize/reuse.md analyzer deep-dive (#168) by @anilmurty in #180
  • Fix #175, #176: tj cost framing + backfill plan_tier propagation (v0.4.2) by @anilmurty in #181
  • Apply declared plan tiers across onboard, serve, and UI by @HoomanDgtl in #184
  • docs: add PR + commit conventions for any agent producing a PR by @anilmurty in #182
  • Fix #179: surface Claude Code connection staleness (onboard banner + tj doctor check) by @anilmurty in #185
  • Fix #177, #178: consistent Lens chart framing + local-timezone axis labels by @anilmurty in #186
  • Fix #188: keep daily chart date labels UTC-aligned by @anilmurty in #189
  • Fix #187: suppress raw dollar figures for subscription/local users on web table & trace surfaces by @anilmurty in #190
  • Fix #191: suppress raw dollar figures for subscription/local users on Status, Optimize & Reuse web surfaces by @anilmurty in #192
  • Fix #196: SDK bootstrap honors TJ_CONFIG for config discovery by @anilmurty in #199
  • Fix #194: resolve LiteLLM provider from model name (no more "litellm" provider) by @anilmurty in #198
  • Fix #197: CLI and Lens render identical plan-tier framing (window-independent mix) by @anilmurty in #202
  • Fix #195: capture prompt/completion content + cache tokens on LiteLLM spans by @anilmurty in #203
  • Make user pricing first-class + attribution-proof (#200, #201) by @anilmurty in #205
  • Bump version to 0.4.2 by @anilmurty in #206

New Contributors

Full Changelog: v0.4.1...v0.4.2

v0.4.1 — concurrency + cycle-aware run-rate + Reuse HTTP fallback

Choose a tag to compare

@anilmurty anilmurty released this 20 Jun 00:38
d77000d

A quality + follow-up release on top of v0.4.0's marquee. Six issues closed, each as a focused PR that landed cleanly. Pre-release pass: 10/10 PASS, 0 FAIL, 0 UNCLEAR.

🔧 Daemon DB concurrency — the Overview no longer crashes under fan-out

DuckDBBackend.conn is now a per-thread DuckDB cursor (threading.local) over one shared database. Cursors from connect().cursor() are independent connections safe for concurrent use across threads, which is the documented DuckDB pattern.

Pre-fix symptom: with tj serve running, the Overview's parallel endpoint fan-out + Starlette's sync-route threadpool would race on a single shared connection and SIGABRT the daemon. We worked around it by fetching the Overview's panels sequentially. That workaround is now gone — the Overview fetches all six panels via a single Promise.all with deliberately asymmetric error handling (/cost is load-bearing and rejects on failure; the other five panels each carry a .catch fallback so one failing panel renders empty rather than blanking the whole screen).

Empirically verified: 90 concurrent reads against the Overview endpoint set complete with zero errors. Pre-fix this would crash the process. Issue #124, PR #161.

🗓 Cycle-aware run-rate

tj cost and the Lens spend chart's run-rate projection now honor [budget.<provider>] cycle_start_day when configured. Calendar-month users (the default) still see "by end of June"; users with a billing cycle that starts mid-month see "by Jul 15" — a dated form, because "by end of July" would mislead when the cycle ends mid-month.

core/cycle.py is shared between the existing budget_projection analyzer and the cost-API caption, so both surfaces use one piece of cycle math. The API exposes a cycle block on /api/v1/cost; the UI consumes it. Issue #138, PR #158.

📈 tj cost chart no longer silently empties on long windows

buildCostSeries previously had if (xs.length > 5000) return null as a guard against pathological grids — fine for a 90-day cap today, but if anyone ever requests hourly bucketing over a multi-year window the chart would just go blank with no explanation.

It now coarsens up an hour → day → week ladder until the grid fits, computed closed-form so an oversized array is never allocated. When coarsening fires, a footnote on the chart explains it: "Showing day buckets — this range is too long for hourly detail." Genuinely empty windows still render the existing "No spend in this window" state. Issue #139, PR #163.

📊 Status: active (compute) time alongside wall-clock

Pre-fix, a resumed Claude Code session that ran across days showed Duration: 3087m — easy to misread as a runaway. v0.4.1 distinguishes:

  • Active — Σ span durations. The work actually done.
  • Elapsed — wall-clock from session start to last activity. Resumed Claude Code sessions span days; this can be far larger than Active.

Visible in tj status (Duration: active 12m 3s · elapsed 2d 3h), in tj status --json (both fields), and in the Lens Status tile (two distinct rows with tooltips). A new fmtDurLong formatter renders multi-day spans as 2d 3h instead of 3087m. Issue #147, PR #164.

🔁 tj report --reuse works while the daemon is running

The Reuse report needed direct DB access to fetch each cluster's planning-call completion text — so when tj serve held the write lock, the command errored out and pointed the user at tj stop. v0.4.1 adds a dedicated GET /api/v1/reuse/clusters endpoint that returns the Reuse finding plus the skeleton-rendering extras (planning_texts + pricing_mode).

tj report --reuse now dispatches like tj optimize: direct DB connection when available, ApiBackend.fetch_reuse_clusters when the daemon owns the lock. Renderer accepts pre-fetched planning texts as an alternative to a connection. The endpoint is dedicated rather than bolted onto /optimize because per-cluster planning text can be many KB and the Overview polls /optimize every 30s — we don't make every poll pay for report-only data.

tj report --trim retains the same direct-DB limitation, now explicitly documented in CLAUDE.md. Issue #154, PR #165.

🎨 Recoverable Waste tile consistency

Three small but visible inconsistencies on the Lens Overview's tile band:

  • reuse rendered lowercase while the other analyzer names were title-cased. Added an explicit ANALYZER_META entry plus a centralized capitalize() helper used in the fallback path — so the next analyzer that ships will auto-capitalize instead of falling through to its raw lowercase registry key.
  • Trim's — not ready had a leading em-dash while the other tile states didn't. Dropped to plain Not ready so the three states share a prefix-free scheme.
  • Cache title bold investigated and locked with a regression guard. Couldn't reproduce in current source, but the guard test now asserts no state-specific bold-title rule can be silently added.

Issue #162, PR #166.

Upgrade

pipx upgrade tokenjam
tj stop && tj serve &
tj --version   # expect 0.4.1

Existing installs keep their config, data, and daemon setup. No breaking changes.

Pre-release verification

A 10-step focused pre-release pass (tests/agent-pre-release-v0.4.1.md) was executed by a sub-agent against the live daemon. Result: 10/10 PASS, 0 FAIL, 0 UNCLEAR.

Coverage included the marquee #124 concurrency reproduction (90 concurrent reads, no crash), /api/v1/cost.cycle block, tj report --reuse with daemon running, the new active/elapsed status fields, the 90d cost chart render, and the tile consistency fixes — plus regression checks on TokenMaxx, all five analyzers, and the API framing block.

Full log committed to tests/results/agent-pre-release-v0.4.1-20260620T002546Z.md as a release-time record.

Honesty discipline

The release continues the v0.4.0 framing rules:

  • Run-rate captions remain "linear, not a forecast" — no smoothing, no seasonality, no anomaly bands; just a date-cycle projection
  • Coarsened chart windows are explicit about the coarsening, never silent
  • The Reuse HTTP fallback is documented as paying-for-what-you-use (dedicated endpoint instead of bolting it onto /optimize); no hidden cost on Overview polls
  • Status Active vs Elapsed labels distinguish work-done from wall-clock; the misleading bare "Duration" label is gone

Full changelog

v0.4.0...v0.4.1

v0.4.0 — TokenJam Lens + Reuse

Choose a tag to compare

@anilmurty anilmurty released this 19 Jun 21:21
5f1d8a4

First minor bump since v0.3.0. Significant new product surface — substantially more than the 0.3.x patch cadence — so we bumped the minor.

🔭 TokenJam Lens — the local UI is a product

The local dashboard you get from tj serve has been rebranded and rebuilt as TokenJam Lens. It's the same offline-first single-file SPA, but with a real triage front door instead of a list of tables.

  • Overview screen — the new default landing route. Three bands: spend hero (with a real chart and a "to end of month" run-rate projection that's explicitly not a forecast), recoverable-waste tiles (one per analyzer, registry-driven so future analyzers auto-appear), and health-at-a-glance (alerts, drift, budgets, recent activity).
  • Optimize detail tab — every analyzer's findings rendered in one place, with ?finding=<name> deep-links from the Overview tiles.
  • Real charts — uPlot, vendored offline (zero CDN loads, the dashboard works air-gapped). Hover tooltips, theme-aware (light/dark/system), tick granularity scales with the window (hours/days/weeks).
  • URL state as the source of truth — every filter (window, group-by, agent, finding) lives in the hash, so back/forward/reload work cleanly and cross-screen drill-through carries context.
  • Cost transparency — the tj cost table and the Cost screen now show CACHE R and CACHE W columns alongside input/output tokens, so a cache-heavy $1.44 span no longer looks like it came out of nowhere. (The hidden ~91% cost driver on Claude Code-style workloads finally has a name.)

🔁 Reuse — the 5th analyzer

Agents re-plan the same work constantly. Reuse detects clusters of sessions that share a planning skeleton (the first LLM call before any tool call) and surfaces what that repeated planning costs.

  • tj optimize reuse — clusters sessions by structural similarity (tool-sequence signature, plus prompt-prefix hashing when [capture] prompts = true); produces two honest numbers per cluster: cache-reuse savings (what you'd recover by reusing the skeleton) and script-replacement savings (the upper bound if you converted the planning to a deterministic template).
  • tj report --reuse — renders the clusters as an HTML report plus per-cluster Markdown sidecars with variable slots highlighted ({{slot_N}}), idempotent on cluster_id so re-runs overwrite cleanly. The Markdown is copy-paste-usable as a Claude Code slash command or saved prompt.
  • Honesty by construction: "skeleton match," "recoverable," "review before reusing" — never "saves you."

Known limitation: tj report --reuse needs direct DB access today, so tj stop first if the daemon is running. HTTP fallback is tracked for v0.4.1 (#154).

💰 Cost transparency

  • cache_write_tokens is now surfaced everywhere — in tj cost, on the web Cost screen, in the trace-detail view, and via /api/v1/cost. Previously it was billed but invisible above the DB layer.
  • Plan-tier-aware renderingcore/framing.py is the single source of truth for whether to show dollars (api), token-share (subscription), tokens only (local), or a "may overstate" qualifier (unknown). The CLI and the REST API consume it identically.
  • Analyzer recoverable contract — every savings analyzer now emits estimated_recoverable_usd + estimated_recoverable_tokens + estimate_basis + estimate_confidence on a single time basis (window), so Overview tiles are directly comparable.
  • Honest run-rate — the Lens chart and tj cost projection use a window-average run-rate × days remaining in cycle, captioned "linear run-rate, not a forecast." No EWMA, no seasonality, no anomaly bands.

🔒 Security

  • A committed .tj/config.toml with a live ingest_secret (in repo since v0.2.0) is now untracked. Limited blast radius — local network ingest token only — but a CI test now guards against re-staging it. See PR #145 for the full advisory.

🧹 Quality

  • 17 v0.3.5 post-release findings closed (#141) from an external contributor; 5 rounds of Lens UI bug fixes; 9 individual fix PRs across the release. The full pre-release smoke pass ran twice and is committed under tests/results/ as a record.

Honesty discipline

This release is the most public-facing one yet. The framing language is deliberate everywhere:

  • Recoverable amounts are estimated, never saved.
  • Cache hits at 100% efficacy show "✓ Already optimized," not "no findings."
  • Subscription users see token-share framing, not dollar figures.
  • Forecasting is bounded to a single linear projection captioned "not a forecast."

Upgrade

pipx upgrade tokenjam
tj stop && tj serve &
tj --version   # expect 0.4.0

Existing installs keep their config, data, and daemon setup. Open the dashboard:

open http://127.0.0.1:7391/

…and you'll land on the new Overview.

Full changelog

v0.3.5...v0.4.0

v0.3.5

Choose a tag to compare

@anilmurty anilmurty released this 16 Jun 00:54
fbafb6f

First-run polish + bug fixes surfaced during the v0.3.5 pre-release playbook. No breaking changes — pipx upgrade tokenjam, then tj stop && tj serve & to reload the daemon.

Bug fixes

  • #101tj mcp works out of the box on a fresh install. fastmcp moved from the [mcp] extra into base dependencies, so pipx install tokenjam is enough to wire TokenJam into Claude Code without remembering an extra. The [mcp] extra is kept as a no-op for back-compat. The MCP server now also raises a clean, actionable ImportError pointing at the fix if fastmcp is somehow missing.
  • #98 — "No pricing data" warning no longer spams during backfill. Warns once per (provider, model) per process. Verified against a 20,000-span Claude Code backfill: zero warnings where pre-0.3.5 emitted hundreds. Deprecated Anthropic base models (claude-sonnet-4, claude-opus-4, claude-opus-4-1, claude-haiku-3-5) added to pricing/models.toml so dated variants like claude-sonnet-4-20250514 resolve correctly via the YYYYMMDD-stripping fallback instead of falling through to defaults.
  • #106 — UI footer no longer shows a 9-release-old version. tokenjam/__init__.py now reads from importlib.metadata.version("tokenjam") (single source of truth = pyproject.toml). New GET /api/v1/version endpoint; the UI footer fetches it on init. Same-origin — offline-UI guarantee preserved.
  • #106GET /health endpoint added as a conventional uptime probe. Returns {"status":"ok","version":"..."}. /api/v1/status continues to be the agent overview.
  • #106tj tokenmaxx plan-multiplier renders from project subdirs. When run from a directory whose .tj/config.toml has no [budget] section, _config_declared_plan now falls back to reading ~/.config/tj/config.toml directly. Previously dropped silently to api-pricing framing even when the user had plan = "max_5x" configured globally via tj onboard.
  • #105tj report --trim not-ready hint renders [capture] literally. Rich's print parser was silently swallowing the [capture] substring as an invalid style tag, hiding the section name the user needs to enable.

DX

  • tj --help epilog now shows the canonical upgrade incantation (pipx upgrade tokenjamtj stop && tj serve &tj --version).
  • docs/installation.md documents that [mcp] is no longer needed.
  • Pre-release and post-release manual test playbooks updated to cover the new surface (six-tier tokenmaxx ladder, offline-UI DevTools check, cache cost-correctness verification).

Upgrade

pipx upgrade tokenjam
tj stop && tj serve &
tj --version   # expect 0.3.5

Full changelog

v0.3.4...v0.3.5

v0.3.4 — Six-tier ladder + cache cost-accuracy fixes + offline UI

Choose a tag to compare

@anilmurty anilmurty released this 15 Jun 18:07
78584e6

A mix of a user-visible product change (the tokenmaxx tier rename + expansion) and three credibility-grade infrastructure fixes — two for cost accuracy on the cache path, one for the local-first promise.

TokenMaxx ladder expanded to 6 tiers

The top tier was previously TokenGigaChad at 20×+ — but almost every Claude Code power user lands there, so the headline lost its bite. Two changes:

  • Top tier raised to 50×+ so it reflects genuinely extreme usage, not everyday heavy use
  • New tier in the middle for the 20–50× range
  • Top two tiers renamed to drop the Chad branding
Multiplier (subscription users) Absolute /mo (API users) Tier
< 1× < $100 💧 TokenSipper
1× – 4× $100 – $400 🥱 TokenModerator
4× – 10× $400 – $1,000 💸 TokenMaxxer
10× – 20× $1,000 – $2,000 🔥 TokenSuperMaxxer (was TokenChad)
20× – 50× $2,000 – $5,000 🔥🔥 TokenMegaMaxxer (new)
50×+ $5,000+ 🔥🔥🔥 TokenGigaMaxxer (was TokenGigaChad)

Breaking note: the JSON output's `tier` field carries the new label string verbatim. Any consumer scripting against `TokenChad` / `TokenGigaChad` in `tj tokenmaxx --json` must update.

Cost-accuracy fixes (cache path)

Two related cache-billing fixes from community contributor @sjhddh, plus a follow-up to persist the raw counts.

Cache-only spans no longer costed at $0

A prompt-cache hit (`input_tokens=0`, `output_tokens=0`, but `cache_read_tokens > 0`) bills the cache-read rate, but `calculate_cost()` and `CostEngine.process_span()` were short-circuiting on input/output token counts alone — dropping the span as a no-op. The better your caching, the more cost went missing. The early-return guards now check all four token counts. (PR #90, kudos @sjhddh.)

Cache-creation tokens now costed on the live OTLP ingest path

The SDK integrations emit `gen_ai.usage.cache_creation_tokens` and the pricing table carried a `cache_write_per_mtok` rate, but the live parsers (`parse_otlp_span` + `convert_otel_span`) only read cache-read tokens. So every cache-write emitted via SDK was silently dropped, and the higher-rate cost never charged. `NormalizedSpan` now carries `cache_write_tokens`; both parsers populate it; `process_span` charges it. (PR #92, also @sjhddh.)

`cache_write_tokens` now persisted and threaded everywhere

  • New `cache_write_tokens` column on the `spans` table (migration 5)
  • The 3 remaining ingest paths (Langfuse, Helicone, Claude Code log adapter) now thread cache-creation tokens through `NormalizedSpan` for consistency with the live OTLP path
  • Codex's `cached_token_count` deliberately not mapped to `cache_write` — OpenAI's automatic prompt caching only bills cache-reads at a discount and has no separate cache-creation billing

The `cache` analyzer can now compute creation vs read breakdowns from real data. (PR #95, closes #93 + #94.)

Web UI works fully offline

The `tj serve` dashboard at `http://127.0.0.1:7391/\` was loading three things from external CDNs at render time, which broke the "local-first, no data egress" promise for any user running TokenJam in an air-gapped environment to verify exactly that claim:

  • Favicon SVG from `tokenjam.dev` → now inlined as a `data:` URL
  • Geist + Geist Mono fonts from `fonts.googleapis.com` → removed; system-font fallbacks already in CSS
  • Preact + hooks + htm from `esm.sh` → vendored under `tokenjam/ui/vendor/`, served via FastAPI `StaticFiles`, wired up via `<script type="importmap">` (JS source unchanged)

New regression tests catch any future external-URL slip. (PR #88, closes #87.)

`pipx install tokenjam` is the recommended install path

`pip install tokenjam` failed on macOS with Homebrew Python (PEP 668) and Debian 12+ / Ubuntu 24+. The unhelpful error broke the 3-command quickstart flow promised on `tokenjam.dev/tokenmaxxing`. All install snippets across the README, `docs/`, blog tutorials, and `examples/` now lead with `pipx install tokenjam`. Cross-platform pipx-install fallback table includes macOS / Debian/Ubuntu / Windows / generic. (PR #88 + PR #89, closes #86.)

Install

```
pipx install tokenjam==0.3.4
```

TypeScript SDK in lockstep as `@tokenjam/sdk@0.3.4`.


Thanks to @sjhddh for two excellent cost-accuracy fixes (PRs #90 + #92) and to everyone who flagged install + offline-UI issues during the 0.3.x rollout.

v0.3.3 — TokenMaxx report polish + Opus 4.5 pricing

Choose a tag to compare

@anilmurty anilmurty released this 09 Jun 18:34
d7145ee

A launch-readiness polish release for the tj tokenmaxx social moment, plus one more pricing-accuracy fix from a community contributor.

TokenMaxx Report — visual + structural polish

The tj tokenmaxx output is now a bordered report panel designed to be a clean screenshot artifact:

╭─ TokenJam TokenMaxxing Report ──────────────────────────────────────────────╮
│                                                                              │
│  🔥🔥 You're a TokenGigaChad.                                                │
│                                                                              │
│  Touch grass. Then run tj optimize.                                          │
│                                                                              │
│  \$4056.82 in last 30d across 33 sessions.                                    │
│  That's 40.6× your Max 5x plan cost (\$100/mo flat).                          │
│                                                                              │
│  💡 No obvious savings flagged yet — run tj optimize for the full report     │
│  once you have more data.                                                    │
│                                                                              │
╰──────────────────────────────────────────────────────────────────────────────╯
  Share your tier: screenshot the above and tag @tokenjamdev
  • Spend now renders at 2 decimals (\$4056.82, was \$4056.8200)
  • Plan fee strips decimals when whole-dollar (\$100, was \$100.0000)
  • `tj optimize` rendered bold-green wherever it appears
  • Share line in teal, points at `@tokenjamdev`

TokenMaxx — plan-relative tier ladder

Tiers are now based on the multiplier vs your plan cost, so the tier name means the same thing across Pro / Max-5x / Max-20x users:

Multiplier (subscription) Absolute /mo (API) Tier
< 1× < $100 💧 TokenSipper
1× – 4× $100 – $400 🥱 TokenModerator
4× – 10× $400 – $1000 💸 TokenMaxxer
10× – 20× $1000 – $2000 🔥 TokenChad
20×+ $2000+ 🔥🔥 TokenGigaChad

API users (no plan to multiply against) fall back to absolute USD thresholds, calibrated against Max-5x = $100/mo so the tier name carries the same meaning in either world. A Pro user at 15× their plan and a Max-5x user at 15× their plan are both TokenChads — the tier reflects "how hard you're maxxing," not raw spend.

Pricing fix: Claude Opus 4.5

tokenjam/pricing/models.toml had Opus 4.5 at the old \$15 / \$75 tier. Anthropic moved 4.5 to \$5 / \$25 (same tier as 4.6 / 4.7 / 4.8). Users on 4.5 were seeing ~3× inflated cost figures; fixed in this release.

The repo-root pricing/models.toml (orphaned since v0.1.x) was also removed — the runtime only reads tokenjam/pricing/models.toml, and the duplicate file was confusing contributors. CLAUDE.md, CONTRIBUTING.md, and the fallback warning string all now point at the real path.

Thanks to @kelter-antunes for the catch and the dedupe.

Install

```
pip install tokenjam==0.3.3
```

TypeScript SDK in lockstep as `@tokenjam/sdk@0.3.3`.