Releases: Metabuilder-Labs/tokenjam
Release list
v0.5.3
What's Changed
- Fix #303: add long-window options to Traces by @sanmaxdev in #304
- Fix #299: align recent activity trace window by @sanmaxdev in #305
- Fix #284: backfill --since flag parity by @sanmaxdev in #307
- [chore] Complete StorageBackend protocol, remove direct db.conn access (#309) by @anilmurty in #314
- [enhancement] Surface sample size + confidence interval on savings estimates (#308) by @anilmurty in #316
- fix(lens): gate Analytics Stack control to stacking charts (#295) by @ashwmu in #315
- Fix #310: use utcnow instead of datetime.now in ApiBackend by @tarun73 in #317
- fix(lens): show leaderboard total + reconcile partial-dimension gap (#313) by @ashwmu in #319
- Fix #320: capture prompt + completion content on provider patches by @anilmurty in #321
- feat(lens): show breakdown subtotal on the active KPI tile (#318) by @ashwmu in #322
- Fix #298: Clarify drift baseline agent scope and live session rules by @tarun73 in #323
- Feat/mcp wiring doctor check by @tarun73 in #324
- feat(ui): dynamic analytics csv export with loading state by @tarun73 in #325
- docs: cross-link tokenjam-bench as the "prove" step (#336) by @anilmurty in #337
- docs: add missing CLI reference commands (#280) by @sanmaxdev in #338
- Add summarize scan surface: tj summarize list + MCP tool by @auspexlabs in #354
- test: isolate CLI onboard tests so test_cli.py can't hang (#23) by @anshss in #339
- fix: batch of 7 bug-hunt fixes (optimize, mcp, db, analytics, sdk) by @anshss in #353
- Add summarize mechanism: prep/check — CLI (manual) + MCP (in-session) by @auspexlabs in #355
- Add summarize apply/undo + backup (#329) by @auspexlabs in #356
- Add summarize delivery: prep --via claude-p + --via api (#329) by @auspexlabs in #357
- feat: Add pre-filled tjb command after model-downgrade finding (#330) by @tarun73 in #358
- feat: Docker image for tj serve by @vigneshvinfra in #359
- feat(cli): add read-only tj pricing list command by @Axiya3749 in #361
- docs: document tj pricing list command (#282 follow-up) by @anilmurty in #362
- Add Bedrock Nova Micro/Premier, Anthropic-on-Bedrock, and LiteLLM provider pricing entries by @Axiya3749 in #366
- fix: deflake test_budget_cap_policy timezone-boundary test by @anilmurty in #372
- fix: Claude Code session status + lifecycle (1/5 from #306) by @anshss in #367
- feat: tj context cost diagnostics (3/5 from #306) by @anshss in #368
- feat: zero-install / npm package (2/5 from #306, stacked on #368) by @anshss in #369
- fix: Claude Code cost & alert correctness (4/5 from #306) by @anshss in #370
- feat: session Approach / Map / Timeline (5/5 from #306, stacked on #367) by @anshss in #371
- test: cross-backend StorageBackend parity suite (#51) by @anshss in #379
- Fix #55: self-heal recorded-but-unlanded migrations (silent ingest drop → blank Status) by @anshss in #381
- [feature] Close the loop: annotations, expectations, fix-history (#53) by @anshss in #380
- feat(onboard): zero-token statusline for Claude Code/Codex; reserve MCP for SDK (#59) by @anshss in #374
- Fix #340: resolve real tj binary for daemon install fallback by @anilmurty in #384
- Fix #373: Bedrock pricing keys never match live spans — normalize provider + model id in get_rates by @anilmurty in #385
- chore: add CI-gated GHCR Docker image publish workflow by @vigneshvinfra in #360
- fix(ci): publish npm-wrapper
tokenjamon release, version from tag (#65) by @anshss in #386 - fix(mcp): route get_optimize_report through serve when DB is locked (#61) by @anshss in #388
- fix(transcript): preserve TodoWrite payload instead of dropping it (#67) by @anshss in #391
- Fix: Lens Overview false empty on historical-only telemetry by @anilmurty in #392
- feat: price 1M-context [1m] variants + current OpenAI GPT-5 lineup by @anilmurty in #393
- Fix #60: flag delegating sessions with unaccounted subagent turns in weighted quota by @anshss in #387
- fix(#63): make
tj contextrender whiletj serveis running by @anshss in #390 - fix(#62): ship output-cap hook DEFAULT-OFF opt-in — A/B gate failed (CC pre-truncates Bash at 30KB) by @anshss in #389
- Feat/312 async hooks by @tarun73 in #363
- docs: lead README with pipx + scrub ccusage competitor references by @anilmurty in #394
- feat: add Prove (tokenjam-bench) nudge to onboard + tokenmaxx by @anilmurty in #395
- fix: guard Lens KPI "% vs prev" deltas against near-empty prior window by @anilmurty in #396
- Fix #294, #300: purge stale-scheme backfill span duplicates (cross-version) by @anilmurty in #397
- Bump version to 0.5.3 by @anilmurty in #399
New Contributors
- @tarun73 made their first contribution in #317
- @auspexlabs made their first contribution in #354
- @vigneshvinfra made their first contribution in #359
- @Axiya3749 made their first contribution in #361
Full Changelog: v0.5.2...v0.5.3
v0.5.2 — first-run accuracy & polish
TokenJam 0.5.2 — first-run accuracy & polish
Accuracy fixes (the headline)
- Backfill no longer over-counts your spend. Claude Code sessions that were resumed or compacted were ingested 2–4× over — the same model calls re-counted under fresh message IDs. Token and cost figures (and everything derived from them) are now correct. If you backfilled before 0.5.2, wipe and re-run
tj backfill claude-codeto correct your historical numbers. (#294) tj demonow works on installed versions. The incident scenarios were never packaged into the wheel, sotj demolisted nothing forpip/pipxusers. Fixed —tj demo retry-loopruns out of the box. (#291)- Traces list raised to 200 by default, with "Showing N of M" + a Load more control. (#274)
tj budget --jsonfor machine-readable budgets. (#281)
New capability
- Full-request capture — sampling params and tools/tool_choice are now captured on LLM spans (gated by your
[capture]settings). (#209) - Experimental enforcement-plane preview — an optional local proxy + policy engine (budget caps, routing/reuse policies) ships in suggest mode only: it shows what it would do and enforces nothing. Interfaces are unstable and off by default. (#219–#223)
Docs & brand
- Clearer contributing guide, Code of Conduct, refreshed dashboard screenshots, new black-on-white logo. (#289, #275, #273)
Thanks to our contributors 🙌
This release includes community contributions from @sanmaxdev (traces pagination #274, Code of Conduct #275) and @gitcommit90 (tj budget --json #281). Thank you!
Install: pipx install tokenjam
Full changelog: v0.5.1...v0.5.2
v0.5.1 — First-run polish
0.5.1 sands down every rough edge a new user hits on day one — building on 0.5.0's Lens Visualizations with a unified dashboard, honest subscription framing, faithful backfill data, and a smoother first run.
Highlights
🏠 A unified Dashboard, now the default — the Analytics pivot explorer and the triage front-door are merged into one screen. Land on it, see your recoverable waste and health at a glance, then click any tile to drill straight into the explorer in place.
💳 Honest subscription framing, everywhere — per-item costs now read in tokens (not a confusing "% of cycle"), and the headline Spend shows a "× plan value" multiplier — your usage expressed as a multiple of what your subscription costs. Subscription users never see raw dollar "spend."
🔍 Faithful backfilled history — Claude Code history now reconstructs into session-level traces (a whole conversation as one trace, with its tool calls nested) instead of thousands of fragments. Cache read vs write tokens are tracked separately, and backfill reports honest session counts.
📊 Clearer charts — the cost-by-component chart no longer collapses into a single block, the cache-savings chart leads with the answer instead of an unreadable dual-axis, and the trace waterfall puts span names in a fixed column with cost-magnitude bars and timestamps.
👋 A warmer first run — tj onboard opens with a branded welcome banner and ends with a next-steps nudge that leads with what works immediately (your last 30 days are already loaded — try tj tokenmaxx, tj optimize, tj serve).
✨ Polish — thousand-separated counts, softened first-day deltas, a Status list view by default, a helpful empty state when a metric doesn't apply to a dimension, and a real contribution funnel in the README.
Honest by construction
Every estimate is "estimated recoverable," never "saved." Every dollar-bearing surface respects plan tier — subscription users see token-share and plan-value, never raw spend.
Install / upgrade
pipx install tokenjam # new
pipx upgrade tokenjam # existing — then: tj stop && tj serve &Full changelog: v0.5.0...v0.5.1
v0.5.0 — Lens Visualizations
TokenJam Lens grows from a single spend chart into a full visualization surface. This release ships the Lens Visualizations milestone — new ways to see where your tokens and dollars go, all offline, all plan-tier-honest.
Highlights
📊 Analytics pivot explorer — a new screen that turns one query into any view: pick a metric (spend / tokens / sessions / events), break it down by any dimension (model, agent, tool, day…), stack a second dimension, switch chart type, and export CSV. State lives in the URL, so any view is shareable and bookmarkable.
💸 Cost screen, deeper — stacked cost-by-model/agent breakdown over time, plus a cache-savings time-series showing cumulative captured savings against your cache hit-rate.
🔍 Optimize overview — cost-by-component with a recoverable-waste overlay, so the biggest savings opportunities surface at a glance.
✨ KPI sparklines + deltas — every Analytics headline tile now carries a trend sparkline and a period-over-period change vs. the prior window.
🧭 Trace waterfall, cost-first — the trace detail view annotates each span with cost and tokens alongside a magnitude bar, so you can see where the spend lands across a run.
🎨 Consistent coloring — one shared 12-hue color map gives every series the same color on every chart and screen; grouping by a time dimension no longer produces a confusing rainbow.
↪️ Deep-linked Overview — Overview tiles and the spend headline now click through into the Analytics explorer, pre-filtered to the relevant slice.
Honest by construction
Every dollar-bearing surface respects plan-tier framing: subscription users see token-share, never raw spend. Every estimate is "estimated recoverable," never "saved."
Docs
Install docs now lead with pipx install tokenjam (sidesteps PEP 668 on Homebrew/Debian/Ubuntu Python). The [mcp] extra is no longer needed — the MCP server ships in the base install.
Install
pipx install tokenjam # recommended
# or: pip install tokenjam (in a clean venv)Full changelog: v0.4.2...v0.5.0
v0.4.2 — honesty hardening + cost-accuracy fixes
A maintenance release closing a full pass of plan-tier-framing and cost-accuracy bugs found by checking TokenJam against real usage, plus first-class user pricing.
Highlights
- Honest plan-tier framing everywhere. Subscription/local users no longer see raw dollar figures across the CLI (
tj cost) or the Lens web UI — Cost table, Traces, trace-detail, Status, and Optimize all suppress/reframe correctly; chart framing is consistent across time windows; CLI and Lens agree on the same data; axes read in local time with coherent daily labels. (#175, #176, #177, #178, #187, #188, #191, #197) - Cost accuracy. Fixed LiteLLM provider attribution (was recording "litellm" → ~42% cost undercount, NULL billing account, suppressed plan); LiteLLM now also captures prompt/completion content and cache tokens. Imported sessions inherit the configured plan tier. (#183, #194, #195)
- First-class user pricing. Override per-model rates from your config — keyed by bare model name so it works even when the provider can't be inferred — with docs. (#200, #201)
- SDK & onboarding. The SDK honors
TJ_CONFIG(no more writing to the wrong DB); onboarding /tj doctorsurface when Claude Code needs a restart to start sending telemetry. (#179, #196)
What's Changed
- docs: README cleanup post-v0.4.1 (Reuse + Lens added, competitive table removed) by @anilmurty in #169
- docs: drop stale CHANGELOG.md + add maintainer contact by @anilmurty in #170
- Track A part 1: weekly GitHub traffic archive workflow by @anilmurty in #171
- Track A part 1 follow-up: workflow pushes as PAT owner to satisfy main-branch protection by @anilmurty in #172
- Track A: move traffic archive to traffic-data branch (close-out) by @anilmurty in #173
- docs: add traffic-archive action to CLAUDE.md so future agents discover it by @anilmurty in #174
- docs: add docs/optimize/reuse.md analyzer deep-dive (#168) by @anilmurty in #180
- Fix #175, #176: tj cost framing + backfill plan_tier propagation (v0.4.2) by @anilmurty in #181
- Apply declared plan tiers across onboard, serve, and UI by @HoomanDgtl in #184
- docs: add PR + commit conventions for any agent producing a PR by @anilmurty in #182
- Fix #179: surface Claude Code connection staleness (onboard banner + tj doctor check) by @anilmurty in #185
- Fix #177, #178: consistent Lens chart framing + local-timezone axis labels by @anilmurty in #186
- Fix #188: keep daily chart date labels UTC-aligned by @anilmurty in #189
- Fix #187: suppress raw dollar figures for subscription/local users on web table & trace surfaces by @anilmurty in #190
- Fix #191: suppress raw dollar figures for subscription/local users on Status, Optimize & Reuse web surfaces by @anilmurty in #192
- Fix #196: SDK bootstrap honors TJ_CONFIG for config discovery by @anilmurty in #199
- Fix #194: resolve LiteLLM provider from model name (no more "litellm" provider) by @anilmurty in #198
- Fix #197: CLI and Lens render identical plan-tier framing (window-independent mix) by @anilmurty in #202
- Fix #195: capture prompt/completion content + cache tokens on LiteLLM spans by @anilmurty in #203
- Make user pricing first-class + attribution-proof (#200, #201) by @anilmurty in #205
- Bump version to 0.4.2 by @anilmurty in #206
New Contributors
- @HoomanDgtl made their first contribution in #184
Full Changelog: v0.4.1...v0.4.2
v0.4.1 — concurrency + cycle-aware run-rate + Reuse HTTP fallback
A quality + follow-up release on top of v0.4.0's marquee. Six issues closed, each as a focused PR that landed cleanly. Pre-release pass: 10/10 PASS, 0 FAIL, 0 UNCLEAR.
🔧 Daemon DB concurrency — the Overview no longer crashes under fan-out
DuckDBBackend.conn is now a per-thread DuckDB cursor (threading.local) over one shared database. Cursors from connect().cursor() are independent connections safe for concurrent use across threads, which is the documented DuckDB pattern.
Pre-fix symptom: with tj serve running, the Overview's parallel endpoint fan-out + Starlette's sync-route threadpool would race on a single shared connection and SIGABRT the daemon. We worked around it by fetching the Overview's panels sequentially. That workaround is now gone — the Overview fetches all six panels via a single Promise.all with deliberately asymmetric error handling (/cost is load-bearing and rejects on failure; the other five panels each carry a .catch fallback so one failing panel renders empty rather than blanking the whole screen).
Empirically verified: 90 concurrent reads against the Overview endpoint set complete with zero errors. Pre-fix this would crash the process. Issue #124, PR #161.
🗓 Cycle-aware run-rate
tj cost and the Lens spend chart's run-rate projection now honor [budget.<provider>] cycle_start_day when configured. Calendar-month users (the default) still see "by end of June"; users with a billing cycle that starts mid-month see "by Jul 15" — a dated form, because "by end of July" would mislead when the cycle ends mid-month.
core/cycle.py is shared between the existing budget_projection analyzer and the cost-API caption, so both surfaces use one piece of cycle math. The API exposes a cycle block on /api/v1/cost; the UI consumes it. Issue #138, PR #158.
📈 tj cost chart no longer silently empties on long windows
buildCostSeries previously had if (xs.length > 5000) return null as a guard against pathological grids — fine for a 90-day cap today, but if anyone ever requests hourly bucketing over a multi-year window the chart would just go blank with no explanation.
It now coarsens up an hour → day → week ladder until the grid fits, computed closed-form so an oversized array is never allocated. When coarsening fires, a footnote on the chart explains it: "Showing day buckets — this range is too long for hourly detail." Genuinely empty windows still render the existing "No spend in this window" state. Issue #139, PR #163.
📊 Status: active (compute) time alongside wall-clock
Pre-fix, a resumed Claude Code session that ran across days showed Duration: 3087m — easy to misread as a runaway. v0.4.1 distinguishes:
- Active — Σ span durations. The work actually done.
- Elapsed — wall-clock from session start to last activity. Resumed Claude Code sessions span days; this can be far larger than Active.
Visible in tj status (Duration: active 12m 3s · elapsed 2d 3h), in tj status --json (both fields), and in the Lens Status tile (two distinct rows with tooltips). A new fmtDurLong formatter renders multi-day spans as 2d 3h instead of 3087m. Issue #147, PR #164.
🔁 tj report --reuse works while the daemon is running
The Reuse report needed direct DB access to fetch each cluster's planning-call completion text — so when tj serve held the write lock, the command errored out and pointed the user at tj stop. v0.4.1 adds a dedicated GET /api/v1/reuse/clusters endpoint that returns the Reuse finding plus the skeleton-rendering extras (planning_texts + pricing_mode).
tj report --reuse now dispatches like tj optimize: direct DB connection when available, ApiBackend.fetch_reuse_clusters when the daemon owns the lock. Renderer accepts pre-fetched planning texts as an alternative to a connection. The endpoint is dedicated rather than bolted onto /optimize because per-cluster planning text can be many KB and the Overview polls /optimize every 30s — we don't make every poll pay for report-only data.
tj report --trim retains the same direct-DB limitation, now explicitly documented in CLAUDE.md. Issue #154, PR #165.
🎨 Recoverable Waste tile consistency
Three small but visible inconsistencies on the Lens Overview's tile band:
reuserendered lowercase while the other analyzer names were title-cased. Added an explicitANALYZER_METAentry plus a centralizedcapitalize()helper used in the fallback path — so the next analyzer that ships will auto-capitalize instead of falling through to its raw lowercase registry key.- Trim's
— not readyhad a leading em-dash while the other tile states didn't. Dropped to plainNot readyso the three states share a prefix-free scheme. - Cache title bold investigated and locked with a regression guard. Couldn't reproduce in current source, but the guard test now asserts no state-specific bold-title rule can be silently added.
Upgrade
pipx upgrade tokenjam
tj stop && tj serve &
tj --version # expect 0.4.1Existing installs keep their config, data, and daemon setup. No breaking changes.
Pre-release verification
A 10-step focused pre-release pass (tests/agent-pre-release-v0.4.1.md) was executed by a sub-agent against the live daemon. Result: 10/10 PASS, 0 FAIL, 0 UNCLEAR.
Coverage included the marquee #124 concurrency reproduction (90 concurrent reads, no crash), /api/v1/cost.cycle block, tj report --reuse with daemon running, the new active/elapsed status fields, the 90d cost chart render, and the tile consistency fixes — plus regression checks on TokenMaxx, all five analyzers, and the API framing block.
Full log committed to tests/results/agent-pre-release-v0.4.1-20260620T002546Z.md as a release-time record.
Honesty discipline
The release continues the v0.4.0 framing rules:
- Run-rate captions remain "linear, not a forecast" — no smoothing, no seasonality, no anomaly bands; just a date-cycle projection
- Coarsened chart windows are explicit about the coarsening, never silent
- The Reuse HTTP fallback is documented as paying-for-what-you-use (dedicated endpoint instead of bolting it onto
/optimize); no hidden cost on Overview polls - Status
ActivevsElapsedlabels distinguish work-done from wall-clock; the misleading bare "Duration" label is gone
Full changelog
v0.4.0 — TokenJam Lens + Reuse
First minor bump since v0.3.0. Significant new product surface — substantially more than the 0.3.x patch cadence — so we bumped the minor.
🔭 TokenJam Lens — the local UI is a product
The local dashboard you get from tj serve has been rebranded and rebuilt as TokenJam Lens. It's the same offline-first single-file SPA, but with a real triage front door instead of a list of tables.
- Overview screen — the new default landing route. Three bands: spend hero (with a real chart and a "to end of month" run-rate projection that's explicitly not a forecast), recoverable-waste tiles (one per analyzer, registry-driven so future analyzers auto-appear), and health-at-a-glance (alerts, drift, budgets, recent activity).
- Optimize detail tab — every analyzer's findings rendered in one place, with
?finding=<name>deep-links from the Overview tiles. - Real charts — uPlot, vendored offline (zero CDN loads, the dashboard works air-gapped). Hover tooltips, theme-aware (light/dark/system), tick granularity scales with the window (hours/days/weeks).
- URL state as the source of truth — every filter (window, group-by, agent, finding) lives in the hash, so back/forward/reload work cleanly and cross-screen drill-through carries context.
- Cost transparency — the
tj costtable and the Cost screen now show CACHE R and CACHE W columns alongside input/output tokens, so a cache-heavy$1.44span no longer looks like it came out of nowhere. (The hidden ~91% cost driver on Claude Code-style workloads finally has a name.)
🔁 Reuse — the 5th analyzer
Agents re-plan the same work constantly. Reuse detects clusters of sessions that share a planning skeleton (the first LLM call before any tool call) and surfaces what that repeated planning costs.
tj optimize reuse— clusters sessions by structural similarity (tool-sequence signature, plus prompt-prefix hashing when[capture] prompts = true); produces two honest numbers per cluster: cache-reuse savings (what you'd recover by reusing the skeleton) and script-replacement savings (the upper bound if you converted the planning to a deterministic template).tj report --reuse— renders the clusters as an HTML report plus per-cluster Markdown sidecars with variable slots highlighted ({{slot_N}}), idempotent oncluster_idso re-runs overwrite cleanly. The Markdown is copy-paste-usable as a Claude Code slash command or saved prompt.- Honesty by construction: "skeleton match," "recoverable," "review before reusing" — never "saves you."
Known limitation: tj report --reuse needs direct DB access today, so tj stop first if the daemon is running. HTTP fallback is tracked for v0.4.1 (#154).
💰 Cost transparency
cache_write_tokensis now surfaced everywhere — intj cost, on the web Cost screen, in the trace-detail view, and via/api/v1/cost. Previously it was billed but invisible above the DB layer.- Plan-tier-aware rendering —
core/framing.pyis the single source of truth for whether to show dollars (api), token-share (subscription), tokens only (local), or a "may overstate" qualifier (unknown). The CLI and the REST API consume it identically. - Analyzer recoverable contract — every savings analyzer now emits
estimated_recoverable_usd+estimated_recoverable_tokens+estimate_basis+estimate_confidenceon a single time basis (window), so Overview tiles are directly comparable. - Honest run-rate — the Lens chart and
tj costprojection use a window-average run-rate × days remaining in cycle, captioned "linear run-rate, not a forecast." No EWMA, no seasonality, no anomaly bands.
🔒 Security
- A committed
.tj/config.tomlwith a liveingest_secret(in repo since v0.2.0) is now untracked. Limited blast radius — local network ingest token only — but a CI test now guards against re-staging it. See PR #145 for the full advisory.
🧹 Quality
- 17 v0.3.5 post-release findings closed (#141) from an external contributor; 5 rounds of Lens UI bug fixes; 9 individual fix PRs across the release. The full pre-release smoke pass ran twice and is committed under
tests/results/as a record.
Honesty discipline
This release is the most public-facing one yet. The framing language is deliberate everywhere:
- Recoverable amounts are estimated, never saved.
- Cache hits at 100% efficacy show "✓ Already optimized," not "no findings."
- Subscription users see token-share framing, not dollar figures.
- Forecasting is bounded to a single linear projection captioned "not a forecast."
Upgrade
pipx upgrade tokenjam
tj stop && tj serve &
tj --version # expect 0.4.0Existing installs keep their config, data, and daemon setup. Open the dashboard:
open http://127.0.0.1:7391/…and you'll land on the new Overview.
Full changelog
v0.3.5
First-run polish + bug fixes surfaced during the v0.3.5 pre-release playbook. No breaking changes — pipx upgrade tokenjam, then tj stop && tj serve & to reload the daemon.
Bug fixes
- #101 —
tj mcpworks out of the box on a fresh install.fastmcpmoved from the[mcp]extra into base dependencies, sopipx install tokenjamis enough to wire TokenJam into Claude Code without remembering an extra. The[mcp]extra is kept as a no-op for back-compat. The MCP server now also raises a clean, actionableImportErrorpointing at the fix iffastmcpis somehow missing. - #98 — "No pricing data" warning no longer spams during backfill. Warns once per
(provider, model)per process. Verified against a 20,000-span Claude Code backfill: zero warnings where pre-0.3.5 emitted hundreds. Deprecated Anthropic base models (claude-sonnet-4,claude-opus-4,claude-opus-4-1,claude-haiku-3-5) added topricing/models.tomlso dated variants likeclaude-sonnet-4-20250514resolve correctly via the YYYYMMDD-stripping fallback instead of falling through to defaults. - #106 — UI footer no longer shows a 9-release-old version.
tokenjam/__init__.pynow reads fromimportlib.metadata.version("tokenjam")(single source of truth =pyproject.toml). NewGET /api/v1/versionendpoint; the UI footer fetches it on init. Same-origin — offline-UI guarantee preserved. - #106 —
GET /healthendpoint added as a conventional uptime probe. Returns{"status":"ok","version":"..."}./api/v1/statuscontinues to be the agent overview. - #106 —
tj tokenmaxxplan-multiplier renders from project subdirs. When run from a directory whose.tj/config.tomlhas no[budget]section,_config_declared_plannow falls back to reading~/.config/tj/config.tomldirectly. Previously dropped silently to api-pricing framing even when the user hadplan = "max_5x"configured globally viatj onboard. - #105 —
tj report --trimnot-ready hint renders[capture]literally. Rich's print parser was silently swallowing the[capture]substring as an invalid style tag, hiding the section name the user needs to enable.
DX
tj --helpepilog now shows the canonical upgrade incantation (pipx upgrade tokenjam→tj stop && tj serve &→tj --version).docs/installation.mddocuments that[mcp]is no longer needed.- Pre-release and post-release manual test playbooks updated to cover the new surface (six-tier tokenmaxx ladder, offline-UI DevTools check, cache cost-correctness verification).
Upgrade
pipx upgrade tokenjam
tj stop && tj serve &
tj --version # expect 0.3.5Full changelog
v0.3.4 — Six-tier ladder + cache cost-accuracy fixes + offline UI
A mix of a user-visible product change (the tokenmaxx tier rename + expansion) and three credibility-grade infrastructure fixes — two for cost accuracy on the cache path, one for the local-first promise.
TokenMaxx ladder expanded to 6 tiers
The top tier was previously TokenGigaChad at 20×+ — but almost every Claude Code power user lands there, so the headline lost its bite. Two changes:
- Top tier raised to 50×+ so it reflects genuinely extreme usage, not everyday heavy use
- New tier in the middle for the 20–50× range
- Top two tiers renamed to drop the Chad branding
| Multiplier (subscription users) | Absolute /mo (API users) | Tier |
|---|---|---|
| < 1× | < $100 | 💧 TokenSipper |
| 1× – 4× | $100 – $400 | 🥱 TokenModerator |
| 4× – 10× | $400 – $1,000 | 💸 TokenMaxxer |
| 10× – 20× | $1,000 – $2,000 | 🔥 TokenSuperMaxxer (was TokenChad) |
| 20× – 50× | $2,000 – $5,000 | 🔥🔥 TokenMegaMaxxer (new) |
| 50×+ | $5,000+ | 🔥🔥🔥 TokenGigaMaxxer (was TokenGigaChad) |
Breaking note: the JSON output's `tier` field carries the new label string verbatim. Any consumer scripting against `TokenChad` / `TokenGigaChad` in `tj tokenmaxx --json` must update.
Cost-accuracy fixes (cache path)
Two related cache-billing fixes from community contributor @sjhddh, plus a follow-up to persist the raw counts.
Cache-only spans no longer costed at $0
A prompt-cache hit (`input_tokens=0`, `output_tokens=0`, but `cache_read_tokens > 0`) bills the cache-read rate, but `calculate_cost()` and `CostEngine.process_span()` were short-circuiting on input/output token counts alone — dropping the span as a no-op. The better your caching, the more cost went missing. The early-return guards now check all four token counts. (PR #90, kudos @sjhddh.)
Cache-creation tokens now costed on the live OTLP ingest path
The SDK integrations emit `gen_ai.usage.cache_creation_tokens` and the pricing table carried a `cache_write_per_mtok` rate, but the live parsers (`parse_otlp_span` + `convert_otel_span`) only read cache-read tokens. So every cache-write emitted via SDK was silently dropped, and the higher-rate cost never charged. `NormalizedSpan` now carries `cache_write_tokens`; both parsers populate it; `process_span` charges it. (PR #92, also @sjhddh.)
`cache_write_tokens` now persisted and threaded everywhere
- New `cache_write_tokens` column on the `spans` table (migration 5)
- The 3 remaining ingest paths (Langfuse, Helicone, Claude Code log adapter) now thread cache-creation tokens through `NormalizedSpan` for consistency with the live OTLP path
- Codex's `cached_token_count` deliberately not mapped to `cache_write` — OpenAI's automatic prompt caching only bills cache-reads at a discount and has no separate cache-creation billing
The `cache` analyzer can now compute creation vs read breakdowns from real data. (PR #95, closes #93 + #94.)
Web UI works fully offline
The `tj serve` dashboard at `http://127.0.0.1:7391/\` was loading three things from external CDNs at render time, which broke the "local-first, no data egress" promise for any user running TokenJam in an air-gapped environment to verify exactly that claim:
- Favicon SVG from `tokenjam.dev` → now inlined as a `data:` URL
- Geist + Geist Mono fonts from `fonts.googleapis.com` → removed; system-font fallbacks already in CSS
- Preact + hooks + htm from `esm.sh` → vendored under `tokenjam/ui/vendor/`, served via FastAPI `StaticFiles`, wired up via `<script type="importmap">` (JS source unchanged)
New regression tests catch any future external-URL slip. (PR #88, closes #87.)
`pipx install tokenjam` is the recommended install path
`pip install tokenjam` failed on macOS with Homebrew Python (PEP 668) and Debian 12+ / Ubuntu 24+. The unhelpful error broke the 3-command quickstart flow promised on `tokenjam.dev/tokenmaxxing`. All install snippets across the README, `docs/`, blog tutorials, and `examples/` now lead with `pipx install tokenjam`. Cross-platform pipx-install fallback table includes macOS / Debian/Ubuntu / Windows / generic. (PR #88 + PR #89, closes #86.)
Install
```
pipx install tokenjam==0.3.4
```
TypeScript SDK in lockstep as `@tokenjam/sdk@0.3.4`.
Thanks to @sjhddh for two excellent cost-accuracy fixes (PRs #90 + #92) and to everyone who flagged install + offline-UI issues during the 0.3.x rollout.
v0.3.3 — TokenMaxx report polish + Opus 4.5 pricing
A launch-readiness polish release for the tj tokenmaxx social moment, plus one more pricing-accuracy fix from a community contributor.
TokenMaxx Report — visual + structural polish
The tj tokenmaxx output is now a bordered report panel designed to be a clean screenshot artifact:
╭─ TokenJam TokenMaxxing Report ──────────────────────────────────────────────╮
│ │
│ 🔥🔥 You're a TokenGigaChad. │
│ │
│ Touch grass. Then run tj optimize. │
│ │
│ \$4056.82 in last 30d across 33 sessions. │
│ That's 40.6× your Max 5x plan cost (\$100/mo flat). │
│ │
│ 💡 No obvious savings flagged yet — run tj optimize for the full report │
│ once you have more data. │
│ │
╰──────────────────────────────────────────────────────────────────────────────╯
Share your tier: screenshot the above and tag @tokenjamdev
- Spend now renders at 2 decimals (
\$4056.82, was\$4056.8200) - Plan fee strips decimals when whole-dollar (
\$100, was\$100.0000) - `tj optimize` rendered bold-green wherever it appears
- Share line in teal, points at `@tokenjamdev`
TokenMaxx — plan-relative tier ladder
Tiers are now based on the multiplier vs your plan cost, so the tier name means the same thing across Pro / Max-5x / Max-20x users:
| Multiplier (subscription) | Absolute /mo (API) | Tier |
|---|---|---|
| < 1× | < $100 | 💧 TokenSipper |
| 1× – 4× | $100 – $400 | 🥱 TokenModerator |
| 4× – 10× | $400 – $1000 | 💸 TokenMaxxer |
| 10× – 20× | $1000 – $2000 | 🔥 TokenChad |
| 20×+ | $2000+ | 🔥🔥 TokenGigaChad |
API users (no plan to multiply against) fall back to absolute USD thresholds, calibrated against Max-5x = $100/mo so the tier name carries the same meaning in either world. A Pro user at 15× their plan and a Max-5x user at 15× their plan are both TokenChads — the tier reflects "how hard you're maxxing," not raw spend.
Pricing fix: Claude Opus 4.5
tokenjam/pricing/models.toml had Opus 4.5 at the old \$15 / \$75 tier. Anthropic moved 4.5 to \$5 / \$25 (same tier as 4.6 / 4.7 / 4.8). Users on 4.5 were seeing ~3× inflated cost figures; fixed in this release.
The repo-root pricing/models.toml (orphaned since v0.1.x) was also removed — the runtime only reads tokenjam/pricing/models.toml, and the duplicate file was confusing contributors. CLAUDE.md, CONTRIBUTING.md, and the fallback warning string all now point at the real path.
Thanks to @kelter-antunes for the catch and the dedupe.
Install
```
pip install tokenjam==0.3.3
```
TypeScript SDK in lockstep as `@tokenjam/sdk@0.3.3`.