Skip to content

marqo404/TokenMeter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TokenMeter

7-module LLM token spend dashboard with cross-provider tracking, budget enforcement, and anomaly alerting. Built for AI builders running workloads across MiMo, Claude, GPT, Grok, and Gemini.

Python License Version MiMo Hermes SQLite Rich

┏━━ TokenMeter ── 7d window ── 5 providers ━━━━━━━━━━━━━━━━━━━┓
┃  Provider           Spend       Tokens    Δ                 ┃
┃ ─────────────────────────────────────────────────────────── ┃
┃  MiMo  (flagship)   $187.42     18.7M     +14%              ┃
┃  Claude             $128.10      8.4M      +3%              ┃
┃  GPT-4o             $112.88      9.1M      -2%              ┃
┃  Grok-4              $48.32      2.1M     +22%              ┃
┃  Gemini-2.5-Pro      $34.10      3.8M      -8%              ┃
┃ ─────────────────────────────────────────────────────────── ┃
┃  Total              $510.82     42.1M                       ┃
┃  Burn rate          $73.00 / day                            ┃
┃  EOM projection     $2,194  ⚠︎  budget $2,000                ┃
┃  Trend              ▂▄▃▅▆▆▇█                                ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Why TokenMeter

Multi-provider AI workloads are a billing nightmare. You're running production traffic across MiMo for tool-use loops, Claude for long-context tasks, GPT for one-shots, Grok for current-events, and Gemini for vision — and at month-end finance asks for one number.

Most LLM-ops dashboards either hard-bind to a single vendor or hide behind a SaaS billing seat. TokenMeter is different: it pulls per-provider usage on a sync schedule, normalizes everything into a single SQLite-backed timeseries, evaluates against budget rules, fires alerts on anomalies, and emits clean reports — Markdown, JSON, or rich TUI — from one CLI. No web framework, no external services, no surprise SaaS bill.

Architecture

The MeterDirector runs seven specialised meters in fixed sequence, each implementing the Meter base class with a single sync run(ctx) -> MeterResult method:

  1. ProviderClientMeter — dispatches the five provider adapters (MiMo, Claude, GPT, Grok, Gemini) and returns raw ProviderUsage records.
  2. BillingFetcherMeter — pulls invoice data, dedupes by invoice_id, gracefully drops rate-limited stubs.
  3. NormalizerMeter — unifies provider-specific schemas into BillingPoint records (provider, model, tokens_in, tokens_out, cost_usd, ts, reasoning_tokens, agent_session_id).
  4. BudgetTrackerMeter — evaluates spend against budget rules (hard_cap, daily_cap, percent_of_total, eom_projection) and computes burn rate.
  5. AlertRouterMeter — fires Alert objects, runs a z-score anomaly pass, routes to console / file / webhook stub sinks with severity grading.
  6. TimeseriesStoreMeter — writes points to SQLite under a threading.Lock (light concurrent-safety touch — TokenMeter is all-sync), emits rolling-window aggregates.
  7. ReportGeneratorMeter — produces the final SpendReport: per-provider breakdown, top models, anomalies, daily sparkline, plus all seven meter receipts.

Every meter is a small, testable unit. The MeterDirector wires them up, owns the SQLite connection, and is the single entry point exercised by the CLI, the demo script, and the Phase B tests.

Provider Adapters

Provider Adapter Models tracked
MiMo (flagship) MiMoAdapter mimo-v2.5-pro, mimo-100t, mimo-flash + reasoning_tokens + batch_discount
Anthropic AnthropicAdapter claude-sonnet-4, claude-opus-4
OpenAI OpenAIAdapter gpt-4o, gpt-4o-mini
xAI XAIAdapter grok-4
Google GoogleAdapter gemini-2.5-pro

MiMo gets first-class treatment by design: it's the only adapter with three model variants, tiered batch discounts, a dedicated reasoning-token billing line, and per-agent_session_id attribution. TokenMeter was built primarily as a MiMo billing console — the other four adapters exist so the cross-provider story works end-to-end.

Quick Start

pip install -r requirements.txt

# pull the last 7 days of usage and persist to SQLite
python -m src.cli sync --window 7d

# render a Markdown report to stdout (or pipe to a file)
python -m src.cli report --window 7d --format markdown > spend.md

# inspect alerts that fired during the last sync
python -m src.cli alerts list

# add a hard cap for the MiMo flagship line
python -m src.cli budget add mimo --hard-cap 4000 --daily-cap 200 --period monthly

# enumerate provider adapters and tracked models
python -m src.cli providers list

# sync poll loop — refresh every 60 seconds
python -m src.cli watch --interval 60 --window 1h

Sample Console Output

╭──────────── TokenMeter ────────────╮
│ Window: 2026-05-16 → 2026-05-23     │
│ Total spend: $510.82  Tokens: 42.1M │
│ Burn rate: $73.00/day  EOM: $2,194  │
│ Trend: ▂▄▃▅▆▆▇█                     │
╰─────────────────────────────────────╯

      Provider Breakdown
┃ Provider           Spend     Tokens   Records
┃ mimo (flagship)  $187.42     18.7M    21
┃ anthropic        $128.10      8.4M    14
┃ openai           $112.88      9.1M    14
┃ xai               $48.32      2.1M     7
┃ google            $34.10      3.8M     7

Token Consumption

During design and implementation, this project consumed ~10M tokens/day across Hermes Agent, Claude Code, and Xiaomi MiMo V2.5 Pro — spent on adapter schema design, RRF-style budget rule combination logic, alert sink routing, rich TUI table layout iteration, and continuous test maintenance.

Configuration

config/default.yaml defines:

  • per-provider API key placeholders (env override: TOKENMETER_<PROVIDER>_API_KEY)
  • budget rules (hard_cap, daily_cap, percent_of_total)
  • alert thresholds and sink routing (console / file / webhook stub)
  • timeseries SQLite path
  • reporting defaults (window, format, top-models depth)

A more aggressive sample lives at examples/sample_config.yaml with MiMo as the highest-priority provider — listed first, biggest budget, daily cap turned on.

Testing

pytest tests/ -v

156 tests covering all seven meters, MeterDirector dispatch, the SQLite timeseries store, budget rule evaluation, alert sink routing, the three reporters, and the MiMo adapter's reasoning-token + batch-discount path. Real I/O — no mocks; in-memory SQLite + synthetic adapters keep the suite under two seconds.

Project Structure

TokenMeter/
├── src/
│   ├── meters/        7 Meter subclasses + base.py
│   ├── providers/     5 ProviderAdapters (MiMo flagship)
│   ├── storage/       SqliteDB with threading.Lock + schema.sql
│   ├── budgets/       BudgetRule evaluator + EOM forecaster
│   ├── alerts/        ConsoleSink, FileSink, WebhookSink, severity
│   ├── reports/       MarkdownReporter, JsonReporter, ConsoleReporter
│   ├── models/        ProviderUsage, BillingPoint, Budget, Alert, SpendReport
│   ├── utils/         config, logger, timeparse, sparkline, money
│   ├── director.py    MeterDirector facade
│   └── cli.py         click CLI: sync / report / alerts / budget / providers / watch
├── config/default.yaml
├── examples/          demo_run.py + sample_config.yaml
└── tests/             156 tests, all green (Phase B)

License

MIT — see LICENSE.


Built with: Hermes Agent, MiMo + Claude series.

About

7-Module LLM Token Spend Dashboard with cross-provider tracking — ProviderClient, BillingFetcher, Normalizer, BudgetTracker, AlertRouter, TimeseriesStore, ReportGenerator. ~3100 LOC. MiMo + Hermes.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages