Skip to content

ged1182/TokenLedger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

54 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸͺ™ TokenLedger

Know exactly what your AI features cost, per user, per endpoint, per day.

CI codecov PyPI version License: ELv2 Python 3.11+

Note: TokenLedger is in active development (v0.x). The API is stabilizing but may have breaking changes before v1.0. Pin your version in requirements.

TokenLedger is a self-hosted LLM cost analytics solution that runs on your existing Postgres database. Zero external dependencies, complete data ownership, works with Supabase out of the box.

✨ Why TokenLedger?

Every startup building AI features lacks cost attribution:

  • πŸ“Š "Which users are costing us the most?" β†’ No idea
  • 🎯 "What's our cost per feature?" β†’ Can't tell you
  • πŸ” "Which endpoint is burning through tokens?" β†’ Who knows

Existing solutions (Helicone, LangSmith, Langfuse) are either:

  • SaaS β€” Your data leaves your infrastructure
  • Heavy β€” Require significant setup and infrastructure
  • Expensive β€” Per-seat pricing adds up fast

TokenLedger is different:

  • βœ… Postgres-native β€” Works with your existing database (Supabase, Neon, RDS)
  • βœ… Self-hosted β€” Your data never leaves your infrastructure
  • βœ… Zero overhead β€” 2-line integration, async batching
  • βœ… Cost-aware β€” Automatic cost calculation with up-to-date pricing

πŸš€ Quick Start

Installation

pip install tokenledger

2-Line Integration

import tokenledger
import openai

# Configure once
tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_openai()

# That's it! All calls are now tracked
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Every OpenAI call is now automatically logged to your Postgres database with:

  • Token counts (input, output, cached)
  • Cost in USD
  • Latency
  • Model used
  • User ID (if provided)
  • Full request/response metadata

Streaming Support

Streaming calls are also automatically tracked:

# Streaming works seamlessly
for chunk in openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
    stream_options={"include_usage": True}  # Optional: get token counts
):
    print(chunk.choices[0].delta.content or "", end="")
# Event is logged after stream completes

Works with Anthropic too

import tokenledger
import anthropic

tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_anthropic()

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-5-latest",
    messages=[{"role": "user", "content": "Hello!"}]
)

And Google Gemini

import tokenledger
from google import genai

tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_google()

client = genai.Client(api_key="...")
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Hello!"
)

Cost Attribution

Know exactly who is spending money and which features are driving costs:

from tokenledger import attribution

# Context manager - all calls inside are attributed
with attribution(user_id="user_123", feature="summarize", team="ml"):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Summarize this..."}]
    )

# Decorator - attribute entire functions
@attribution(feature="chat", cost_center="CC-001")
def handle_chat(user_id: str, message: str):
    with attribution(user_id=user_id):  # Contexts nest and merge
        return client.chat.completions.create(...)

Query your costs by any dimension:

SELECT feature, team, SUM(cost_usd) as cost
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days'
GROUP BY feature, team
ORDER BY cost DESC;

Streaming with Attribution

When using streaming/lazy responses (common with frameworks like pydantic-ai, LangChain), the LLM API call may happen after the context manager exits. Use persistent=True mode:

from tokenledger import attribution, clear_attribution

# Problem: Context exits before stream is consumed
async with attribution(user_id="user123"):
    response = await framework.stream(...)  # Returns lazy response
# Context exits here!
async for chunk in response:  # API call happens here, context is gone!
    yield chunk

# Solution: Use persistent mode
async with attribution(user_id="user123", feature="chat", persistent=True):
    response = await framework.stream(...)

async for chunk in response:  # Context still active!
    yield chunk

clear_attribution()  # Explicitly clear when done

πŸ“Š Dashboard

TokenLedger includes a beautiful React dashboard:

# Start with Docker
docker compose up

# Open http://localhost:3000

Or run the API server standalone:

pip install tokenledger[server]
python -m tokenledger.server

πŸ”§ Configuration Options

import tokenledger

tokenledger.configure(
    # Database connection
    database_url="postgresql://user:pass@localhost/db",
    
    # App identification
    app_name="my-app",
    environment="production",
    
    # Performance tuning
    batch_size=100,           # Events per batch write
    flush_interval_seconds=5,  # How often to flush
    async_mode=True,          # Background logging
    
    # Sampling for high-volume apps
    sample_rate=1.0,          # 1.0 = log everything
)

πŸ“ˆ Querying Your Data

Using the Python API

from tokenledger.queries import TokenLedgerQueries

queries = TokenLedgerQueries()

# Get cost summary
summary = queries.get_cost_summary(days=30)
print(f"Last 30 days: ${summary.total_cost:.2f}")
print(f"Total requests: {summary.total_requests}")

# Cost by model
models = queries.get_costs_by_model(days=30)
for m in models:
    print(f"{m.model}: ${m.total_cost:.2f} ({m.total_requests} requests)")

# Cost by user
users = queries.get_costs_by_user(days=30)
for u in users[:5]:
    print(f"{u.user_id}: ${u.total_cost:.2f}")

# Daily trends
daily = queries.get_daily_costs(days=7)
for d in daily:
    print(f"{d.date}: ${d.total_cost:.2f}")

Direct SQL

-- Daily costs by model
SELECT 
    DATE(timestamp) as date,
    model,
    SUM(cost_usd) as total_cost,
    COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY DATE(timestamp), model
ORDER BY date DESC, total_cost DESC;

-- Top 10 users by cost
SELECT 
    user_id,
    SUM(cost_usd) as total_cost,
    COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY user_id
ORDER BY total_cost DESC
LIMIT 10;

-- Projected monthly cost
SELECT 
    (SUM(cost_usd) / 7) * 30 as projected_monthly
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days';

πŸ”Œ Framework Integration

FastAPI

from fastapi import FastAPI
from tokenledger.middleware import FastAPIMiddleware

app = FastAPI()
app.add_middleware(FastAPIMiddleware)

# User ID from X-User-ID header is automatically tracked

Flask

from flask import Flask
from tokenledger.middleware import TokenLedger

app = Flask(__name__)
TokenLedger(app)

Manual Tracking

from tokenledger import track_cost

# Track manually if you need to
track_cost(
    input_tokens=150,
    output_tokens=500,
    model="gpt-4o",
    user_id="user_123",
)

🐘 Supabase Setup

TokenLedger works perfectly with Supabase:

  1. Get your connection string from Supabase Dashboard β†’ Settings β†’ Database

  2. Run the migrations:

DATABASE_URL="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres" tokenledger db init
  1. Configure TokenLedger:
tokenledger.configure(
    database_url="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres"
)

πŸ“ Project Structure

tokenledger/
β”œβ”€β”€ tokenledger/           # Python package
β”‚   β”œβ”€β”€ __init__.py       # Main exports
β”‚   β”œβ”€β”€ config.py         # Configuration
β”‚   β”œβ”€β”€ tracker.py        # Core tracking logic
β”‚   β”œβ”€β”€ pricing.py        # LLM pricing data
β”‚   β”œβ”€β”€ queries.py        # Analytics queries
β”‚   β”œβ”€β”€ decorators.py     # @track_llm decorator
β”‚   β”œβ”€β”€ middleware.py     # FastAPI/Flask middleware
β”‚   β”œβ”€β”€ server.py         # Dashboard API server
β”‚   └── interceptors/     # SDK patches
β”‚       β”œβ”€β”€ openai.py
β”‚       β”œβ”€β”€ anthropic.py
β”‚       └── google.py
β”œβ”€β”€ dashboard/            # React dashboard
β”œβ”€β”€ migrations/           # SQL migrations
└── examples/             # Usage examples

πŸ’° Supported Models & Pricing

TokenLedger includes up-to-date pricing (January 2026) for 74+ models across 3 providers:

OpenAI (38 text models + audio/image)

Model Family Input/1M Output/1M Notes
GPT-5 (5.2, 5.1, 5, mini, nano) $0.05-1.75 $0.40-14.00 Cached input support
GPT-5 Pro $15.00 $120.00 Premium reasoning
GPT-4.1 (4.1, mini, nano) $0.10-2.00 $0.40-8.00 1M context window
GPT-4o (4o, 4o-mini) $0.15-2.50 $0.60-10.00 128K context
O-Series (o1, o3, o4-mini) $1.10-20.00 $4.40-80.00 Reasoning models
Audio (Whisper, TTS) $0.003-0.012/min - Per-minute billing
Images (DALL-E 3, GPT-Image) $0.04-0.12/image - Per-image billing

Anthropic (23 models)

Model Family Input/1M Output/1M Notes
Claude 4.5 (Opus, Sonnet, Haiku) $1.00-5.00 $5-25 Latest generation
Claude 4 (Opus, Sonnet) $3.00-15.00 $15-75 Prompt caching
Claude 3.7 (Sonnet) $3.00 $15.00 Prompt caching
Claude 3.5 (Sonnet, Haiku) $0.80-3.00 $4-15 Prompt caching
Claude 3 (Opus, Sonnet, Haiku) $0.25-15.00 $1.25-75 Legacy

Google Gemini (13 models)

Model Family Input/1M Output/1M Notes
Gemini 3 (Pro, Flash preview) $0.50-2.00 $4-12 Latest preview
Gemini 2.5 (Pro, Flash, Lite) $0.10-1.25 $0.40-10 Production ready
Gemini 2.0 (Flash, Lite) $0.075-0.10 $0.30-0.40 Fast inference

Coming Soon

  • Mistral (pricing data included, interceptor planned)
  • Custom/self-hosted models

πŸ›  Development

# Clone the repo
git clone https://github.com/yourusername/tokenledger
cd tokenledger

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Start local development
docker compose up postgres
python -m tokenledger.server

πŸ—Ί Roadmap

  • Alerts & notifications (budget thresholds)
  • Cost allocation tags (feature, team, project, cost_center)
  • Team/project grouping via attribution context
  • Google Gemini support
  • OpenAI audio/image API tracking
  • pydantic-ai framework compatibility
  • OpenAI streaming support
  • Anthropic streaming support
  • Google streaming support
  • Grafana integration
  • CLI for querying
  • More LLM providers (Mistral, Cohere)
  • TimescaleDB optimization guide

πŸ“œ License

TokenLedger is licensed under the Elastic License 2.0 (ELv2).

What this means:

  • βœ… Free to use β€” Use TokenLedger in your projects, even commercial ones
  • βœ… Modify freely β€” Fork it, extend it, make it yours
  • βœ… Self-host β€” Run it on your own infrastructure
  • ❌ No SaaS β€” You cannot offer TokenLedger as a hosted/managed service

This license protects the project while keeping it free for the community.

πŸ™ Contributing

Contributions are welcome! Please read our Contributing Guide first.


Built with ❀️ for the AI startup community

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors