llmwatch

A lightweight Python library for LLM cost attribution — track, tag, and report LLM API costs by feature, user, and model.

Why llmwatch?

llmwatch is not an observability platform or proxy server. It's a lightweight Python library that integrates directly into your existing codebase.

Unlike solutions like Langfuse, LangSmith, or LiteLLM, llmwatch requires no external infrastructure, no API gateway, and no proxy setup. Just pip install llmwatch and add 3 lines of code to start tracking LLM costs.

Key differentiators:

No proxy or gateway needed — Unlike LiteLLM and Helicone, which sit between your code and LLM APIs
No external platform — Unlike Langfuse and LangSmith, which require cloud infrastructure
Works with your existing SDK — Patch your OpenAI, Anthropic, or Google clients with instrument(client)
Feature-level cost attribution — Tag LLM calls by feature, user, environment, and any custom dimension
Minimal setup — 3 lines of code to get started
1000+ models — Bundled pricing data covering OpenAI, Anthropic, Google, and more

Quick Start

Async

from openai import AsyncOpenAI
from llmwatch.tracker import LLMWatch

client = AsyncOpenAI()
watcher = LLMWatch(client=client)

@watcher.tracked(feature="summarize", user_id="alice")
async def summarize(text: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content

result = await summarize("Long document text...")

Sync

from openai import OpenAI
from llmwatch.tracker import LLMWatch

client = OpenAI()
watcher = LLMWatch(client=client)

@watcher.tracked(feature="summarize", user_id="alice")
def summarize(text: str) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": f"Summarize: {text}"}],
    )
    return response.choices[0].message.content

result = summarize("Long document text...")

Features

Automatic cost tracking — Instrument SDK clients to capture token usage and calculate costs without modifying your LLM calls
Flexible tagging — Attach metadata to tracked calls with @watcher.tracked(feature=..., user_id=..., environment=...)
Multi-provider support — OpenAI, Anthropic, Google Generative AI (sync, async, and streaming)
Bundled pricing — 1000+ models with up-to-date pricing data synced from pydantic/genai-prices
Multiple database backends — SQLite (default), PostgreSQL, MySQL, MongoDB (Beanie ODM), Oracle, MSSQL
Budget alerts — Set thresholds and trigger callbacks when spending exceeds limits
Reporting and export — Generate cost summaries by feature, user, model, or provider (CSV/JSON)
CLI tools — View reports, manage data, sync pricing
Web dashboard — Optional interactive dashboard for cost visualization (llmwatch dashboard)
Streaming support — Track costs for streaming responses (SSE, async streams)

Supported Providers

Provider	Sync	Async	Streaming	Models
OpenAI	O	O	O	GPT-5.4, o4-mini, o3, o1, GPT-4o, etc.
Anthropic	O	O	O	Claude Opus 4.6, Claude Sonnet 4.6, Claude Haiku 4.5, etc.
Google	O	O	O	Gemini 3.1, Gemini 2.5, Gemini 2.0, etc.

Installation

pip install llmwatch
# or
uv add llmwatch

Optional Database Backends

pip install llmwatch[pg]         # PostgreSQL
pip install llmwatch[mysql]      # MySQL
pip install llmwatch[mongo]      # MongoDB (Beanie ODM)
pip install llmwatch[dashboard]  # Web dashboard (Starlette + Uvicorn)

Usage

Basic Tracking

from openai import AsyncOpenAI
from llmwatch.tracker import LLMWatch

client = AsyncOpenAI()
watcher = LLMWatch(client=client)

@watcher.tracked(feature="chat", user_id="user123", environment="production")
async def chat_response(prompt: str) -> str:
    response = await client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
    )
    return response.choices[0].message.content

# Costs are tracked automatically
result = await chat_response("Hello, how are you?")

Budget Alerts

watcher = LLMWatch(client=client)

async def on_budget_exceeded(record):
    print(f"Budget exceeded: ${record.cost_usd:.4f} on feature={record.tags.feature}")

watcher.budget.add_rule(
    max_cost_usd=0.50,
    callback=on_budget_exceeded,
    feature="summarize",
)

Reporting

Programmatic

summary = await watcher.report.by_feature(period="7d")
print(f"Total cost: ${summary.total_cost_usd:.4f}")
for b in summary.breakdowns:
    print(f"  {b.group_value}: ${b.total_cost_usd:.4f} ({b.total_requests} calls)")

# Also available: by_user_id(), by_model(), by_provider()
await watcher.report.export_csv("costs.csv", group_by="feature", period="30d")
await watcher.report.export_json("costs.json", group_by="model", period="7d")

CLI

llmwatch report --group-by feature --period 7d
llmwatch export costs.csv --format csv
llmwatch pricing list --provider openai
llmwatch pricing sync

Web Dashboard

pip install llmwatch[dashboard]
llmwatch dashboard
# Opens at http://localhost:8000

Multiple Database Backends

By default, llmwatch uses SQLite (~/.llmwatch/usage.db). Switch to other backends by passing a storage instance:

from llmwatch.tracker import LLMWatch
from llmwatch.databases.sqlalchemy import Storage

# PostgreSQL
watcher = LLMWatch(
    client=client,
    storage=Storage("postgresql+asyncpg://user:password@localhost/llmwatch"),
)

# MySQL
watcher = LLMWatch(
    client=client,
    storage=Storage("mysql+aiomysql://user:password@localhost/llmwatch"),
)

MongoDB

from llmwatch.tracker import LLMWatch
from llmwatch.databases.mongo import MongoStorage

watcher = LLMWatch(
    client=client,
    storage=MongoStorage("mongodb://localhost:27017", database="llmwatch"),
)

CLI Reference

Command	Description
`llmwatch report`	Generate cost report (`--group-by`, `--period`)
`llmwatch export`	Export usage records to CSV or JSON
`llmwatch prune`	Delete old records by date
`llmwatch stats`	Show database statistics
`llmwatch pricing list`	List pricing data by provider
`llmwatch pricing sync`	Sync pricing data from upstream
`llmwatch dashboard`	Start interactive web dashboard

How It Works

Instrument — LLMWatch(client=client) patches the SDK client's methods
Extract — On each LLM call, extractors normalize the response (handles OpenAI, Anthropic, Google, streaming)
Calculate — calculate_cost() computes USD cost using bundled pricing data
Store — Storage.save() persists the UsageRecord to your database
Tag — @watcher.tracked() provides tag context (feature, user_id, environment)
Alert — Optional BudgetAlert callbacks trigger when thresholds are exceeded
Report — Reporter generates cost summaries grouped by feature, user, model, or provider

LLM Call
  | (via instrumented SDK client)
Extract Response -> Calculate Cost -> Save Record + Tags
  |
Database
  | (queried by Reporter)
Reports, Dashboards, Exports

Development

uv sync --group dev
uv run pytest tests/ -v
uv run ruff check src/ tests/
uv run mypy src/llmwatch/

License

MIT

Pricing data sourced from pydantic/genai-prices.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
assets		assets
examples		examples
scripts		scripts
src/llmwatch		src/llmwatch
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llmwatch

Why llmwatch?

Quick Start

Async

Sync

Features

Supported Providers

Installation

Optional Database Backends

Usage

Basic Tracking

Budget Alerts

Reporting

Programmatic

CLI

Web Dashboard

Multiple Database Backends

MongoDB

CLI Reference

How It Works

Development

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llmwatch

Why llmwatch?

Quick Start

Async

Sync

Features

Supported Providers

Installation

Optional Database Backends

Usage

Basic Tracking

Budget Alerts

Reporting

Programmatic

CLI

Web Dashboard

Multiple Database Backends

MongoDB

CLI Reference

How It Works

Development

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages