Skip to content

testerooms/Tokenizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

15 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ Claude Cost Guardian

Stop the Uber problem before it hits you. An open-source API proxy + FinOps dashboard that sits between your engineers and the Anthropic API β€” tracking spend, enforcing budget caps, and giving finance teams visibility before your AI bill blows up.


The Problem

In April 2026, Uber exhausted its entire 2026 AI budget in 4 months β€” driven by Claude Code adoption spreading through 5,000 engineers faster than any budget model could anticipate. Per-engineer costs hit $500–$2,000/month. No guardrails. No visibility. No plan.

This is a solvable problem.


How It Works

Engineer's Claude Code
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Claude Cost Guardianβ”‚  ← Proxy + Middleware
β”‚   Proxy (port 3001)   β”‚
β”‚                       β”‚
β”‚  1. Auth engineer     β”‚
β”‚  2. Check budget      β”‚  ← Block if over hard limit
β”‚  3. Forward request   β”‚  ← Warn if over soft limit
β”‚  4. Track tokens      β”‚  ← Record usage + cost
β”‚  5. Emit alerts       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
 Anthropic API (real)
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Dashboard (port 5173β”‚
β”‚   React + Recharts    β”‚
β”‚                       β”‚
β”‚  β€’ Per-engineer spend β”‚
β”‚  β€’ Team budgets       β”‚
β”‚  β€’ Daily trend chart  β”‚
β”‚  β€’ Blocked engineers  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Features

Feature Status
πŸ”Œ API proxy (drop-in for Anthropic API) βœ…
πŸ“Š Per-engineer spend tracking βœ…
🚦 Soft limit warnings (pass-through with header) βœ…
πŸ”’ Hard limit enforcement (block with 429) βœ…
πŸ‘₯ Team-level budget caps βœ…
πŸ“ˆ Daily spend trend dashboard βœ…
🏷️ Engineer tiers (standard / power / restricted) βœ…
πŸ”” Alert system βœ…
πŸ—ƒοΈ SQLite storage (zero-dependency) βœ…
πŸ“‹ YAML policy config βœ…
πŸ”„ OpenAI Codex / other providers 🚧 Coming
πŸ“§ Slack / email alerts 🚧 Coming
πŸ”‘ SSO / LDAP engineer sync 🚧 Coming

Quick Start

Prerequisites

  • Node.js 18+
  • npm 9+

1. Clone & install

git clone https://github.com/your-org/claude-cost-guardian
cd claude-cost-guardian
npm install

2. Configure environment

cp .env.example .env
# Edit .env and set your ANTHROPIC_API_KEY

3. Configure budget policy

Edit config/policy.yaml:

teams:
  default:
    monthlyCapUSD: 5000
    perEngineerSoftLimitUSD: 150   # Warning threshold
    perEngineerHardLimitUSD: 500   # Block threshold

4. Start the proxy + dashboard

npm run dev
  • Proxy β†’ http://localhost:3001
  • Dashboard β†’ http://localhost:5173
  • Metrics API β†’ http://localhost:3001/api/metrics/overview

5. Configure Claude Code to use the proxy

Add to your ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "http://localhost:3001/proxy",
    "ANTHROPIC_API_KEY": "your-key-here"
  },
  "http": {
    "headers": {
      "x-engineer-id": "your-engineer-id"
    }
  }
}

Or set environment variables:

export ANTHROPIC_BASE_URL=http://localhost:3001/proxy
export CLAUDE_EXTRA_HEADERS='{"x-engineer-id": "alice@company.com"}'

Budget Enforcement

When an engineer hits their soft limit, requests pass through but include warning headers:

X-Budget-Warning: Soft limit of $150.00 exceeded. Current spend: $162.40 / $500.00 hard limit.
X-Budget-Utilization: 32.48

When an engineer hits their hard limit, the proxy returns HTTP 429:

{
  "error": "Budget limit reached",
  "reason": "Monthly hard limit of $500.00 reached. Spent: $523.10. Resets 2026-06-01.",
  "currentSpend": 523.10,
  "limit": 500.00,
  "resetDate": "2026-06-01T00:00:00.000Z"
}

Engineer Tiers

Set per-engineer tiers to multiply their team limits:

Tier Multiplier Use case
standard 1x Most engineers
power 3x Staff engineers, ML leads
restricted 0.25x Probation, cost control

Token Pricing

Guardian uses the latest Anthropic pricing (per million tokens):

Model Input Output Cache Read Cache Write
claude-opus-4-5 $15 $75 $1.50 $18.75
claude-sonnet-4-5 $3 $15 $0.30 $3.75
claude-haiku-4-5 $0.8 $4 $0.08 $1.00

API Reference

GET /api/metrics/overview

Full org overview: total spend, top spenders, trend data.

GET /api/metrics/engineers

Per-engineer spend summaries for the current month.

GET /api/metrics/teams

Team-level budget summaries.

GET /api/metrics/trend?days=30

Daily spend trend (default last 30 days).

GET /api/alerts

Active (unresolved) alerts.


Architecture

claude-cost-guardian/
β”œβ”€β”€ packages/
β”‚   β”œβ”€β”€ proxy/              # Express proxy server
β”‚   β”‚   └── src/
β”‚   β”‚       β”œβ”€β”€ index.ts        # Main proxy + API server
β”‚   β”‚       β”œβ”€β”€ db.ts           # SQLite storage layer
β”‚   β”‚       β”œβ”€β”€ policyEngine.ts # Budget enforcement logic
β”‚   β”‚       β”œβ”€β”€ logger.ts       # Winston logger
β”‚   β”‚       └── routes/
β”‚   β”‚           β”œβ”€β”€ metrics.ts  # Dashboard API endpoints
β”‚   β”‚           └── alerts.ts
β”‚   β”œβ”€β”€ dashboard/          # React dashboard
β”‚   β”‚   └── src/
β”‚   β”‚       └── App.tsx         # Main dashboard UI
β”‚   └── shared/             # Shared TypeScript types
β”‚       └── src/
β”‚           └── index.ts        # Types, pricing constants
β”œβ”€β”€ config/
β”‚   └── policy.yaml         # Budget policy config
└── .env.example

Deployment

Docker (recommended for teams)

docker build -t claude-cost-guardian .
docker run -e ANTHROPIC_API_KEY=sk-ant-... -p 3001:3001 claude-cost-guardian

Self-hosted

Run the proxy on an internal server. Point all engineers' ANTHROPIC_BASE_URL to it. The SQLite DB persists on the server's filesystem.


Contributing

PRs welcome. Especially for:

  • Slack / email alert integrations
  • OpenAI Codex / Gemini proxy support
  • LDAP / Okta engineer sync
  • Per-repo or per-project budget attribution

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors