Skip to content

dabit3/llm-tap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm-tap

A local proxy that intercepts LLM API traffic, tracks costs in real-time, and provides a terminal dashboard. Think of it as tcpdump for your OpenAI and Anthropic API calls.

Why?

  • Cost Visibility: See exactly how much each request costs, broken down by model
  • Token Tracking: Monitor input/output tokens across all your LLM calls
  • Latency Monitoring: Track response times to identify slow requests
  • Debugging: Inspect request/response payloads without modifying your code
  • Zero Code Changes: Just set environment variables to route traffic through the proxy

Installation

npm install -g llm-tap

Or run locally:

git clone https://github.com/dabit3/llm-tap
cd llm-tap
npm install
npm link

Usage

Start the Proxy

llm-tap start

This starts the proxy on port 8787 with a real-time TUI dashboard showing:

  • Total cost, requests, and tokens
  • Cost breakdown by model
  • Recent request log with timing
  • Token and latency sparklines
  • Provider breakdown

Configure Your App

Set these environment variables to route traffic through llm-tap:

# For OpenAI SDK
export OPENAI_BASE_URL=http://localhost:8787/v1

# For Anthropic SDK
export ANTHROPIC_BASE_URL=http://localhost:8787/anthropic

Or print the exports:

llm-tap env

Options

# Custom port
llm-tap start --port 9000

# Verbose mode (logs to stdout instead of dashboard)
llm-tap start --verbose --no-dashboard

# Show pricing table
llm-tap pricing

API Endpoints

The proxy exposes these endpoints for programmatic access:

  • GET /stats - Aggregated statistics (costs, tokens, latency)
  • GET /requests?limit=50 - Recent requests with details
  • GET /export - Export all data as JSON
  • GET /health - Health check

Example:

curl http://localhost:8787/stats | jq

Dashboard Controls

  • q or Esc - Quit
  • r - Force refresh

Supported Providers

  • OpenAI: GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5, o1, o1-mini, o3-mini
  • Anthropic: Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus, Claude Sonnet 4, Claude Opus 4

Pricing is updated as of February 2026. Unknown models use a conservative default estimate.

How It Works

  1. llm-tap runs a local HTTP proxy server
  2. You point your LLM SDK at the proxy via *_BASE_URL environment variables
  3. The proxy forwards requests to the real API, capturing request/response data
  4. Token usage and costs are calculated based on the response's usage field
  5. Stats are aggregated and displayed in the TUI or exposed via the API
Your App  -->  llm-tap (localhost:8787)  -->  OpenAI/Anthropic API
              [logs, calculates cost]

Use Cases

  • Cost Optimization: Identify which parts of your app consume the most tokens
  • Debugging Agent Loops: See every API call your agent makes
  • Comparing Models: A/B test different models and compare costs
  • Budget Monitoring: Track spend during development
  • Latency Analysis: Find slow requests that impact user experience

License

MIT

About

Local proxy to intercept, log, and analyze LLM API traffic with real-time cost tracking

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors