llm-tap

A local proxy that intercepts LLM API traffic, tracks costs in real-time, and provides a terminal dashboard. Think of it as tcpdump for your OpenAI and Anthropic API calls.

Why?

Cost Visibility: See exactly how much each request costs, broken down by model
Token Tracking: Monitor input/output tokens across all your LLM calls
Latency Monitoring: Track response times to identify slow requests
Debugging: Inspect request/response payloads without modifying your code
Zero Code Changes: Just set environment variables to route traffic through the proxy

Installation

npm install -g llm-tap

Or run locally:

git clone https://github.com/dabit3/llm-tap
cd llm-tap
npm install
npm link

Usage

Start the Proxy

llm-tap start

This starts the proxy on port 8787 with a real-time TUI dashboard showing:

Total cost, requests, and tokens
Cost breakdown by model
Recent request log with timing
Token and latency sparklines
Provider breakdown

Configure Your App

Set these environment variables to route traffic through llm-tap:

# For OpenAI SDK
export OPENAI_BASE_URL=http://localhost:8787/v1

# For Anthropic SDK
export ANTHROPIC_BASE_URL=http://localhost:8787/anthropic

Or print the exports:

llm-tap env

Options

# Custom port
llm-tap start --port 9000

# Verbose mode (logs to stdout instead of dashboard)
llm-tap start --verbose --no-dashboard

# Show pricing table
llm-tap pricing

API Endpoints

The proxy exposes these endpoints for programmatic access:

GET /stats - Aggregated statistics (costs, tokens, latency)
GET /requests?limit=50 - Recent requests with details
GET /export - Export all data as JSON
GET /health - Health check

Example:

curl http://localhost:8787/stats | jq

Dashboard Controls

q or Esc - Quit
r - Force refresh

Supported Providers

OpenAI: GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5, o1, o1-mini, o3-mini
Anthropic: Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus, Claude Sonnet 4, Claude Opus 4

Pricing is updated as of February 2026. Unknown models use a conservative default estimate.

How It Works

llm-tap runs a local HTTP proxy server
You point your LLM SDK at the proxy via *_BASE_URL environment variables
The proxy forwards requests to the real API, capturing request/response data
Token usage and costs are calculated based on the response's usage field
Stats are aggregated and displayed in the TUI or exposed via the API

Your App  -->  llm-tap (localhost:8787)  -->  OpenAI/Anthropic API
              [logs, calculates cost]

Use Cases

Cost Optimization: Identify which parts of your app consume the most tokens
Debugging Agent Loops: See every API call your agent makes
Comparing Models: A/B test different models and compare costs
Budget Monitoring: Track spend during development
Latency Analysis: Find slow requests that impact user experience

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bin		bin
src		src
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llm-tap

Why?

Installation

Usage

Start the Proxy

Configure Your App

Options

API Endpoints

Dashboard Controls

Supported Providers

How It Works

Use Cases

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llm-tap

Why?

Installation

Usage

Start the Proxy

Configure Your App

Options

API Endpoints

Dashboard Controls

Supported Providers

How It Works

Use Cases

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages