Skip to content

balyakin/smartproxy

Repository files navigation

SmartProxy

SmartProxy is a local LLM proxy for people who use OpenAI-compatible clients but do not want provider choice, failover, cost tracking, and caching scattered across shell scripts.

It runs as one Go binary, listens on 127.0.0.1:4100, and keeps its ledger in SQLite. Point Cursor, Continue, the OpenAI SDK, or any compatible client at http://127.0.0.1:4100/v1; SmartProxy decides where each chat request should go.

SmartProxy terminal dashboard preview

The checked-in image above is a dashboard preview with sample numbers. Replace it with a real terminal capture when you publish release screenshots.

Why It Exists

Local LLM workflows tend to grow small pieces of infrastructure: one script for DeepSeek, another for OpenAI, a hand-written retry loop, a spreadsheet for token costs, and a half-remembered budget check. SmartProxy puts those boring parts in one place.

The useful bit is not that it hides providers. It is that every request gets a recorded decision: which route matched, which target was tried, whether failover happened, how many tokens came back, what it probably cost, and whether a cache hit saved the call.

What It Does

  • Accepts OpenAI Chat Completions requests at POST /v1/chat/completions.
  • Routes by exact model, model aliases, ordered rules, request shape, tools, vision input, estimated token size, and target capabilities.
  • Talks to OpenAI-compatible providers directly and translates Anthropic Messages responses back into OpenAI-shaped responses.
  • Retries transient upstream failures, respects Retry-After, and opens circuit breakers around bad targets.
  • Records local SQLite telemetry for usage, cost, routing, cache, latency, status, request fingerprints, and anomalies.
  • Keeps prompts, responses, tool arguments, request bodies, API keys, and full client IP addresses out of telemetry tables.
  • Supports an optional exact-match response cache. When enabled, cache entries can be gzipped and encrypted at rest.
  • Enforces daily, monthly, and session budgets before sending work upstream.
  • Ships CLI tools for setup, config validation, provider checks, route explanation, stats, export, pruning, cache maintenance, and a Bubble Tea terminal dashboard.

Status

SmartProxy is currently 0.1.0. The v1 scope is intentionally narrow: OpenAI-compatible Chat Completions in, provider-specific HTTP out, local observability around the trip. The Responses API, embeddings, image generation, audio generation, hosted dashboards, and multi-tenant billing are outside this version.

Quick Start

Build the binary from a checkout:

go build -trimpath -o dist/smartproxy ./cmd/smartproxy

Run it with zero config by setting at least one supported provider key:

export OPENAI_API_KEY=sk-...
./dist/smartproxy

SmartProxy will listen at:

http://127.0.0.1:4100/v1

On loopback, client authentication is optional by default. If your SDK requires an API key field, use any non-empty placeholder value there; provider keys still come from the environment variables configured for SmartProxy.

Try a request:

curl -s http://127.0.0.1:4100/v1/chat/completions \
  -H 'content-type: application/json' \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [{"role": "user", "content": "Reply with one short sentence."}]
  }'

Then inspect what happened:

./dist/smartproxy stats
./dist/smartproxy dash

Configuration

For anything beyond the simplest local setup, create a config file:

./dist/smartproxy setup
./dist/smartproxy start --watch

Or start from the checked-in example:

cp smartproxy.yaml.example smartproxy.yaml
$EDITOR smartproxy.yaml
./dist/smartproxy config validate --config smartproxy.yaml
./dist/smartproxy --config smartproxy.yaml start --watch

SmartProxy discovers config files in this order:

./smartproxy.yaml
~/.config/smartproxy/smartproxy.yaml
~/.smartproxy.yaml

The full example lives in smartproxy.yaml.example. It shows three common provider entries, ordered route rules, failover policy, cache settings, budget limits, pricing overrides, and local auth defaults.

Useful environment overrides:

SMARTPROXY_LISTEN=127.0.0.1:4100
SMARTPROXY_ADMIN_LISTEN=127.0.0.1:4101
SMARTPROXY_DB=~/.smartproxy/smartproxy.db
SMARTPROXY_LOG_LEVEL=info
SMARTPROXY_CACHE_ENABLED=false
SMARTPROXY_BUDGET_DAILY_LIMIT_USD=10
SMARTPROXY_BUDGET_MONTHLY_LIMIT_USD=200
SMARTPROXY_BUDGET_SESSION_LIMIT_USD=0

Provider keys are read from the env vars named in the config, for example OPENAI_API_KEY, DEEPSEEK_API_KEY, and ANTHROPIC_API_KEY.

Routing Model

Routing is deterministic. For each chat request SmartProxy:

  1. Resolves model aliases.
  2. Sends an exact configured model match directly to the first enabled target.
  3. Otherwise evaluates routes from top to bottom.
  4. Skips targets that are disabled, circuit-open, missing credentials, too small for the request, or missing required capabilities.
  5. Retries transient failures on the same target.
  6. Fails over to the next capable target when retries are exhausted.

You can ask SmartProxy to explain a request without sending it upstream:

./dist/smartproxy route explain \
  --config smartproxy.yaml.example \
  --file testdata/openai_chat.json

Responses include routing headers when observability.response_headers is true:

x-smartproxy-request-id
x-smartproxy-route
x-smartproxy-target
x-smartproxy-provider
x-smartproxy-model
x-smartproxy-attempts
x-smartproxy-failover
x-smartproxy-cache

CLI Reference

The day-to-day commands are small on purpose:

Need Command
Create an interactive config smartproxy setup
Start the proxy smartproxy start --watch
Start with the dashboard smartproxy start --dash
Validate config smartproxy config validate --config smartproxy.yaml
Check local setup smartproxy doctor --config smartproxy.yaml
Check providers smartproxy providers check --config smartproxy.yaml
Explain routing smartproxy route explain --file request.json
Send one request from a file smartproxy request --file request.json
Show local usage smartproxy stats --today
Export telemetry smartproxy export --format csv
Show cache stats smartproxy cache stats
Clear cache entries smartproxy cache clear
Prune old telemetry smartproxy prune --execute

HTTP Surface

Public listener:

  • POST /v1/chat/completions
  • GET /v1/models
  • GET /healthz
  • GET /readyz
  • GET /stats

Admin listener:

  • POST /admin/reload
  • GET /admin/stats
  • GET /admin/budget
  • GET /metrics
  • GET /debug/routes
  • GET /debug/status

POST /v1/responses is intentionally rejected in v1 with an unsupported_endpoint error.

Data Policy

Telemetry is designed for debugging spend and routing, not for storing user content. Request logs include metadata such as route, target, provider, model, status, token counts, latency, cache status, cost estimate, retry attempts, user agent, salted client address hash, trace ID, and a salted request fingerprint.

Telemetry does not store:

  • prompt text;
  • response text;
  • request bodies;
  • tool arguments;
  • API keys;
  • full client IP addresses.

The response cache is different: if you enable it, SmartProxy stores response bodies locally so it can replay exact matches. Use cache.encrypt: true and set SMARTPROXY_CACHE_KEY when cached bodies need encryption at rest.

Docker

The default listener binds to loopback. In Docker, bind the public listener to 0.0.0.0 and set a SmartProxy client key:

docker build -t smartproxy .

docker run --rm \
  -p 127.0.0.1:4100:4100 \
  -e OPENAI_API_KEY \
  -e SMARTPROXY_LISTEN=0.0.0.0:4100 \
  -e SMARTPROXY_API_KEY=local-dev-key \
  smartproxy

Clients should then send Authorization: Bearer local-dev-key to the local proxy. The upstream provider key remains inside the container environment.

Development

Requirements:

  • Go 1.23 or newer.
  • No Redis, Postgres, CGO SQLite driver, or external worker process.

Common checks:

make test
make race
make coverage
make release-check

The release check runs the regular tests, race tests, coverage script, and a trimmed binary build.

Project Layout

cmd/smartproxy/        CLI entrypoint
internal/api/          OpenAI request parsing and request analysis
internal/provider/     OpenAI-compatible and Anthropic upstream adapters
internal/translate/    OpenAI <-> Anthropic conversion
internal/routing/      Rules, retries, failover, circuit state
internal/runtime/      HTTP listeners, auth, headers, reload
internal/store/        SQLite schema, WAL store, async logging
internal/report/       Stats, export, assertions
internal/dashboard/    Terminal dashboard
internal/cache/        Exact-match response cache
internal/budget/       Daily, monthly, and session limits

License

MIT. See LICENSE.

About

Local-first OpenAI-compatible LLM proxy with smart routing, failover, SQLite telemetry, budgets, response caching, Anthropic translation, and a terminal dashboard.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages