adarshba/submux
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
SUBMUX subscription LLM gateway One binary. Two upstreams. Zero pasted tokens. WHAT IS SUBMUX? Submux is a small Rust HTTP gateway that proxies your Anthropic Claude Max and ChatGPT Codex *subscriptions* to standard SDK clients. You log in once with the official CLIs (`claude login` / `codex login`), run `submux`, and point Claude Code, Cursor, or any OpenAI/Anthropic SDK at it. It is not an API-key gateway. It does not multiplex many accounts. It does not need a database. It is a single process that brokers one account per provider with the right headers, cloaking, and OAuth refresh handling so session-mode clients keep working through SDKs that only know API keys. WHY Standard SDKs assume the unit of capacity is an API key. Subscriptions are different: OAuth tokens with short TTLs, refresh flows, fingerprint headers, the OAuth body cloak Claude Code uses. Stuffing that into "set ANTHROPIC_API_KEY=…" loses too much. Submux carries the session model end-to-end so the client never has to. FEATURES - `/v1/messages` byte-for-byte passthrough to api.anthropic.com with the Claude Code OAuth body cloak, Stainless headers, and the `anthropic-beta: oauth-2025-04-20` tag. Refresh-on-401 with singleflight. - `/v1/chat/completions` translation: OpenAI Chat Completions in, Anthropic Messages SSE out, OpenAI streaming chunks (or buffered `chat.completion`) back to the client. - `/codex/responses` passthrough to chatgpt.com/backend-api/codex with the Codex CLI fingerprint and JWT bearer. - `/codex/v1/messages` Anthropic Messages in, Codex Responses out — lets Claude Code (`ANTHROPIC_BASE_URL=http://submux/codex`) drive a ChatGPT subscription end to end. - `/v1/models` (+ `/v1/models/:model`) proxy so Claude Code's model-validation probe succeeds against the gateway instead of 404-ing. - Auto-discovers credentials from where the official CLIs already store them: macOS Keychain item `Claude Code-credentials` or `~/.claude/.credentials.json`, and `~/.codex/auth.json`. - Optional inbound API key gate (`api_key` in the TOML, or `SUBMUX_API_KEY`) with constant-time verification. Disabled by default; routes are open on `127.0.0.1` only. - OpenTelemetry SDK metrics: OTLP push (set `SUBMUX_OTLP_ENDPOINT`) plus a Prometheus `/metrics` scrape endpoint, from one instrument set. Per-consumer usage and token counts (Anthropic + Codex) keyed on the client `X-Proxy-User-Id` header, labelled by protocol/model. A one-command Docker Compose stack (collector + Prometheus + Grafana with the dashboard pre-provisioned) and the dashboard JSON live in `examples/`. SETUP 1. Log in with the official CLIs (one-time) claude login # stores in macOS Keychain or ~/.claude/.credentials.json codex login # stores in ~/.codex/auth.json Skip whichever provider you don't intend to proxy. 2. Install and run cargo install submux submux Requires Rust. Get it at https://rustup.rs First boot writes a config skeleton to ~/.config/submux/config.toml (mode 0600), discovers your CLI credentials, prints a banner, and serves the proxy on http://127.0.0.1:8080. 3. Point your client at it Claude Code (Claude Max): ANTHROPIC_BASE_URL=http://127.0.0.1:8080 OpenAI-compatible client: base_url=http://127.0.0.1:8080 Claude Code (ChatGPT Codex): ANTHROPIC_BASE_URL=http://127.0.0.1:8080/codex Optional — observability: Push metrics to a collector: SUBMUX_OTLP_ENDPOINT=http://localhost:4318 Tag per-consumer usage: ANTHROPIC_CUSTOM_HEADERS="X-Proxy-User-Id: $(whoami)_$(uuidgen)" 4. Optional — lock the proxy down Edit ~/.config/submux/config.toml and set api_key to any non-empty string. Then point your client's *_API_KEY at the same value. The gate uses Authorization: Bearer or x-api-key, constant-time. Health, readiness, and /metrics stay open. CLI submux run the proxy (the only verb) submux --config <path> use a config file outside the XDG default submux -q | -v warn-only or debug logging submux --help / --version To inspect or rotate config values, edit the TOML directly and restart. TECH STACK - Rust 2024 edition, single library + binary crate - tokio 1.42 + axum 0.7 inbound - reqwest 0.12 outbound with per-request cookie jars - arc-swap, dashmap, moka, parking_lot for hot-path state - chacha20poly1305-shaped api key compare via sha2 (constant-time) - tracing + tracing-subscriber; OpenTelemetry SDK metrics (OTLP + Prometheus) PROJECT STRUCTURE src/ ├── lib.rs module roots + re-exports ├── main.rs binary entry (parse → resolve → seed → serve) ├── cli.rs clap flags (--config, -q, -v) ├── core/ canonical types (no I/O) ├── config/ TOML file + env resolver → Settings + auto-discovery ├── accounts/ Account pool, refresh, cookie jar, cooldown, discovery ├── providers/anthropic AnthropicProxy + cloak + headers + oauth + quota ├── providers/codex CodexProxy + ChatGPT session + cookies ├── protocols/ Anthropic ↔ OpenAI translation, collapse, parse ├── streaming/ SSE parser/emitter, relay state machine ├── telemetry/ OpenTelemetry metrics (OTLP push + Prometheus) + tracer ids ├── server/ axum app, middleware (request-id, auth, panic), │ routes (root, messages, models, chat, codex, │ codex_messages, health, metrics), banner, shutdown └── constants/ header names, upstream paths, user agents, limits See docs/architecture.md for the full request path and module map. DEVELOPMENT (contributors) cargo check cargo fmt --all -- --check cargo clippy --all-targets -- -D warnings cargo test --lib cargo build --release The verify chain must pass before merge. DESIGN PHILOSOPHY - Subscriptions, not API keys, on the credential side. - Bytes, not parsed tokens, on the fast path. - Auto-discover before asking the user to paste anything. - One concept per file; types in their own files; method-first APIs. - No dead modules. If it isn't on the request path, it isn't here. DISCLAIMER Submux is a personal engineering project. It is not affiliated with Anthropic, OpenAI, or any subscription provider. Use it only with accounts you own and within each provider's terms of service. LICENSE Apache-2.0