a simple and blazing fast LiteLLM-compatible ai gateway for coding agents (Claude Code, Codex, Hermes, etc.)
lite claudeThe first run prompts for your LiteLLM URL and API key, saves them to
~/.config/lite/claude.env, and starts Claude Code with:
ANTHROPIC_BASE_URL="https://your-litellm-rust-server.com"
ANTHROPIC_AUTH_TOKEN="$LITELLM_API_KEY"Arguments after lite claude are forwarded to Claude Code:
lite claude --help
lite claude --model claude-sonnet-4-5Run lite claude --reset to ignore saved settings and enter them again.
lite codexThe first run prompts for your LiteLLM URL and API key, saves them to
~/.config/lite/codex.env, and starts Codex pointed at the gateway. Codex uses
the OpenAI Responses API (SSE over HTTP — no WebSocket), so requests land on
POST /v1/responses. The wizard injects the gateway via -c config overrides
and never edits your ~/.codex/config.toml:
codex \
-c model_provider="litellm" \
-c model_providers.litellm.base_url="https://your-litellm-rust-server.com/v1" \
-c model_providers.litellm.wire_api="responses" \
-c model_providers.litellm.env_key="LITELLM_API_KEY"
# LITELLM_API_KEY is exported from your saved keyArguments after lite codex are forwarded to Codex:
lite codex exec "fix the failing test"
lite codex -m gpt-5.5Run lite codex --reset to ignore saved settings and enter them again.
The gateway needs an OpenAI model route in its config:
model_list:
- model_name: openai/*
litellm_params:
model: openai/*
api_key: os.environ/OPENAI_API_KEY
api_base: https://api.openai.comCodex Mac app: the desktop app reads ~/.codex/config.toml, so route it by
adding a provider block there (same fields the wizard passes), then select it in
the app:
model_provider = "litellm"
[model_providers.litellm]
name = "LiteLLM"
base_url = "https://your-litellm-rust-server.com/v1"
wire_api = "responses"
env_key = "LITELLM_API_KEY"Installing/updating the CLI:
cargo install --path . --forceso theliteon yourPATHincludes thecodexsubcommand (a stale install errors withunrecognized subcommand 'codex').
litellm-rust is compatible with your existing litellm config.yaml and DB.
model_list:
- model_name: anthropic/*
litellm_params:
model: anthropic/*
api_key: os.environ/ANTHROPIC_API_KEY
general_settings:
master_key: os.environ/MASTER_KEY
sandbox_choice: "e2b" # can be either "e2b" or "daytona"
e2b_sandbox_params:
e2b_api_key: os.environ/E2B_API_KEY
e2b_template: "litellm-4gb"$ litellm-rust --config /app/config.yaml
POST /messages
POST /responses
POST /realtime
POST /audio- OpenAI
- Azure OpenAI
- VertexAI
- Bedrock
Entry points and what runs at startup:
src/main.rs— binary entry point. Parses CLI args, loadsconfig.yaml, builds the HTTP client, callsmodel_prices::load(), then wires everything intoAppStateand starts the server.src/model_prices.rs— fetches the LiteLLM model cost/capability map from upstream at startup; falls back to the embeddedmodel_prices_backup.jsonsnapshot if the network is unavailable. Returns aModelCostMap;main.rsstores it onAppState. Override the URL withLITELLM_MODEL_COST_MAP_URL.src/errors.rs— typed error enum. All error variants map to HTTP status + JSON body in one place.
Subsystems:
src/http/— HTTP layer only. Route registration, auth, body extraction, response shaping. No business logic.src/providers/— provider registry, per-provider request/response transformation, model router (maps model name → deployment + handler).src/proxy/— config loading, master-key auth,AppState.src/cli/—lite claudewizard: credential storage, model selector, Claude Code launcher.
See CODING_STANDARDS.md.
