Route Claude, GPT, Gemini, DeepSeek, Ollama, vLLM and 40+ providers through a single fast endpoint. Bring your own subscription or API key, plug it into Claude Code, Cursor, Codex, Cline and any OpenAI-compatible tool.
A lightweight, self-hosted LiteLLM / OpenRouter alternative — one Go binary, no Docker, no Python, no database server.
Features · Quick Start · Supported Tools · Providers · API · Deploy
Modern AI workflows are fragmented. Every CLI, IDE and agent speaks a slightly different API — OpenAI, Anthropic, or Gemini — and every provider bills differently. Switching models means editing config in a dozen places, and your paid subscriptions sit idle while you burn API credits.
Flow Router fixes that with one local endpoint:
- 🔌 One endpoint, every model. Point any OpenAI-compatible tool at
http://127.0.0.1:2402/v1and reach Claude, GPT, Gemini, DeepSeek, Groq, local models — anything. - 🔑 Use what you already pay for. Drive Claude Code / Cursor through your existing Claude Pro/Max subscription — no extra API key required.
- 🔁 Automatic fallback. Define a priority chain (subscription → cheap → free) so requests never fail when one provider is rate-limited.
- 🖥️ Single Go binary. No runtime, no database server. Portable across Linux, macOS and Windows. Built-in web dashboard.
- 🛡️ Private by default. Everything runs on your machine. Your keys and traffic never leave localhost unless you choose to tunnel out.
┌──────────────────────────────────────────────────────────────┐
│ Your tools: Claude Code · Cursor · Codex · Cline · Kilo … │
└───────────────────────────┬──────────────────────────────────┘
│ OpenAI / Anthropic / Gemini API
▼
┌─────────────────────────┐
│ FLOW ROUTER │ http://127.0.0.1:2402
│ • format translation │
│ • priority fallback │
│ • combos & load balance │
│ • usage + quota tracking│
└────────────┬────────────┘
┌────────────┬────────┼────────┬─────────────┐
▼ ▼ ▼ ▼ ▼
Claude (sub) OpenAI Gemini DeepSeek Local Llama / Qwen
A request comes in (in any of the three major API dialects), Flow Router finds a provider that serves the requested model, translates the format if needed, forwards it with the right auth, then translates the response back. Usage, cost and latency are logged for every call.
| Feature | What it does |
|---|---|
| 🔌 Universal endpoint | OpenAI /v1/chat/completions + streaming, Anthropic /v1/messages, OpenAI /v1/responses, Gemini /v1beta/models — all served at once |
| 🔄 Format translation | Transparent OpenAI ⇄ Anthropic ⇄ Gemini conversion, including streaming SSE |
| 🛠️ Tool calling | Full tool_calls ⇄ tool_use conversion so agentic tools work across providers |
| 🔁 Smart fallback | Priority-ordered providers; auto-retry the next one on error or rate limit |
| 🧩 Combos | Group models into one alias with priority / round-robin / random / cost-optimal strategies |
| 🔑 Subscription auth | Use a Claude Pro/Max session (no API key) for Claude Code and friends |
| 🪪 OAuth & key import | Connect Codex, Cursor, GitLab, iFlow, Kiro, Claude — or paste a token directly |
| ⌨️ CLI auto-config | Detect and configure 13 popular AI CLIs/extensions in one click |
| 📊 Usage analytics | Per-day charts, per-provider breakdown, live request stream, cost estimates |
| ⏱️ Quota tracking | Track subscription/API limits per provider, daily/weekly/monthly |
| 🛡️ MITM inspector | Capture, inspect and replay full request/response bodies for debugging |
| 🚇 Tunnels | Expose your router securely via Cloudflare Tunnel or Tailscale |
| 🌐 Edge proxy deploy | Generate ready-to-ship proxy workers for Cloudflare, Deno Deploy or Vercel |
| 🎬 Media providers | Route embeddings, text-to-image, TTS, STT and web search to dedicated backends |
| 🧠 MCP registry | Register Model Context Protocol servers and list their tools live |
| 🏷️ Tags & pricing | Organize providers with tags; maintain per-model rate cards |
| 🔐 Optional login | Password or OIDC (OpenID Connect) with opt-in session enforcement |
git clone https://github.com/flowork-os/Flow_Router.git
cd Flow_Router
go build -o flow-router ./...
./flow-router # listens on http://127.0.0.1:2402Requires Go 1.25+. The result is a single self-contained binary — copy it anywhere.
Set your tool's base URL to:
http://127.0.0.1:2402/v1
Open the dashboard at http://127.0.0.1:2402 to add providers, create combos, and watch usage in real time.
- Open the dashboard → Providers → add a provider (API key or subscription).
- (Optional) Create a Combo to alias several models behind one name.
- Send a request:
curl http://127.0.0.1:2402/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-haiku-4-5",
"messages": [{"role": "user", "content": "Hello!"}]
}'Auto-detect and one-click configure these AI CLIs and editor extensions:
Claude Code · Codex · Cline · Copilot · Cowork · DeepSeek TUI · Droid · Hermes · JCode · Kilo · OpenClaw · OpenCode · Antigravity
Each integration writes the correct config (JSON / TOML / .env) pointing at Flow Router, and can be reset just as easily.
Flow Router speaks three API dialects, so it works with essentially any modern LLM provider:
- Subscription / OAuth — Claude (Pro/Max), Codex, Cursor, GitLab, iFlow, Kiro
- API-key cloud — OpenAI, Anthropic, Google Gemini, DeepSeek, Groq, Together, Mistral, OpenRouter and any OpenAI-compatible endpoint
- Local — llama.cpp /
llama-server, Ollama, LM Studio, vLLM, or any OpenAI-compatible local server
Add models freely — there is no hardcoded allow-list. If a provider exposes a /models endpoint, Flow Router can discover and validate it.
Flow Router exposes a multi-dialect surface so any client just works:
| Endpoint | Dialect |
|---|---|
POST /v1/chat/completions |
OpenAI (streaming supported) |
POST /v1/messages |
Anthropic Messages |
POST /v1/responses |
OpenAI Responses |
GET /v1/models |
OpenAI model list |
GET /v1beta/models · POST /v1beta/models/{model}:generateContent |
Gemini |
POST /v1/embeddings · /v1/images · /v1/audio · /v1/search |
Media routing |
Streaming example:
curl -N http://127.0.0.1:2402/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"claude-haiku-4-5","stream":true,
"messages":[{"role":"user","content":"Count to 3"}]}'Management APIs live under /api/* (providers, combos, usage, tunnel, oauth, mcp, tags, pricing, cli-tools) and power the dashboard.
Flow Router is a single binary, so deployment is trivial:
# Run on a VPS, bind to all interfaces behind your own TLS/reverse proxy
./flow-router -addr 0.0.0.0:2402For remote access without opening ports, use the built-in Tunnel panel:
- Cloudflare Tunnel — instant public
*.trycloudflare.comURL - Tailscale — private mesh access over your tailnet
Need an edge proxy? The Proxy Pools panel generates ready-to-deploy worker scripts for Cloudflare Workers, Deno Deploy and Vercel Edge.
- Language: Go 1.25 (single static binary, no CGO required for core)
- Storage: embedded SQLite (
~/.flow_router/db/data.sqlite) - UI: embedded single-page dashboard (no build step required to run)
- Footprint: ~16 MB binary, low memory — runs comfortably on a Raspberry Pi or mini PC
- Multi-dialect endpoint (OpenAI / Anthropic / Gemini) + streaming
- Tool calling translation
- CLI auto-config, OAuth flows, tunnels, MITM, usage analytics
- MCP registry with live tool discovery
- Streaming tool-use rounds
- Per-intent multiplexing (local model for private prompts, cloud for the rest)
- Cross-device sync (pull-based config sync between instances)
Is Flow Router a LiteLLM or OpenRouter alternative? Yes. It's a self-hosted, open-source LLM gateway that gives you one OpenAI-compatible endpoint for every provider — like LiteLLM or OpenRouter, but as a single Go binary (no Python, no Docker, no managed service) with a built-in dashboard.
How do I self-host an OpenAI-compatible API proxy?
Download the binary, run ./flow-router-bin, add a provider key in the dashboard, and point any tool at http://127.0.0.1:2402/v1. That's the whole setup — see Quick Start.
Which LLM providers does it support? OpenAI (GPT), Anthropic (Claude), Google Gemini, DeepSeek, Groq, OpenRouter, Mistral, Qwen, and any OpenAI-compatible endpoint — plus local models via Ollama, vLLM and llama.cpp.
Does it work with Claude Code, Cursor and Codex? Yes — auto-config for Claude Code, Cursor, Codex, Cline, OpenCode and more. Drive them through your existing Claude Pro/Max subscription instead of paying per-token API.
Does it support streaming, tool calling and MCP? Yes — streaming SSE, OpenAI ⇄ Anthropic ⇄ Gemini tool-call translation (incl. streaming tool-use rounds), and an MCP registry with live tool discovery.
Is my data private?
Everything runs on localhost. Keys are encrypted at rest; traffic never leaves your machine unless you opt into a tunnel.
Contributions are welcome! Open an issue to discuss a feature or bug, or submit a pull request. Please keep changes focused and include a clear description.
Released under the MIT License. Free to use, modify and self-host.
Flow Router — your AI traffic, your rules, your machine.
⭐ Star this repo if it saves you time or money.
Keywords: AI gateway · LLM gateway · LLM proxy · LLM router · OpenAI-compatible API · self-hosted · LiteLLM alternative · OpenRouter alternative · multi-provider · OpenAI · Anthropic Claude · Google Gemini · DeepSeek · Ollama · vLLM · Claude Code · Cursor · MCP · Go single binary