-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
API Reference
🌐 Languages: 🇺🇸 English | 🇧🇷 Português (Brasil) | 🇪🇸 Español | 🇫🇷 Français | 🇮🇹 Italiano | 🇷🇺 Русский | 🇨🇳 中文 (简体) | 🇩🇪 Deutsch | 🇮🇳 हिन्दी | 🇹🇭 ไทย | 🇺🇦 Українська | 🇸🇦 العربية | 🇯🇵 日本語 | 🇻🇳 Tiếng Việt | 🇧🇬 Български | 🇩🇰 Dansk | 🇫🇮 Suomi | 🇮🇱 עברית | 🇭🇺 Magyar | 🇮🇩 Bahasa Indonesia | 🇰🇷 한국어 | 🇲🇾 Bahasa Melayu | 🇳🇱 Nederlands | 🇳🇴 Norsk | 🇵🇹 Português (Portugal) | 🇷🇴 Română | 🇵🇱 Polski | 🇸🇰 Slovenčina | 🇸🇪 Svenska | 🇵🇭 Filipino | 🇨🇿 Čeština
Complete reference for all OmniRoute API endpoints.
- Chat Completions
- Embeddings
- Image Generation
- List Models
- Compatibility Endpoints
- Files API
- Batches API
- Search API
- WebSocket Streaming
- Quotas & Issues Reporting
- Semantic Cache
- Dashboard & Management
- Combo Management
- Webhooks
- Registered Keys (Auto-Management)
- Agents Protocol
- Management Proxies
- Resilience (extended)
- Skills
- Memory
- MCP Server
- A2A Server
- Cloud, Evals & Assess
- Request Processing
- Authentication
POST /v1/chat/completions
Authorization: Bearer your-api-key
Content-Type: application/json
{
"model": "cc/claude-opus-4-6",
"messages": [
{"role": "user", "content": "Write a function to..."}
],
"stream": true
}| Header | Direction | Description |
|---|---|---|
X-OmniRoute-No-Cache |
Request | Set to true to bypass cache |
X-OmniRoute-Progress |
Request | Set to true for progress events |
X-Session-Id |
Request | Sticky session key for external session affinity |
x_session_id |
Request | Underscore variant also accepted (direct HTTP) |
Idempotency-Key |
Request | Dedup key (5s window) |
X-Request-Id |
Request | Alternative dedup key |
X-OmniRoute-Cache |
Response |
HIT or MISS (non-streaming) |
X-OmniRoute-Idempotent |
Response |
true if deduplicated |
X-OmniRoute-Progress |
Response |
enabled if progress tracking on |
X-OmniRoute-Session-Id |
Response | Effective session ID used by OmniRoute |
Nginx note: if you rely on underscore headers (for example
x_session_id), enableunderscores_in_headers on;.
POST /v1/embeddings
Authorization: Bearer your-api-key
Content-Type: application/json
{
"model": "nebius/Qwen/Qwen3-Embedding-8B",
"input": "The food was delicious"
}Available providers: Nebius, OpenAI, Mistral, Together AI, Fireworks, NVIDIA, OpenRouter, GitHub Models.
# List all embedding models
GET /v1/embeddingsPOST /v1/images/generations
Authorization: Bearer your-api-key
Content-Type: application/json
{
"model": "openai/gpt-image-2",
"prompt": "A beautiful sunset over mountains",
"size": "1024x1024"
}Available providers: OpenAI (GPT Image 2), xAI (Grok Image), Together AI (FLUX), Fireworks AI, Nebius (FLUX), Hyperbolic, NanoBanana, OpenRouter, SD WebUI (local), ComfyUI (local).
# List all image models
GET /v1/images/generationsGET /v1/models
Authorization: Bearer your-api-key
→ Returns all chat, embedding, and image models + combos in OpenAI format| Method | Path | Format |
|---|---|---|
| POST | /v1/chat/completions |
OpenAI |
| POST | /v1/messages |
Anthropic |
| POST | /v1/responses |
OpenAI Responses |
| POST | /v1/embeddings |
OpenAI |
| POST | /v1/images/generations |
OpenAI Images |
| POST | /v1/images/edits |
OpenAI Images (edit/inpaint) |
| POST | /v1/videos/generations |
OpenAI-style video generation |
| POST | /v1/music/generations |
OpenAI-style music generation |
| POST | /v1/audio/transcriptions |
OpenAI Audio (STT) |
| POST | /v1/audio/speech |
OpenAI TTS (returns audio body) |
| POST | /v1/rerank |
Cohere/Voyage-style rerank |
| POST | /v1/moderations |
OpenAI Moderations |
| GET | /v1/models |
OpenAI |
| POST | /v1/messages/count_tokens |
Anthropic |
| GET | /v1beta/models |
Gemini |
| POST | /v1beta/models/{...path} |
Gemini generateContent |
| POST | /v1/api/chat |
Ollama |
| GET | /api/v1/vscode/{token}/ |
OpenAI catalog alias |
| GET | /api/v1/vscode/{token}/models |
OpenAI models alias |
| POST | /api/v1/vscode/{token}/chat/completions |
OpenAI tokenized alias |
| POST | /api/v1/vscode/{token}/responses |
OpenAI Responses tokenized alias |
| POST | /api/v1/vscode/{token}/api/chat |
Ollama tokenized alias |
| GET | /api/v1/vscode/{token}/api/tags |
Ollama tags tokenized alias |
All POST routes follow the same shape: Bearer your-api-key + Zod-validated JSON body (v1RerankSchema, v1ModerationSchema, v1AudioSpeechSchema, etc., see src/shared/validation/schemas.ts). 4xx is returned on schema failure.
For clients that cannot attach Authorization: Bearer ..., OmniRoute also accepts API keys in the URL via either query-string compatibility (?token=..., ?apiKey=..., ?api_key=..., ?key=...) or the dedicated /api/v1/vscode/{token}/... endpoints documented below.
# Rerank
POST /v1/rerank { "model": "cohere/rerank-3", "query": "...", "documents": ["..."] }
# Moderations
POST /v1/moderations { "model": "omni-moderation-latest", "input": "..." }
# TTS — returns audio/mpeg (or requested format) body
POST /v1/audio/speech { "model": "openai/tts-1", "input": "Hello", "voice": "alloy" }
# Image edit (multipart)
POST /v1/images/edits -F image=@input.png -F prompt="..." -F mask=@mask.png
# Video / music generation (provider-prefixed model id)
POST /v1/videos/generations { "model": "runway/gen-3", "prompt": "..." }
POST /v1/music/generations { "model": "suno/v3.5", "prompt": "..." }POST /v1/providers/{provider}/chat/completions
POST /v1/providers/{provider}/embeddings
POST /v1/providers/{provider}/images/generationsThe provider prefix is auto-added if missing. Mismatched models return 400.
OpenAI-compatible files endpoint for batch input/output and file-purpose uploads.
| Method | Path | Description |
|---|---|---|
| POST | /v1/files |
Upload a file (multipart: file, purpose, expires_after[anchor], expires_after[seconds]) — 512 MiB max |
| GET | /v1/files |
List files for the authenticated API key |
| GET | /v1/files/[id] |
Retrieve a file's metadata |
| DELETE | /v1/files/[id] |
Delete a file |
| GET | /v1/files/[id]/content |
Stream the raw file body back |
Auth: Bearer API key — files are scoped per-API-key via getApiKeyRequestScope.
OpenAI-compatible batch processing.
| Method | Path | Description |
|---|---|---|
| POST | /v1/batches |
Create batch — body validated by v1BatchCreateSchema (input_file_id, endpoint, completion_window) |
| GET | /v1/batches |
List batches |
| GET | /v1/batches/[id] |
Retrieve batch status + request_counts
|
| DELETE | /v1/batches/[id] |
Delete a finished/failed batch |
| POST | /v1/batches/[id]/cancel |
Cancel an in-progress batch |
Auth: Bearer API key. Batches are scoped per-API-key.
Web/search provider abstraction (Tavily, Brave, Exa, Serper, etc.).
| Method | Path | Description |
|---|---|---|
| GET | /v1/search |
List configured search providers + capabilities |
| POST | /v1/search |
Run a search query — body validated by v1SearchSchema, supports caching/coalescing |
| GET | /v1/search/analytics |
Per-provider hit/latency/cache stats |
Auth: Bearer API key (extractApiKey + isValidApiKey). Search policy enforced via enforceApiKeyPolicy.
GET /v1/ws?handshake=1Validates a WebSocket upgrade handshake and returns the wire protocol example messages (request, cancel). Actual WS frames are handled by the bundled WS server outside the Next.js route table.
Auth: Bearer API key during handshake.
# Same host:port as the HTTP API (default 20128); upgrade the connection:
wscat -c "ws://localhost:20128/v1/responses?api_key=<OMNIROUTE_API_KEY>"
# (or: -H "Authorization: Bearer <OMNIROUTE_API_KEY>")
# First frame MUST be response.create:
{ "type": "response.create", "model": "gpt-5.5", "input": [ { "role": "user", "content": "hi" } ] }A Responses-API-over-WebSocket proxy is wired exclusively to codex (ChatGPT
backend). It listens on the same port as the API/dashboard at paths /v1/responses,
/responses, and /api/v1/responses. On the first response.create frame it
authenticates + prepares via the internal codex-responses-ws bridge, selects a
codex OAuth connection, and tunnels to wss://chatgpt.com/backend-api/codex/responses
via the wreq-js transport. Non-codex models are rejected (codex_ws_provider_required).
For quota-share routing use model: "qtSd/<group>/codex/<model>". Implemented in
app/server-ws.mjs + scripts/dev/responses-ws-proxy.mjs + src/app/api/internal/codex-responses-ws/route.ts.
Auth: Bearer API key during handshake. The bundled HTTP server (server-ws.mjs)
must be the active entrypoint (it is, by default, when app/server-ws.mjs exists).
The OpenAI Codex CLI validates the model name client-side when
supports_websockets = true and rejects provider-prefixed ids like
codex/gpt-5.5 (The 'codex/gpt-5.5' model is not supported when using Codex with a ChatGPT account). Send the bare id (e.g. gpt-5.5). OmniRoute's bridge is
codex-only, so it re-resolves a bare id as a codex model
(resolveCodexWsModelInfo) before tunneling upstream — even though a bare
gpt-5.5 would otherwise route to another provider over HTTP.
Point the Codex CLI at OmniRoute by adding a custom provider with WebSocket
support to ~/.codex/config.toml (use a separate CODEX_HOME to avoid touching
an existing config):
model = "gpt-5.5" # bare id — NOT "codex/gpt-5.5"
model_provider = "omniroute"
[model_providers.omniroute]
name = "OmniRoute (WS)"
base_url = "http://localhost:20128/v1" # no trailing slash; the WS URL is derived (use https/wss in production)
wire_api = "responses" # only supported value since Feb 2026
supports_websockets = true # enables the Responses-over-WS transport
env_key = "OMNIROUTE_API_KEY" # holds the OmniRoute API key (Bearer)export OMNIROUTE_API_KEY=sk-... # an OmniRoute API key (any key if REQUIRE_API_KEY=false)
codex exec "Responda apenas: PONG"The CLI upgrades base_url + /responses to a WebSocket and OmniRoute tunnels it
to the selected codex OAuth connection. Validated end-to-end against the local
server: ChatGPT returns codex.rate_limits + response.created and streams the
completion.
| Method | Path | Description |
|---|---|---|
| GET | /v1/quotas/check |
Pre-validate quota for a provider + accountId before issuing a registered key |
| POST | /v1/issues/report |
Report a quota/key issuance failure to GitHub (requires GITHUB_ISSUES_REPO + token) |
Auth: Bearer API key (isAuthenticated).
# Get cache stats
GET /api/cache/stats
# Clear all caches
DELETE /api/cache/statsResponse example:
{
"semanticCache": {
"memorySize": 42,
"memoryMaxSize": 500,
"dbSize": 128,
"hitRate": 0.65
},
"idempotency": {
"activeKeys": 3,
"windowMs": 5000
}
}| Endpoint | Method | Description |
|---|---|---|
/api/auth/login |
POST | Login |
/api/auth/logout |
POST | Logout |
/api/settings/require-login |
GET/PUT | Toggle login required |
| Endpoint | Method | Description |
|---|---|---|
/api/providers |
GET/POST | List / create providers |
/api/providers/[id] |
GET/PUT/DELETE | Manage a provider |
/api/providers/[id]/test |
POST | Test provider connection |
/api/providers/[id]/models |
GET | List provider models |
/api/providers/validate |
POST | Validate provider config |
/api/provider-nodes* |
Various | Provider node management |
/api/provider-models |
GET/POST/PATCH/DELETE | Custom models (add, update, hide/show, delete) |
| Endpoint | Method | Description |
|---|---|---|
/api/oauth/[provider]/[action] |
Various | Provider-specific OAuth |
| Endpoint | Method | Description |
|---|---|---|
/api/models/alias |
GET/POST | Model aliases |
/api/models/catalog |
GET | All models by provider + type |
/api/combos* |
Various | Combo management |
/api/keys* |
Various | API key management |
/api/pricing |
GET | Model pricing |
| Endpoint | Method | Description |
|---|---|---|
/api/usage/history |
GET | Usage history |
/api/usage/logs |
GET | Usage logs |
/api/usage/request-logs |
GET | Request-level logs |
/api/usage/[connectionId] |
GET | Per-connection usage |
/api/usage/token-limits |
GET/POST/DELETE | Per-API-key token-limit budgets |
| Endpoint | Method | Description |
|---|---|---|
/api/settings |
GET/PUT/PATCH | General settings |
/api/settings/proxy |
GET/PUT | Network proxy config |
/api/settings/proxy/test |
POST | Test proxy connection |
/api/settings/ip-filter |
GET/PUT | IP allowlist/blocklist |
/api/settings/thinking-budget |
GET/PUT | Reasoning token budget |
/api/settings/system-prompt |
GET/PUT | Global system prompt |
/api/settings/compression |
GET/PUT | Global compression config |
/api/settings/purge-request-history |
POST | Clear request log rows and local call-log artifacts |
| Endpoint | Method | Description |
|---|---|---|
/api/compression/preview |
POST | Preview off/lite/standard/aggressive/ultra/RTK/stacked compression |
/api/compression/language-packs |
GET | List available Caveman language packs |
/api/compression/rules |
GET | List Caveman rule metadata |
/api/context/caveman/config |
GET/PUT | Caveman-specific settings alias |
/api/context/rtk/config |
GET/PUT | RTK-specific settings, including custom filters and raw-output retention |
/api/context/rtk/filters |
GET | RTK filter catalog and custom-filter diagnostics |
/api/context/rtk/test |
POST | Run RTK preview/test against a text payload |
/api/context/rtk/raw-output/[id] |
GET | Read retained redacted raw output by pointer id |
/api/context/combos |
GET/POST | Compression combo list/create |
/api/context/combos/[id] |
GET/PUT/DELETE | Compression combo detail/update/delete |
/api/context/combos/[id]/assignments |
GET/PUT | Assign compression combos to routing combos |
/api/context/analytics |
GET | Compression analytics alias |
| Endpoint | Method | Description |
|---|---|---|
/api/sessions |
GET | Active session tracking |
/api/rate-limits |
GET | Per-account rate limits |
/api/monitoring/health |
GET | Health check + provider summary (catalogCount, configuredCount, activeCount, monitoredCount) |
/api/cache/stats |
GET/DELETE | Cache stats / clear |
| Endpoint | Method | Description |
|---|---|---|
/api/db-backups |
GET | List available backups |
/api/db-backups |
PUT | Create a manual backup |
/api/db-backups |
POST | Restore from a specific backup |
/api/db-backups/export |
GET | Download database as .sqlite file |
/api/db-backups/import |
POST | Upload .sqlite file to replace database |
/api/db-backups/exportAll |
GET | Download full backup as .tar.gz archive |
| Endpoint | Method | Description |
|---|---|---|
/api/sync/cloud |
Various | Cloud sync operations |
/api/sync/initialize |
POST | Initialize sync |
/api/cloud/* |
Various | Cloud management |
| Endpoint | Method | Description |
|---|---|---|
/api/tunnels/cloudflared |
GET | Read Cloudflare Quick Tunnel install/runtime status for the dashboard |
/api/tunnels/cloudflared |
POST | Enable or disable the Cloudflare Quick Tunnel (action=enable/disable) |
/api/tunnels/ngrok |
GET | Read ngrok Tunnel runtime status for the dashboard |
/api/tunnels/ngrok |
POST | Enable or disable the ngrok Tunnel (action=enable/disable) |
| Endpoint | Method | Description |
|---|---|---|
/api/cli-tools/claude-settings |
GET | Claude CLI status |
/api/cli-tools/codex-settings |
GET | Codex CLI status |
/api/cli-tools/droid-settings |
GET | Droid CLI status |
/api/cli-tools/openclaw-settings |
GET | OpenClaw CLI status |
/api/cli-tools/runtime/[toolId] |
GET | Generic CLI runtime |
CLI responses include: installed, runnable, command, commandPath, runtimeMode, reason.
| Endpoint | Method | Description |
|---|---|---|
/api/acp/agents |
GET | List all detected agents (built-in + custom) with status |
/api/acp/agents |
POST | Add custom agent or refresh detection cache |
/api/acp/agents |
DELETE | Remove a custom agent by id query param |
GET response includes agents[] (id, name, binary, version, installed, protocol, isCustom) and summary (total, installed, notFound, builtIn, custom).
| Endpoint | Method | Description |
|---|---|---|
/api/resilience |
GET/PATCH | Get/update request queue, connection cooldown, provider breaker, and wait settings |
/api/resilience/reset |
POST | Reset provider circuit breakers |
/api/resilience/model-cooldowns |
GET | List active per-(provider, connection, model) lockouts, sorted by remaining time |
/api/resilience/model-cooldowns |
DELETE | Clear a model lockout — body {provider, model} or {all: true} to wipe everything |
/api/rate-limits |
GET | Per-account rate limit status |
/api/rate-limit |
GET | Global rate limit configuration |
All four
/api/resilience/*routes require management auth (requireManagementAuth). See Resilience (extended) for a full breakdown of provider breaker vs connection cooldown vs model lockout.
| Endpoint | Method | Description |
|---|---|---|
/api/evals |
GET/POST | List eval suites / run evaluation |
| Endpoint | Method | Description |
|---|---|---|
/api/policies |
GET/POST/DELETE | Manage routing policies |
| Endpoint | Method | Description |
|---|---|---|
/api/compliance/audit-log |
GET | Compliance audit log (last N) |
| Endpoint | Method | Description |
|---|---|---|
/v1beta/models |
GET | List models in Gemini format |
/v1beta/models/{...path} |
POST | Gemini generateContent endpoint |
These endpoints mirror Gemini's API format for clients that expect native Gemini SDK compatibility.
| Endpoint | Method | Description |
|---|---|---|
/api/init |
GET | Application initialization check (used on first run) |
/api/tags |
GET | Ollama-compatible model tags (for Ollama clients) |
/api/restart |
POST | Trigger graceful server restart |
/api/shutdown |
POST | Trigger graceful server shutdown |
/api/system/env/repair |
POST | Repair OAuth provider environment variables |
Note: These endpoints are used internally by the system or for Ollama client compatibility. They are not typically called by end users.
POST /api/system/env/repair
Content-Type: application/json
{
"provider": "claude-code"
}Repairs missing or corrupted OAuth environment variables for a specific provider. Returns:
{
"success": true,
"repaired": ["CLAUDE_CODE_OAUTH_CLIENT_ID", "CLAUDE_CODE_OAUTH_CLIENT_SECRET"],
"backupPath": "/home/user/.omniroute/backups/env-repair-2026-04-11.bak"
}POST /v1/audio/transcriptions
Authorization: Bearer your-api-key
Content-Type: multipart/form-dataTranscribe audio files using Deepgram or AssemblyAI.
Request:
curl -X POST http://localhost:20128/v1/audio/transcriptions \
-H "Authorization: Bearer your-api-key" \
-F "file=@recording.mp3" \
-F "model=deepgram/nova-3"Response:
{
"text": "Hello, this is the transcribed audio content.",
"task": "transcribe",
"language": "en",
"duration": 12.5
}Supported providers: deepgram/nova-3, assemblyai/best.
Supported formats: mp3, wav, m4a, flac, ogg, webm.
For clients that use Ollama's API format:
# Chat endpoint (Ollama format)
POST /v1/api/chat
# Model listing (Ollama format)
GET /api/tagsRequests are automatically translated between Ollama and internal formats.
Use these aliases when an integration cannot inject an Authorization header and needs the API key embedded in the base URL.
# OpenAI-style catalog alias
GET /api/v1/vscode/{token}/
GET /api/v1/vscode/{token}/models
# OpenAI-style chat aliases
POST /api/v1/vscode/{token}/chat/completions
POST /api/v1/vscode/{token}/responses
# Ollama-style aliases
POST /api/v1/vscode/{token}/api/chat
GET /api/v1/vscode/{token}/api/tagsExample:
curl https://your-host.example/api/v1/vscode/YOUR_API_KEY/models
curl -X POST https://your-host.example/api/v1/vscode/YOUR_API_KEY/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"auto","messages":[{"role":"user","content":"hello"}]}'Notes:
- The tokenized aliases reuse the same handlers as
/v1/*and/api/tags; response shapes stay identical. - Prefer
Authorization: Bearer ...whenever the client supports custom headers. - URL-based tokens may appear in reverse-proxy logs, browser history, and telemetry outside OmniRoute. Treat them as a compatibility option, not the default authentication mode.
# Get latency telemetry summary (p50/p95/p99 per provider)
GET /api/telemetry/summaryResponse:
{
"providers": {
"claudeCode": { "p50": 245, "p95": 890, "p99": 1200, "count": 150 },
"github": { "p50": 180, "p95": 620, "p99": 950, "count": 320 }
}
}# Get budget status for all API keys
GET /api/usage/budget
# Set or update a budget
POST /api/usage/budget
Content-Type: application/json
{
"apiKeyId": "key-123",
"dailyLimitUsd": 5.00,
"weeklyLimitUsd": 30.00,
"monthlyLimitUsd": 100.00,
"warningThreshold": 0.8,
"resetInterval": "monthly"
}Schema notes (
setBudgetSchema):apiKeyIdis required; at least one ofdailyLimitUsd,weeklyLimitUsd, ormonthlyLimitUsdmust be greater than zero. Optional fields:warningThreshold(0–1),resetInterval(daily|weekly|monthly),resetTime(HH:MM). The legacy{keyId, limit, period}shape returns400 Bad Request.
Per-API-key token budgets (distinct from the USD-based Budget above). Enforced inline on the request path: when a key's current window usage reaches its limit, requests are rejected with 429 Too Many Requests. Limits can be scoped to a specific model, a provider, or applied globally across the key; when several limits match a request, the most restrictive one wins.
# List a key's token limits (includes live window usage)
GET /api/usage/token-limits?apiKeyId=key-123
# Create or update a token limit
POST /api/usage/token-limits
Content-Type: application/json
{
"apiKeyId": "key-123",
"scopeType": "model",
"scopeValue": "openai/gpt-4o",
"tokenLimit": 1000000,
"resetInterval": "monthly",
"enabled": true
}
# Delete a token limit by id
DELETE /api/usage/token-limits?id=tl-abcSchema notes (
setTokenLimitSchema):apiKeyIdandscopeType(model|provider|global) are required.scopeValueis required unlessscopeTypeisglobal(e.g. a model id formodelscope, a provider id forproviderscope).tokenLimitmust be a positive integer (coerced from string). Optional:id(omit to create, supply to update),resetInterval(daily|weekly|monthly, defaultmonthly),resetTime(HH:MM),enabled(defaulttrue).GETresponses enrich each limit withtokensUsed,remaining,windowStart,periodStartAt, andnextResetAt. This is a management-class endpoint (auth enforced centrally by the authz pipeline).
- Client sends request to
/v1/* - Route handler calls
handleChat,handleEmbedding,handleAudioTranscription, orhandleImageGeneration - Model is resolved (direct provider/model or alias/combo)
- Credentials selected from local DB with account availability filtering
- For chat:
handleChatCorechecks semantic/signature cache and resolves combo compression settings - Proactive compression runs before provider translation when enabled (
lite, Caveman, RTK, or stacked) - Provider executor sends upstream request
- Response translated back to client format (chat) or returned as-is (embeddings/images/audio)
- Usage, compression analytics, and request logs are recorded
- Fallback applies on errors according to combo rules
Full architecture reference: ARCHITECTURE.md
Higher-level routing combos (already summarized under /api/combos*) can also be mapped 1:1 from a model id pattern, allowing transparent redirection of an OpenAI-style model id to a combo.
| Method | Path | Description |
|---|---|---|
| GET | /api/model-combo-mappings |
List all model→combo mappings |
| POST | /api/model-combo-mappings |
Create mapping — body: {pattern, comboId, priority?, enabled?, description?}
|
| GET | /api/model-combo-mappings/[id] |
Retrieve a single mapping |
| PUT | /api/model-combo-mappings/[id] |
Update fields of an existing mapping |
| DELETE | /api/model-combo-mappings/[id] |
Remove a mapping |
Auth: management session/API key (requireManagementAuth).
Outbound webhook subscriptions for OmniRoute events (request completion, quota exhaustion, key rotation, etc.).
| Method | Path | Description |
|---|---|---|
| GET | /api/webhooks |
List webhooks (secrets are masked to <prefix>...) |
| POST | /api/webhooks |
Create webhook — body: {url, events?: ["*"], secret?, description?}
|
| GET | /api/webhooks/[id] |
Retrieve a webhook |
| PUT | /api/webhooks/[id] |
Update url/events/secret/description |
| DELETE | /api/webhooks/[id] |
Remove a webhook |
| POST | /api/webhooks/[id]/test |
Send a test payload to the webhook URL and return delivery status |
Auth: management session/API key (requireManagementAuth).
Used by the auto-key management subsystem to issue and rotate API keys against a backing provider/account, with daily/hourly quotas.
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/registered-keys |
List registered keys (masked prefix only) |
| POST | /api/v1/registered-keys |
Issue a new registered key — body: {name, provider?, accountId?, idempotencyKey?, expiresAt?, dailyBudget?, hourlyBudget?}. Returns the raw key once. Returns 429 on quota refusal. |
| GET | /api/v1/registered-keys/[id] |
Retrieve a registered key's metadata (no raw material) |
| DELETE | /api/v1/registered-keys/[id] |
Revoke a registered key |
| POST | /api/v1/registered-keys/[id]/revoke |
Explicit revoke endpoint (same effect as DELETE) |
Auth: Bearer API key (isAuthenticated). See also /v1/quotas/check and /v1/issues/report.
Cloud agent tasks (Claude Code, Codex Cloud, OpenHands, etc.) executed remotely on behalf of OmniRoute users.
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/agents/tasks |
List tasks — optional ?provider=, ?status=, ?limit= (1–500, default 50) |
| POST | /api/v1/agents/tasks |
Create task — body validated by CreateCloudAgentTaskSchema (providerId, prompt, source, options?). Returns 201 with task envelope |
| DELETE | /api/v1/agents/tasks?id=... |
Delete a task |
| GET | /api/v1/agents/tasks/[id] |
Read task — synchronously refreshes status from the upstream cloud agent when an external_id is set |
| POST | /api/v1/agents/tasks/[id] |
Discriminated action: {action: "approve"}, {action: "message", message}, or {action: "cancel"}
|
| DELETE | /api/v1/agents/tasks/[id] |
Delete a specific task by id |
Auth: management auth required on every method (
requireCloudAgentManagementAuth). Prior to v3.8.0 these were unauthenticated — see commit588a0333for the breaking change.
# Create a Claude Code cloud task
curl -X POST http://localhost:20128/api/v1/agents/tasks \
-H "Authorization: Bearer your-management-key" \
-H "Content-Type: application/json" \
-d '{"providerId":"claude-code-cloud","prompt":"Fix the failing test","source":{"repo":"...","branch":"..."}}'Outbound HTTP(S)/SOCKS proxies that can be assigned to providers, accounts, or globally.
| Method | Path | Description |
|---|---|---|
| GET | /api/v1/management/proxies |
List proxies (with ?id= returns one; with ?id=&where_used=1 returns the assignment graph) |
| POST | /api/v1/management/proxies |
Create proxy — body validated by createProxyRegistrySchema
|
| PATCH | /api/v1/management/proxies |
Update proxy — body validated by updateProxyRegistrySchema (requires id) |
| DELETE | /api/v1/management/proxies?id=...&force=1 |
Delete proxy (use force=1 to detach assignments) |
| GET | /api/v1/management/proxies/assignments |
List assignments — filterable by proxy_id, scope, scope_id; pass resolve_connection_id=<id> to resolve the active proxy for a connection |
| PUT | /api/v1/management/proxies/assignments |
Assign — body validated by proxyAssignmentSchema ({scope, scopeId?, proxyId?}). Clears dispatcher cache |
| PUT | /api/v1/management/proxies/bulk-assign |
Bulk-assign — body validated by bulkProxyAssignmentSchema ({scope, scopeIds[], proxyId?}) |
| GET | /api/v1/management/proxies/health?hours=24 |
Aggregate proxy health (success/fail counts, latency) over a window |
Auth: management session/API key on every route (requireManagementAuth).
The task description's
POST /api/v1/management/proxies/[id]/assignmentsandPOST /api/v1/management/proxies/[id]/healthare served by the flat/assignmentsand/healthroutes shown above — there are no per-id subroutes in the codebase.
OmniRoute exposes three independent temporary-failure mechanisms; the management endpoints below let operators read and override them:
| Scope | State storage | Read | Reset / clear |
|---|---|---|---|
| Provider breaker |
domain_circuit_breakers + in-memory |
/api/monitoring/health |
POST /api/resilience/reset |
| Connection cooldown |
rateLimitedUntil on provider connections |
/api/rate-limits, /api/providers/[id]
|
(re-enables lazily; clear via provider PUT) |
| Model lockout | In-memory model-availability registry | GET /api/resilience/model-cooldowns |
DELETE /api/resilience/model-cooldowns |
PATCH /api/resilience accepts provider breaker overrides under providerBreaker.oauth and providerBreaker.apikey. Each profile supports degradationThreshold, failureThreshold, and resetTimeoutMs; the same fields are exposed in Dashboard → Settings → Resilience.
# Clear a single model lockout
curl -X DELETE http://localhost:20128/api/resilience/model-cooldowns \
-H "Cookie: auth_token=..." \
-H "Content-Type: application/json" \
-d '{"provider":"openai","model":"gpt-4o-mini"}'
# Wipe every lockout
curl -X DELETE http://localhost:20128/api/resilience/model-cooldowns \
-H "Cookie: auth_token=..." \
-d '{"all":true}'Full conceptual reference and breaker defaults: see CLAUDE.md → "Resilience Runtime State".
Skill framework for extending OmniRoute with custom executable handlers, plus marketplace integrations.
| Method | Path | Description |
|---|---|---|
| GET | /api/skills |
List installed skills — filterable by ?q=, ?mode=on|off|auto, ?source=skillsmp|skillssh|local, paginated |
| GET | /api/skills/[id] |
Retrieve one skill |
| PUT | /api/skills/[id] |
Update skill (name, description, mode, schema, handler, tags) |
| DELETE | /api/skills/[id] |
Uninstall a skill |
| POST | /api/skills/install |
Install a skill from a raw manifest — body: {name, version, description, schema:{input, output}, handlerCode, apiKeyId?}
|
| GET | /api/skills/executions |
List recent skill executions (audit trail with inputs/outputs/duration) |
| GET | /api/skills/marketplace?q=... |
Search/popular list from the SkillsMP marketplace (requires skillsmpApiKey setting) |
| POST | /api/skills/marketplace/install |
Install a skill by id from SkillsMP |
| GET | /api/skills/skillssh?q=&limit= |
Search the skills.sh registry |
| POST | /api/skills/skillssh/install |
Install a skill by id from skills.sh |
Auth: management session/API key. Marketplace search routes accept either management auth or a Bearer API key (isAuthenticated).
Persistent conversational/factual memory store, scoped per API key / session.
| Method | Path | Description |
|---|---|---|
| GET | /api/memory |
List memories — ?apiKeyId=, ?type=, ?sessionId=, ?q=, with offset/limit or page/limit pagination |
| POST | /api/memory |
Create memory — body validated by Zod: {content, key, type?, sessionId?, apiKeyId?, metadata?, expiresAt?}
|
| GET | /api/memory/[id] |
Retrieve one memory |
| DELETE | /api/memory/[id] |
Delete a memory |
| GET | /api/memory/health |
Memory subsystem health (DB connectivity, embeddings backend, vector index status) |
Auth: management session/API key (requireManagementAuth). type enum: FACTUAL, EPISODIC, SEMANTIC, PROCEDURAL (see MemoryType in src/lib/memory/types.ts).
OmniRoute ships an embedded Model Context Protocol server with 3 transports (stdio, SSE, streamable-http) and scoped tools. The dashboard endpoints below read status/audit data and proxy the HTTP transports.
| Method | Path | Description |
| ------ | ---------------------- | ------------------------------------------------------------------------------------------------ | -------------------- |
| GET | /api/mcp/status | Heartbeat, transport, online state, last call, top tools, 24h success rate |
| GET | /api/mcp/tools | List of MCP tools with name, description, scopes, phase, auditLevel, sourceEndpoints |
| GET | /api/mcp/sse | Open SSE stream for the SSE transport (returns 503 if MCP disabled or transport mismatch) |
| POST | /api/mcp/sse | Send JSON-RPC frame on the SSE transport |
| GET | /api/mcp/stream | Open SSE side of the Streamable HTTP transport (server-initiated messages) |
| POST | /api/mcp/stream | Send JSON-RPC frame on the Streamable HTTP transport |
| DELETE | /api/mcp/stream | End a Streamable HTTP session |
| GET | /api/mcp/audit | Query audit log — ?limit=, ?offset=, ?tool=, ?success=true | false, ?apiKeyId= |
| GET | /api/mcp/audit/stats | Aggregate audit stats (totals, success rate, avg duration, top tools) |
Auth: the sse/stream transports honor the MCP-specific auth surface (Bearer API key with mcp scope); the status/tools/audit* routes are readable from the dashboard (no extra auth required beyond reaching the dashboard host).
Both HTTP transports are gated by
settings.mcpEnabledandsettings.mcpTransport— a transport mismatch returns400, an MCP disabled state returns503.
OmniRoute exposes an A2A (Agent-to-Agent) JSON-RPC 2.0 endpoint plus a REST wrapper for inspection/dashboard use.
POST /a2a
Authorization: Bearer your-api-key # optional unless OMNIROUTE_API_KEY is set
Content-Type: application/json
{
"jsonrpc": "2.0",
"id": 1,
"method": "message/send",
"params": {
"skill": "smart-routing",
"messages": [{"role": "user", "content": "Route this coding task"}]
}
}Supported methods (all gated on settings.a2aEnabled):
| Method | Description |
|---|---|
message/send |
Synchronous skill execution; returns {task, artifacts, metadata}
|
message/stream |
Streaming SSE execution of the same skill set |
tasks/get |
Fetch a task by taskId
|
tasks/cancel |
Cancel a task by taskId
|
Built-in skills: smart-routing, quota-management, provider-discovery, cost-analysis, health-report.
GET /.well-known/agent.jsonReturns the public A2A agent card (name, description, capabilities, skill catalog, auth scheme) — cached publicly for 1h. No auth required.
| Method | Path | Description |
|---|---|---|
| GET | /api/a2a/status |
A2A enabled + task stats + cached agent card summary |
| GET | /api/a2a/tasks |
List tasks — ?state=submitted|working|completed|failed|cancelled, ?skill=, ?limit= (≤200), ?offset=
|
| POST | /api/a2a/tasks |
(Not implemented as a REST helper — create via JSON-RPC message/send) |
| GET | /api/a2a/tasks/[id] |
Retrieve one task |
| POST | /api/a2a/tasks/[id]/cancel |
Cancel a task |
Auth: the REST helpers run without management auth (dashboard-readable); the JSON-RPC /a2a route uses Bearer OMNIROUTE_API_KEY if configured.
| Method | Path | Description |
| ------ | ------------------------------- | ------------------------------------------------------------------------------------------------- | ----------------------------- | ----------------------------------- |
| POST | /api/cloud/auth | Verify a Bearer key and return masked provider connections + model aliases for cloud sync clients |
| POST | /api/cloud/credentials/update | Update encrypted credentials for a cloud-synced provider |
| POST | /api/cloud/model/resolve | Resolve a logical model id to a concrete provider/model using the local routing table |
| GET | /api/cloud/models/alias | List model aliases as exposed to cloud sync |
| GET | /api/assess | Read latest assessment categorizations (per-provider/model) |
| POST | /api/assess | Run an assessment — body: {scope: {type:"all"} | {type:"provider", providerId} | {type:"model", modelId}, trigger?} |
| GET | /api/evals | List built-in eval suites + most recent runs |
| POST | /api/evals | Trigger an eval run |
| POST | /api/evals/suites | Create a custom eval suite — body validated by evalSuiteSaveSchema |
| GET | /api/evals/suites/[id] | Retrieve a custom eval suite |
Auth: /api/cloud/auth validates a Bearer key directly; the other /api/cloud/*, /api/evals/*, and /api/assess routes require management session/API key. /api/assess POST uses validateBody with a discriminated-union scope schema.
The ACP framework lets you spawn CLI agents (Claude Code, Codex, Gemini CLI, etc.) as child processes. These endpoints manage ACP agent detection and custom agent registration.
| Method | Path | Description |
|---|---|---|
| GET | /api/acp/agents |
List all known CLI agents (built-in + custom) with installation status, version, binary |
| POST | /api/acp/agents |
Register a custom ACP agent or refresh cache — body: {id, name, binary, versionCommand, providerAlias, spawnArgs, protocol} or {action: "refresh"}
|
| DELETE | /api/acp/agents |
Remove a custom ACP agent — query param: ?id=<agentId>
|
Response example (GET /api/acp/agents):
{
"agents": [
{
"id": "claude",
"name": "Claude Code CLI",
"binary": "claude",
"version": "1.0.45",
"installed": true,
"protocol": "stdio",
"providerAlias": "claude",
"isCustom": false
},
{
"id": "my-custom-cli",
"name": "My Custom CLI",
"installed": false,
"protocol": "stdio",
"providerAlias": "my-provider",
"isCustom": true
}
],
"cacheTtlMs": 60000,
"cacheAge": 1234
}Auth: Requires management session (dashboard auth_token cookie) or a
management-scoped API key.
See ACP Framework for full details.
Real-time analytics endpoints for monitoring routing, compression, and provider
diversity. These power the /dashboard/analytics/* pages.
| Method | Path | Description |
|---|---|---|
| GET | /api/analytics/auto-routing |
Aggregate auto-routing stats: total calls, strategy distribution, tier distribution, top providers |
| GET | /api/analytics/auto-routing?days=7 |
Time-windowed stats (default 24h) |
Response example:
{
"window": "24h",
"totalCalls": 1234,
"strategyBreakdown": {
"rules": 800,
"cost": 200,
"latency": 150,
"sla-aware": 50,
"lkgp": 34
},
"tierBreakdown": {
"ultra": 100,
"pro": 500,
"standard": 400,
"free": 234
},
"topProviders": [
{ "provider": "openai", "calls": 500, "avgLatencyMs": 850 },
{ "provider": "anthropic", "calls": 300, "avgLatencyMs": 1200 }
]
}| Method | Path | Description |
|---|---|---|
| GET | /api/analytics/compression |
Aggregate compression stats: tokens saved, savings %, mode distribution, engine usage |
Response example:
{
"window": "24h",
"totalOriginalTokens": 5000000,
"totalCompressedTokens": 3500000,
"totalSavings": 1500000,
"savingsPct": 30.0,
"modeBreakdown": {
"lite": 400,
"standard": 600,
"aggressive": 100,
"ultra": 50,
"rtk": 84
},
"engineBreakdown": {
"caveman": 800,
"rtk": 434
}
}| Method | Path | Description |
|---|---|---|
| GET | /api/analytics/diversity |
Shannon entropy-based diversity tracking: prevents single points of failure by measuring provider spread |
Response example:
{
"window": "24h",
"shannonEntropy": 2.45,
"maxEntropy": 3.17,
"diversityRatio": 0.77,
"providerUsage": {
"openai": 0.40,
"anthropic": 0.25,
"google": 0.20,
"kiro": 0.15
},
"warnings": [
"OpenAI accounts for 40% of traffic — consider diversifying"
]
}Auth: Requires management session or management-scoped API key.
Admin-only endpoints for operational management.
| Method | Path | Description |
|---|---|---|
| GET | /api/admin/concurrency |
Read current concurrency limits (global + per-provider) |
| POST | /api/admin/concurrency |
Update concurrency limits — body: {global?: number, perProvider?: Record<string, number>}
|
Auth: Requires management session with admin scope.
Manage CLI tools that integrate with OmniRoute (antigravity, chipotle, commandCode, devin-cli, etc.). See Provider Reference for the full list.
| Method | Path | Description |
|---|---|---|
| GET | /api/cli-tools/all-statuses |
Status of all CLI tools (installed, version, last seen) |
| GET | /api/cli-tools/[id]/status |
Status of a specific CLI tool (id can be: antigravity, chipotle, commandCode, devin-cli, etc.) |
| POST | /api/cli-tools/apply |
Apply a CLI tool configuration to a provider connection |
| GET | /api/cli-tools/backups |
List CLI tool configuration backups |
| POST | /api/cli-tools/backups |
Create a backup of all CLI tool configurations |
| POST | /api/cli-tools/[id]/restore |
Restore a CLI tool from a backup |
| GET | /api/cli-tools/antigravity-mitm |
Antigravity MITM proxy status (the "antigravity-mitm" CLI tool) |
| POST | /api/cli-tools/antigravity-mitm/alias |
Configure antigravity-mitm aliases |
Auth: Requires management session.
Manage AI agent skills (similar to OpenAI's custom GPTs but for agents).
| Method | Path | Description |
|---|---|---|
| GET | /api/agent-skills |
List all agent skills (built-in + custom) |
| GET | /api/agent-skills/[id] |
Get a specific agent skill |
| POST | /api/agent-skills |
Create a custom agent skill — body: {name, description, prompt, model?, temperature?}
|
| PUT | /api/agent-skills/[id] |
Update a custom agent skill |
| DELETE | /api/agent-skills/[id] |
Delete a custom agent skill |
| GET | /api/agent-skills/[id]/raw |
Get raw prompt + metadata (no execution) |
| POST | /api/agent-skills/generate |
AI-generate a new skill from a natural language description |
Auth: Requires management session or management-scoped API key.
Manage the semantic cache and reasoning cache.
| Method | Path | Description |
|---|---|---|
| GET | /api/cache |
Cache overview: total entries, hit rate, size on disk |
| GET | /api/cache/entries |
List cached entries (with pagination) |
| DELETE | /api/cache/entries |
Delete cache entries (filter by query parameters) |
| GET | /api/cache/stats |
Detailed cache statistics (per-provider, per-model) |
| GET | /api/cache/reasoning |
Reasoning cache status (for reasoning replay) |
| DELETE | /api/cache/reasoning |
Clear reasoning cache — query params: ?toolCallId=<id> (single) or ?provider=<p> or no params (all) |
Auth: Requires management session.
Manage persistent memory (FTS5 + vector embeddings).
| Method | Path | Description |
|---|---|---|
| GET | /api/memory |
List memory entries (filter by scope, type, search query) |
| POST | /api/memory |
Create a new memory entry — body: {scope, type, content, metadata?}
|
| GET | /api/memory/[id] |
Get a specific memory entry |
| PUT | /api/memory/[id] |
Update a memory entry |
| DELETE | /api/memory/[id] |
Delete a memory entry |
| GET | /api/memory/search |
Search memory (FTS5 + vector) |
| POST | /api/memory/clear |
Clear memory entries (with filters) |
| GET | /api/memory/stats |
Memory statistics (total entries, embedding coverage, etc.) |
Auth: Requires management session or management-scoped API key.
Manage webhook subscriptions for events.
| Method | Path | Description |
|---|---|---|
| GET | /api/webhooks |
List all webhook subscriptions |
| POST | /api/webhooks |
Create a webhook subscription — body: {url, events[], secret?, active?}
|
| GET | /api/webhooks/[id] |
Get a specific webhook subscription |
| PUT | /api/webhooks/[id] |
Update a webhook subscription |
| DELETE | /api/webhooks/[id] |
Delete a webhook subscription |
| GET | /api/webhooks/events |
List all available webhook event types |
| GET | /api/webhooks/[id]/deliveries |
List delivery history for a webhook (success/failure log) |
| POST | /api/webhooks/[id]/test |
Send a test event to a webhook |
Auth: Requires management session.
See Webhooks Framework for full event types.
Manage Skills (the agentic extensions framework).
| Method | Path | Description |
|---|---|---|
| GET | /api/skills |
List all installed skills (built-in + custom) |
| POST | /api/skills/install |
Install a skill from a local path or URL |
| DELETE | /api/skills/[id] |
Uninstall a skill |
| PUT | /api/skills/[id] |
Enable or disable a skill — body: {enabled?: boolean, mode?: "on" | "off" | "auto"}
|
| POST | /api/skills/executions |
Execute a skill — body: {skillName, apiKeyId, input?, sessionId?}
|
| GET | /api/skills/executions |
List execution history for all skills (filter by ?apiKeyId=) |
Auth: Requires management session or management-scoped API key.
See Skills Framework for full details.
Manage OmniRoute plugins (third-party extensions).
| Method | Path | Description |
|---|---|---|
| GET | /api/plugins |
List installed plugins |
| POST | /api/plugins/install |
Install a plugin from a local path or URL |
| DELETE | /api/plugins/[name] |
Uninstall a plugin |
| POST | /api/plugins/[name]/activate |
Activate a plugin |
| POST | /api/plugins/[name]/deactivate |
Deactivate a plugin |
| GET | /api/plugins/[name]/config |
Get plugin configuration |
| PUT | /api/plugins/[name]/config |
Update plugin configuration |
Auth: Requires management session.
See Plugins Framework for full details.
Shadow / A-B comparison of providers is not a standalone REST surface — it is configured through combo routing (see Auto-Combo). Per-combo comparison metrics are served by GET /api/combos/metrics.
Inspect the runtime guardrails (PII detection, prompt injection detection, vision bridging). Guardrails run on every request; per-call opt-out is via the x-omniroute-disabled-guardrails request header — there is no persisted enable/disable surface.
| Method | Path | Description |
|---|---|---|
| GET | /api/guardrails |
List the registered guardrails and their status (name / enabled / priority) |
| POST | /api/guardrails/test |
Dry-run the pre-call pipeline over a sample input — body: {input, disabledGuardrails?}
|
Auth: Requires management session.
See Security > Guardrails for full details.
- Dashboard routes (
/dashboard/*) useauth_tokencookie - Login uses saved password hash; fallback to
INITIAL_PASSWORD -
requireLogintoggleable via/api/settings/require-login -
/v1/*routes optionally require Bearer API key whenREQUIRE_API_KEY=true
Breaking change (v3.8.0) —
/api/v1/agents/tasks/*and the cooldown management endpoints now require management auth (dashboardauth_tokencookie or a management-scoped API key). Clients that previously called these routes unauthenticated will receive401 Unauthorized. See commit588a0333(fix(auth): require management auth for agent and cooldown APIs).
OmniRoute · Website · npm · Docker Hub
- Setup Guide
- User Guide
- Features
- Quick Start (Docker)
- Electron Desktop App
- Termux (Android)
- PWA Guide
- MCP Server
- A2A Server
- Agent Protocols
- OpenCode Plugin
- Webhooks
- Cloud Agents
- Skills
- Memory
- Evals
- Gamification
- Guardrails
- Compliance
- Error Sanitization
- Public Credentials
- Route Guard Tiers
- Stealth Guide
- CLI Token Auth