-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Nexus Gateway is an enterprise AI traffic gateway. It sits between every application or endpoint that calls a large language model and the LLM provider itself, and runs that traffic through one compliance engine, one audit pipeline, and one control plane. A single organization can apply unified policy, observability, cost control, and access management across every way its people and applications consume AI.
Nexus intercepts LLM traffic at three independent layers. Each layer runs the full compliance pipeline on its own traffic and sends it directly to the upstream provider.
| Layer | Where it intercepts |
|---|---|
| AI Gateway | The SDK layer — virtual keys on /v1/chat/completions, /v1/responses, /v1/embeddings, and /v1/messages. |
| Compliance Proxy | The network layer — a transparent TLS bump over HTTPS CONNECT. |
| Desktop Agent | The operating-system layer — packet-level capture on macOS, Linux, and Windows endpoints. |
When network policy routes Desktop Agent traffic out through the Compliance Proxy, the Agent stamps a signed attestation header on each request. The proxy verifies it and passes the connection through without re-inspecting it or recording a duplicate audit event, because the Agent already ran the pipeline.
- Provider abstraction. Applications speak the OpenAI SDK shape. Nexus normalizes each request to a canonical shape and translates it to the wire format of the actual provider. Twenty in-tree adapters ship today — eleven with full bidirectional translation (OpenAI, Anthropic, Gemini, Vertex, Azure, Bedrock, Cohere, MiniMax, GLM, Replicate, Voyage) and nine OpenAI-compatible passthrough providers (DeepSeek, Moonshot, Mistral, Groq, Fireworks, Together, Perplexity, xAI, Hugging Face). Function calls, vision input, structured outputs, and reasoning tokens survive the translation.
- Multi-tier cache. An exact-match response cache, a semantic vector cache, provider-native cached-token accounting, and single-flight folding of concurrent identical prompts into one upstream call.
-
Cost and quota control. Multi-axis quotas (per organization, virtual key, provider, or model), token- or dollar-based budgets, soft limits that fire alerts and hard limits that reject with HTTP
429, and seven routing strategies: single, fallback, load-balance, conditional, A/B split, policy, and smart. - Compliance pipeline. PII detection, data classification, keyword filtering, content safety, rate limiting, IP allowlists, request-size validation, webhook forwarders, per-stage audit, body capture, SIEM forwarding, a three-tier kill switch, and emergency passthrough.
- Enterprise governance. IAM with role- and attribute-based access control, virtual keys scoped per model, OIDC federation with just-in-time user provisioning, an organization and project hierarchy with per-organization quota, a credential vault with AES-256-GCM encryption and key rotation, and Desktop Agent fleet management with config sync and out-of-sync detection.
Five Go services and one React console make up Nexus.
| Component | Purpose |
|---|---|
| Nexus Hub | Node registry, target-config store, config sync, scheduled jobs, agent certificate authority, SIEM bridge. |
| Control Plane | Admin API, IAM, single sign-on, analytics. |
| AI Gateway |
/v1 AI traffic, provider adapters, routing, quota. |
| Compliance Proxy | HTTPS CONNECT, MITM TLS bump, compliance pipeline. |
| Desktop Agent | Endpoint traffic interception on macOS, Linux, and Windows. |
| Control Plane UI | React admin dashboard. |
The four backend services register with the Nexus Hub as nodes and pull their configuration from it on boot and whenever it signals a change; the Hub never pushes full state. Durable state lives in PostgreSQL. Valkey backs sessions, the caches, rate limiting, and quota counters. NATS JetStream carries event streams and Hub coordination.
- Administrator / Compliance Officer — defines policy: providers and credentials, routing rules, virtual keys, hooks and rule packs, quotas, identity providers, IAM policies, and alerts.
- Fleet Operator — runs the Infrastructure surface: nodes, config sync, scheduled jobs, the kill switch, error and crash reporting, proxy rollout, agent setup, observability, and SIEM.
-
Developer / Application Owner — holds a personal virtual key and calls the AI Gateway on
/v1. - Endpoint User — has AI traffic intercepted by the Compliance Proxy or the Desktop Agent, with no direct Control Plane interaction.
- Getting Started — run Nexus locally and make your first request.
- Core Concepts — virtual keys, providers, routing, caching, compliance, and nodes.
- Architecture Overview — the traffic-plane and control-plane picture.
- Installation and Deployment — SaaS, self-hosted, and air-gapped options.
Nexus Gateway · Enterprise AI traffic gateway for compliance, routing, caching, and analytics.
Start here
Concepts
Using the gateway
Operations & internals
Community