A Rust-based API gateway for AI services with protocol conversion between OpenAI and Anthropic formats.
- Rust (stable, latest version recommended)
- Cargo
cargo buildFor release builds:
cargo build --releasecargo runThe server starts on http://0.0.0.0:8080 by default.
Configuration is managed via TOML files in the config/ directory.
[server]
host = "0.0.0.0"
port = 8080[logging]
level = "info" # debug, info, warn, error
format = "json" # json or prettyDefine AI provider backends:
[[upstreams]]
name = "openai"
url = "https://api.openai.com"
format = "openai-chat"
[[upstreams]]
name = "anthropic"
url = "https://api.anthropic.com"
format = "anthropic"Map incoming requests to upstreams:
[[routes]]
path = "/v1/chat/completions"
input_format = "openai-chat"
upstream = "openai"
[[routes]]
path = "/v1/messages"
input_format = "anthropic"
upstream = "anthropic"See config/example.toml for all available options.
The gateway exposes Prometheus metrics at GET /metrics. This endpoint bypasses access control and does not require authentication.
| Metric | Type | Labels | Description |
|---|---|---|---|
http_requests_total |
Counter | method, path, status |
Total HTTP requests processed |
http_request_duration_seconds |
Histogram | method, path, status |
Request latency including all middleware |
http_requests_in_flight |
Gauge | method, path |
Currently processing requests |
| Metric | Type | Labels | Description |
|---|---|---|---|
upstream_request_duration_seconds |
Histogram | upstream, provider |
Time to first byte from upstream |
upstream_requests_total |
Counter | upstream, provider, status |
Total requests to upstreams |
upstream_errors_total |
Counter | upstream, error_type |
Upstream errors by type |
| Metric | Type | Labels | Description |
|---|---|---|---|
streaming_events_total |
Counter | provider |
Total SSE events streamed |
streaming_bytes_total |
Counter | provider |
Total bytes streamed |
streaming_duration_seconds |
Histogram | provider |
Full stream duration |
| Metric | Type | Labels | Description |
|---|---|---|---|
auth_requests_total |
Counter | result |
Authentication attempts (allowed/denied) |
rate_limit_exceeded_total |
Counter | key_name |
Rate limit violations by key |
quota_exceeded_total |
Counter | key_name |
Quota violations by key |
# Request rate per second
rate(http_requests_total[5m])
# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
# Error rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))
# Upstream error rate by provider
sum by (provider) (rate(upstream_errors_total[5m]))
# Streaming throughput in bytes/sec
rate(streaming_bytes_total[5m])
# Authentication failure rate
sum(rate(auth_requests_total{result="denied"}[5m])) / sum(rate(auth_requests_total[5m]))
# Panel: Request Rate
rate(http_requests_total[1m])
# Panel: Latency Heatmap
sum(rate(http_request_duration_seconds_bucket[1m])) by (le)
# Panel: In-Flight Requests
sum(http_requests_in_flight)
# Panel: Upstream Latency by Provider
histogram_quantile(0.50, sum(rate(upstream_request_duration_seconds_bucket[5m])) by (provider, le))
# Panel: Rate Limit Violations
sum by (key_name) (increase(rate_limit_exceeded_total[1h]))
Build the Docker image:
docker build -t ai-gateway .Run the container:
docker run -d \
-p 8080:8080 \
-v $(pwd)/config:/home/appuser/config:ro \
-e GATEWAY_SERVER_HOST=0.0.0.0 \
-e GATEWAY_UPSTREAMS_OPENAI_API_KEY=sk-... \
-e GATEWAY_UPSTREAMS_ANTHROPIC_API_KEY=sk-ant-... \
ai-gatewayThe image uses Alpine Linux and is under 30MB. It runs as a non-root user for security.
For local deployment:
docker-compose up -dFor deployment with Prometheus and Grafana monitoring:
docker-compose -f examples/docker-compose.monitoring.yml up -dAccess points:
- Gateway: http://localhost:8080
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000 (admin/admin)
All configuration can be overridden via environment variables with the GATEWAY_ prefix:
| Variable | Description | Example |
|---|---|---|
GATEWAY_SERVER_HOST |
Listen address | 0.0.0.0 |
GATEWAY_SERVER_PORT |
Listen port | 8080 |
GATEWAY_LOGGING_LEVEL |
Log level | info, warn, debug |
GATEWAY_LOGGING_FORMAT |
Log format | json, pretty |
GATEWAY_ENV |
Config environment | production |
GATEWAY_UPSTREAMS_OPENAI_API_KEY |
OpenAI API key | sk-... |
GATEWAY_UPSTREAMS_ANTHROPIC_API_KEY |
Anthropic API key | sk-ant-... |
API keys should always be passed via environment variables, not config files.
Mount the config directory as read-only:
-v /path/to/config:/home/appuser/config:roThe container expects config files at /home/appuser/config/. The gateway loads configuration based on GATEWAY_ENV:
GATEWAY_ENV=productionloadsconfig/production.toml- Default loads
config/default.toml
See examples/config.production.toml for a production-ready configuration template.
The gateway exposes a health check endpoint at GET /health:
curl http://localhost:8080/healthResponse:
{
"status": "healthy",
"version": "0.1.0",
"uptime_seconds": 1234
}This endpoint bypasses access control and is suitable for:
- Docker HEALTHCHECK
- Kubernetes liveness/readiness probes
- Load balancer health checks
Basic deployment hints:
# Deployment
apiVersion: apps/v1
kind: Deployment
spec:
template:
spec:
containers:
- name: ai-gateway
image: ai-gateway:latest
ports:
- containerPort: 8080
env:
- name: GATEWAY_SERVER_HOST
value: "0.0.0.0"
- name: GATEWAY_UPSTREAMS_OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: ai-gateway-secrets
key: openai-api-key
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 30
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
volumeMounts:
- name: config
mountPath: /home/appuser/config
readOnly: true
volumes:
- name: config
configMap:
name: ai-gateway-configFor Prometheus monitoring, create a ServiceMonitor:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
spec:
endpoints:
- port: http
path: /metrics
interval: 15sStore API keys in Kubernetes Secrets and non-secret configuration in ConfigMaps.
# Run tests
cargo test
# Run lints
cargo clippy
# Format code
cargo fmtMIT