Skip to content

This project was developed using an early experimental version of AhaLoop, created entirely by AI through a single prompt. While there are still some minor issues and the README may slightly differ from the actual implementation, it remains an interesting experiment.

Notifications You must be signed in to change notification settings

YougLin-dev/AI-Gateway

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

145 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Gateway

A Rust-based API gateway for AI services with protocol conversion between OpenAI and Anthropic formats.

Prerequisites

  • Rust (stable, latest version recommended)
  • Cargo

Build

cargo build

For release builds:

cargo build --release

Run

cargo run

The server starts on http://0.0.0.0:8080 by default.

Configuration

Configuration is managed via TOML files in the config/ directory.

Server

[server]
host = "0.0.0.0"
port = 8080

Logging

[logging]
level = "info"    # debug, info, warn, error
format = "json"   # json or pretty

Upstreams

Define AI provider backends:

[[upstreams]]
name = "openai"
url = "https://api.openai.com"
format = "openai-chat"

[[upstreams]]
name = "anthropic"
url = "https://api.anthropic.com"
format = "anthropic"

Routes

Map incoming requests to upstreams:

[[routes]]
path = "/v1/chat/completions"
input_format = "openai-chat"
upstream = "openai"

[[routes]]
path = "/v1/messages"
input_format = "anthropic"
upstream = "anthropic"

See config/example.toml for all available options.

Metrics

The gateway exposes Prometheus metrics at GET /metrics. This endpoint bypasses access control and does not require authentication.

HTTP Metrics

Metric Type Labels Description
http_requests_total Counter method, path, status Total HTTP requests processed
http_request_duration_seconds Histogram method, path, status Request latency including all middleware
http_requests_in_flight Gauge method, path Currently processing requests

Upstream Metrics

Metric Type Labels Description
upstream_request_duration_seconds Histogram upstream, provider Time to first byte from upstream
upstream_requests_total Counter upstream, provider, status Total requests to upstreams
upstream_errors_total Counter upstream, error_type Upstream errors by type

Streaming Metrics

Metric Type Labels Description
streaming_events_total Counter provider Total SSE events streamed
streaming_bytes_total Counter provider Total bytes streamed
streaming_duration_seconds Histogram provider Full stream duration

Access Control Metrics

Metric Type Labels Description
auth_requests_total Counter result Authentication attempts (allowed/denied)
rate_limit_exceeded_total Counter key_name Rate limit violations by key
quota_exceeded_total Counter key_name Quota violations by key

Example Prometheus Queries

# Request rate per second
rate(http_requests_total[5m])

# 95th percentile latency
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))

# Error rate
sum(rate(http_requests_total{status=~"5.."}[5m])) / sum(rate(http_requests_total[5m]))

# Upstream error rate by provider
sum by (provider) (rate(upstream_errors_total[5m]))

# Streaming throughput in bytes/sec
rate(streaming_bytes_total[5m])

# Authentication failure rate
sum(rate(auth_requests_total{result="denied"}[5m])) / sum(rate(auth_requests_total[5m]))

Example Grafana Dashboard Queries

# Panel: Request Rate
rate(http_requests_total[1m])

# Panel: Latency Heatmap
sum(rate(http_request_duration_seconds_bucket[1m])) by (le)

# Panel: In-Flight Requests
sum(http_requests_in_flight)

# Panel: Upstream Latency by Provider
histogram_quantile(0.50, sum(rate(upstream_request_duration_seconds_bucket[5m])) by (provider, le))

# Panel: Rate Limit Violations
sum by (key_name) (increase(rate_limit_exceeded_total[1h]))

Deployment

Docker

Build the Docker image:

docker build -t ai-gateway .

Run the container:

docker run -d \
  -p 8080:8080 \
  -v $(pwd)/config:/home/appuser/config:ro \
  -e GATEWAY_SERVER_HOST=0.0.0.0 \
  -e GATEWAY_UPSTREAMS_OPENAI_API_KEY=sk-... \
  -e GATEWAY_UPSTREAMS_ANTHROPIC_API_KEY=sk-ant-... \
  ai-gateway

The image uses Alpine Linux and is under 30MB. It runs as a non-root user for security.

Docker Compose

For local deployment:

docker-compose up -d

For deployment with Prometheus and Grafana monitoring:

docker-compose -f examples/docker-compose.monitoring.yml up -d

Access points:

Environment Variables

All configuration can be overridden via environment variables with the GATEWAY_ prefix:

Variable Description Example
GATEWAY_SERVER_HOST Listen address 0.0.0.0
GATEWAY_SERVER_PORT Listen port 8080
GATEWAY_LOGGING_LEVEL Log level info, warn, debug
GATEWAY_LOGGING_FORMAT Log format json, pretty
GATEWAY_ENV Config environment production
GATEWAY_UPSTREAMS_OPENAI_API_KEY OpenAI API key sk-...
GATEWAY_UPSTREAMS_ANTHROPIC_API_KEY Anthropic API key sk-ant-...

API keys should always be passed via environment variables, not config files.

Volume Mounting

Mount the config directory as read-only:

-v /path/to/config:/home/appuser/config:ro

The container expects config files at /home/appuser/config/. The gateway loads configuration based on GATEWAY_ENV:

  • GATEWAY_ENV=production loads config/production.toml
  • Default loads config/default.toml

See examples/config.production.toml for a production-ready configuration template.

Health Check

The gateway exposes a health check endpoint at GET /health:

curl http://localhost:8080/health

Response:

{
  "status": "healthy",
  "version": "0.1.0",
  "uptime_seconds": 1234
}

This endpoint bypasses access control and is suitable for:

  • Docker HEALTHCHECK
  • Kubernetes liveness/readiness probes
  • Load balancer health checks

Kubernetes

Basic deployment hints:

# Deployment
apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
        - name: ai-gateway
          image: ai-gateway:latest
          ports:
            - containerPort: 8080
          env:
            - name: GATEWAY_SERVER_HOST
              value: "0.0.0.0"
            - name: GATEWAY_UPSTREAMS_OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: ai-gateway-secrets
                  key: openai-api-key
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 30
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          volumeMounts:
            - name: config
              mountPath: /home/appuser/config
              readOnly: true
      volumes:
        - name: config
          configMap:
            name: ai-gateway-config

For Prometheus monitoring, create a ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
spec:
  endpoints:
    - port: http
      path: /metrics
      interval: 15s

Store API keys in Kubernetes Secrets and non-secret configuration in ConfigMaps.

Development

# Run tests
cargo test

# Run lints
cargo clippy

# Format code
cargo fmt

License

MIT

About

This project was developed using an early experimental version of AhaLoop, created entirely by AI through a single prompt. While there are still some minor issues and the README may slightly differ from the actual implementation, it remains an interesting experiment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published