A high-performance, production-ready proxy server for AI providers like OpenAI, built with Go. Features request logging, guardrails system, and extensible architecture for multiple AI providers.
- Multi-provider Support: Currently supports OpenAI, easily extensible to Anthropic, Google, and others
- Advanced Guardrails System:
- Parallel Execution: Same priority guardrails run concurrently for minimal latency impact
- Priority-based Processing: Sequential execution across different priority levels
- First-fail Mechanism: Immediate halt on any guardrail failure
- Highly Extensible: Add custom guardrails by simply implementing the
Guardrailinterface
- Request Logging: Asynchronous PostgreSQL logging with comprehensive metrics
- Automated Database Management: Zero-touch schema migrations run automatically on startup
- Configuration-driven: Add new endpoints, providers, and guardrails via YAML
- Ultra-Low Latency: Parallel guardrails execution, async processing, connection pooling, and optimized middleware
- Docker Ready: Complete containerization with docker-compose setup
- Observability: Health checks, metrics, and comprehensive logging with guardrail performance tracking
- Security: Environment variable support, header sanitization, and configurable content filtering
-
Clone and configure
git clone https://github.com/yourusername/flash-gateway.git cd flash-gateway cp configs/providers.example.yaml configs/providers.yaml -
Set your OpenAI API key
export OPENAI_API_KEY=your_api_key_here -
Start with Docker Compose
docker-compose up -d
Database migrations run automatically on first startup
-
Verify it's running
curl http://localhost:8080/health # Response: {"status": "healthy"}
-
Build from source
git clone https://github.com/yourusername/flash-gateway.git cd flash-gateway go build -o gateway cmd/server/main.go -
Configure and run
cp configs/providers.example.yaml configs/providers.yaml export OPENAI_API_KEY=your_api_key_here ./gateway -config configs/providers.yaml
Flash Gateway acts as an intelligent proxy between your applications and AI providers. Here's how a request flows through the system:
Client Request
↓
┌─────────────────────────────────────────────────────────────┐
│ Flash Gateway │
│ │
│ 1. Recovery Middleware ← Catches panics │
│ ↓ │
│ 2. Logger Middleware ← Logs request info │
│ ↓ │
│ 3. CORS Middleware ← Handles CORS headers │
│ ↓ │
│ 4. ContentType Middleware ← Sets content types │
│ ↓ │
│ 5. Capture Middleware ← Captures req/resp for logs │
│ ↓ │
│ 6. ProxyHandler (Router) ← Routes to provider │
│ ↓ │
│ 7. Input Guardrails ← Parallel execution by │
│ ┌─────────────────┐ priority groups │
│ │ G1 │ G2 │ G3 │...│ ← Same priority = parallel │
│ │ Priority 0 ────│ ← Lower number = higher priority│
│ └─────────────────┘ │
│ ┌─────────────────┐ ← First-fail mechanism │
│ │ G4 │ G5 │... │ │
│ │ Priority 1 ────│ ← Next priority group │
│ └─────────────────┘ │
│ ↓ │
│ 8. Provider (OpenAI) ← Proxies to AI service │
│ ↓ │
│ 9. Output Guardrails ← Same parallel execution │
│ ┌─────────────────┐ pattern as input │
│ │ G6 │ G7 │ G8 │...│ ← Response validation │
│ │ Priority 0 ────│ ← Can override unsafe content │
│ └─────────────────┘ │
│ ↓ │
│ 10. Async Logging ← Stores logs & metrics in │
│ PostgreSQL (non-blocking) │
│ │
└─────────────────────────────────────────────────────────────┘
↓
Client Response
- Recovery: Catches any panics and returns proper HTTP error responses
- Logger: Logs basic request information (method, path, duration)
- CORS: Adds CORS headers for cross-origin requests (configurable)
- ContentType: Ensures proper content-type headers
- Capture: Captures full request/response data for async logging
- ProxyHandler: Routes requests and orchestrates guardrails execution
- Input Guardrails:
- Parallel Execution: Same priority guardrails run concurrently for low latency
- Priority Groups: Different priorities run sequentially (lower number = higher priority)
- First-Fail: Execution stops immediately if any guardrail fails
- Extensible: Just implement the
Guardrailinterface to add new checks
- Provider: Forwards request to the appropriate AI service (OpenAI, etc.)
- Output Guardrails:
- Same Architecture: Parallel execution with priority groups and first-fail
- Response Override: Can replace unsafe AI responses with safe alternatives
- Metrics Tracking: All guardrail executions are tracked with performance data
- Async Logging: Stores comprehensive logs and guardrail metrics without blocking responses
The docker-compose.yml provides a complete development environment:
# PostgreSQL database for request logging
postgres:
image: postgres:15-alpine
ports: ["5432:5432"]
environment:
POSTGRES_DB: gateway
POSTGRES_USER: gateway
POSTGRES_PASSWORD: gateway_pass
# Flash Gateway application
gateway:
build: .
ports: ["8080:8080"]
environment:
- DATABASE_URL=postgres://gateway:gateway_pass@postgres:5432/gateway?sslmode=disable
- OPENAI_API_KEY=${OPENAI_API_KEY}
depends_on:
postgres: { condition: service_healthy }# Start all services (runs migrations automatically)
docker-compose up -d
# View logs (includes migration output)
docker-compose logs -f gateway
# Stop all services
docker-compose down
# Rebuild after code changes
docker-compose up --build -d gatewayFlash Gateway includes a zero-configuration database migration system:
How it works:
- Automatic Execution: Migrations run automatically when the gateway container starts
- Idempotent: Safe to restart containers - migrations only run if needed
- Single Schema File: All database schema consolidated in
migrations/schema.sql - Health Check Integration: Gateway won't start if migrations fail
What gets created:
request_logstable with indexes for request trackingguardrail_metricstable for performance monitoring- Views for common queries (
recent_request_logs,guardrail_performance_summary) - Triggers and functions for automatic timestamp updates
Fresh vs Existing Databases:
# Fresh database - migrations run automatically
docker-compose up -d
# Existing database - migrations skipped if already applied
docker-compose restart gatewayManual Migration (if needed):
# Run migrations manually
docker exec gateway-app ./migrations/run-migrations.sh
# Connect to database directly
docker exec -it gateway-postgres psql -U gateway -d gatewayFor production, create a custom docker-compose override:
# docker-compose.prod.yml
version: '3.8'
services:
gateway:
restart: always
environment:
- LOG_LEVEL=info
- DATABASE_URL=${DATABASE_URL}
- OPENAI_API_KEY=${OPENAI_API_KEY}
postgres:
restart: always
volumes:
- /var/lib/postgresql/data:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}GET /health- Health checkGET /status- Server status and provider infoGET /metrics- Logging and performance metrics
All OpenAI API endpoints are supported:
POST /v1/chat/completions- Chat completionsPOST /v1/completions- Legacy completionsPOST /v1/embeddings- Text embeddingsGET /v1/models- Available modelsPOST /v1/audio/speech- Text-to-speechPOST /v1/audio/transcriptions- Audio transcriptionPOST /v1/images/generations- Image generationPOST /v1/fine-tuning/jobs- Fine-tuning- And many more...
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${OPENAI_API_KEY}" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Hello!"}]
}'The gateway automatically applies configured guardrails:
# This request will be checked by OpenAI Moderation API
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer ${OPENAI_API_KEY}" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Tell me something harmful"}]
}'
# Response: {"error": {"message": "I cannot service this request", "type": "content_policy_violation"}}curl http://localhost:8080/statusResponse:
{
"status": "running",
"providers": {
"openai": {
"endpoints": ["/v1/chat/completions", "/v1/completions", "..."]
}
}
}# Required
export OPENAI_API_KEY=your_api_key_here
# Optional - Database (defaults to docker-compose values)
export DATABASE_URL=postgres://user:pass@localhost:5432/gateway?sslmode=disable
# Optional - Logging
export LOG_LEVEL=infoCopy configs/providers.example.yaml to configs/providers.yaml and customize:
server:
port: ":8080"
read_timeout: 30 # seconds
write_timeout: 30 # seconds
idle_timeout: 120 # seconds
storage:
type: "postgres"
postgres:
url: "${DATABASE_URL}" # Uses environment variable
max_connections: 25
max_idle_conns: 5
conn_max_lifetime: 60 # minutes
logging:
enabled: true
buffer_size: 1000 # Channel buffer size
batch_size: 10 # Batch insert size
flush_interval: "1s" # Force flush interval
workers: 3 # Number of worker goroutines
max_body_size: 65536 # Max body size to capture (64KB)
skip_health_check: true # Don't log /health requests
skip_on_error: true # Don't block requests if logging fails
guardrails:
enabled: true
timeout: "5s"
input_guardrails:
- name: "openai_moderation"
type: "openai_moderation"
enabled: true
config:
api_key: "${OPENAI_API_KEY}"
block_on_flag: true
categories: ["hate", "violence", "sexual", "self-harm"]
providers:
- name: openai
base_url: https://api.openai.com
endpoints:
- path: /v1/chat/completions
methods: ["POST"]
timeout: 60
# ... more endpointsBuilt-in guardrails include:
- OpenAI Moderation: Uses OpenAI's moderation API to check for harmful content
- Example Guardrails: Demonstration guardrails for testing
Custom guardrails can be added by implementing the Guardrail interface.
Deploy to AWS with minimal cost (~$5-7/month)
Flash Gateway can be deployed to AWS using App Runner and RDS PostgreSQL with minimal setup:
# 1. Create RDS database (free tier eligible in ap-south-1)
./scripts/setup-rds.sh
# 2. Run database migrations
DATABASE_URL="postgres://..." ./scripts/run-migrations-rds.sh
# 3. Deploy to App Runner (builds for linux/amd64, works on M1 Macs)
./scripts/deploy-to-apprunner.shFeatures:
- ✅ Fully managed infrastructure (zero server maintenance)
- ✅ Automatic HTTPS and SSL certificates
- ✅ Auto-scaling (1-2 instances)
- ✅ Free tier eligible for 12 months (RDS)
- ✅ ~$5-7/month for low traffic (<100 req/day)
- ✅ Cross-platform build support (M1/M2 Macs)
📖 Complete AWS deployment guide: docs/DEPLOYMENT.md
- CPU: 2+ cores recommended
- Memory: 512MB minimum, 2GB+ recommended
- Storage: 10GB+ for logs (with log rotation)
- Network: Reliable internet connection for AI provider APIs
- Set strong PostgreSQL password
- Configure log rotation
- Set up monitoring and alerting
- Enable HTTPS with reverse proxy (nginx/Cloudflare)
- Configure firewall rules
- Set up backup strategy for database
- Monitor disk space for logs
- Configure resource limits in docker-compose
Monitor these metrics:
- Health:
GET /healthendpoint - Request logs: PostgreSQL
request_logstable - Performance:
GET /metricsendpoint - Error rates: Check application logs
- Database: Monitor PostgreSQL performance
-
Create provider implementation:
// internal/providers/anthropic/provider.go type AnthropicProvider struct {} func (p *AnthropicProvider) GetName() string { return "anthropic" } func (p *AnthropicProvider) ProxyRequest(req *http.Request) (*http.Response, error) { // Implementation } // ... implement other Provider interface methods
-
Register in router:
// internal/router/router.go case "anthropic": provider = anthropic.NewAnthropicProvider(providerConfig)
-
Add to configuration:
providers: - name: anthropic base_url: https://api.anthropic.com endpoints: - path: /v1/messages methods: ["POST"] timeout: 60
# Run unit tests
go test ./...
# Test with Docker
docker-compose up -dFlash Gateway includes a web dashboard for monitoring and testing AI gateway traffic.
- Real-time Request Logs: View all API requests with detailed information
- Guardrail Metrics: Monitor content filtering and safety measures
- API Playground: Test endpoints directly through the web interface
- Response Override Tracking: See when and how guardrails modify responses
After running the Docker setup, the dashboard is automatically available:
# Start all services including dashboard
docker-compose up -d
# Access points:
# - Gateway API: http://localhost:8080
# - Dashboard UI: http://localhost:5173
# - Dashboard API: http://localhost:4000- Paginated view of all API requests
- Filterable by endpoint, method, status code
- Click any row to view full request/response details
- Real-time updates as new requests come in
- Monitor all guardrail executions
- See performance metrics (duration, pass/fail rates)
- Track response overrides when content is blocked
- Filter by guardrail name, layer (input/output), or status
- Interactive API testing interface
- Supports both
/v1/chat/completionsand/v1/responsesendpoints - Configurable system prompts
- Real-time conversation interface
- Request/response logging in browser console
The dashboard supports hot-reload during development:
# Dashboard runs in development mode by default
docker-compose up -d
# View dashboard logs
docker-compose logs -f dashboard
# Rebuild after changes
docker-compose up --build -d dashboardFor production, set the environment to optimize builds:
# Set production environment
NODE_ENV=production docker-compose up --build -dGET /api/health- Health check with database statusGET /api/request-logs- Paginated request logsGET /api/request-logs/:id- Individual request detailsGET /api/guardrail-metrics- Paginated guardrail metrics
The dashboard consists of:
- Frontend: React + TypeScript + Vite dev server (port 5173)
- Backend: Express.js API server (port 4000)
- Database: Shared PostgreSQL with gateway for real-time data
All components run in Docker containers with automatic service discovery and health checks.
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
go test ./... - Commit:
git commit -m 'feat: add amazing feature' - Push:
git push origin feature/amazing-feature - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Documentation: Check this README and CONTRIBUTING.md
- Issues: GitHub Issues
- Security: See our Security Policy
- Support for Anthropic Claude API
- Support for Google Gemini API
- Enhanced rate limiting and caching
- Webhook support for async processing
- Enhanced metrics and monitoring
- Plugin system for custom middleware
- Load balancing across multiple provider instances
Built by the Flash Gateway team