Helicone AI Gateway

The fastest, lightest, and easiest-to-integrate AI Gateway on the market.

Built by the team at Helicone, open-sourced for the community.

🚀 Quick Start • 📖 Docs • 💬 Discord • 🌐 Website

🚆 1 API. 100+ models.

Open-source, lightweight, and built on Rust.

Handle hundreds of models and millions of LLM requests with minimal latency and maximum reliability.

The NGINX of LLMs.

👩🏻‍💻 Set up in seconds

Set up your .env file with your PROVIDER_API_KEYs

OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key

Run locally in your terminal

npx @helicone/ai-gateway@latest

Make your requests using any OpenAI SDK:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/ai",
    api_key="placeholder-api-key" # Gateway handles API keys
)

# Route to any LLM provider through the same interface, we handle the rest.
response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet",  # Or other 100+ models..
    messages=[{"role": "user", "content": "Hello from Helicone AI Gateway!"}]
)

That's it. No new SDKs to learn, no integrations to maintain. Fully-featured and open-sourced.

-- For advanced config, check out our configuration guide and the providers we support.

Why Helicone AI Gateway?

🌐 Unified interface

Request any LLM provider using familiar OpenAI syntax. Stop rewriting integrations—use one API for OpenAI, Anthropic, Google, AWS Bedrock, and 20+ more providers.

⚡ Smart provider selection

Load balance to always hit the fastest, cheapest, or most reliable option. Built-in strategies include latency-based P2C + PeakEWMA, weighted distribution, and cost optimization. Always aware of provider uptime and rate limits.

💰 Control your spending

Rate limit to prevent runaway costs and usage abuse. Set limits per user, team, or globally with support for request counts, token usage, and dollar amounts.

🚀 Improve performance

Cache responses to reduce costs and latency by up to 95%. Supports Redis and S3 backends with intelligent cache invalidation.

📊 Simplified tracing

Monitor performance and debug issues with built-in Helicone integration, plus OpenTelemetry support for logs, metrics, and traces.

☁️ One-click deployment

Deploy in seconds to your own infrastructure by using our Docker or binary download following our deployment guides.

Launch.Final.1.1.1.mp4

⚡ Scalable for production

Metric	Helicone AI Gateway	Typical Setup
P95 Latency	<10ms	~60-100ms
Memory Usage	~64MB	~512MB
Requests/sec	~2,000	~500
Binary Size	~15MB	~200MB
Cold Start	~100ms	~2s

Note: These are preliminary performance metrics. See benchmarks/README.md for detailed benchmarking methodology and results.

🎥 Demo

AI.Gateway.Demo.mp4

🏗️ How it works

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Your App      │───▶│ Helicone AI     │───▶│  LLM Providers  │
│                 │    │ Gateway         │    │                 │
│ OpenAI SDK      │    │                 │    │ • OpenAI        │
│ (any language)  │    │ • Load Balance  │    │ • Anthropic     │
│                 │    │ • Rate Limit    │    │ • AWS Bedrock   │
│                 │    │ • Cache         │    │ • Google Vertex │
│                 │    │ • Trace         │    │ • 20+ more      │
│                 │    │ • Fallbacks     │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                               │
                               ▼
                      ┌─────────────────┐
                      │ Helicone        │
                      │ Observability   │
                      │                 │
                      │ • Dashboard     │
                      │ • Observability │
                      │ • Monitoring    │
                      │ • Debugging     │
                      └─────────────────┘

⚙️ Custom configuration

1. Set up your environment variables

Include your PROVIDER_API_KEYs in your .env file.

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
HELICONE_API_KEY=sk-...

2. Customize your config file

Note: This is a sample config.yaml file. Please refer to our configuration guide for the full list of options, examples, and defaults. See our full provider list here.

helicone: # Include your HELICONE_API_KEY in your .env file
  observability: true
  authentication: true

cache-store:
  in-memory: {}

global: # Global settings for all routers
  cache:
    directive: "max-age=3600, max-stale=1800"

routers:
  your-router-name: # Single router configuration
    load-balance:
      chat:
        strategy: latency
        targets:
          - openai
          - anthropic
    rate-limit:
      per-api-key:
        capacity: 1000
        refill-frequency: 1m # 1000 requests per minute

3. Run with your custom configuration

npx @helicone/ai-gateway@latest --config config.yaml

4. Make your requests

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/router/your-router-name",
    api_key="placeholder-api-key" # Gateway handles API keys
)

# Route to any LLM provider through the same interface, we handle the rest.
response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet",  # Or other 100+ models..
    messages=[{"role": "user", "content": "Hello from Helicone AI Gateway!"}]
)

📚 Migration guide

From OpenAI (Python)

from openai import OpenAI

client = OpenAI(
-   api_key=os.getenv("OPENAI_API_KEY")
+   api_key="placeholder-api-key" # Gateway handles API keys
+   base_url="http://localhost:8080/router/your-router-name"
)

# No other changes needed!
response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

From OpenAI (TypeScript)

import { OpenAI } from "openai";

const client = new OpenAI({
-   apiKey: os.getenv("OPENAI_API_KEY")
+   apiKey: "placeholder-api-key", // Gateway handles API keys
+   baseURL: "http://localhost:8080/router/your-router-name",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-4o",
  messages: [{ role: "user", content: "Hello from Helicone AI Gateway!" }],
});

📚 Resources

Documentation

📖 Full Documentation - Complete guides and API reference
🚀 Quickstart Guide - Get up and running in 1 minute
🔬 Advanced Configurations - Configuration reference & examples

Community

💬 Discord Server - Our community of passionate AI engineers
🐙 GitHub Discussions - Q&A and feature requests
🐦 Twitter - Latest updates and announcements
📧 Newsletter - Tips and tricks to deploying AI applications

Support

🎫 Report bugs: Github issues
💼 Enterprise Support: Book a discovery call with our team

📄 License

The Helicone AI Gateway is licensed under the Apache License - see the file for details.

Made with ❤️ by Helicone.

Website • Docs • Twitter • Discord

Name		Name	Last commit message	Last commit date
Latest commit History 360 Commits
.ai		.ai
.cargo-husky/hooks		.cargo-husky/hooks
.cargo		.cargo
.claude		.claude
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
ai-gateway		ai-gateway
benchmarks		benchmarks
crates		crates
infrastructure		infrastructure
scripts		scripts
.dockerignore		.dockerignore
.env.template		.env.template
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEMO.md		DEMO.md
DEVELOPMENT.md		DEVELOPMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RELEASE.md		RELEASE.md
SIDECAR.md		SIDECAR.md
dist-workspace.toml		dist-workspace.toml
fly.toml		fly.toml
render.yaml		render.yaml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Helicone AI Gateway

🚆 1 API. 100+ models.

👩🏻‍💻 Set up in seconds

Why Helicone AI Gateway?

🌐 Unified interface

⚡ Smart provider selection

💰 Control your spending

🚀 Improve performance

📊 Simplified tracing

☁️ One-click deployment

⚡ Scalable for production

🎥 Demo

🏗️ How it works

⚙️ Custom configuration

1. Set up your environment variables

2. Customize your config file

3. Run with your custom configuration

4. Make your requests

📚 Migration guide

From OpenAI (Python)

From OpenAI (TypeScript)

📚 Resources

Documentation

Community

Support

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 8

Uh oh!

Languages

License

Helicone/ai-gateway

Folders and files

Latest commit

History

Repository files navigation

Helicone AI Gateway

🚆 1 API. 100+ models.

👩🏻‍💻 Set up in seconds

Why Helicone AI Gateway?

🌐 Unified interface

⚡ Smart provider selection

💰 Control your spending

🚀 Improve performance

📊 Simplified tracing

☁️ One-click deployment

⚡ Scalable for production

🎥 Demo

🏗️ How it works

⚙️ Custom configuration

1. Set up your environment variables

2. Customize your config file

3. Run with your custom configuration

4. Make your requests

📚 Migration guide

From OpenAI (Python)

From OpenAI (TypeScript)

📚 Resources

Documentation

Community

Support

📄 License

About

Topics

Resources

License

Code of conduct

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 8

Uh oh!

Languages

Packages