Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions markdown/api-reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,76 @@ curl -X POST http://localhost:8080/v1/chat/completions \
}'
```

### Vision/Multimodal Support

For vision-capable models, you can include images in your requests using either HTTP URLs or base64-encoded data URLs. Vision support must be enabled with `ENABLE_VISION=true` in your configuration.

#### Using HTTP URL

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
]
}'
```

#### Using Base64 Data URL

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this image"
},
{
"type": "image_url",
"image_url": {
"url": ""
}
}
]
}
]
}'
```

**Supported Providers with Vision:**

- OpenAI (GPT-4o, GPT-5, GPT-4.1, GPT-4 Turbo)
- Anthropic (Claude 3, Claude 4, Claude 4.5 Sonnet, Claude 4.5 Haiku)
- Google (Gemini 2.5)
- Cohere (Command A Vision, Aya Vision)
- Ollama (LLaVA, Llama 4, Llama 3.2 Vision)
- Groq (vision models)
- Mistral (Pixtral)

**Note:** When `ENABLE_VISION=false` (default), requests containing image content will be rejected even if the model supports vision. This is disabled by default for performance and security reasons.

### Direct API Proxy

For more advanced use cases, you can proxy requests directly to the provider's API:
Expand Down
8 changes: 8 additions & 0 deletions markdown/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ Environment variables are the primary method for configuring Inference Gateway.
<ConfigTable
rows={[
{ variable: 'ENVIRONMENT', description: 'Deployment environment', defaultValue: 'production' },
{
variable: 'ENABLE_VISION',
description: 'Enable vision/multimodal support for all providers',
defaultValue: 'false',
},
{
variable: 'ENABLE_TELEMETRY',
description: 'Enable OpenTelemetry metrics and tracing',
Expand All @@ -30,6 +35,8 @@ Environment variables are the primary method for configuring Inference Gateway.
]}
/>

When `ENABLE_VISION` is set to `true`, Inference Gateway enables vision/multimodal capabilities, allowing you to send images alongside text in chat completion requests. When disabled (default), requests with image content will be rejected even if the provider and model support vision. This is disabled by default for performance and security reasons.

When `ENABLE_TELEMETRY` is set to `true`, Inference Gateway exposes a `/metrics` endpoint for Prometheus scraping and generates distributed traces that can be collected by OpenTelemetry collectors.

### OpenID Connect
Expand Down Expand Up @@ -369,6 +376,7 @@ Here's a comprehensive example for configuring Inference Gateway in a production
```bash
# General settings
ENVIRONMENT=production
ENABLE_VISION=true
ENABLE_TELEMETRY=true
ENABLE_AUTH=true

Expand Down
71 changes: 71 additions & 0 deletions markdown/examples.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -98,4 +98,75 @@ curl -X POST http://localhost:8080/v1/chat/completions \
}'
```

### Vision/Multimodal Image Processing

Process images with vision-capable models. First, enable vision support:

```bash
ENABLE_VISION=true
```

#### Using HTTP Image URL

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
]
}'
```

#### Using Base64 Data URL

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What color is this pixel?"
},
{
"type": "image_url",
"image_url": {
"url": ""
}
}
]
}
]
}'
```

**Supported Vision Models:**

- `anthropic/claude-3-5-sonnet-20241022` (Claude 4.5 Sonnet)
- `anthropic/claude-3-5-haiku-20250219` (Claude 4.5 Haiku)
- `openai/gpt-4o`
- `google/gemini-2.5-flash`
- `ollama/llava`
- And more...

For more detailed examples and use cases, check out the full [examples directory](https://github.com/inference-gateway/inference-gateway/tree/main/examples) in the GitHub repository.
174 changes: 133 additions & 41 deletions markdown/supported-providers.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,61 +11,153 @@ The following LLM providers are currently supported:
<div className="border p-4 rounded-md">
<h3>OpenAI</h3>
<p>Access GPT models including GPT-3.5, GPT-4, and more.</p>
<p><strong>Authentication:</strong> Bearer Token</p>
<p><strong>Default URL:</strong> https://api.openai.com/v1</p>
<div><strong>Authentication:</strong> Bearer Token</div>
<div><strong>Default URL:</strong> https://api.openai.com/v1</div>
<div><strong>Vision Support:</strong> ✅ Yes (GPT-4o, GPT-5, GPT-4.1, GPT-4 Turbo)</div>
</div>

<div className="border p-4 rounded-md">
<h3>DeepSeek</h3>
<p>Use DeepSeek's models for various natural language tasks.</p>
<p><strong>Authentication:</strong> Bearer Token</p>
<p><strong>Default URL:</strong> https://api.deepseek.com</p>
<div className="border p-4 rounded-md">
<h3>DeepSeek</h3>
<p>Use DeepSeek's models for various natural language tasks.</p>
<div>
<strong>Authentication:</strong> Bearer Token
</div>

<div className="border p-4 rounded-md">
<h3>Anthropic</h3>
<p>Connect to Claude models for high-quality conversational AI.</p>
<p><strong>Authentication:</strong> X-Header</p>
<p><strong>Default URL:</strong> https://api.anthropic.com/v1</p>
<div>
<strong>Default URL:</strong> https://api.deepseek.com
</div>

<div className="border p-4 rounded-md">
<h3>Cohere</h3>
<p>Use Cohere's models for various natural language tasks.</p>
<p><strong>Authentication:</strong> Bearer Token</p>
<p><strong>Default URL:</strong> https://api.cohere.com</p>
<div>
<strong>Vision Support:</strong> ❌ No
</div>

<div className="border p-4 rounded-md">
<h3>Groq</h3>
<p>Access high-performance inference with Groq's LPU-accelerated models.</p>
<p><strong>Authentication:</strong> Bearer Token</p>
<p><strong>Default URL:</strong> https://api.groq.com/openai/v1</p>
</div>

<div className="border p-4 rounded-md">
<h3>Anthropic</h3>
<p>Connect to Claude models for high-quality conversational AI.</p>
<div>
<strong>Authentication:</strong> X-Header
</div>

<div className="border p-4 rounded-md">
<h3>Cloudflare</h3>
<p>Connect to Cloudflare Workers AI for inference on various models.</p>
<p><strong>Authentication:</strong> Bearer Token</p>
<p><strong>Default URL:</strong> https://api.cloudflare.com/client/v4/accounts/</p>
<p className="text-sm">{'{ACCOUNT_ID}'}/ai</p>
<div>
<strong>Default URL:</strong> https://api.anthropic.com/v1
</div>

<div className="border p-4 rounded-md">
<h3>Ollama</h3>
<p>Run open-source models locally or on a self-hosted server.</p>
<p><strong>Authentication:</strong> None (optional API key)</p>
<p><strong>Default URL:</strong> http://ollama:8080/v1</p>
<div>
<strong>Vision Support:</strong> ✅ Yes (Claude 3, Claude 4, Claude 4.5 Sonnet, Claude 4.5
Haiku)
</div>
</div>

<div className="border p-4 rounded-md">
<h3>Cohere</h3>
<p>Use Cohere's models for various natural language tasks.</p>
<div>
<strong>Authentication:</strong> Bearer Token
</div>
<div>
<strong>Default URL:</strong> https://api.cohere.com
</div>
<div>
<strong>Vision Support:</strong> ✅ Yes (Command A Vision, Aya Vision)
</div>
</div>

<div className="border p-4 rounded-md">
<h3>Groq</h3>
<p>Access high-performance inference with Groq's LPU-accelerated models.</p>
<div>
<strong>Authentication:</strong> Bearer Token
</div>
<div>
<strong>Default URL:</strong> https://api.groq.com/openai/v1
</div>
<div>
<strong>Vision Support:</strong> ✅ Yes (vision models)
</div>
</div>

<div className="border p-4 rounded-md">
<h3>Cloudflare</h3>
<p>Connect to Cloudflare Workers AI for inference on various models.</p>
<div>
<strong>Authentication:</strong> Bearer Token
</div>
<div>
<strong>Default URL:</strong> https://api.cloudflare.com/client/v4/accounts/
</div>
<div className="text-sm">{'{ACCOUNT_ID}'}/ai</div>
<div>
<strong>Vision Support:</strong> ❌ No
</div>
</div>

<div className="border p-4 rounded-md">
<h3>Ollama</h3>
<p>Run open-source models locally or on a self-hosted server.</p>
<div>
<strong>Authentication:</strong> None (optional API key)
</div>

<div>
<strong>Default URL:</strong> http://ollama:8080/v1
</div>
<div>
<strong>Vision Support:</strong> ✅ Yes (LLaVA, Llama 4, Llama 3.2 Vision)
</div>
</div>

<div className="border p-4 rounded-md">
<h3>Google</h3>
<p>Access Google's Gemini models for text generation and understanding.</p>
<p><strong>Authentication:</strong> Bearer Token</p>
<p><strong>Default URL:</strong> https://generativelanguage.googleapis.com/v1</p>
<div><strong>Authentication:</strong> Bearer Token</div>
<div><strong>Default URL:</strong> https://generativelanguage.googleapis.com/v1</div>
<div><strong>Vision Support:</strong> ✅ Yes (Gemini 2.5)</div>
</div>
</div>

## Vision/Multimodal Support

Several providers support vision/multimodal capabilities, allowing you to process images alongside text. To use vision features, you must enable them in your configuration:

```bash
ENABLE_VISION=true
```

**Note:** Vision support is disabled by default for performance and security reasons. When disabled, requests containing image content will be rejected even if the model supports vision.

### Providers with Vision Support

- **OpenAI**: GPT-4o, GPT-5, GPT-4.1, GPT-4 Turbo
- **Anthropic**: Claude 3, Claude 4, Claude 4.5 Sonnet, Claude 4.5 Haiku
- **Google**: Gemini 2.5
- **Cohere**: Command A Vision, Aya Vision
- **Ollama**: LLaVA, Llama 4, Llama 3.2 Vision
- **Groq**: Vision models
- **Mistral**: Pixtral

### Example Vision Request

```bash
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
]
}'
```

## Using Providers

### Provider Configuration
Expand Down
2 changes: 1 addition & 1 deletion public/search-index.json

Large diffs are not rendered by default.

Loading