Skip to content
167 changes: 147 additions & 20 deletions docs/getting-started/advanced-topics/scaling.md

Large diffs are not rendered by default.

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -6,38 +6,30 @@ title: "OpenAI"

## Overview

Open WebUI makes it easy to connect and use OpenAI and other OpenAI-compatible APIs. This guide will walk you through adding your API key, setting the correct endpoint, and selecting models — so you can start chatting right away.
Open WebUI makes it easy to connect to **OpenAI** and **Azure OpenAI**. This guide will walk you through adding your API key, setting the correct endpoint, and selecting models — so you can start chatting right away.

For other providers that offer an OpenAI-compatible API (Anthropic, Google Gemini, Mistral, Groq, DeepSeek, and many more), see the **[OpenAI-Compatible Providers](/getting-started/quick-start/connect-a-provider/starting-with-openai-compatible)** guide.

---

## Important: Protocols, Not Providers

Open WebUI is a **protocol-centric** platform. While we provide first-class support for OpenAI models, we do so mainly through the **OpenAI Chat Completions API protocol**.
Open WebUI is a **protocol-centric** platform. While we provide first-class support for OpenAI models, we do so mainly through the **OpenAI Chat Completions API protocol**.

We focus on universal standards shared across dozens of providers, with experimental support for emerging standards like **[Open Responses](https://www.openresponses.org/)**. For a detailed explanation, see our **[FAQ on protocol support](/faq#q-why-doesnt-open-webui-natively-support-provider-xs-proprietary-api)**.

---

## Step 1: Get Your OpenAI API Key

To use OpenAI models (such as GPT-4 or o3-mini), you need an API key from a supported provider.
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

You can use:
## Step 1: Get Your OpenAI API Key

- **OpenAI** directly (https://platform.openai.com/account/api-keys)
- **Azure OpenAI**
- **Anthropic** (via their [OpenAI-compatible endpoint](https://platform.claude.com/docs/en/api/openai-sdk))
- **Google Gemini** (via their [OpenAI-compatible endpoint](https://generativelanguage.googleapis.com/v1beta/openai/))
- **DeepSeek** (https://platform.deepseek.com/)
- **MiniMax** (https://platform.minimax.io/)
- **Proxies & Aggregators**: OpenRouter, LiteLLM, Helicone, Vercel AI Gateway.
- **Local Servers**: Ollama, Llama.cpp, LM Studio, vLLM, LocalAI.
- **OpenAI**: Get your key at [platform.openai.com/account/api-keys](https://platform.openai.com/account/api-keys)
- **Azure OpenAI**: Get your key from the [Azure Portal](https://portal.azure.com/)

---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

## Step 2: Add the API Connection in Open WebUI

Once Open WebUI is running:
Expand All @@ -47,13 +39,11 @@ Once Open WebUI is running:
3. Click ➕ **Add New Connection**.

<Tabs>
<TabItem value="standard" label="Standard / Compatible" default>

Use this for **OpenAI**, **DeepSeek**, **MiniMax**, **OpenRouter**, **LocalAI**, **FastChat**, **Helicone**, **LiteLLM**, **Vercel AI Gateway** etc.
<TabItem value="standard" label="OpenAI" default>

* **Connection Type**: External
* **URL**: `https://api.openai.com/v1` (or your provider's endpoint)
* **API Key**: Your secret key (usually starts with `sk-...`)
* **URL**: `https://api.openai.com/v1`
* **API Key**: Your secret key (starts with `sk-...`)

</TabItem>
<TabItem value="azure" label="Azure OpenAI">
Expand All @@ -74,17 +64,7 @@ Once Open WebUI is running:
* **Model IDs (Filter)**:
* *Default (Empty)*: Auto-detects all available models from the provider.
* *Set*: Acts as an **Allowlist**. Only the specific model IDs you enter here will be visible to users. Use this to hide older or expensive models.

:::tip OpenRouter Recommendation
When using **OpenRouter**, we **highly recommend**:
1. **Use an allowlist** (add specific Model IDs). OpenRouter exposes thousands of models, which can clutter your model selector and slow down the admin panel if not filtered.
2. **Enable Model Caching** (`Settings > Connections > Cache Base Model List` or `ENABLE_BASE_MODELS_CACHE=True`). Without caching, page loads can take 10-15+ seconds on first visit due to querying a large number of models. See the [Performance Guide](/troubleshooting/performance) for more details.
:::

:::caution MiniMax Whitelisting
Some providers, like **MiniMax**, do not expose their models via a `/models` endpoint. For these providers, you **must** manually add the Model ID (e.g., `MiniMax-M2.5`) to the **Model IDs (Filter)** list for them to appear in the UI.
:::


* **Prefix ID**:
* If you connect multiple providers that have models with the same name (e.g., two providers both offering `llama3`), add a prefix here (e.g., `groq/`) to distinguish them. The model will appear as `groq/llama3`.

Expand Down Expand Up @@ -113,9 +93,9 @@ If you've saved an unreachable URL and the UI becomes unresponsive, see the [Mod

Once your connection is saved, you can start using models right inside Open WebUI.

🧠 You dont need to download any models — just select one from the Model Selector and start chatting. If a model is supported by your provider, youll be able to use it instantly via their API.
🧠 You don't need to download any models — just select one from the Model Selector and start chatting. If a model is supported by your provider, you'll be able to use it instantly via their API.

Heres what model selection looks like:
Here's what model selection looks like:

![OpenAI Model Selector](/images/getting-started/quick-start/selector-openai.png)

Expand All @@ -125,9 +105,9 @@ Simply choose GPT-4, o3-mini, or any compatible model offered by your provider.

## All Set!

Thats it! Your OpenAI-compatible API connection is ready to use.
That's it! Your OpenAI API connection is ready to use.

With Open WebUI and OpenAI, you get powerful language models, an intuitive interface, and instant access to chat capabilities — no setup headaches.
If you want to connect other providers (Anthropic, Google Gemini, Mistral, Groq, DeepSeek, etc.), see the **[OpenAI-Compatible Providers](/getting-started/quick-start/connect-a-provider/starting-with-openai-compatible)** guide.

If you run into issues or need additional support, visit our [help section](/troubleshooting).

Expand Down
4 changes: 3 additions & 1 deletion docs/getting-started/quick-start/tab-kubernetes/Helm.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,9 @@ Helm helps you manage Kubernetes applications.
If you intend to scale Open WebUI using multiple nodes/pods/workers in a clustered environment, you need to setup a NoSQL key-value database (Redis).
There are some [environment variables](https://docs.openwebui.com/reference/env-configuration/) that need to be set to the same value for all service-instances, otherwise consistency problems, faulty sessions and other issues will occur!

**Important:** The default vector database (ChromaDB) uses a local SQLite-backed client that is **not safe for multi-replica or multi-worker deployments**. SQLite connections are not fork-safe, and concurrent writes from multiple processes will crash workers instantly. You **must** switch to an external vector database (PGVector, Milvus, Qdrant) via [`VECTOR_DB`](https://docs.openwebui.com/reference/env-configuration#vector_db), or run ChromaDB as a separate HTTP server via [`CHROMA_HTTP_HOST`](https://docs.openwebui.com/reference/env-configuration#chroma_http_host). See the [Scaling & HA guide](https://docs.openwebui.com/troubleshooting/multi-replica) for full requirements.
**Important:** The default vector database (ChromaDB) uses a local SQLite-backed client that is **not safe for multi-replica or multi-worker deployments**. SQLite connections are not fork-safe, and concurrent writes from multiple processes will crash workers instantly. You **must** switch to an external vector database (PGVector, Milvus, Qdrant) via [`VECTOR_DB`](https://docs.openwebui.com/reference/env-configuration#vector_db), or run ChromaDB as a separate HTTP server via [`CHROMA_HTTP_HOST`](https://docs.openwebui.com/reference/env-configuration#chroma_http_host).

For the complete step-by-step scaling walkthrough, see [Scaling Open WebUI](https://docs.openwebui.com/getting-started/advanced-topics/scaling). For troubleshooting multi-replica issues, see the [Scaling & HA guide](https://docs.openwebui.com/troubleshooting/multi-replica).

:::

Expand Down
2 changes: 2 additions & 0 deletions docs/troubleshooting/connection-error.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ WEBSOCKET_MANAGER=redis
WEBSOCKET_REDIS_URL=redis://redis:6379/1
```

For detailed Redis setup instructions, see [Redis WebSocket Support](/tutorials/integrations/redis). For a complete multi-instance scaling walkthrough, see [Scaling Open WebUI](/getting-started/advanced-topics/scaling). If you're seeing WebSocket 403 errors specifically in a multi-replica setup, see [Scaling & HA Troubleshooting](/troubleshooting/multi-replica#2-websocket-403-errors--connection-failures).

### Testing Your Configuration

To verify your setup is working:
Expand Down
5 changes: 5 additions & 0 deletions docs/troubleshooting/multi-replica.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ title: "Scaling & HA"

This guide addresses common issues encountered when deploying Open WebUI in **multi-replica** environments (e.g., Kubernetes, Docker Swarm) or when using **multiple workers** (`UVICORN_WORKERS > 1`) for increased concurrency.

If you are setting up a scaled deployment for the first time, start with the [Scaling Open WebUI](/getting-started/advanced-topics/scaling) guide for a step-by-step walkthrough.

## Core Requirements Checklist

Before troubleshooting specific errors, ensure your deployment meets these **absolute requirements** for a multi-replica setup. Missing any of these will cause instability, login loops, or data loss.
Expand Down Expand Up @@ -253,8 +255,11 @@ While Open WebUI is designed to be stateless with proper Redis configuration, en

## Related Documentation

- [Scaling Open WebUI](/getting-started/advanced-topics/scaling) — Step-by-step guide to scaling from single instance to production
- [Environment Variable Configuration](/reference/env-configuration)
- [Optimization, Performance & RAM Usage](/troubleshooting/performance)
- [Redis WebSocket Support](/tutorials/integrations/redis) — Detailed Redis setup tutorial
- [Troubleshooting Connection Errors](/troubleshooting/connection-error)
- [RAG Troubleshooting](/troubleshooting/rag) — Document upload and embedding issues
- [Logging Configuration](/getting-started/advanced-topics/logging)

2 changes: 2 additions & 0 deletions docs/troubleshooting/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,8 @@ The way your documents are chunked directly impacts both storage efficiency and

If you are deploying for **enterprise scale** (hundreds of users), simple Docker Compose setups may not suffice. You will need to move to a clustered environment.

For a step-by-step walkthrough of the entire scaling journey (PostgreSQL, Redis, vector DB, storage, observability), see the **[Scaling Open WebUI](/getting-started/advanced-topics/scaling)** guide.

* **Kubernetes / Helm**: For deploying on K8s with multiple replicas, see the **[Multi-Replica & High Availability Guide](/troubleshooting/multi-replica)**.
* **Redis (Mandatory)**: When running multiple workers (`UVICORN_WORKERS > 1`) or multiple replicas, **Redis is required** to handle WebSocket connections and session syncing. See **[Redis Integration](/tutorials/integrations/redis)**.
* **Load Balancing**: Ensure your Ingress controller supports **Session Affinity** (Sticky Sessions) for best performance.
Expand Down
118 changes: 0 additions & 118 deletions docs/tutorials/integrations/llm-providers/amazon-bedrock.md

This file was deleted.

Loading