open-webui · Classic298 · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026
diff --git a/docs/features/authentication-access/api-keys.md b/docs/features/authentication-access/api-keys.md
@@ -116,6 +116,22 @@ print(response.json())
 
 For the full endpoint reference - chat completions, Ollama proxy, RAG, file management, and more - see [API Endpoints](/reference/api-endpoints).
 
+### Behind a reverse proxy that consumes `Authorization`?
+
+If Open WebUI sits behind a gateway that uses the `Authorization` header for its own auth (basic auth, SSO sidecar, corporate API gateway, mutual-TLS adapter, etc.), clients can deliver the API key via a dedicated header instead. The middleware checks, in order: `Authorization: Bearer`, the `token` cookie, and a configurable custom header.
+
+The custom header defaults to `x-api-key`, and admins can rename it via the [`CUSTOM_API_KEY_HEADER`](/reference/env-configuration#custom_api_key_header) environment variable to avoid collisions with anything else in the request chain.
+
+```bash
+curl -H "X-OpenWebUI-Key: YOUR_API_KEY" \
+  http://openwebui.internal/api/models
+```
+
+```
+# Open WebUI container env
+CUSTOM_API_KEY_HEADER=X-OpenWebUI-Key
+```
+
 ---
 
 ## Best Practices

diff --git a/docs/features/extensibility/plugin/functions/filter.mdx b/docs/features/extensibility/plugin/functions/filter.mdx
@@ -73,11 +73,11 @@ class Filter:
 
 ---
 
-### 🆕 🧲 Toggle Filter Example: Adding Interactivity and Icons (New in Open WebUI 0.6.10)
+### 🧲 Toggleable Filters: Making Filters User-Controllable (`self.toggle`)
 
-Filters can do more than simply modify text—they can expose UI toggles and display custom icons. For instance, you might want a filter that can be turned on/off with a user interface button, and displays a special icon in Open WebUI’s message input UI.
+By default a filter that's **active and in scope** (global, or attached to the model) runs on every request — the user has no say in it. That's often what you want (PII scrubbing, logging, mandatory guardrails). Sometimes you want the opposite: let the user decide whether the filter runs for a given conversation.
 
-Here’s how you could create such a toggle filter:
+Set `self.toggle = True` to make the filter **user-controllable**. The filter then shows up in the chat UI with a clickable chip + an entry in the Integrations menu, and only runs on requests where the user has it selected.
 
 ```python
 from pydantic import BaseModel, Field
@@ -89,36 +89,58 @@ class Filter:
 
     def __init__(self):
         self.valves = self.Valves()
-        self.toggle = True # IMPORTANT: This creates a switch UI in Open WebUI
+        self.toggle = True   # Make this filter user-controllable (see notes below)
         # TIP: Use a hosted URL for your icon instead of base64 to avoid API payload bloat.
         # See the Action Function docs for details on why base64 icons are not recommended.
         self.icon = "https://example.com/icons/lightbulb.svg"
-        pass
 
     async def inlet(
         self, body: dict, __event_emitter__, __user__: Optional[dict] = None
     ) -> dict:
+        # This method ONLY runs when the filter is currently selected by the user.
+        # You do NOT need to branch on self.toggle inside here — see the note below.
         await __event_emitter__(
             {
                 "type": "status",
-                "data": {
-                    "description": "Toggled!",
-                    "done": True,
-                    "hidden": False,
-                },
+                "data": {"description": "Running!", "done": True, "hidden": False},
             }
         )
         return body
 ```
 
-#### 🖼️ What’s happening?
--   **toggle = True** creates a switch UI in Open WebUI—users can manually enable or disable the filter in real time.
--   **icon** will show up as a little image next to the filter’s name. You can use a URL pointing to any image (SVG, PNG, JPEG, etc.). While base64 data URIs are technically supported, **using a hosted URL is strongly recommended** to avoid bloating the `/api/models` payload — see the [Action Function icon_url warning](/features/extensibility/plugin/functions/action#example---specifying-action-frontmatter) for details.
--   **The `inlet` function** uses the `__event_emitter__` special argument to broadcast feedback/status to the UI, such as a little toast/notification that reads "Toggled!"
+#### What `self.toggle = True` actually does
+
+It is a **visibility / gating flag**, read once at request-dispatch time — **not** a runtime state the UI flips on your Python object. Specifically:
+
+- **Visibility:** the filter only appears in the chat UI (inline chip + Integrations menu entry) when `self.toggle = True` **and** it is either a global filter or attached to the selected model. Without `self.toggle`, the filter still runs (if active and in scope) but has no UI surface — users can't turn it off.
+- **Gating:** at request time the backend checks the user's current `filter_ids` selection. If the filter is in that list, `inlet()` / `stream()` / `outlet()` run. If not, the filter is **not invoked at all**.
+- **`self.toggle` is never mutated by the UI.** Inside `inlet()` it is always whatever you set in `__init__` — which will be `True` for every call that actually runs, because if the user had disabled the filter, `inlet()` wouldn't be running. Don't build logic that reads `self.toggle` at runtime; it's not a live on/off signal.
+
+:::warning Upgrading from pre-0.9.0 filters
+Some older filters used a pattern like `if self.toggle: enable_feature() else: disable_feature()` inside `inlet()`, hoping to read the UI state back on every request. **That pattern was never reliable and is effectively dead on 0.9.0+.** `inlet()` simply isn't called when the filter is disabled in the UI, so there is no "else" branch to hit. The correct migration is to stop branching on `self.toggle` entirely and just do the work unconditionally — the user controls whether `inlet()` runs by selecting/deselecting the chip. If you need user-driven config (a numeric threshold, a target language, etc.), expose it through [`UserValves`](#⚙%EF%B8%8F-filter-administration--configuration) instead; clicking the chip opens the user's valves modal automatically.
+:::
+
+#### How users interact with a toggleable filter
+
+When a toggleable filter is in scope for the current chat, two UI surfaces show up:
+
+- **Inline chip in the chat input bar.** Shows the filter's `self.icon` + name. Clicking it:
+  - **opens the user-valves modal** if the filter defines a `UserValves` class (so the user can tune per-chat settings), **otherwise**
+  - **removes the filter from the current selection** for this chat session (the chip disappears from the inline row).
+- **Integrations menu (⚙️ icon).** Lists every toggleable filter in scope, each with a proper on/off **Switch**. This is where users re-enable a filter they removed from the chip row, or switch one off that was selected by default.
+
+The chip being present = the filter is enabled for the next request. The chip being absent (but the filter is in the Integrations menu) = the user has turned it off.
+
+#### Where the selection lives
+
+- Stored in the browser as **sessionStorage draft state**, keyed per chat.
+- Survives page reloads in the same browser session but is **not persisted to the chat record on the server**.
+- Initial state comes from the model's `defaultFilterIds` (Admin Panel → Model Settings → Default Filters) — admins decide which toggleable filters start **on** vs **off** per model.
+- Resets when the user switches to a different model.
 
 ![Toggle Filter](/images/features/plugin/functions/toggle-filter.png)
 
-You can use these mechanisms to make your filters dynamic, interactive, and visually unique within Open WebUI’s plugin ecosystem.
+`self.icon` continues to work as before: pass a URL (strongly preferred) or a base64 data URI, and it renders in both the inline chip and the Integrations menu entry. See the [Action Function icon_url warning](/features/extensibility/plugin/functions/action#example---specifying-action-frontmatter) for why hosted URLs are recommended over base64.
 
 ---
 
@@ -276,18 +298,19 @@ class ContentModerationFilter:
 #### Toggleable Filters (`toggle=True`)
 
 **Characteristics:**
-- Appear as **switches in the chat UI** (in the integrations menu - ⚙️ icon)
-- Users can **enable/disable** them per chat session
-- **Do** appear in the "Default Filters" section
-- `defaultFilterIds` controls their initial state (ON or OFF)
+- Appear in the **chat input bar as a clickable chip** and in the **Integrations menu (⚙️ icon) as a Switch**.
+- Users can add or remove them from the active selection on a per-chat, per-session basis. Selection is stored in browser sessionStorage — not persisted to the chat record on the server.
+- **Do** appear in the model's "Default Filters" configuration.
+- `defaultFilterIds` on the model controls the **initial selection** (which toggleable filters start on when a new chat begins with that model).
+- `self.toggle` itself is **never mutated at runtime** — it's a visibility/gating flag read once at request dispatch. `inlet()` only runs when the filter is currently selected; there is no "else" branch to write inside the filter. See [the detailed note above](#what-selftoggle--true-actually-does).
 
 **Use Cases:**
 - **Web search integration** - User decides when to search the web for context
 - **Citation mode** - User controls when to require sources in responses
 - **Verbose/detailed mode** - User toggles between concise and detailed responses
 - **Translation filters** - User enables translation to/from specific languages
 - **Code formatting** - User chooses when to apply syntax highlighting or linting
-- **Thinking/reasoning toggle** - Show or hide model's chain-of-thought reasoning
+- **Thinking/reasoning toggle** - User switches the underlying model's thinking mode on/off by enabling the filter (do the work unconditionally in `inlet()`; the user disables it by removing the chip)
 - **Markdown rendering** - Toggle between raw text and formatted output
 - **Anonymization mode** - User enables when discussing sensitive topics
 - **Expert mode** - Inject domain-specific context (legal, medical, technical)
@@ -297,11 +320,12 @@ class ContentModerationFilter:
 ```python
 class WebSearchFilter:
     def __init__(self):
-        self.toggle = True  # User can turn on/off
-        self.icon = "https://example.com/icons/web-search.svg"  # Shows in UI
-    
+        self.toggle = True  # Make user-controllable
+        self.icon = "https://example.com/icons/web-search.svg"
+
     async def inlet(self, body: dict, __event_emitter__) -> dict:
-        # Only runs when user has enabled this filter
+        # This only runs when the user has this filter selected.
+        # Do NOT branch on self.toggle here — it's always True when this runs.
         await __event_emitter__({
             "type": "status",
             "data": {"description": "Searching the web...", "done": False}
@@ -312,12 +336,15 @@ class WebSearchFilter:
 
 **Where Toggleable Filters Appear:**
 
-1.  **Model Settings → Default Filters Section**
-    -   Configure which filters start enabled
-2.  **Chat UI → Integrations Menu (⚙️ icon)**
-    -   Users can toggle filters on/off per chat
-    -   Shows custom icons if provided
-    -   Realtime enable/disable
+1.  **Model Settings → Default Filters**
+    - Admin picks which toggleable filters start in the selection on new chats with that model.
+2.  **Chat input bar → Inline chip**
+    - Shown for every toggleable filter currently in the user's selection for this chat.
+    - Clicking the chip **opens the user-valves modal** if the filter defines `UserValves`, otherwise **removes the filter from the selection** (it moves back to the Integrations menu where the user can re-enable it).
+    - `self.icon` renders as the chip's image.
+3.  **Chat UI → Integrations Menu (⚙️ icon)**
+    - Lists every toggleable filter in scope for the current model, each with a proper on/off **Switch**.
+    - Used to re-enable a filter removed from the chip row, or to turn off a filter selected by default.
 
 ---
 
@@ -337,16 +364,18 @@ Here's the complete flow from admin configuration to filter execution:
 - DefaultFiltersSelector: Set default enabled state (only for toggleable filters)
 
 **3. CHAT UI (User Interaction - Toggleable Filters Only)**
-- Chat → Integrations Menu (⚙️) → Toggle Filters
-- Users can enable/disable toggleable filters
-- Always-on filters run automatically (no UI control)
+- Chat input bar → Inline chip (add/remove via click; click opens UserValves modal if defined)
+- Chat → Integrations Menu (⚙️) → Switch per toggleable filter
+- Frontend tracks `selectedFilterIds` in sessionStorage (per chat, per session)
+- Initial selection seeded from the model's `defaultFilterIds`
+- Always-on filters (no `self.toggle`) run automatically with no UI control
 
 **4. REQUEST PROCESSING (Filter Compilation)**
-- Backend: get_sorted_filter_ids()
-- Fetch global filters (is_global=True, is_active=True)
-- Add model-specific filters from model.meta.filterIds
-- Filter by is_active status
-- For toggleable filters: Check user's enabled state
+- Frontend ships the current `selectedFilterIds` with the request
+- Backend: `get_sorted_filter_ids(request, model, filter_ids)`
+- Fetch global filters (`is_global=True`, `is_active=True`) + model-specific filters from `model.meta.filterIds`
+- Filter by `is_active` status
+- **For toggleable filters:** keep only the ones whose ID is in the request's `filter_ids` — others are dropped entirely (never invoked, `self.toggle` never read)
 - Sort by priority (from valves)
 
 **5. FILTER EXECUTION**

diff --git a/docs/reference/api-endpoints.md b/docs/reference/api-endpoints.md
@@ -9,6 +9,10 @@ This guide provides essential information on how to interact with the API endpoi
 
 To ensure secure access to the API, authentication is required 🛡️. You can authenticate your API requests using the Bearer Token mechanism. Obtain your API key from **Settings > Account** in the Open WebUI, or alternatively, use a JWT (JSON Web Token) for authentication. For full instructions on enabling and generating API keys - including the admin toggle and group permissions required for non-admin users - see [API Keys](/features/authentication-access/api-keys).
 
+:::tip Alternate credential header for proxy-heavy setups
+When Open WebUI is behind a reverse proxy that already uses the `Authorization` header for its own auth, you can deliver the API key via a custom header instead (`x-api-key` by default). Admins can rename the header via the [`CUSTOM_API_KEY_HEADER`](/reference/env-configuration#custom_api_key_header) environment variable to avoid collisions — see [Behind a reverse proxy that consumes `Authorization`?](/features/authentication-access/api-keys#behind-a-reverse-proxy-that-consumes-authorization) for the full pattern.
+:::
+
 ## Swagger Documentation Links
 
 :::important

diff --git a/docs/reference/env-configuration.mdx b/docs/reference/env-configuration.mdx
@@ -1514,6 +1514,37 @@ This variable replaces the deprecated `API_KEY_ALLOWED_ENDPOINTS` environment va
 
 :::
 
+#### `CUSTOM_API_KEY_HEADER`
+
+- Type: `str`
+- Default: `x-api-key`
+- Description: Name of the HTTP header the auth middleware checks for API-key credentials. Useful when Open WebUI sits behind a reverse proxy or API gateway that consumes the `Authorization` header for its own authentication — set this to a distinct header (for example `X-OpenWebUI-Key`) so clients can deliver their Open WebUI API key without colliding with the proxy's own auth.
+- Read at startup from the process environment (not a `PersistentConfig`).
+
+**How the auth middleware picks up a credential**, in order:
+
+1. `Authorization: Bearer <token>` (most common — API key or JWT)
+2. `token` cookie (Bearer — used by the WebUI itself)
+3. The header named by `CUSTOM_API_KEY_HEADER` (default `x-api-key`)
+
+If none of the three is present, the request falls through as anonymous.
+
+:::tip Behind a proxy that eats `Authorization`?
+Many corporate gateways (basic auth, mutual TLS adapters, SSO sidecars) consume the `Authorization` header and never forward it upstream. In that case, point your clients at the custom header instead:
+
+```bash
+curl -H "X-OpenWebUI-Key: sk-..." http://openwebui.internal/api/models
+```
+
+And set on the Open WebUI container:
+
+```
+CUSTOM_API_KEY_HEADER=X-OpenWebUI-Key
+```
+
+The header name is matched case-insensitively by the ASGI layer, so pick whatever fits your naming convention.
+:::
+
 ### Model Caching
 
 #### `ENABLE_BASE_MODELS_CACHE`
@@ -2839,7 +2870,9 @@ Note: this configuration assumes that AWS credentials will be available to your
   - `docling` - Use Docling engine
   - `document_intelligence` - Use Document Intelligence engine
   - `mistral_ocr` - Use Mistral OCR engine
-  - `mineru`
+  - `datalab_marker` - Use Datalab Marker engine
+  - `mineru` - Use MinerU engine
+  - `paddleocr_vl` - Use a PaddleOCR-vl server (requires `PADDLEOCR_VL_TOKEN`; see below)
 - Description: Sets the content extraction engine to use for document ingestion.
 - Persistence: This environment variable is a `PersistentConfig` variable.
 
@@ -2983,6 +3016,24 @@ DOCLING_PARAMS="{\"do_ocr\": true, \"ocr_engine\": \"tesseract\", \"ocr_lang\":
 - Description: Sets the timeout in seconds for MinerU API requests during document processing.
 - Persistence: This environment variable is a `PersistentConfig` variable.
 
+#### `PADDLEOCR_VL_BASE_URL`
+
+- Type: `str`
+- Default: `http://localhost:8080`
+- Description: Base URL of the PaddleOCR-vl server used when `CONTENT_EXTRACTION_ENGINE=paddleocr_vl`. Documents and images are POSTed to `{base_url}/layout-parsing` and the response's `layoutParsingResults[].markdown.text` is ingested page-by-page.
+- Persistence: This environment variable is a `PersistentConfig` variable.
+
+#### `PADDLEOCR_VL_TOKEN`
+
+- Type: `str`
+- Default: `""` (empty)
+- Description: Authentication token for the PaddleOCR-vl server. Sent as `Authorization: token <value>` on every layout-parsing request. **The PaddleOCR-vl engine is skipped at runtime if this value is empty** — the loader falls back to the default PyPDFLoader for the current document even when `CONTENT_EXTRACTION_ENGINE=paddleocr_vl` is set. Set this to activate the engine.
+- Persistence: This environment variable is a `PersistentConfig` variable.
+
+:::info Supported file types
+PaddleOCR-vl handles both documents and images. Extensions treated as images and dispatched with `fileType=1`: `png`, `jpg`, `jpeg`, `bmp`, `tiff`, `webp`. Everything else is dispatched with `fileType=0` (document, e.g. PDFs). Output is per-page Markdown, so downstream chunking behaves the same as other engines.
+:::
+
 ## Retrieval Augmented Generation (RAG)
 
 ### Core Configuration

diff --git a/docs/troubleshooting/performance.md b/docs/troubleshooting/performance.md
@@ -147,6 +147,7 @@ This is the **#1 cause of unexplained memory growth** in production deployments.
 |---|---|---|
 | **Apache Tika** | General-purpose, widely used, handles most document types | `CONTENT_EXTRACTION_ENGINE=tika` + `TIKA_SERVER_URL=http://tika:9998` |
 | **Docling** | High-quality extraction with layout-aware parsing | `CONTENT_EXTRACTION_ENGINE=docling` |
+| **PaddleOCR-vl** | OCR-heavy workloads (scanned PDFs, images, mixed layouts); self-hosted vision-language OCR | `CONTENT_EXTRACTION_ENGINE=paddleocr_vl` + `PADDLEOCR_VL_BASE_URL=http://paddleocr-vl:8080` + `PADDLEOCR_VL_TOKEN=...` |
 | **External Loader** | Recommended for production and custom extraction pipelines | `CONTENT_EXTRACTION_ENGINE=external` + `EXTERNAL_DOCUMENT_LOADER_URL=...` |
 
 Using an external extractor moves the memory-intensive parsing out of the Open WebUI process entirely, eliminating this class of memory leaks.