diff --git a/docs/features/code-execution/index.md b/docs/features/code-execution/index.md
index 05cb47e4cc..2bff856f08 100644
--- a/docs/features/code-execution/index.md
+++ b/docs/features/code-execution/index.md
@@ -7,10 +7,10 @@ Open WebUI offers powerful code execution capabilities directly within your chat
 
 ## Key Features
 
-- **Python Code Execution**: Run Python scripts directly in your browser using Pyodide, with support for popular libraries like pandas and matplotlib no setup required.
+- **Python Code Execution**: Run Python scripts directly in your browser using Pyodide, with support for popular libraries like pandas and matplotlib with no setup required.
 
 - **MermaidJS Rendering**: Create and visualize flowcharts, diagrams, and other visual representations with MermaidJS syntax that automatically renders in your chat.
 
 - **Interactive Artifacts**: Generate and interact with rich content like HTML websites, SVG graphics, and JavaScript visualizations directly within your conversations.
 
-These execution capabilities bridge the gap between conversation and implementation, allowing you to explore ideas, analyze data, and create visual content seamlessly while chatting with AI models.
\ No newline at end of file
+These execution capabilities bridge the gap between conversation and implementation, allowing you to explore ideas, analyze data, and create visual content seamlessly while chatting with AI models.
diff --git a/docs/features/plugin/functions/action.mdx b/docs/features/plugin/functions/action.mdx
index 030568f17e..f075b8f555 100644
--- a/docs/features/plugin/functions/action.mdx
+++ b/docs/features/plugin/functions/action.mdx
@@ -3,12 +3,11 @@ sidebar_position: 3
 title: "🎬 Action Function"
 ---
 
-Action functions allow you to write custom buttons to the message toolbar for end users to interact
-with. This feature enables more interactive messaging, enabling users to grant permission before a
-task is performed, generate visualizations of structured data, download an audio snippet of chats,
-and many other use cases.
+Action functions allow you to write custom buttons that appear in the message toolbar for end users to interact with. This feature enables more interactive messaging, allowing users to grant permission before a task is performed, generate visualizations of structured data, download an audio snippet of chats, and many other use cases.
 
-A scaffold of Action code can be found [in the community section](https://openwebui.com/f/hub/custom_action/).
+Actions are admin-managed functions that extend the chat interface with custom interactive capabilities. When a message is generated by a model that has actions configured, these actions appear as clickable buttons beneath the message.
+
+A scaffold of Action code can be found [in the community section](https://openwebui.com/f/hub/custom_action/). For more Action Function examples built by the community, visit [https://openwebui.com/functions](https://openwebui.com/functions).
 
 An example of a graph visualization Action can be seen in the video below.
 
@@ -21,46 +20,195 @@ An example of a graph visualization Action can be seen in the video below.
 	</a>
 </p>
 
-### Action
+## Action Function Architecture
 
-Actions are used to create a button in the Message UI (the small buttons found directly underneath individual chat messages).
+Actions are Python-based functions that integrate directly into the chat message toolbar. They execute server-side and can interact with users through real-time events, modify message content, and access the full Open WebUI context.
 
-Actions have a single main component called an action function. This component takes an object defining the type of action and the data being processed.
+### Function Structure
 
-<details>
-<summary>Example</summary>
+Actions follow a specific class structure with an `action` method as the main entry point:
 
 ```python
-async def action(
-        self,
-        body: dict,
-        __user__=None,
-        __event_emitter__=None,
-        __event_call__=None,
-    ) -> Optional[dict]:
-        print(f"action:{__name__}")
+class Action:
+    def __init__(self):
+        self.valves = self.Valves()
+    
+    class Valves(BaseModel):
+        # Configuration parameters
+        parameter_name: str = "default_value"
+    
+    async def action(self, body: dict, __user__=None, __event_emitter__=None, __event_call__=None):
+        # Action implementation
+        return {"content": "Modified message content"}
+```
+
+### Action Method Parameters
+
+The `action` method receives several parameters that provide access to the execution context:
+
+- **`body`** - Dictionary containing the message data and context
+- **`__user__`** - Current user object with permissions and settings
+- **`__event_emitter__`** - Function to send real-time updates to the frontend
+- **`__event_call__`** - Function for bidirectional communication (confirmations, inputs)
+- **`__model__`** - Model information that triggered the action
+- **`__request__`** - FastAPI request object for accessing headers, etc.
+- **`__id__`** - Action ID (useful for multi-action functions)
+
+## Event System Integration
+
+Actions can utilize Open WebUI's real-time event system for interactive experiences:
+
+### Event Emitter (`__event_emitter__`)
+
+**For more information about Events and Event emitters, [see here](https://docs.openwebui.com/features/plugin/events/).**
+
+Send real-time updates to the frontend during action execution:
+
+```python
+async def action(self, body: dict, __event_emitter__=None):
+    # Send status updates
+    await __event_emitter__({
+        "type": "status", 
+        "data": {"description": "Processing request..."}
+    })
+    
+    # Send notifications
+    await __event_emitter__({
+        "type": "notification",
+        "data": {"type": "info", "content": "Action completed successfully"}
+    })
+```
+
+### Event Call (`__event_call__`)
+Request user input or confirmation during execution:
+
+```python
+async def action(self, body: dict, __event_call__=None):
+    # Request user confirmation
+    response = await __event_call__({
+        "type": "confirmation",
+        "data": {
+            "title": "Confirm Action",
+            "message": "Are you sure you want to proceed?"
+        }
+    })
+    
+    # Request user input
+    user_input = await __event_call__({
+        "type": "input",
+        "data": {
+            "title": "Enter Value",
+            "message": "Please provide additional information:",
+            "placeholder": "Type your input here..."
+        }
+    })
+```
+
+## Action Types and Configurations
+
+### Single Actions
+Standard actions with one `action` method:
+
+```python
+async def action(self, body: dict, **kwargs):
+    # Single action implementation
+    return {"content": "Action result"}
+```
+
+### Multi-Actions
+Functions can define multiple sub-actions through an `actions` array:
+
+```python
+actions = [
+    {
+        "id": "summarize",
+        "name": "Summarize",
+        "icon_url": "data:image/svg+xml;base64,..."
+    },
+    {
+        "id": "translate",
+        "name": "Translate", 
+        "icon_url": "data:image/svg+xml;base64,..."
+    }
+]
+
+async def action(self, body: dict, __id__=None, **kwargs):
+    if __id__ == "summarize":
+        # Summarization logic
+        return {"content": "Summary: ..."}
+    elif __id__ == "translate":
+        # Translation logic  
+        return {"content": "Translation: ..."}
+```
+
+### Global vs Model-Specific Actions
+- **Global Actions** - Turn on the toggle in the Action's settings, to globally enable it for all users and all models.
+- **Model-Specific Actions** - Configure enabled actions for specific models in the model settings.
+
+## Advanced Capabilities
 
-        response = await __event_call__(
+### Background Task Execution
+For long-running operations, actions can integrate with the task system:
+
+```python
+async def action(self, body: dict, __event_emitter__=None):
+    # Start long-running process
+    await __event_emitter__({
+        "type": "status",
+        "data": {"description": "Starting background processing..."}
+    })
+    
+    # Perform time-consuming operation
+    result = await some_long_running_function()
+    
+    return {"content": f"Processing completed: {result}"}
+```
+
+### File and Media Handling
+Actions can work with uploaded files and generate new media:
+
+```python
+async def action(self, body: dict):
+    message = body
+    
+    # Access uploaded files
+    if message.get("files"):
+        for file in message["files"]:
+            # Process file based on type
+            if file["type"] == "image":
+                # Image processing logic
+                pass
+    
+    # Return new files
+    return {
+        "content": "Analysis complete",
+        "files": [
             {
-                "type": "input",
-                "data": {
-                    "title": "write a message",
-                    "message": "here write a message to append",
-                    "placeholder": "enter your message",
-                },
+                "type": "image",
+                "url": "generated_chart.png",
+                "name": "Analysis Chart"
             }
-        )
-        print(response)
+        ]
+    }
 ```
 
-</details>
+### User Context and Permissions
+Actions can access user information and respect permissions:
+
+```python
+async def action(self, body: dict, __user__=None):
+    if __user__["role"] != "admin":
+        return {"content": "This action requires admin privileges"}
+    
+    user_name = __user__["name"]
+    return {"content": f"Hello {user_name}, admin action completed"}
+```
 
-### Example - Specifying Action Frontmatter
+## Example - Specifying Action Frontmatter
 
 Each Action function can include a docstring at the top to define metadata for the button. This helps customize the display and behavior of your Action in Open WebUI.
 
 Example of supported frontmatter fields:
-
 - `title`: Display name of the Action.
 - `author`: Name of the creator.
 - `version`: Version number of the Action.
@@ -69,13 +217,100 @@ Example of supported frontmatter fields:
 
 **Base64-Encoded Example:**
 
+<details>
+<summary>Example</summary>
+
 ```python
 """
-title: Summarize Text
-author: @you
-version: 1.0.0
+title: Enhanced Message Processor
+author: @admin
+version: 1.2.0
 required_open_webui_version: 0.5.0
-icon_url: data:image/svg+xml;base64,<IMAGE STRING>...
+icon_url: data:image/svg+xml;base64,PHN2ZyB3aWR0aD0iMjQiIGhlaWdodD0iMjQiIHZpZXdCb3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIj4KPHBhdGggZD0iTTEyIDJMMTMuMDkgOC4yNkwyMCA5TDEzLjA5IDE1Ljc0TDEyIDIyTDEwLjkxIDE1Ljc0TDQgOUwxMC45MSA4LjI2TDEyIDJaIiBzdHJva2U9ImN1cnJlbnRDb2xvciIgc3Ryb2tlLXdpZHRoPSIyIiBzdHJva2UtbGluZWNhcD0icm91bmQiIHN0cm9rZS1saW5lam9pbj0icm91bmQiLz4KPHN2Zz4K
+requirements: requests,beautifulsoup4
 """
 
+from pydantic import BaseModel
+
+class Action:
+    def __init__(self):
+        self.valves = self.Valves()
+
+    class Valves(BaseModel):
+        api_key: str = ""
+        processing_mode: str = "standard"
+
+    async def action(
+        self,
+        body: dict,
+        __user__=None,
+        __event_emitter__=None,
+        __event_call__=None,
+    ):
+        # Send initial status
+        await __event_emitter__({
+            "type": "status",
+            "data": {"description": "Processing message..."}
+        })
+        
+        # Get user confirmation
+        response = await __event_call__({
+            "type": "confirmation",
+            "data": {
+                "title": "Process Message",
+                "message": "Do you want to enhance this message?"
+            }
+        })
+        
+        if not response:
+            return {"content": "Action cancelled by user"}
+        
+        # Process the message
+        original_content = body.get("content", "")
+        enhanced_content = f"Enhanced: {original_content}"
+        
+        return {"content": enhanced_content}
 ```
+
+</details>
+
+## Best Practices
+
+### Error Handling
+Always implement proper error handling in your actions:
+
+```python
+async def action(self, body: dict, __event_emitter__=None):
+    try:
+        # Action logic here
+        result = perform_operation()
+        return {"content": f"Success: {result}"}
+    except Exception as e:
+        await __event_emitter__({
+            "type": "notification",
+            "data": {"type": "error", "content": f"Action failed: {str(e)}"}
+        })
+        return {"content": "Action encountered an error"}
+```
+
+### Performance Considerations
+- Use async/await for I/O operations
+- Implement timeouts for external API calls
+- Provide progress updates for long-running operations
+- Consider using background tasks for heavy processing
+
+### User Experience
+- Always provide clear feedback through event emitters
+- Use confirmation dialogs for destructive actions
+- Include helpful error messages
+
+## Integration with Open WebUI Features
+
+Actions integrate seamlessly with other Open WebUI features:
+- **Models** - Actions can be model-specific or global
+- **Tools** - Actions can invoke external tools and APIs
+- **Files** - Actions can process uploaded files and generate new ones
+- **Memory** - Actions can access conversation history and context
+- **Permissions** - Actions respect user roles and access controls
+
+For more examples and community-contributed actions, visit [https://openwebui.com/functions](https://openwebui.com/functions) where you can discover, download, and explore custom functions built by the Open WebUI community.
diff --git a/docs/getting-started/advanced-topics/development.md b/docs/getting-started/advanced-topics/development.md
index 5760dc2b9a..ecd9c974ea 100644
--- a/docs/getting-started/advanced-topics/development.md
+++ b/docs/getting-started/advanced-topics/development.md
@@ -14,7 +14,7 @@ Before you begin, ensure your system meets these minimum requirements:
 - **Operating System:** Linux (or WSL on Windows), Windows 11, or macOS. *(Recommended for best compatibility)*
 - **Python:** Version **3.11 or higher**. *(Required for backend services)*
 - **Node.js:** Version **22.10 or higher**. *(Required for frontend development)*
-- **IDE (Recommended):** We recommend using an IDE like [VSCode](https://code.visualstudio.com/) for code editing, debugging, and integrated terminal access. Feel free to use your favorite IDE if you have one!
+- **IDE (Recommended):** We recommend using an IDE like [VS Code](https://code.visualstudio.com/) for code editing, debugging, and integrated terminal access. Feel free to use your favorite IDE if you have one!
 - **[Optional] GitHub Desktop:** For easier management of the Git repository, especially if you are less familiar with command-line Git, consider installing [GitHub Desktop](https://desktop.github.com/).
 
 ## Setting Up Your Local Environment
@@ -49,7 +49,7 @@ Let's get the user interface (what you see in your browser) up and running first
 
      This command copies the `.env.example` file to a new file named `.env`. The `.env` file is where you'll configure environment variables for the frontend.
 
-   - **Customize `.env`**: Open the `.env` file in your code editor (like VSCode). This file contains configuration variables for the frontend, such as API endpoints and other settings. For local development, the default settings in `.env.example` are usually sufficient to start with. However, you can customize them if needed.
+   - **Customize `.env`**: Open the `.env` file in your code editor (like VS Code). This file contains configuration variables for the frontend, such as API endpoints and other settings. For local development, the default settings in `.env.example` are usually sufficient to start with. However, you can customize them if needed.
 
   **Important:** Do not commit sensitive information to `.env` if you are contributing back to the repository.
 
@@ -107,17 +107,17 @@ npm run build
 
 We **require** you to use separate terminal instances for your frontend and backend processes. This keeps your workflows organized and makes it easier to manage each part of the application independently.
 
-**Using VSCode Integrated Terminals:**
+**Using VS Code Integrated Terminals:**
 
-VSCode's integrated terminal feature makes managing multiple terminals incredibly easy. Here's how to leverage it for frontend and backend separation:
+VS Code's integrated terminal feature makes managing multiple terminals incredibly easy. Here's how to leverage it for frontend and backend separation:
 
-1. **Frontend Terminal (You likely already have this):** If you followed the Frontend Setup steps, you probably already have a terminal open in VSCode at the project root (`open-webui` directory). This is where you'll run your frontend commands (`npm run dev`, etc.). Ensure you are in the `open-webui` directory for the next steps if you are not already.
+1. **Frontend Terminal (You likely already have this):** If you followed the Frontend Setup steps, you probably already have a terminal open in VS Code at the project root (`open-webui` directory). This is where you'll run your frontend commands (`npm run dev`, etc.). Ensure you are in the `open-webui` directory for the next steps if you are not already.
 
 2. **Backend Terminal (Open a New One):**
-   - In VSCode, go to **Terminal > New Terminal** (or use the shortcut `Ctrl+Shift+` on Windows/Linux or `Cmd+Shift+` on macOS). This will open a new integrated terminal panel.
+   - In VS Code, go to **Terminal > New Terminal** (or use the shortcut `Ctrl+Shift+` on Windows/Linux or `Cmd+Shift+` on macOS). This will open a new integrated terminal panel.
    - **Navigate to the `backend` directory:** In this *new* terminal, use the `cd backend` command to change the directory to the `backend` folder within your project. This ensures all backend-related commands are executed in the correct context.
 
-   Now you have **two separate terminal instances within VSCode**: one for the frontend (likely in the `open-webui` directory) and one specifically for the backend (inside the `backend` directory). You can easily switch between these terminals within VSCode to manage your frontend and backend processes independently. This setup is highly recommended for a cleaner and more efficient development workflow.
+   Now you have **two separate terminal instances within VS Code**: one for the frontend (likely in the `open-webui` directory) and one specifically for the backend (inside the `backend` directory). You can easily switch between these terminals within VS Code to manage your frontend and backend processes independently. This setup is highly recommended for a cleaner and more efficient development workflow.
 
 **Backend Setup Steps (in your *backend* terminal):**
 
diff --git a/docs/getting-started/env-configuration.md b/docs/getting-started/env-configuration.md
index bf0c9d8906..b1486b38a7 100644
--- a/docs/getting-started/env-configuration.md
+++ b/docs/getting-started/env-configuration.md
@@ -1117,6 +1117,17 @@ modeling files for reranking.
 - Default: `chroma`
 - Description: Specifies which vector database system to use. This setting determines which vector storage system will be used for managing embeddings.
 
+:::note
+
+PostgreSQL Dependencies
+To use `pgvector`, ensure you have PostgreSQL dependencies installed:
+
+```bash
+pip install open-webui[all]
+```
+
+:::
+
 ### ChromaDB
 
 #### `CHROMA_TENANT`
@@ -1308,6 +1319,17 @@ modeling files for reranking.
 
 ### PGVector
 
+:::note
+
+PostgreSQL Dependencies
+To use `pgvector`, ensure you have PostgreSQL dependencies installed:
+
+```bash
+pip install open-webui[all]
+```
+
+:::
+
 #### `PGVECTOR_DB_URL`
 
 - Type: `str`
@@ -3238,6 +3260,30 @@ If `OAUTH_PICTURE_CLAIM` is set to `''` (empty string), then the OAuth picture c
 - Description: Enables or disables user permission to edit chats.
 - Persistence: This environment variable is a `PersistentConfig` variable.
 
+#### `USER_PERMISSIONS_CHAT_DELETE_MESSAGE`
+- Type: `bool`
+- Default: `True`
+- Description: Enables or disables user permission to delete individual messages within chats. This provides granular control over message deletion capabilities separate from full chat deletion.
+- Persistence: This environment variable is a `PersistentConfig` variable.
+
+#### `USER_PERMISSIONS_CHAT_CONTINUE_RESPONSE`
+- Type: `bool`
+- Default: `True`
+- Description: Enables or disables user permission to continue AI responses. When disabled, users cannot use the "Continue Response" button, which helps prevent potential system prompt leakage through response continuation manipulation.
+- Persistence: This environment variable is a `PersistentConfig` variable.
+
+#### `USER_PERMISSIONS_CHAT_REGENERATE_RESPONSE`
+- Type: `bool`
+- Default: `True`
+- Description: Enables or disables user permission to regenerate AI responses. Controls access to both the standard regenerate button and the guided regeneration menu.
+- Persistence: This environment variable is a `PersistentConfig` variable.
+
+#### `USER_PERMISSIONS_CHAT_RATE_RESPONSE`
+- Type: `bool`
+- Default: `True`
+- Description: Enables or disables user permission to rate AI responses using the thumbs up/down feedback system. This controls access to the response rating functionality for evaluation and feedback collection.
+- Persistence: This environment variable is a `PersistentConfig` variable.
+
 #### `USER_PERMISSIONS_CHAT_STT`
 
 - Type: `bool`
@@ -3614,7 +3660,10 @@ If the endpoint is an S3-compatible provider like MinIO that uses a TLS certific
 
 :::info
 
-Supports SQLite, Postgres, and encrypted SQLite via SQLCipher. Changing the URL does not migrate data between databases.
+**For PostgreSQL support, ensure you installed with `pip install open-webui[all]` instead of the basic installation.**
+Supports SQLite, Postgres, and encrypted SQLite via SQLCipher.
+**Changing the URL does not migrate data between databases.**
+
 Documentation on the URL scheme is available [here](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls).
 
 If your database password contains special characters, please ensure they are properly URL-encoded. For example, a password like `p@ssword` should be encoded as `p%40ssword`.
@@ -3824,6 +3873,18 @@ If you use UVICORN_WORKERS, you also need to ensure that related environment var
 
 :::
 
+### Cache Settings
+
+#### `CACHE_CONTROL`
+
+- Type: `str`
+- Default: Not set (no Cache-Control header added)
+- Description: Sets the Cache-Control header for all HTTP responses. Supports standard directives like `public`, `private`, `no-cache`, `no-store`, `must-revalidate`, `max-age=seconds`, etc. If an invalid value is provided, defaults to `"no-store, max-age=0"` (no caching).
+- Examples:
+  - `"private, max-age=86400"` - Cache privately for 24 hours  
+  - `"public, max-age=3600, must-revalidate"` - Cache publicly for 1 hour, then revalidate
+  - `"no-cache, no-store, must-revalidate"` - Never cache
+
 ### Proxy Settings
 
 Open WebUI supports using proxies for HTTP and HTTPS retrievals. To specify proxy settings,
diff --git a/docs/intro.mdx b/docs/intro.mdx
index 19dcd447be..58e905bced 100644
--- a/docs/intro.mdx
+++ b/docs/intro.mdx
@@ -148,13 +148,21 @@ Once `uv` is installed, running Open WebUI is a breeze. Use the command below, e
   $env:DATA_DIR="C:\open-webui\data"; uvx --python 3.11 open-webui@latest serve
   ```
 
+:::note 
+**For PostgreSQL Support:**
 
+The default installation now uses a slimmed-down package. If you need **PostgreSQL support**, install with all optional dependencies:
+
+```bash
+pip install open-webui[all]
+```
+:::
 
 ### Installation with `pip`
 
 For users installing Open WebUI with Python's package manager `pip`, **it is strongly recommended to use Python runtime managers like `uv` or `conda`**. These tools help manage Python environments effectively and avoid conflicts. 
 
-Python 3.11 is the development environment. Python 3.12 seems to work but has not been thoroughly tested. Python 3.13 is entirely untested—**use at your own risk**.
+Python 3.11 is the development environment. Python 3.12 seems to work but has not been thoroughly tested. Python 3.13 is entirely untested and some dependencies do not work with Python 3.13 yet—**use at your own risk**.
 
 1. **Install Open WebUI**:  
 
@@ -220,4 +228,4 @@ We are deeply grateful for the generous grant support provided by:
   </a>
 
   
-</div>
\ No newline at end of file
+</div>
diff --git a/docs/tutorials/integrations/backend-controlled-ui-compatible-flow.md b/docs/tutorials/integrations/backend-controlled-ui-compatible-flow.md
index 74a4b13d50..ec2f4e4c89 100644
--- a/docs/tutorials/integrations/backend-controlled-ui-compatible-flow.md
+++ b/docs/tutorials/integrations/backend-controlled-ui-compatible-flow.md
@@ -26,26 +26,27 @@ Before following this tutorial, ensure you have:
 
 ## Overview
 
-This tutorial describes a comprehensive 6-step process that enables server-side orchestration of Open WebUI conversations while ensuring that assistant replies appear properly in the frontend UI.
+This tutorial describes a comprehensive 7-step process that enables server-side orchestration of Open WebUI conversations while ensuring that assistant replies appear properly in the frontend UI.
 
 ### Process Flow
 
 The essential steps are:
 
 1. **Create a new chat with a user message** - Initialize the conversation with the user's input
-2. **Manually inject an empty assistant message** - Create a placeholder for the assistant's response
-3. **Trigger the assistant completion** - Generate the actual AI response (with optional knowledge integration)
-4. **Mark the completion** - Signal that the response generation is complete
-5. **Poll for response readiness** - Wait for the assistant response to be fully generated
-6. **Fetch and process the final chat** - Retrieve and parse the completed conversation
+2. **Enrich the chat response with an assistant message** - Add assistant message to the response object in memory
+3. **Update chat with assistant message** - Send the enriched chat state to the server
+4. **Trigger the assistant completion** - Generate the actual AI response (with optional knowledge integration)
+5. **Wait for response completion** - Monitor the assistant response until fully generated
+6. **Complete the assistant message** - Mark the response as completed
+7. **Fetch and process the final chat** - Retrieve and parse the completed conversation
 
 This enables server-side orchestration while still making replies show up in the frontend UI exactly as if they were generated through normal user interaction.
 
 ## Implementation Guide
 
-### Critical Step: Manually Inject the Assistant Message
+### Critical Step: Enrich Chat Response with Assistant Message
 
-The assistant message needs to be injected manually as a critical prerequisite before triggering the completion. This step is essential because the Open WebUI frontend expects assistant messages to exist in a specific structure.
+The assistant message needs to be added to the chat response object in memory as a critical prerequisite before triggering the completion. This step is essential because the Open WebUI frontend expects assistant messages to exist in a specific structure.
 
 The assistant message must appear in both locations:
 - `chat.messages[]` - The main message array
@@ -61,11 +62,11 @@ The assistant message must appear in both locations:
   "parentId": "<user-msg-id>",
   "modelName": "gpt-4o",
   "modelIdx": 0,
-  "timestamp": <currentTimestamp>
+  "timestamp": "<currentTimestamp>"
 }
 ```
 
-Without this manual injection, the assistant's response will not appear in the frontend interface, even if the completion is successful.
+Without this enrichment, the assistant's response will not appear in the frontend interface, even if the completion is successful.
 
 ## Step-by-Step Implementation
 
@@ -106,26 +107,80 @@ curl -X POST https://<host>/api/v1/chats/new \
   }'
 ```
 
-### Step 2: Manually Inject Empty Assistant Message
+### Step 2: Enrich Chat Response with Assistant Message
 
-Add the assistant message placeholder to the chat structure:
+Add the assistant message to the chat response object in memory. Note that this can be combined with Step 1 by including the assistant message in the initial chat creation:
+
+```java
+// Example implementation in Java
+public void enrichChatWithAssistantMessage(OWUIChatResponse chatResponse, String model) {
+    OWUIMessage assistantOWUIMessage = buildAssistantMessage(chatResponse, model, "assistant", "");
+    assistantOWUIMessage.setParentId(chatResponse.getChat().getMessages().get(0).getId());
+
+    chatResponse.getChat().getMessages().add(assistantOWUIMessage);
+    chatResponse.getChat().getHistory().getMessages().put(assistantOWUIMessage.getId(), assistantOWUIMessage);
+}
+```
+
+**Note:** This step can be performed in memory on the response object, or combined with Step 1 by including both user and empty assistant messages in the initial chat creation.
+
+### Step 3: Update Chat with Assistant Message
+
+Send the enriched chat state containing both user and assistant messages to the server:
 
 ```bash
-curl -X POST https://<host>/api/v1/chats/<chatId>/messages \
+curl -X POST https://<host>/api/v1/chats/<chatId> \
   -H "Authorization: Bearer <token>" \
   -H "Content-Type: application/json" \
   -d '{
-    "id": "assistant-msg-id",
-    "role": "assistant",
-    "content": "",
-    "parentId": "user-msg-id",
-    "modelName": "gpt-4o",
-    "modelIdx": 0,
-    "timestamp": 1720000001000
+    "chat": {
+      "id": "<chatId>",
+      "title": "New Chat",
+      "models": ["gpt-4o"],
+      "messages": [
+        {
+          "id": "user-msg-id",
+          "role": "user",
+          "content": "Hi, what is the capital of France?",
+          "timestamp": 1720000000000,
+          "models": ["gpt-4o"]
+        },
+        {
+          "id": "assistant-msg-id",
+          "role": "assistant",
+          "content": "",
+          "parentId": "user-msg-id",
+          "modelName": "gpt-4o",
+          "modelIdx": 0,
+          "timestamp": 1720000001000
+        }
+      ],
+      "history": {
+        "current_id": "assistant-msg-id",
+        "messages": {
+          "user-msg-id": {
+            "id": "user-msg-id",
+            "role": "user",
+            "content": "Hi, what is the capital of France?",
+            "timestamp": 1720000000000,
+            "models": ["gpt-4o"]
+          },
+          "assistant-msg-id": {
+            "id": "assistant-msg-id",
+            "role": "assistant",
+            "content": "",
+            "parentId": "user-msg-id",
+            "modelName": "gpt-4o",
+            "modelIdx": 0,
+            "timestamp": 1720000001000
+          }
+        }
+      }
+    }
   }'
 ```
 
-### Step 3: Trigger Assistant Completion
+### Step 4: Trigger Assistant Completion
 
 Generate the actual AI response using the completion endpoint:
 
@@ -165,7 +220,7 @@ curl -X POST https://<host>/api/chat/completions \
   }'
 ```
 
-#### Step 3.1: Trigger Assistant Completion with Knowledge Integration (RAG)
+#### Step 4.1: Trigger Assistant Completion with Knowledge Integration (RAG)
 
 For advanced use cases involving knowledge bases or document collections, include knowledge files in the completion request:
 
@@ -212,25 +267,37 @@ curl -X POST https://<host>/api/chat/completions \
   }'
 ```
 
-### Step 4: Mark Completion
+### Step 5: Wait for Assistant Response Completion
 
-Signal that the assistant response is complete:
+Assistant responses can be handled in two ways depending on your implementation needs:
 
-```bash
-curl -X POST https://<host>/api/chat/completed \
-  -H "Authorization: Bearer <token>" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "chat_id": "<chatId>",
-    "id": "assistant-msg-id",
-    "session_id": "session-id",
-    "model": "gpt-4o"
-  }'
-```
+#### Option A: Stream Processing (Recommended)
+If using `stream: true` in the completion request, you can process the streamed response in real-time and wait for the stream to complete. This is the approach used by the OpenWebUI web interface and provides immediate feedback.
 
-### Step 5: Poll for Assistant Response Completion
+#### Option B: Polling Approach
+For implementations that cannot handle streaming, poll the chat endpoint until the response is ready. Use a retry mechanism with exponential backoff:
 
-Since assistant responses are generated asynchronously, poll the chat endpoint until the response is ready:
+```java
+// Example implementation in Java
+@Retryable(
+    retryFor = AssistantResponseNotReadyException.class,
+    maxAttemptsExpression = "#{${webopenui.retries:50}}",
+    backoff = @Backoff(delayExpression = "#{${webopenui.backoffmilliseconds:2000}}")
+)
+public String getAssistantResponseWhenReady(String chatId, ChatCompletedRequest chatCompletedRequest) {
+    OWUIChatResponse response = owuiService.fetchFinalChatResponse(chatId);
+    Optional<OWUIMessage> assistantMsg = extractAssistantResponse(response);
+
+    if (assistantMsg.isPresent() && !assistantMsg.get().getContent().isBlank()) {
+        owuiService.completeAssistantMessage(chatCompletedRequest);
+        return assistantMsg.get().getContent();
+    }
+
+    throw new AssistantResponseNotReadyException("Assistant response not ready yet for chatId: " + chatId);
+}
+```
+
+For manual polling, you can use:
 
 ```bash
 # Poll every few seconds until assistant content is populated
@@ -249,7 +316,23 @@ while true; do
 done
 ```
 
-### Step 6: Fetch Final Chat
+### Step 6: Complete Assistant Message
+
+Once the assistant response is ready, mark it as completed:
+
+```bash
+curl -X POST https://<host>/api/chat/completed \
+  -H "Authorization: Bearer <token>" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "chat_id": "<chatId>",
+    "id": "assistant-msg-id",
+    "session_id": "session-id",
+    "model": "gpt-4o"
+  }'
+```
+
+### Step 7: Fetch Final Chat
 
 Retrieve the completed conversation:
 
@@ -278,6 +361,42 @@ curl -X GET https://<host>/api/v1/models/model?id=<model-name> \
   -H "Authorization: Bearer <token>"
 ```
 
+### Send Additional Messages to Chat
+
+For multi-turn conversations, you can send additional messages to an existing chat:
+
+```bash
+curl -X POST https://<host>/api/v1/chats/<chatId> \
+  -H "Authorization: Bearer <token>" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "chat": {
+      "id": "<chatId>",
+      "messages": [
+        {
+          "id": "new-user-msg-id",
+          "role": "user",
+          "content": "Can you tell me more about this?",
+          "timestamp": 1720000002000,
+          "models": ["gpt-4o"]
+        }
+      ],
+      "history": {
+        "current_id": "new-user-msg-id",
+        "messages": {
+          "new-user-msg-id": {
+            "id": "new-user-msg-id",
+            "role": "user",
+            "content": "Can you tell me more about this?",
+            "timestamp": 1720000002000,
+            "models": ["gpt-4o"]
+          }
+        }
+      }
+    }
+  }'
+```
+
 ## Response Processing
 
 ### Parsing Assistant Responses
@@ -735,13 +854,14 @@ This cleaning process handles:
 ## Important Notes
 
 - This workflow is compatible with Open WebUI + backend orchestration scenarios
-- **Critical:** Avoid skipping the assistant injection step — otherwise the frontend won't display the message
+- **Critical:** The assistant message enrichment must be done in memory on the response object, not via API call
+- **Alternative Approach:** You can include both user and assistant messages in the initial chat creation (Step 1) instead of doing Step 2 separately
 - No frontend code changes are required for this approach
 - The `stream: true` parameter allows for real-time response streaming if needed
+- **Response Monitoring:** Use streaming for real-time processing or polling for simpler implementations that cannot handle streams
 - Background tasks like title generation can be controlled via the `background_tasks` object
 - Session IDs help maintain conversation context across requests
 - **Knowledge Integration:** Use the `files` array to include knowledge collections for RAG capabilities
-- **Polling Strategy:** Always poll for completion rather than assuming immediate response availability
 - **Response Parsing:** Handle JSON responses that may be wrapped in markdown code blocks
 - **Error Handling:** Implement proper retry mechanisms for network timeouts and server errors
 
@@ -750,15 +870,16 @@ This cleaning process handles:
 Use the Open WebUI backend APIs to:
 
 1. **Start a chat** - Create the initial conversation with user input
-2. **Inject an assistant placeholder message** - Prepare the response container
-3. **Trigger a reply** - Generate the AI response (with optional knowledge integration)
-4. **Poll for completion** - Wait for the assistant response to be ready
-5. **Finalize the conversation** - Mark completion and retrieve the final chat
-6. **Process the response** - Parse and clean the assistant's output
+2. **Enrich with assistant message** - Add assistant placeholder to the response object in memory (can be combined with Step 1)
+3. **Update chat state** - Send the enriched chat to the server
+4. **Trigger a reply** - Generate the AI response (with optional knowledge integration)
+5. **Monitor completion** - Wait for the assistant response using streaming or polling
+6. **Complete the message** - Mark the response as completed
+7. **Fetch the final chat** - Retrieve and parse the completed conversation
 
 **Enhanced Capabilities:**
 - **RAG Integration** - Include knowledge collections for context-aware responses
-- **Asynchronous Processing** - Handle long-running AI operations with polling
+- **Asynchronous Processing** - Handle long-running AI operations with streaming or polling
 - **Response Parsing** - Clean and validate JSON responses from the assistant
 - **Session Management** - Maintain conversation context across requests
 
@@ -777,4 +898,4 @@ You can test your implementation by following the step-by-step CURL examples pro
 
 :::tip
 Start with a simple user message and gradually add complexity like knowledge integration and advanced features once the basic flow is working.
-::: 
\ No newline at end of file
+:::
diff --git a/docs/tutorials/integrations/continue-dev.md b/docs/tutorials/integrations/continue-dev.md
index 2f356d6465..4761f45441 100644
--- a/docs/tutorials/integrations/continue-dev.md
+++ b/docs/tutorials/integrations/continue-dev.md
@@ -1,20 +1,20 @@
 ---
 sidebar_position: 13
-title: "⚛️ Continue.dev VSCode Extension with Open WebUI"
+title: "⚛️ Continue.dev VS Code Extension with Open WebUI"
 ---
 
 :::warning
 This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the [contributing tutorial](/docs/contributing.mdx).
 :::
 
-# Integrating Continue.dev VSCode Extension with Open WebUI
+# Integrating Continue.dev VS Code Extension with Open WebUI
 
 ## Download Extension
 
-You can download the VSCode extension on the [Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue) or directly via the `EXTENSION:MARKETPLACE` within VSCode by searching for `continue`.
-Once installed, you can access the application via the `continue` tab in the side bar of VSCode.
+You can download the VS Code extension on the [Visual Studio Marketplace](https://marketplace.visualstudio.com/items?itemName=Continue.continue) or directly via the `EXTENSION:MARKETPLACE` within VS Code by searching for `continue`.
+Once installed, you can access the application via the `continue` tab in the side bar of VS Code.
 
-**VSCode side bar icon:**
+**VS Code side bar icon:**
 
 ![continue.dev vscode icon](/images/tutorials/continue-dev/continue_dev_vscode_icon.png)