diff --git a/docs.json b/docs.json
index c24f8737..ccf6e549 100644
--- a/docs.json
+++ b/docs.json
@@ -142,6 +142,21 @@
"group": "Storage",
"pages": ["storage/network-volumes", "storage/s3-api"]
},
+ {
+ "group": "Integrations",
+ "pages": [
+ "integrations/overview",
+ {
+ "group": "Guides",
+ "pages":[
+ "integrations/n8n-integration",
+ "integrations/dstack",
+ "integrations/mods",
+ "integrations/skypilot"
+ ]
+ }
+ ]
+ },
{
"group": "Hub",
"pages": [
@@ -191,14 +206,6 @@
"references/troubleshooting/manage-payment-cards"
]
},
- {
- "group": "Integrations",
- "pages": [
- "integrations/dstack",
- "integrations/mods",
- "integrations/skypilot"
- ]
- },
{
"group": "Migrations",
"pages": [
@@ -472,6 +479,10 @@
},
"redirects": [
+ {
+ "source": "/serverless/development/integrations",
+ "destination": "/integrations/overview"
+ },
{
"source": "/references/faq",
"destination": "/references/troubleshooting/zero-gpus"
diff --git a/images/serverless-endpoint-id.png b/images/serverless-endpoint-id.png
new file mode 100644
index 00000000..c3bbc94d
Binary files /dev/null and b/images/serverless-endpoint-id.png differ
diff --git a/integrations/crewai-integration.mdx b/integrations/crewai-integration.mdx
new file mode 100644
index 00000000..6283650e
--- /dev/null
+++ b/integrations/crewai-integration.mdx
@@ -0,0 +1,130 @@
+---
+title: "Integrate Runpod with CrewAI"
+sidebarTitle: CrewAI
+description: "Learn how to deploy a vLLM worker on Runpod and connect it to CrewAI for orchestrating autonomous AI agents."
+tag: "BETA"
+---
+
+Learn how to integrate Runpod Serverless with CrewAI, a framework for orchestrating role-playing autonomous AI agents. By the end of this tutorial, you'll have a vLLM endpoint running on Runpod that you can use to power your CrewAI agents.
+
+## What you'll learn
+
+In this tutorial, you'll learn how to:
+
+* Deploy a vLLM worker on Runpod Serverless.
+* Configure your vLLM endpoint for OpenAI compatibility.
+* Connect CrewAI to your Runpod endpoint.
+* Test your integration with a simple agent.
+
+## Requirements
+
+Before you begin, you'll need:
+
+* A [Runpod account](/get-started/manage-accounts) (with available credits).
+* A [Runpod API key](/get-started/api-keys).
+* A [CrewAI](https://crewai.com/) account.
+* (Optional) For gated models, you've created a [Hugging Face access token](https://huggingface.co/docs/hub/en/security-tokens).
+
+## Step 1: Deploy a vLLM worker on Runpod
+
+First, you'll deploy a vLLM worker to serve your language model.
+
+
+
+ Open the [Runpod console](https://www.console.runpod.io/serverless) and navigate to the Serverless page.
+
+ Click **New Endpoint** and select **vLLM** under **Ready-to-Deploy Repos**.
+
+
+
+
+
+ For more details on vLLM deployment options, see [Deploy a vLLM worker](/serverless/vllm/get-started).
+
+
+ In the deployment modal:
+
+ * Enter the model name or Hugging Face model URL (e.g., `openchat/openchat-3.5-0106`).
+ * Expand the **Advanced** section:
+ * Set **Max Model Length** to `8192` (or an appropriate context length for your model).
+ * You may need to enable tool calling and set an appropriate reasoning parser depending on your model.
+ * Click **Next**.
+ * Click **Create Endpoint**.
+
+ Your endpoint will now begin initializing. This may take several minutes while Runpod provisions resources and downloads your model. Wait until the status shows as **Running**.
+
+
+
+ Once deployed, navigate to your endpoint in the Runpod console. You can find your endpoint ID in the **Overview** tab:
+
+
+
+
+
+ You can also find your endpoint ID in the URL of the endpoint detail page. For example, if the URL for your endpoint is `https://console.runpod.io/serverless/user/endpoint/isapbl1e254mbj`, the endpoint ID is `isapbl1e254mbj`.
+
+ Copy your endpoint ID to the clipboard. You'll need this to connect your endpoint to CrewAI.
+
+
+
+## Step 2: Connect CrewAI to your Runpod endpoint
+
+Now you'll configure CrewAI to use your Runpod endpoint as an OpenAI-compatible API.
+
+
+
+ Open the CrewAI dashboard and open the **LLM connections** section.
+
+
+
+ Under **Add New Conncection**, enter a name for your connection. Then under **Provider**, select **custom-openai-compatible** from the dropdown menu.
+
+
+
+ Configure the connection with your Runpod credentials:
+
+ * For `OPENAI_API_KEY`, use your Runpod API Key. You can find or create API keys in the settings page of the [Runpod console](https://console.runpod.io/user/settings).
+
+ For `OPENAI_API_BASE`, enter the base URL for your vLLM's OpenAI-compatible endpoint:
+
+ ```
+ https://api.runpod.ai/v2/ENDPOINT_ID/openai/v1
+ ```
+
+ Replace `ENDPOINT_ID` with your actual endpoint ID from Step 1.
+
+
+
+ Click **Fetch Available Models** to test the connection. If successful, CrewAI will retrieve the list of models available on your endpoint.
+
+
+
+## Step 3: Test your integration
+
+To verify that your CrewAI agents can use your Runpod endpoint, you can try using it in an automation:
+
+
+
+ Create a blank automation and add an Agent node. Click the edit button to configure it. the Agent node. Under **Model**, select your Runpod endpoint from the dropdown menu (if you have trouble finding it, try filtering for **Custom OpenAI Compatible** models).
+
+
+
+ Assign a simple task to your agent and run it to verify that it can communicate with your Runpod endpoint.
+
+
+
+ Monitor requests from your CrewAI agents in the endpoint details page of the Runpod console.
+
+
+
+ Confirm that your agent is receiving appropriate responses from your model running on Runpod.
+
+
+
+## Next steps
+
+Now that you've integrated Runpod with CrewAI, you can
+
+* Build complex multi-agent systems using your Runpod endpoint to serve the necessary models.
+* Explore other [integration options](/integrations/overview).
+* Learn more about [OpenAI compatibility](/serverless/vllm/openai-compatibility) features in vLLM.
diff --git a/integrations/dstack.mdx b/integrations/dstack.mdx
index fc22078b..8e1c5558 100644
--- a/integrations/dstack.mdx
+++ b/integrations/dstack.mdx
@@ -3,46 +3,44 @@ title: "Manage Pods with dstack on Runpod"
sidebarTitle: "dstack"
---
-[dstack](https://dstack.ai/) is an open-source tool that simplifies the orchestration of Pods for AI and ML workloads. By defining your application and resource requirements in YAML configuration files, it automates the provisioning and management of cloud resources on Runpod, allowing you to focus on your application logic rather than the infrastructure.
+[dstack](https://dstack.ai/) is an open-source tool that automates Pod orchestration for AI and ML workloads. It lets you define your application and resource requirements in YAML files, then handles provisioning and managing cloud resources on Runpod so you can focus on your application instead of infrastructure.
-In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with Runpod to deploy [vLLM](https://github.com/vllm-project/vllm). We'll serve the `meta-llama/Llama-3.1-8B-Instruct` model from Hugging Face using a Python environment.
+This guide shows you how to set up dstack with Runpod and deploy [vLLM](https://github.com/vllm-project/vllm) to serve the `meta-llama/Llama-3.1-8B-Instruct` model from Hugging Face.
-## Prerequisites
+## Requirements
-* [A Runpod account with an API key](/get-started/api-keys)
+You'll need:
-* On your local machine:
+* [A Runpod account with an API key](/get-started/api-keys).
+* Python 3.8 or higher installed on your local machine.
+* `pip` (or `pip3` on macOS).
+* Basic utilities like `curl`.
- * Python 3.8 or higher
- * `pip` (or `pip3` on macOS)
- * Basic utilities: `curl`
-
-* These instructions are applicable for macOS, Linux, and Windows systems.
+These instructions work on macOS, Linux, and Windows.
-**Windows Users**
+**Windows users**
-* It's recommended to use [WSL (Windows Subsystem for Linux)](https://docs.microsoft.com/en-us/windows/wsl/install) or tools like [Git Bash](https://gitforwindows.org/) to follow along with the Unix-like commands used in this tutorial
-* Alternatively, Windows users can use PowerShell or Command Prompt and adjust commands accordingly
+Use [WSL (Windows Subsystem for Linux)](https://docs.microsoft.com/en-us/windows/wsl/install) or [Git Bash](https://gitforwindows.org/) to follow along with the Unix-like commands in this guide. Alternatively, use PowerShell or Command Prompt and adjust commands as needed.
-## Installation
-
-### Setting Up the dstack Server
+## Set up dstack
-1. **Prepare Your Workspace**
+### Install and configure the server
- Open a terminal or command prompt and create a new directory for this tutorial:
+
+
+ Open a terminal and create a new directory:
```bash
mkdir runpod-dstack-tutorial
cd runpod-dstack-tutorial
```
+
-2. **Set Up a Python Virtual Environment**
-
+
```bash
@@ -78,10 +76,10 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
+
-3. **Install dstack**
-
- Use `pip` to install dstack:
+
+ Install dstack using `pip`:
@@ -89,8 +87,6 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
pip3 install -U "dstack[all]"
```
- **Note:** If `pip3` is not available, you may need to install it or use `pip`.
-
@@ -108,12 +104,14 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
+
+
-### Configuring dstack for Runpod
+### Configure dstack for Runpod
-1. **Create the Global Configuration File**
-
- The following `config.yml` file is a **global configuration** used by [dstack](https://dstack.ai/) for all deployments on your computer. It's essential to place it in the correct configuration directory.
+
+
+ Create a `config.yml` file in the dstack configuration directory. This file stores your Runpod credentials for all dstack deployments.
* **Create the configuration directory:**
@@ -133,8 +131,6 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
- **Command Prompt or PowerShell:**
-
```bash
mkdir %USERPROFILE%\.dstack\server
```
@@ -169,9 +165,7 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
- * **Create the `config.yml` File**
-
- In the configuration directory, create a file named `config.yml` with the following content:
+ Create a file named `config.yml` with the following content:
```yml
projects:
@@ -183,17 +177,17 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
api_key: YOUR_RUNPOD_API_KEY
```
- Replace `YOUR_RUNPOD_API_KEY` with the API key you obtained from Runpod.
-
-2. **Start the dstack Server**
+ Replace `YOUR_RUNPOD_API_KEY` with your actual Runpod API key.
+
- From the configuration directory, start the dstack server:
+
+ Start the dstack server:
```bash
dstack server
```
- You should see output indicating that the server is running:
+ You'll see output like this:
```bash
[INFO] Applying ~/.dstack/server/config.yml...
@@ -203,35 +197,33 @@ In this guide, we'll walk through setting up [dstack](https://dstack.ai/) with R
-The `ADMIN-TOKEN` displayed is important for accessing the dstack web UI.
+Save the `ADMIN-TOKEN` to access the dstack web UI.
+
-3. **Access the dstack Web UI**
-
-* Open your web browser and navigate to `http://127.0.0.1:3000`.
-* When prompted for an admin token, enter the `ADMIN-TOKEN` from the server output.
-* The web UI allows you to monitor and manage your deployments.
+
+ Open your browser and go to `http://127.0.0.1:3000`. Enter the `ADMIN-TOKEN` from the server output to access the web UI where you can monitor and manage deployments.
+
+
-## Deploying vLLM as a Task
-
-### Step 1: Configure the Deployment Task
-
-1. **Prepare for Deployment**
+## Deploy vLLM
-* Open a new terminal or command prompt window.
+### Configure the deployment
-* Navigate to your tutorial directory:
+
+
+ Open a new terminal and navigate to your tutorial directory:
```bash
cd runpod-dstack-tutorial
```
-* **Activate the Python Virtual Environment**
+ Activate the Python virtual environment:
@@ -264,21 +256,19 @@ The `ADMIN-TOKEN` displayed is important for accessing the dstack web UI.
+
-2. **Create a Directory for the Task**
-
-Create and navigate to a new directory for the deployment task:
+
+Create a new directory for the deployment:
```bash
mkdir task-vllm-llama
cd task-vllm-llama
```
+
-3. **Create the dstack Configuration File**
-
-* **Create the `.dstack.yml` File**
-
- Create a file named `.dstack.yml` (or `dstack.yml` if your system doesn't allow filenames starting with a dot) with the following content:
+
+ Create a file named `.dstack.yml` with the following content:
```yml
type: task
@@ -303,59 +293,51 @@ cd task-vllm-llama
-Replace `YOUR_HUGGING_FACE_HUB_TOKEN` with your actual [Hugging Face access token](https://huggingface.co/settings/tokens) (read-access is enough) or define the token in your environment variables. Without this token, the model cannot be downloaded as it is gated.
+Replace `YOUR_HUGGING_FACE_HUB_TOKEN` with your [Hugging Face access token](https://huggingface.co/settings/tokens). The model is gated and requires authentication to download.
+
+
-### Step 2: Initialize and Deploy the Task
-
-1. **Initialize dstack**
+### Initialize and deploy
-Run the following command **in the directory where your `.dstack.yml` file is located**:
+
+
+In the directory with your `.dstack.yml` file, run:
```bash
dstack init
```
+
-2. **Apply the Configuration**
-
-Deploy the task by applying the configuration:
+
+Deploy the task:
```bash
dstack apply
```
-* You will see an output summarizing the deployment configuration and available instances.
-
-* When prompted:
+ You'll see the deployment configuration and available instances. When prompted:
```bash
Submit the run vllm-llama-3.1-8b-instruct? [y/n]:
```
- Type `y` and press `Enter` to confirm.
+ Type `y` and press Enter.
-* The `ports` configuration provides port forwarding from the deployed pod to `localhost`, allowing you to access the deployed vLLM via `localhost:8000`.
+The `ports` configuration forwards the deployed Pod's port to `localhost:8000` on your machine.
+
-3. **Monitor the Deployment**
+
+dstack will provision the Pod, download the Docker image, install packages, download the model, and start the vLLM server. You'll see progress logs in the terminal.
-* After executing `dstack apply`, you'll see all the steps that dstack performs:
-
- * Provisioning the pod on Runpod.
- * Downloading the Docker image.
- * Installing required packages.
- * Downloading the model from Hugging Face.
- * Starting the vLLM server.
-
-* The logs of vLLM will be displayed in the terminal.
-
-* To monitor the logs at any time, run:
+To view logs at any time, run:
```bash
dstack logs vllm-llama-3.1-8b-instruct
```
-* Wait until you see logs indicating that vLLM is serving the model, such as:
+Wait until you see logs indicating the server is ready:
```
INFO: Started server process [1]
@@ -363,16 +345,14 @@ dstack apply
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```
+
+
-### Step 3: Test the Model Server
-
-1. **Access the Service**
-
-Since the `ports` configuration forwards port `8000` from the deployed pod to `localhost`, you can access the vLLM server via `http://localhost:8000`.
+### Test the deployment
-2. **Test the Service Using `curl`**
+The vLLM server is now accessible at `http://localhost:8000`.
-Use the following `curl` command to test the deployed model:
+Test it with `curl`:
@@ -428,9 +408,7 @@ curl.exe -Method Post http://localhost:8000/v1/chat/completions `
-3. **Verify the Response**
-
-You should receive a JSON response similar to the following:
+You'll receive a JSON response:
```json
{
@@ -460,45 +438,33 @@ You should receive a JSON response similar to the following:
}
```
-This confirms that the model is running and responding as expected.
-
-### Step 4: Clean Up
-
-To avoid incurring additional costs, it's important to stop the task when you're finished.
-
-1. **Stop the Task**
+### Clean up
-In the terminal where you ran `dstack apply`, you can stop the task by pressing `Ctrl + C`.
+Stop the task when you're done to avoid charges.
-You'll be prompted:
+Press `Ctrl + C` in the terminal where you ran `dstack apply`. When prompted:
```
Stop the run vllm-llama-3.1-8b-instruct before detaching? [y/n]:
```
-Type `y` and press `Enter` to confirm stopping the task.
+Type `y` and press Enter.
-2. **Terminate the Instance**
-
-The instance will terminate automatically after stopping the task.
-
-If you wish to ensure the instance is terminated immediately, you can run:
+The instance will terminate automatically. To ensure immediate termination, run:
```bash
dstack stop vllm-llama-3.1-8b-instruct
```
-3. **Verify Termination**
+Verify termination in your Runpod dashboard or the dstack web UI.
-Check your Runpod dashboard or the [dstack](https://dstack.ai/) web UI to ensure that the instance has been terminated.
+## Use volumes for persistent storage
-## Additional Tips: Using Volumes for Persistent Storage
+Volumes let you store data between runs and cache models to reduce startup times.
-If you need to retain data between runs or cache models to reduce startup times, you can use volumes.
+### Create a volume
-### Creating a Volume
-
-Create a separate [dstack](https://dstack.ai/) file named `volume.dstack.yml` with the following content:
+Create a file named `volume.dstack.yml`:
```yml
type: volume
@@ -513,7 +479,7 @@ size: 100GB
-The `region` ties your volume to a specific region, which then also ties your Pod to that same region.
+The `region` ties your volume to a specific region, which also ties your Pod to that region.
@@ -523,9 +489,7 @@ Apply the volume configuration:
dstack apply -f volume.dstack.yml
```
-This will create the volume named `llama31-volume`.
-
-### Using the Volume in Your Task
+### Use the volume in your task
Modify your `.dstack.yml` file to include the volume:
@@ -535,14 +499,6 @@ volumes:
path: /data
```
-This configuration will mount the volume to the `/data` directory inside your container.
-
-By doing this, you can store models and data persistently, which can be especially useful for large models that take time to download.
-
-For more information on using volumes with Runpod, refer to the [dstack blog on volumes](https://dstack.ai/blog/volumes-on-runpod/).
-
-***
-
-## Conclusion
+This mounts the volume to the `/data` directory inside your container, letting you store models and data persistently. This is useful for large models that take time to download.
-By leveraging [dstack](https://dstack.ai/) on Runpod, you can efficiently deploy and manage Pods, accelerating your development workflow and reducing operational overhead.
+For more information, see the [dstack blog on volumes](https://dstack.ai/blog/volumes-on-runpod/).
diff --git a/integrations/mods.mdx b/integrations/mods.mdx
index f5970c49..1000b1c5 100644
--- a/integrations/mods.mdx
+++ b/integrations/mods.mdx
@@ -3,28 +3,25 @@ title: "Running Runpod on Mods"
sidebarTitle: "Mods"
---
-[Mods](https://github.com/charmbracelet/mods) is an AI-powered tool designed for the command line and built to seamlessly integrate with pipelines. It provides a convenient way to interact with language models directly from your terminal.
+[Mods](https://github.com/charmbracelet/mods) is a command-line tool for interacting with language models. It integrates with Unix pipelines, letting you send command output directly to LLMs from your terminal.
-## How Mods Works
+## How Mods works
-Mods operates by reading standard input and prefacing it with a prompt supplied in the Mods arguments. It sends the input text to a language model (LLM) and prints out the generated result. Optionally, you can ask the LLM to format the response as Markdown. This allows you to "question" the output of a command, making it a powerful tool for interactive exploration and analysis. Additionally, Mods can work with standard input or an individually supplied argument prompt.
+Mods reads standard input (or a prompt you provide as an argument), sends it to a language model, and prints the result. You can prefix the input with a prompt from the Mods arguments, and optionally format the output as Markdown. This lets you pipe command output to an LLM for analysis or transformation.
-## Getting Started
+## Get started
-To start using Mods, follow these step-by-step instructions:
+
+
+ Get your API key from the [Runpod Settings](https://www.console.runpod.io/user/settings) page.
+
-1. **Obtain Your API Key**:
+
+ Follow the installation instructions for [Mods](https://github.com/charmbracelet/mods) based on your system.
+
- * Visit the [Runpod Settings](https://www.console.runpod.io/user/settings) page to retrieve your API key.
- * If you haven't created an account yet, you'll need to sign up before obtaining the key.
-
-2. **Install Mods**:
-
- * Refer to the different installation methods for [Mods](https://github.com/charmbracelet/mods) based on your preferred approach.
-
-3. **Configure Runpod**:
-
- * Update the `config_template.yml` file to use your Runpod configuration. Here's an example:
+
+ Update the `config_template.yml` file with your Runpod configuration:
```yml
runpod:
@@ -39,24 +36,21 @@ To start using Mods, follow these step-by-step instructions:
max-input-chars: 8192
```
- * `base-url`: Update your base-url with your specific endpoint.
-
- * `api-key-env`: Add your Runpod API key.
-
- * `openchat/openchat-3.5-1210`: Replace with the name of the model you want to use.
-
- * `aliases: ["openchat"]`: Replace with your preferred model alias.
-
- * `max-input-chars`: Update the maximum input characters allowed for your model.
-
-4. **Verify Your Setup**:
-
- * To ensure everything is set up correctly, pipe any command line output and pass it to `mods`.
+ Replace the following values:
+ * `base-url`: Your specific endpoint URL.
+ * `api-key-env`: Your Runpod API key.
+ * `openchat/openchat-3.5-1210`: The model name you want to use.
+ * `aliases: ["openchat"]`: Your preferred model alias.
+ * `max-input-chars`: The maximum input characters for your model.
+
- * Specify the Runpod API and model you want to use.
+
+ Test your setup by piping command output to Mods:
```sh
ls ~/Downloads | mods --api runpod --model openchat -f "tell my fortune based on these files" | glow
```
- * This command will list the files in your `~/Downloads` directory, pass them to Mods using the Runpod API and the specified model, and format the response as a fortune based on the files. The output will then be piped to `glow` for a visually appealing display.
+ This lists files in your `~/Downloads` directory, sends them to Mods using the Runpod API and specified model, and pipes the output to `glow` for formatted display.
+
+
diff --git a/integrations/n8n-integration.mdx b/integrations/n8n-integration.mdx
new file mode 100644
index 00000000..a3562164
--- /dev/null
+++ b/integrations/n8n-integration.mdx
@@ -0,0 +1,164 @@
+---
+sidebarTitle: n8n
+title: "Integrate Runpod with n8n"
+description: "Deploy a vLLM worker on Runpod and connect it to n8n for AI-powered workflow automation."
+tag: "NEW"
+---
+
+Learn how to integrate Runpod Serverless with n8n, a workflow automation tool. By the end of this tutorial, you'll have a vLLM endpoint running on Runpod that you can use within your n8n workflows.
+
+
+For a faster start, you can point your n8n workflow to an OpenAI-compatible [Public Endpoint](/hub/public-endpoints) instead of deploying a vLLM worker. To do this, skip to [step 2](#step-2%3A-create-an-n8n-workflow) to create your workflow, then in step 3, set the base URL to the Public Endpoint URL for Qwen3 32B AWQ:
+
+```
+https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1
+```
+
+
+## What you'll learn
+
+In this tutorial, you'll learn how to:
+
+* Deploy a vLLM worker serving the `Qwen/qwen3-32b-awq` model.
+* Configure your environment variables for n8n compatibility.
+* Create a simple n8n workflow to test your integration.
+* Connect your workflow to your Runpod endpoint.
+
+## Requirements
+
+Before you begin, you'll need:
+
+* A [Runpod account](/get-started/manage-accounts) (with available credits).
+* A [Runpod API key](/get-started/api-keys).
+* An [n8n](https://n8n.io/) account.
+
+## Step 1: Deploy a vLLM worker on Runpod
+
+First, you'll deploy a vLLM worker to serve the `Qwen/qwen3-32b-awq` model.
+
+
+
+ Open the [Runpod console](https://www.console.runpod.io/serverless) and navigate to the Serverless page.
+
+ Click **New Endpoint** and select **vLLM** under **Ready-to-Deploy Repos**.
+
+
+
+
+
+ For more details on vLLM deployment options, see [Deploy a vLLM worker](/serverless/vllm/get-started).
+
+
+ In the deployment modal:
+
+ * In the **Model** field, enter `Qwen/qwen3-32b-awq`.
+ * Expand the **Advanced** section to configure your vLLM environment variables:
+ * Set **Max Model Length** to `8192`.
+ * Near the bottom of the page, check **Enable Auto Tool Choice**.
+ * Set **Tool Call Parser** to `Hermes`.
+ * Set **Reasoning Parser** to `Qwen3`.
+ * Click **Next**.
+ * Click **Create Endpoint**.
+
+
+ When using a different model, you may need to adjust your vLLM environment variables to ensure your model returns responses in the format that n8n expects.
+
+
+ Your endpoint will now begin initializing. This may take several minutes while Runpod provisions resources and downloads your model. Wait until the status shows as **Running**.
+
+
+
+ Once deployed, you'll be taken to the detail page for your endpoint in the Runpod console. You can find your endpoint ID in the **Overview** tab:
+
+
+
+
+
+ You can also find your endpoint ID in the URL of the endpoint detail page. For example, if the URL for your endpoint is `https://console.runpod.io/serverless/user/endpoint/isapbl1e254mbj`, the endpoint ID is `isapbl1e254mbj`.
+
+ Copy your endpoint ID to your clipboard. You'll need it to configure your n8n workflow.
+
+
+
+## Step 2: Create an n8n workflow
+
+Next, you'll create a simple n8n workflow to test your integration.
+
+
+
+ Open n8n and navigate to your workspace, then click **Create Workflow**.
+
+
+ Click **Add first step** and select **On chat message**. Click **Test chat** to confirm.
+
+
+
+ Click the **+** button and search for **AI Agent** and select it. Click **Execute step** to confirm.
+
+
+
+ Click the **+** button labeled **Chat Model**. Search for **OpenAI Chat Model** and select it.
+
+
+
+ Click the dropdown under **Credential to connect with** and select **Create new credential**.
+
+
+
+
+## Step 3: Configure the OpenAI Chat Model node
+
+Now you'll configure the n8n OpenAI Chat Model node to use the model running on your Runpod endpoint.
+
+
+
+
+ Under **API Key**, add your Runpod API Key. You can create an API key on the settings page of the [Runpod console](https://console.runpod.io/user/settings).
+
+
+
+ Under **Base URL**, replace the default OpenAI URL with your Runpod endpoint URL:
+
+ ```
+ https://api.runpod.ai/v2/ENDPOINT_ID/openai/v1
+ ```
+
+ Replace `ENDPOINT_ID` with your vLLM endpoint ID from Step 1.
+
+
+
+ Click **Save**, and n8n will automatically test your endpoint connection.
+
+ It may take a few minutes for your endpoint to scale up a worker to process the request. You can monitor the request using the **Workers** and **Requests** tabs for your vLLM endpoint in the Runpod console.
+
+ If you see the message "Connection tested successfully," that means your endpoint is reachable, but it doesn't gaurantee that it's fully compatible with n8n—we'll do that in the next step.
+
+
+
+ Press escape to return to the OpenAI Chat Model configuration modal.
+
+ Under **Model**, select `qwen/qwen3-32b-awq`, then press escape to return to the workflow canvas.
+
+
+
+
+ Type a test message into the chat box like "Hello, how are you?" and press enter.
+
+ If everything is working correctly, you should see each of the nodes in your workflow go green to indicate successful execution, and a response from the model in the chat box.
+
+
+ Make sure to **Save** your workflow before closing it, as n8n may not save changes to your model node configuration automatically.
+
+
+
+
+
+## Next steps
+
+Congratulations! You've successfully used Runpod to power an AI agent on n8n.
+
+Now that you've integrated with n8n, you can:
+
+* Build complex AI-powered workflows using your Runpod endpoints.
+* Explore other [integration options](/integrations/overview) with Runpod.
+* Learn about [OpenAI compatibility](/serverless/vllm/openai-compatibility) features in vLLM.
diff --git a/integrations/overview.mdx b/integrations/overview.mdx
new file mode 100644
index 00000000..a936a791
--- /dev/null
+++ b/integrations/overview.mdx
@@ -0,0 +1,90 @@
+---
+title: "Integrate Runpod with external tools"
+sidebarTitle: "Overview"
+description: "Learn how to integrate Runpod compute resources with external tools and agentic frameworks."
+tag: "NEW"
+---
+
+You can integrate Runpod with any system that supports custom endpoint configuration. Integration is usually straightforward: any library or framework that accepts a custom base URL for API calls will work with Runpod without specialized adapters or connectors. This means you can use Runpod with tools like n8n, CrewAI, LangChain, and many others by simply pointing them to your Runpod endpoint URL.
+
+## Endpoint integration options
+
+Runpod offers four deployment options for endpoint integrations:
+
+### Public Endpoints
+
+[Public Endpoints](/hub/public-endpoints) are pre-deployed AI models that you can use without setting up your own Serverless endpoint. They're vLLM-compatible and return OpenAI-compatible responses, so you can get started quickly or test things out without deploying infrastructure.
+
+The following Public Endpoint URLs are available for OpenAI-compatible models:
+
+```
+# Public Endpoint for Qwen3 32B AWQ
+https://api.runpod.ai/v2/qwen3-32b-awq/openai/v1
+
+# Public Endpoint for ibm/IBM Granite-4.0-H-Small
+https://api.runpod.ai/v2/granite-4-0-h-small/openai/v1
+```
+
+### vLLM workers
+
+[vLLM workers](/serverless/vllm/overview) provide an inference engine that returns [OpenAI-compatible responses](/serverless/vllm/openai-compatibility), making it ideal for tools that expect OpenAI's API format.
+
+When you deploy a vLLM endpoint, access it using the OpenAI-compatible API at:
+
+```
+https://api.runpod.ai/v2/ENDPOINT_ID/openai/v1
+```
+
+Where `ENDPOINT_ID` is your Serverless endpoint ID.
+
+For a full walkthrough of how to integrate a vLLM endpoint with an agentic framework, see the n8n integration guide:
+
+
+
+ Connect Runpod to n8n for AI-powered workflow automation.
+
+
+
+### SGLang workers
+
+[SGLang workers](https://github.com/runpod-workers/worker-sglang) also return OpenAI-compatible responses, offering optimized performance for certain model types and use cases.
+
+### Load balancing endpoints
+
+[Load balancing endpoints](/serverless/load-balancing/overview) let you create custom endpoints where you define your own inputs and outputs. This gives you complete control over the API contract and is ideal when you need custom behavior beyond standard inference patterns.
+
+## Model configuration and compatibility
+
+Some models require specific vLLM environment variables to work with external tools and frameworks. You may need to set a custom chat template or [tool call parser](https://docs.vllm.ai/en/latest/features/tool_calling.html) to ensure your model returns responses in the format your integration expects.
+
+For example, you can configure the `Qwen/qwen3-32b-awq` model for OpenAI compatibility by adding these environment variables in your vLLM endpoint settings:
+
+```txt
+ENABLE_AUTO_TOOL_CHOICE=true
+REASONING_PARSER=qwen3
+TOOL_CALL_PARSER=hermes
+```
+
+These settings enable automatic tool choice selection and set the right parsers for the Qwen3 model to work with tools that expect OpenAI-formatted responses.
+
+For more information about tool calling configuration and available parsers, see the [vLLM tool calling documentation](https://docs.vllm.ai/en/latest/features/tool_calling.html).
+
+## Compatible frameworks
+
+The same integration pattern works with any framework that supports custom OpenAI-compatible endpoints, including:
+
+* **n8n**: A workflow automation tool with AI integration capabilities.
+* **CrewAI**: A framework for orchestrating role-playing autonomous AI agents.
+* **LangChain**: A framework for developing applications powered by language models.
+* **AutoGen**: Microsoft's framework for building multi-agent conversational systems.
+* **Haystack**: An end-to-end framework for building search systems and question answering.
+
+Configure these frameworks to use your Runpod endpoint URL as the base URL, and provide your Runpod API key for authentication.
+
+## Third-party integrations
+
+For infrastructure management and orchestration, Runpod also integrates with:
+
+* [**dstack**](/integrations/dstack): Simplified Pod orchestration for AI/ML workloads.
+* [**SkyPilot**](/integrations/skypilot): Multi-cloud execution framework.
+* [**Mods**](/integrations/mods): AI-powered command-line tool.
diff --git a/integrations/skypilot.mdx b/integrations/skypilot.mdx
index c48d63be..853f3304 100644
--- a/integrations/skypilot.mdx
+++ b/integrations/skypilot.mdx
@@ -3,41 +3,56 @@ title: "Running Runpod on SkyPilot"
sidebarTitle: "SkyPilot"
---
-[SkyPilot](https://skypilot.readthedocs.io/en/latest/) is a framework for executing LLMs, AI, and batch jobs on any cloud, offering maximum cost savings, highest GPU availability, and managed execution.
+[SkyPilot](https://skypilot.readthedocs.io/en/latest/) is a framework for running LLMs, AI, and batch jobs on any cloud.
-This integration leverages the Runpod CLI infrastructure, streamlining the process of spinning up on-demand pods and deploying serverless endpoints with SkyPilot.
+This integration uses the Runpod CLI infrastructure to spin up on-demand Pods and deploy Serverless endpoints with SkyPilot.
-## Getting started
+## Get started
-To begin using Runpod with SkyPilot, follow these steps:
+
+
+ Get your API key from the [Runpod Settings](https://www.console.runpod.io/user/settings) page.
+
-1. **Obtain Your API Key**: Visit the [Runpod Settings](https://www.console.runpod.io/user/settings) page to get your API key. If you haven't created an account yet, you'll need to do so before obtaining the key.
-
-2. **Install Runpod**: Use the following command to install the latest version of Runpod:
+
+ Install the latest version of Runpod:
```sh
pip install "runpod>=1.6"
```
+
-3. **Configure Runpod**: Enter `runpod config` in your CLI and paste your API key when prompted.
+
+ Run `runpod config` and paste your API key when prompted.
+
-4. **Install SkyPilot Runpod Cloud**: Execute the following command to install the [SkyPilot Runpod cloud](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#runpod):
+
+ Install the [SkyPilot Runpod cloud](https://skypilot.readthedocs.io/en/latest/getting-started/installation.html#runpod):
```sh
pip install "skypilot-nightly[runpod]"
```
+
-5. **Verify Your Setup**: Run `sky check` to ensure your credentials are correctly set up and you're ready to proceed.
-
-## Running a Project
+
+ Run `sky check` to verify your credentials are set up correctly.
+
+
-After setting up your environment, you can seamlessly spin up a cluster in minutes:
+## Run a project
-1. **Create a New Project Directory**: Run `mkdir hello-sky` to create a new directory for your project.
+
+
+ Create a new directory for your project:
-2. **Navigate to Your Project Directory**: Change into your project directory with `cd hello-sky`.
+ ```sh
+ mkdir hello-sky
+ cd hello-sky
+ ```
+
-3. **Create a Configuration File**: Enter `cat > hello_sky.yaml` and input the following configuration details:
+
+ Create a file named `hello_sky.yaml` with the following content:
```yml
resources:
@@ -60,9 +75,17 @@ After setting up your environment, you can seamlessly spin up a cluster in minut
echo "Hello, SkyPilot!"
conda env list
```
+
-4. **Launch Your Project**: With your configuration file created, launch your project on the cluster by running `sky launch -c mycluster hello_sky.yaml`.
+
+ Launch your project on the cluster:
-5. **Confirm Your GPU Type**: You should see the available GPU options on Secure Cloud appear in your command line. Once you confirm your GPU type, your cluster will start spinning up.
+ ```sh
+ sky launch -c mycluster hello_sky.yaml
+ ```
+
-With this integration, you can leverage the power of Runpod and SkyPilot to efficiently run your LLMs, AI, and batch jobs on any cloud.
+
+ You'll see the available GPU options. Confirm your GPU type and the cluster will start spinning up.
+
+