Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions vllm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ llm-scaler-vllm is an extended and optimized version of vLLM, specifically adapt
2.5 [Omni Model Support](#25-omni-model-support)
2.6 [Data Parallelism (DP)](#26-data-parallelism-dp)
2.7 [Finding maximum Context Length](#27-finding-maximum-context-length)
2.8 [Multi-Modal Webui](#28-multi-modal-webui)
3. [Supported Models](#3-supported-models)
4. [Troubleshooting](#4-troubleshooting)
5. [Performance tuning](#5-performance-tuning)
Expand Down Expand Up @@ -2314,6 +2315,33 @@ In this case, you should adjust the launch command with:
--max-model-len 114432
```

### 2.8 Multi-Modal Webui
The project provides two optimized interfaces for interacting with Qwen2.5-VL models:


#### 📌 Core Components
- **Inference Engine**: vLLM (Intel-optimized)
- **Interfaces**:
- Gradio (for rapid prototyping)
- ComfyUI (for complex workflows)

#### 🚀 Deployment Options

#### Option 1: Gradio Deployment (Recommended for Most Users)
- check `/llm-scaler/vllm/webui/multi-modal-gradio/README.md` for implementation details

#### Option 2: ComfyUI Deployment (Advanced Workflows)
- check `/llm-scaler/vllm/webui/multi-modal-comfyui/README.md` for implementation details


#### 🔧 Configuration Guide

| Parameter | Effect | Recommended Value |
|-----------|--------|-------------------|
| `--quantization fp8` | XPU acceleration | Required |
| `-tp=2` | Tensor parallelism | Match GPU count |
| `--max-model-len` | Context window | 32768 (max) |

---

## 3. Supported Models
Expand Down
284 changes: 284 additions & 0 deletions vllm/webui/multi-modal-comfyui/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,284 @@
# Qwen2.5-VL-3B-Instruct Deployment Guide (ComfyUI + Intel GPU + Linux)

This document provides comprehensive instructions for deploying the `Qwen2.5-VL-3.5B-Instruct` multimodal LLM on Linux systems with `Intel GPU` acceleration via `ComfyUI` workflow.

## 🛠️ Installation Procedure
### 1. Environment Setup
```bash
# Install system dependencies
sudo apt update && sudo apt install -y \
git python3-pip python3-venv \
ocl-icd-opencl-dev

# Configure Intel GPU drivers (if not present)
sudo apt install -y \
intel-opencl-icd \
intel-level-zero-gpu \
level-zero
```

### 2. Conda Environment Configuration
```bash
conda create -n comfyqwen python=3.11
conda activate comfyqwen
```

### 3. ComfyUI Installation
```bash
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ./ComfyUI

# Install Intel-optimized PyTorch
pip install torch torchvision torchaudio \
--index-url https://download.pytorch.org/whl/xpu

# For nightly builds with potential performance improvements:
# pip install --pre torch torchvision torchaudio \
# --index-url https://download.pytorch.org/whl/nightly/xpu

pip install -r requirements.txt
```

### 4. Qwen2.5-VL Custom Node Deployment
```bash
# Download node definition files
git clone https://github.com/IuvenisSapiens/ComfyUI_Qwen2_5-VL-Instruct

Move the ComfyUI_Qwen2_5-VL-Instruct folder into /ComfyUI/custom_nodes/ directory

Place the downloaded Qwen2.5-VL-3B-Instruct model folder into /ComfyUI/models/prompt_generator/
# If prompt_generator subdirectory doesn't exist under models, please create it first
```
<details><summary>ComfyUI_Qwen2_5-VL-Instruct_workflow.json</summary>
{
"id": "9f2dfc63-3d19-433d-a7c0-49d83464f553",
"revision": 0,
"last_node_id": 59,
"last_link_id": 72,
"nodes": [
{
"id": 56,
"type": "Qwen2_VQA",
"pos": [
199.93017578125,
46.947696685791016
],
"size": [
322.1059265136719,
348
],
"flags": {},
"order": 2,
"mode": 0,
"inputs": [
{
"name": "source_path",
"shape": 7,
"type": "PATH",
"link": 70
},
{
"name": "image",
"shape": 7,
"type": "IMAGE",
"link": null
}
],
"outputs": [
{
"name": "STRING",
"type": "STRING",
"slot_index": 0,
"links": [
72
]
}
],
"properties": {
"Node name for S&R": "Qwen2_VQA",
"widget_ue_connectable": {}
},
"widgets_values": [
"Describe the video in detail",
"Qwen2.5-VL-3B-Instruct",
"none",
false,
0.7,
2048,
200704,
1003520,
1444,
"randomize",
"eager"
]
},
{
"id": 59,
"type": "PreviewAny",
"pos": [
702.7207641601562,
61.4115104675293
],
"size": [
140,
76
],
"flags": {},
"order": 3,
"mode": 0,
"inputs": [
{
"name": "source",
"type": "*",
"link": 72
}
],
"outputs": [],
"properties": {
"Node name for S&R": "PreviewAny"
},
"widgets_values": []
},
{
"id": 58,
"type": "VideoLoader",
"pos": [
-513.0911254882812,
130.9906768798828
],
"size": [
430.6719665527344,
452.4115295410156
],
"flags": {},
"order": 0,
"mode": 0,
"inputs": [],
"outputs": [
{
"name": "VIDEO",
"type": "VIDEO",
"links": null
},
{
"name": "PATH",
"type": "PATH",
"links": [
71
]
}
],
"properties": {
"Node name for S&R": "VideoLoader",
"widget_ue_connectable": {}
},
"widgets_values": [
"19_raw.mp4",
"image"
]
},
{
"id": 57,
"type": "MultiplePathsInput",
"pos": [
-49.730098724365234,
137.55857849121094
],
"size": [
210,
82
],
"flags": {},
"order": 1,
"mode": 0,
"inputs": [
{
"name": "path_1",
"type": "PATH",
"link": 71
}
],
"outputs": [
{
"name": "paths",
"type": "PATH",
"slot_index": 0,
"links": [
70
]
}
],
"properties": {
"Node name for S&R": "MultiplePathsInput",
"widget_ue_connectable": {}
},
"widgets_values": [
1
]
}
],
"links": [
[
70,
57,
0,
56,
0,
"PATH"
],
[
71,
58,
1,
57,
0,
"PATH"
],
[
72,
56,
0,
59,
0,
"*"
]
],
"groups": [],
"config": {},
"extra": {
"ds": {
"scale": 0.9646149645000006,
"offset": [
788.9511067206646,
382.6344411516708
]
},
"frontendVersion": "1.24.4",
"ue_links": [],
"links_added_by_ue": [],
"VHS_latentpreview": false,
"VHS_latentpreviewrate": 0,
"VHS_MetadataImage": true,
"VHS_KeepIntermediate": true
},
"version": 0.4
}
</details>

## 🚀 Launching ComfyUI
```bash
python main.py
```
Access the web interface at: `http://localhost:8188`

## Post-Installation Configuration
1. Replace the final component node with `Preview Any` in your workflow
2. Reference model path: `./models/prompt_generator/Qwen2.5-VL-3B-Instruct/`

![Workflow Example](pic/image.png)

## References
- [ComfyUI GitHub](https://github.com/comfyanonymous/ComfyUI)
- [Intel PyTorch XPU](https://intel.github.io/intel-extension-for-pytorch/)
- [Qwen2.5 Model Card](https://huggingface.co/Qwen/Qwen2.5-VL-3B-Instruct)

Binary file added vllm/webui/multi-modal-comfyui/pic/image.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading