# Video Generation with ComfyUI on AMD Instinct MI300X GPU

With the rapid development of Artificial Intelligence Generated Content (AIGC) technologies, text-to-video and image-to-video generation are becoming powerful tools for creators, researchers, and media professionals.

[ComfyUI](https://github.com/comfyanonymous/ComfyUI) is a node-based graphical interface designed for diffusion models, enabling users to visually construct AI image/video generation workflows through modular operations. Its flexible node design, efficiency, and compatibility make it an excellent choice for boosting productivity in creative workflows.

This hands-on workshop walks you through setting up and running ComfyUI on **AMD Instinct MI300X GPUs using ROCm software**. You will learn how to configure your environment, install ComfyUI, and generate videos from text or a combination of text and images on AMD's latest data center GPUs.

## Prerequisites  
This workshop was developed and tested with the following setup:  

### Hardware  
- **AMD Instinct MI300X GPUs**:  We're using AMD Dev Cloud with Instinct MI300X hardware for this workshop.  

### Software  
- **ROCm 7.0**: Confirm the ROCm with running:  

In [None]:
!rocm-smi

This command will list your available AMD GPUs along with their key details.

# ComfyUI setup

To set up the ComfyUI inference environment, follow the steps below.

#### Verify the PyTorch installation

Verify that PyTorch is correctly installed.

**Step 1** Verify PyTorch is installed and can detect the GPU compute device.

In [None]:
!python3 -c 'import torch' 2> /dev/null && echo 'Success' || echo 'Failure'

The expected result is `Success`.

**Step 2** Confirm the GPU is available.

In [None]:
!python3 -c 'import torch; print(torch.cuda.is_available())'

The expected result is `True`.

**Step 3** Display the installed GPU device name.

In [None]:
!python3 -c "import torch; print(f'device name [0]:', torch.cuda.get_device_name(0))"

The expected result should be similar to: `device name [0]:  AMD Instinct MI300X`

# ComfyUI installation

Install ComfyUI from source on the system with the AMD GPU.

Go to the following repo and Ensure that PyTorch will not be reinstalled with the CUDA version:

In [None]:
%cd ComfyUI
!sed -i.bak -E '/^(torch|torchaudio|torchvision)([<>=~!0-9.]*)?$/s/^/# /' requirements.txt

Install the dependencies:

In [None]:
!pip3 install -r requirements.txt

# Running video generation

Follow these steps to generate video from your text or text and image

---

## Model Setup

### 📦 Required Models

We’ve already downloaded the necessary models and placed them in the correct ComfyUI directories:

- **Diffusion Models**:  
  - wan2.2_i2v_high_noise_14B_fp16.safetensors  
  - wan2.2_i2v_low_noise_14B_fp16.safetensors
  - qwen_image_edit_2509_bf16.safetensors

- **VAE Model**:  
  - wan_2.1_vae.safetensors
  - qwen_image_vae.safetensors  

- **Text Encoder Model**:  
  - umt5_xxl_fp16.safetensors
  - qwen_2.5_vl_7b.safetensors

- **LoRAs**:  
  - wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors  
  - wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
  - Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors

---

### 📂 Directory Structure

The models are organized inside the `ComfyUI/models` folder as follows:

```text
ComfyUI/
└───📂 models/
    ├───📂 diffusion_models/
    │   ├─── wan2.2_i2v_high_noise_14B_fp16.safetensors
    │   └─── wan2.2_i2v_low_noise_14B_fp16.safetensors
    │   └─── qwen_image_edit_2509_bf16.safetensors
    ├───📂 text_encoders/
    │   └─── umt5_xxl_fp16.safetensors
    │   └─── qwen_2.5_vl_7b.safetensors
    ├───📂 vae/
    │   └─── wan_2.1_vae.safetensors
    │   └─── qwen_image_vae.safetensors 
    └───📂 loras/
    │   ├─── put_loras_here
    │   └─── wan2.2_i2v_lightx2v_4steps_lora_v1_high_noise.safetensors 
    │   └─── wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
    │   └─── Qwen-Image-Edit-2509-Lightning-4steps-V1.0-bf16.safetensors
    



## 🚀 Launch the Server

To start the ComfyUI server, run the following command:

In [None]:
!python3 main.py --listen 0.0.0.0 --port 9000

Once you see a message like `To see the GUI go to: http://0.0.0.0:9000` in the terminal output, it means the **ComfyUI server has launched successfully**.  

⚠️  **Do not** use `http://0.0.0.0:9000` directly — instead, use the **Your App URL** shown on the **Launch Notebook** page to access the interface in your browser.

## 🌐 Open the ComfyUI Interface

Once you paste the corrected link into your browser, the ComfyUI interface will open.  

You’ll see:  
- A **node-based canvas** in the main area  
- A **sidebar on the left**, where you can load workflows and start generating  

---

## 🔄 Load the Workflow  

ComfyUI workflows define the full pipeline and all parameters required to generate an image or video. These workflows are typically saved as a JSON file or embedded within an animated WebP image (`*.webp`). You can create your own workflow from scratch or customize one provided by third parties.  

In this workshop, we’ll use the **multimodel_workflow.json** template.  

---

### 🧩 How to Load the Workflow  

Once you've launched the ComfyUI interface in your browser:  

1. Click **Workflows** in the sidebar.  
2. Select **multimodel_workflow.json**.  

The full workflow, with all pre-configured nodes, will load onto your canvas and be ready to run.  

This setup ensures you can immediately start experimenting with **multimodel video generation** using the **Wan 2.2 14B model** and the **Qwen-Image model**.  



## ▶️ Run the Workflow

Before running the video generation, make sure the correct models and settings are loaded

---

### 🚀 Execute Video Generation

Click the **Run** button, or press **Ctrl (Cmd) + Enter** to start the video generation process.

![ComfyUI Interface](./assets/workflow.png)

## 🧠 Advanced Assignment: Create an AMD-Themed Video in ComfyUI

### 🎯 Objective
Your task is to create a short, cinematic video inspired by **AMD’s technological vision**, using a customized workflow and prompts of your choice.

---

### 🛠️ Instructions

1. **Choose Your Video Model**

   You are free to choose **any video generation model** supported in ComfyUI (e.g., `wan2.2`, `LTX-Video`, etc.).

   > 💡 **Tip**: If you want to use another template — for example the **Wan 2.2 14B Fun Camera Control** — you can download the required models and place them in the correct directories just like the template shows.  
   > For example, in Jupyter Notebook, run:
   > ```bash
   > !wget https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/resolve/main/split_files/diffusion_models/wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors \
   >   -O models/diffusion_models/wan2.2_fun_camera_high_noise_14B_fp8_scaled.safetensors
   > ```

2. **Design Your Workflow**

   - You may start from an existing workflow template or build your own from scratch.

3. **Set Your Parameters**
   - Adjust resolution and number of frames (e.g., 720x720, 24 frames)
   - Tweak prompts, model settings, or sampler parameters to shape the style

---

### 🎬 Creative Task: *“The AMD Vision”*

Create a **5-10 second video** with one of the following AMD-related themes:

- 🧠 *AI on AMD Instinct* — visualize powerful AI inference or training on MI300X
- ⚙️ *Next-Gen Compute* — represent raw compute performance, silicon, or futuristic chips
- 🔥 *Gaming with Radeon* — imagine energy, motion, or light inspired by GPU gaming

---

### ✅ Requirements
- Export as **`.mp4`** and save your **workflow `.json`**
- Include a **short description** of your concept, model choice, and prompt design

---

### 💡 Bonus Ideas

- Start from an uploaded image (e.g., AMD product photo)

---

### 📤 Submission

Please submit the following:
- 🎞️ Your generated video (`.mp4`)
- 🧩 Your ComfyUI workflow (`.json`)
- 📝 A short write-up on your approach and prompt strategy

---

### 🚀 Get Creative

Feel free to explore and experiment — the goal is to combine **technical skill** and **creative storytelling** using video generation models powered by AMD hardware.
