# Wan 2.2 Video Generation with SGLang (udocker)

Run Wan 2.2 text-to-video model in Google Colab using udocker.

**Requirements:**
- Colab with GPU (T4, A100, or L4)
- ~40GB disk space for model
- ~20GB for Docker image

## 1. Install udocker

In [None]:
# Install udocker
!pip install udocker
!udocker install

## 2. Load Docker Image

Option A: Load from tar file (if you uploaded it to Google Drive)

In [None]:
# Mount Google Drive (if using tar from Drive)
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Load image from tar file
!udocker load -i /content/drive/MyDrive/glm-image-sglang-v0.5.0.tar

Option B: Pull from Docker Hub (Recommended)

In [None]:
# Pull from Docker Hub (recommended - no need to upload tar file)
!udocker pull khalidnass/glm-image-sglang:v0.5.0

## 3. Create Container

In [None]:
# Create container from image
!udocker create --name=wan22 khalidnass/glm-image-sglang:v0.5.0

## 4. Download Wan 2.2 Model

In [None]:
# Install huggingface_hub for downloading
!pip install -q huggingface_hub

In [None]:
# Download Wan 2.2 T2V model (14B parameters, ~40GB)
# This takes 10-30 minutes depending on connection
from huggingface_hub import snapshot_download
import os

os.makedirs('/content/models', exist_ok=True)

snapshot_download(
    repo_id='Wan-AI/Wan2.2-T2V-A14B-Diffusers',
    local_dir='/content/models/Wan2.2-T2V-A14B-Diffusers',
    local_dir_use_symlinks=False
)

In [None]:
# Alternative: Download smaller TI2V model (5B, ~15GB)
# snapshot_download(
#     repo_id='Wan-AI/Wan2.2-TI2V-5B-Diffusers',
#     local_dir='/content/models/Wan2.2-TI2V-5B-Diffusers',
#     local_dir_use_symlinks=False
# )

## 5. Run SGLang Server

In [None]:
# Set execution mode for GPU support
!udocker setup --nvidia wan22

In [None]:
%%bash --bg
# Run SGLang server in background
udocker run \
  --volume=/content/models:/app/models \
  --env="MODEL_PATH=/app/models/Wan2.2-T2V-A14B-Diffusers" \
  --env="HF_HUB_OFFLINE=1" \
  wan22 \
  sglang serve --model-path /app/models/Wan2.2-T2V-A14B-Diffusers --port 30000 --host 0.0.0.0 \
  > /content/sglang.log 2>&1

In [None]:
# Wait for server to start (check logs)
import time
print("Waiting for server to start (2-5 minutes)...")
time.sleep(120)
!tail -50 /content/sglang.log

In [None]:
# Check if server is ready
import requests

try:
    r = requests.get('http://localhost:30000/health', timeout=5)
    print(f"Server ready: {r.json()}")
except:
    print("Server not ready yet. Check logs:")
    !tail -20 /content/sglang.log

## 6. Generate Video

In [None]:
import requests
import base64
from IPython.display import Video, display

BASE_URL = "http://localhost:30000"

# Generate text-to-video
response = requests.post(
    f"{BASE_URL}/v1/video/generations",
    json={
        "prompt": "A cat walking through a beautiful garden with colorful flowers",
        "size": "832x480",
        "num_frames": 81,
        "num_inference_steps": 50,
        "guidance_scale": 5.0,
        "response_format": "b64_json"
    },
    timeout=1200
)

if response.status_code == 200:
    data = response.json()
    video_bytes = base64.b64decode(data["data"][0]["b64_json"])
    
    with open("/content/output.mp4", "wb") as f:
        f.write(video_bytes)
    
    print("Video generated successfully!")
    display(Video("/content/output.mp4", embed=True))
else:
    print(f"Error: {response.status_code}")
    print(response.text)

## 7. More Examples

In [None]:
# Generate another video with different prompt
prompts = [
    "A rocket launching into space with flames and smoke",
    "Ocean waves crashing on a sandy beach at sunset",
    "A person walking through a snowy forest"
]

for i, prompt in enumerate(prompts):
    print(f"Generating video {i+1}: {prompt[:50]}...")
    
    response = requests.post(
        f"{BASE_URL}/v1/video/generations",
        json={
            "prompt": prompt,
            "size": "832x480",
            "num_frames": 81,
            "num_inference_steps": 50,
            "response_format": "b64_json"
        },
        timeout=1200
    )
    
    if response.status_code == 200:
        data = response.json()
        video_bytes = base64.b64decode(data["data"][0]["b64_json"])
        
        with open(f"/content/video_{i+1}.mp4", "wb") as f:
            f.write(video_bytes)
        
        print(f"  Saved to video_{i+1}.mp4")
        display(Video(f"/content/video_{i+1}.mp4", embed=True))
    else:
        print(f"  Error: {response.text}")

## 8. Download Generated Videos

In [None]:
from google.colab import files

# Download videos to your local machine
files.download('/content/output.mp4')

## Troubleshooting

**Server won't start:**
```python
!cat /content/sglang.log
```

**Out of memory:**
- Use smaller model: `Wan2.2-TI2V-5B-Diffusers`
- Reduce `num_frames` to 41
- Use smaller `size`: "480x272"

**udocker GPU issues:**
```python
!udocker setup --nvidia --force wan22
```