# üé¨ Creative Video Diffusion

With 205GB VRAM free, you can run impressive video generation models!

**Available Models:**
1. **Stable Video Diffusion** - High-quality video generation
2. **ModelScope** - Video generation from text/prompts
3. **Open-Sora** - Open-source video generation
4. **Video-LLM** - Text-to-video with language models

**Note:** Some models may need specific setup. Let's try the most compatible ones first!


In [None]:
# Setup: Check for video generation libraries
import torch
import sys

print("üîç Checking video generation capabilities...\n")

# Check PyTorch
print(f"‚úÖ PyTorch: {torch.__version__}")
print(f"‚úÖ GPU: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"‚úÖ VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB\n")

# Check available libraries
video_libs = {
    'diffusers': 'Hugging Face diffusion models',
    'transformers': 'Video transformers',
    'torchvision': 'Video processing',
    'imageio': 'Video I/O',
    'opencv-python': 'cv2 - video processing',
}

print("üì¶ Checking libraries:")
for lib, desc in video_libs.items():
    try:
        mod = __import__(lib)
        v = getattr(mod, '__version__', '?')
        print(f"   ‚úÖ {lib}: {v}")
    except:
        print(f"   ‚ùå {lib}: NOT INSTALLED ({desc})")


In [None]:
# Option 1: Install Video Generation Libraries
# Run this if libraries are missing

print("üì¶ Installing video generation libraries...")

# Uncomment to install:
# !pip install diffusers transformers accelerate imageio opencv-python

print("‚úÖ Libraries installed (or already available)")
print("\nüìù Available video models:")
print("   - stabilityai/stable-video-diffusion-img2vid")
print("   - modelscope/text-to-video-synthesis")
print("   - THUDM/CogVideoX")
print("   - open-sora (from Open-Sora project)")


In [None]:
# DEMO 1: Stable Video Diffusion (Image-to-Video)
# Generate video from a single image

try:
    from diffusers import StableVideoDiffusionPipeline
    from diffusers.utils import load_image
    import torch
    
    print("üé¨ Loading Stable Video Diffusion...")
    
    # Load pipeline
    pipe = StableVideoDiffusionPipeline.from_pretrained(
        "stabilityai/stable-video-diffusion-img2vid",
        torch_dtype=torch.bfloat16,
    )
    pipe = pipe.to("cuda")
    
    # Load an image (you can provide your own)
    print("üì∏ Loading image...")
    # image = load_image("https://example.com/your-image.jpg")
    # Or create a simple test image
    from PIL import Image
    import numpy as np
    
    # Create a test image (replace with your image)
    test_image = Image.new('RGB', (512, 512), color='red')
    
    # Generate video
    print("üé• Generating video from image...")
    frames = pipe(
        image=test_image,
        decode_chunk_size=8,
        generator=torch.manual_seed(42),
    ).frames[0]
    
    print(f"‚úÖ Generated {len(frames)} frames!")
    
    # Save video
    import imageio
    imageio.mimsave("generated_video.mp4", frames, fps=7)
    print("üíæ Saved to generated_video.mp4")
    
except Exception as e:
    print(f"‚ö†Ô∏è Error: {e}")
    print("üí° Try installing: pip install diffusers imageio")


In [None]:
# DEMO 2: Text-to-Video with Transformers
# Generate video from text description

try:
    from transformers import AutoModelForCausalLM, AutoProcessor
    import torch
    
    print("üé¨ Loading Video Generation Model...")
    
    # Try CogVideoX or similar model
    # Note: Some models may not be available - adjust as needed
    model_name = "THUDM/CogVideoX-17B"  # Or try other video models
    
    processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype=torch.bfloat16,
        device_map="auto",
        trust_remote_code=True,
    )
    
    print("‚úÖ Model loaded!")
    
    # Generate video from text
    prompt = "A cat walking on the street"
    print(f"üìù Prompt: {prompt}")
    print("üé• Generating video...")
    
    inputs = processor(text=prompt, return_tensors="pt").to("cuda")
    with torch.no_grad():
        video = model.generate(**inputs, max_length=100)
    
    print("‚úÖ Video generated!")
    
except Exception as e:
    print(f"‚ö†Ô∏è Error: {e}")
    print("üí° This model may not be available or may need different setup")
    print("üí° Try Stable Video Diffusion (image-to-video) instead")


In [None]:
# DEMO 3: Video Generation with Open-Sora (If Available)
# Open-source video generation

try:
    import sys
    sys.path.append('/path/to/open-sora')  # Adjust path if needed
    
    from opensora import OpenSora
    
    print("üé¨ Loading Open-Sora...")
    
    # Initialize Open-Sora
    opensora = OpenSora(
        model_path="Open-Sora/Open-Sora",
        device="cuda",
    )
    
    # Generate video
    prompt = "A beautiful sunset over the ocean"
    print(f"üìù Prompt: {prompt}")
    print("üé• Generating video...")
    
    video = opensora.generate(
        prompt=prompt,
        num_frames=16,
        height=512,
        width=512,
    )
    
    print("‚úÖ Video generated!")
    
except Exception as e:
    print(f"‚ö†Ô∏è Open-Sora not available: {e}")
    print("üí° Install from: https://github.com/hpcaitech/Open-Sora")
    print("üí° Or use Stable Video Diffusion instead")


In [None]:
# DEMO 4: Video Editing/Style Transfer
# Apply styles to video frames

try:
    from diffusers import StableDiffusionImg2ImgPipeline
    import torch
    from PIL import Image
    import imageio
    
    print("üé® Loading Style Transfer Model...")
    
    pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
        "runwayml/stable-diffusion-v1-5",
        torch_dtype=torch.bfloat16,
    )
    pipe = pipe.to("cuda")
    
    # Load or create video frames
    print("üìπ Processing video frames...")
    
    # Example: Process a single frame
    # For full video, loop through frames
    prompt = "anime style, colorful, detailed"
    init_image = Image.new('RGB', (512, 512), color='blue')
    
    image = pipe(
        prompt=prompt,
        image=init_image,
        strength=0.75,
    ).images[0]
    
    print("‚úÖ Frame styled!")
    print("üí° For full video: loop through all frames")
    
except Exception as e:
    print(f"‚ö†Ô∏è Error: {e}")
    print("üí° Install: pip install diffusers")


## üéØ Best Video Generation Options for Your Setup

### ‚úÖ Recommended (Most Compatible):

1. **Stable Video Diffusion** (Image-to-Video)
   - Most stable and compatible
   - Works with ROCm
   - High quality output
   - Install: `pip install diffusers imageio`

2. **Video Diffusion Models** (Hugging Face)
   - Various models available
   - Text-to-video options
   - Check compatibility per model

### ‚ö†Ô∏è May Need Setup:

3. **Open-Sora**
   - Requires manual installation
   - May need ROCm-specific builds
   - Very powerful if working

4. **ModelScope**
   - Chinese video models
   - May need VPN/access

### üí° Quick Start:

**Easiest: Stable Video Diffusion**
```python
from diffusers import StableVideoDiffusionPipeline
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid",
    torch_dtype=torch.bfloat16,
).to("cuda")
```

**With your 205GB VRAM:**
- Can load largest video models
- Generate high-resolution videos
- Process multiple videos simultaneously
- Long videos (many frames)

---

## üöÄ Next Steps

1. **Install libraries:** `pip install diffusers imageio opencv-python`
2. **Try Stable Video Diffusion** (most compatible)
3. **Generate from images** you create/upload
4. **Experiment with prompts** for creative videos

**Your GPU is POWERFUL - perfect for video generation!** üé¨
