# üé¨ YTautoma - YouTube Shorts Automation

Generate 60-second YouTube Shorts using local AI models:
- **Story**: Gemma 3 (via Ollama)
- **Images**: Z-Image-Turbo
- **Video**: Wan 2.2
- **Voice**: VibeVoice

**Works on**: Colab (A100), RunPod, Lambda Labs, etc.

## 1Ô∏è‚É£ Setup

In [None]:
# Clone YTautoma
import os

# Auto-detect workspace
if os.path.exists('/content'):
    WORKSPACE = '/content'
elif os.path.exists('/workspace'):
    WORKSPACE = '/workspace'
else:
    WORKSPACE = os.path.expanduser('~')

os.chdir(WORKSPACE)
print(f'Workspace: {WORKSPACE}')

!git clone https://github.com/DragonLord1998/YTautoma.git
os.chdir('YTautoma')
PROJECT_DIR = os.getcwd()
print(f'Project: {PROJECT_DIR}')

In [None]:
# Install dependencies
!pip install -q -r requirements.txt
!pip install -q git+https://github.com/huggingface/diffusers
!pip install -q flash-attn --no-build-isolation 2>/dev/null || echo 'flash-attn optional'

In [None]:
# Install Ollama
!curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama in background
import subprocess
import time
subprocess.Popen(['ollama', 'serve'], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
time.sleep(5)

# Pull Gemma 3 (use smaller model for cloud GPUs)
!ollama pull gemma3:4b

In [None]:
# Clone Wan 2.2
!mkdir -p models
!git clone https://github.com/Wan-Video/Wan2.2.git models/Wan2.2
!pip install -q -r models/Wan2.2/requirements.txt

In [None]:
# Download Wan 2.2 TI2V-5B (smaller model, works on most GPUs)
!pip install -q "huggingface_hub[cli]"
!huggingface-cli download Wan-AI/Wan2.2-TI2V-5B --local-dir models/Wan2.2-TI2V-5B

In [None]:
# Clone VibeVoice
!git clone https://github.com/microsoft/VibeVoice.git models/VibeVoice
!pip install -q -e models/VibeVoice

In [None]:
# Create .env configuration (auto-detect paths)
import os
PROJECT_DIR = os.getcwd()

env_content = f"""OLLAMA_MODEL=gemma3:4b
OLLAMA_BASE_URL=http://localhost:11434

ZIMAGE_MODEL=Tongyi-MAI/Z-Image-Turbo
ZIMAGE_DEVICE=cuda

WAN_REPO_PATH={PROJECT_DIR}/models/Wan2.2
WAN_MODEL_PATH={PROJECT_DIR}/models/Wan2.2-TI2V-5B
WAN_T5_CPU=true
WAN_OFFLOAD_MODEL=true

VIBEVOICE_REPO_PATH={PROJECT_DIR}/models/VibeVoice
VIBEVOICE_MODEL=microsoft/VibeVoice-Realtime-0.5B
VIBEVOICE_SPEAKER=Carter

LOW_VRAM_MODE=true
TORCH_DTYPE=float16
"""

with open('.env', 'w') as f:
    f.write(env_content)

print('‚úÖ Configuration saved!')
print(f'   Project: {PROJECT_DIR}')
print(f'   VibeVoice: {PROJECT_DIR}/models/VibeVoice')

## 2Ô∏è‚É£ Generate YouTube Short

In [None]:
# Generate story only (quick test)
!python main.py --story-only -c mystery

In [None]:
# Generate images only (no video)
!python main.py --images-only -c horror

In [None]:
# Full pipeline
!python main.py -c sci-fi

## 3Ô∏è‚É£ Download Output

In [None]:
# List generated files
!ls -la output/
!find output -name '*.mp4' -o -name '*.png' | head -20

In [None]:
# Download (Colab only)
import os
import glob

try:
    from google.colab import files
    videos = glob.glob('output/**/*.mp4', recursive=True)
    if videos:
        latest = max(videos, key=lambda x: os.path.getmtime(x))
        print(f'Downloading: {latest}')
        files.download(latest)
    else:
        print('No video found. Run the pipeline first!')
except ImportError:
    print('Not in Colab. Find your video at:')
    !find output -name '*.mp4'