# MuseTalk on Google Colab

This notebook sets up and runs MuseTalk on Google Colab with GPU acceleration.

**Requirements:**
- Google account
- GPU runtime (free tier works)

**Time to setup:** ~10-15 minutes

---

## Step 1: Enable GPU Runtime

**IMPORTANT:** Before running any cells:
1. Click **Runtime** → **Change runtime type**
2. Select **T4 GPU** (or any available GPU)
3. Click **Save**

Then verify GPU is available:

In [None]:
# Verify GPU is available
!nvidia-smi

import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

## Step 2: Clone Repository

In [None]:
# Clone MuseTalk repository
%cd /content
!git clone https://github.com/TMElyralab/MuseTalk.git
%cd MuseTalk

## Step 3: Install Dependencies

In [None]:
# Install PyTorch (Colab usually has it, but ensure correct version)
!pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

In [None]:
# Install Python dependencies
!pip install -r requirements.txt

In [None]:
# Install MMLab packages
!pip install --no-cache-dir -U openmim
!mim install mmengine
!mim install "mmcv==2.0.1"
!mim install "mmdet==3.1.0"
!mim install "mmpose==1.1.0"

## Step 4: Download Model Weights

This will download ~8.5GB of model weights. Takes 5-10 minutes.

In [None]:
# Download all model weights
!bash download_weights.sh

## Step 5: Verify Setup

In [None]:
# Verify all model files exist
import os

required_files = [
    'models/musetalk/musetalk.json',
    'models/musetalk/pytorch_model.bin',
    'models/musetalkV15/musetalk.json',
    'models/musetalkV15/unet.pth',
    'models/sd-vae/config.json',
    'models/sd-vae/diffusion_pytorch_model.bin',
    'models/whisper/config.json',
    'models/whisper/pytorch_model.bin',
    'models/dwpose/dw-ll_ucoco_384.pth',
    'models/syncnet/latentsync_syncnet.pt',
    'models/face-parse-bisent/79999_iter.pth',
    'models/face-parse-bisent/resnet18-5c106cde.pth'
]

all_present = True
for f in required_files:
    exists = os.path.exists(f)
    if not exists:
        all_present = False
    print(f"{'✅' if exists else '❌'} {f}")

print(f"\n{'✅ All models downloaded!' if all_present else '❌ Some models missing'}")

## Option A: Launch Gradio Interface

This creates a web interface you can use to upload videos and audio files.

In [None]:
# Launch Gradio app (will create a public URL)
!python app.py --use_float16

## Option B: Run Command-Line Inference

### Upload Your Files First

In [None]:
# Upload your video and audio files
from google.colab import files

print("Upload your video file:")
uploaded_video = files.upload()

print("\nUpload your audio file:")
uploaded_audio = files.upload()

# Get filenames
video_file = list(uploaded_video.keys())[0]
audio_file = list(uploaded_audio.keys())[0]

print(f"\nVideo: {video_file}")
print(f"Audio: {audio_file}")

### Run Inference

In [None]:
# Run inference with uploaded files
!python -m scripts.inference \
  --video_path "{video_file}" \
  --audio_path "{audio_file}" \
  --result_dir results/custom \
  --unet_model_path models/musetalkV15/unet.pth \
  --unet_config models/musetalkV15/musetalk.json \
  --version v15

## Option C: Test with Built-in Samples

Use the included sample videos and audio:

In [None]:
# Run inference with built-in test data
!python -m scripts.inference \
  --inference_config configs/inference/test.yaml \
  --result_dir results/test \
  --unet_model_path models/musetalkV15/unet.pth \
  --unet_config models/musetalkV15/musetalk.json \
  --version v15

## Step 6: Download Results

In [None]:
# List output files
!ls -lh results/

In [None]:
# Download result video
from google.colab import files
import os

# Find the output video (adjust path as needed)
result_dir = 'results/test'  # or 'results/custom'
for file in os.listdir(result_dir):
    if file.endswith('.mp4'):
        result_path = os.path.join(result_dir, file)
        print(f"Downloading: {result_path}")
        files.download(result_path)

## Optional: Display Result in Notebook

In [None]:
# Display the result video in the notebook
from IPython.display import Video
import os

result_dir = 'results/test'  # or 'results/custom'
for file in os.listdir(result_dir):
    if file.endswith('.mp4'):
        result_path = os.path.join(result_dir, file)
        display(Video(result_path, width=640))
        break

---

## Troubleshooting

### Out of Memory Error
```python
# Use float16 precision
!python app.py --use_float16
# Or reduce batch size in configs
```

### GPU Not Available
1. Go to **Runtime** → **Change runtime type**
2. Select **GPU**
3. Click **Save**
4. Restart and run cells again

### Models Not Downloading
```python
# Try manual download
!wget https://huggingface.co/TMElyralab/MuseTalk/resolve/main/musetalkV15/unet.pth -P models/musetalkV15/
```

---

## Performance Tips

1. **Use T4 GPU** (free tier) or upgrade to A100 for faster inference
2. **Use float16** (`--use_float16`) to reduce memory usage
3. **Keep videos short** (< 30 seconds) for faster processing
4. **Use 25fps videos** for best results (matches training data)

---

## Expected Performance

- **T4 GPU (Free):** ~10-15fps
- **V100 GPU:** ~20-30fps
- **A100 GPU:** ~30fps+

Processing a 10-second video typically takes 1-3 minutes on T4 GPU.