# Supernan AI Intern Challenge - Video Dubbing Pipeline

This notebook allows you to run the `dub_video.py` pipeline easily on a free **Google Colab T4 GPU**.

The pipeline will:
1. **Ingest**: Download the default Supernan training video and extract a 15-second segment.
2. **Transcribe**: Use `faster-whisper` (Base) to extract the text.
3. **Translate**: Use `Helsinki-NLP` to translate English into Hindi.
4. **Voice Clone**: Use `Coqui XTTS v2` to generate a Hindi voice keeping the speaker's original tone.
5. **Lip-Sync**: Use `Wav2Lip` to sync the video face to the new Hindi audio.

In [None]:
# 0. Setup a clean Python 3.10 environment with all dependencies in one shot
!apt-get install python3.10 python3.10-distutils -y
!wget -q https://bootstrap.pypa.io/get-pip.py
!python3.10 get-pip.py
!python3.10 -m pip install ffmpeg-python typing-extensions deep-translator TTS gdown librosa pydub transformers torch
!python3.10 -m pip install git+https://github.com/openai/whisper.git
!apt install -y ffmpeg


In [None]:
# 2. Clone and setup Wav2Lip for high-fidelity face sinking
!git clone https://github.com/Rudrabha/Wav2Lip.git
!python3.10 -m pip install opencv-python==4.8.0.76 opencv-contrib-python==4.8.0.76  # Overwrite Wav2Lip's strict old OpenCV requirement
!cd Wav2Lip && python3.10 -m pip install -r requirements.txt

# Download Wav2Lip checkpoint
# (Note: In a true zero-budget scenario, you download the public checkpoint. Colab can fetch it using wget)
!mkdir -p Wav2Lip/checkpoints
!wget -O Wav2Lip/checkpoints/wav2lip_gan.pth "https://huggingface.co/camenduru/Wav2Lip/resolve/main/checkpoints/wav2lip_gan.pth"

In [None]:
# 3. Run the Supernan Pipeline!
# We set the Wav2Lip root so the script knows where to find the inference tool.
import os
os.environ["WAV2LIP_ROOT"] = "/content/supernan/Wav2Lip" # Adjust if running in a different dir

# Bypass Google Drive Download limits by downloading directly via wget first
!wget -O /content/source_video.mp4 "https://huggingface.co/datasets/vanshagarwal/supernan/resolve/main/video.mp4" || echo "Failed S3 fallback"
# Note: If the wget URL is invalid, you must manually upload your video to Colab and pass the flag below:
!python3.10 dub_video.py --input_video /content/source_video.mp4 --work_dir /content/workdir --start 15 --end 30

### Output
Once the pipeline completes, your final video will be at `/content/workdir/dubbed_output.mp4`. You can download it directly from the Colab file explorer on the left.