# VoiceDub - YouTube Dubbing Backend (GPU)

This notebook runs the dubbing backend on Google Colab's free T4 GPU.

**Setup:**
1. Go to **Runtime > Change runtime type > T4 GPU**
2. Run all cells below
3. Copy the `ngrok` public URL and paste it in your frontend

**Features:** Chatterbox TTS (human-like voice) + Whisper (GPU transcription) + Gemini translation

In [None]:
#@title 1. Check GPU
!nvidia-smi
import torch
print(f"\nCUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'None'}")

In [None]:
#@title 2. Clone repo & install dependencies
!git clone https://github.com/sasmalgiri/youtube-dubbing.git /content/app
%cd /content/app/backend

# Install backend requirements
!pip install -q -r requirements.txt

# Install ngrok for public URL tunnel
!pip install -q pyngrok

In [None]:
#@title 3. Set API Keys
#@markdown Enter your API keys below:

GEMINI_API_KEY = "" #@param {type:"string"}
ELEVENLABS_API_KEY = "" #@param {type:"string"}
OPENAI_API_KEY = "" #@param {type:"string"}
NGROK_AUTH_TOKEN = "" #@param {type:"string"}

#@markdown ---
#@markdown **Required:** `GEMINI_API_KEY` (free at https://aistudio.google.com/apikey)
#@markdown
#@markdown **Required:** `NGROK_AUTH_TOKEN` (free at https://dashboard.ngrok.com/get-started/your-authtoken)
#@markdown
#@markdown **Optional:** `ELEVENLABS_API_KEY`, `OPENAI_API_KEY`

import os
os.environ['GEMINI_API_KEY'] = GEMINI_API_KEY
if ELEVENLABS_API_KEY:
    os.environ['ELEVENLABS_API_KEY'] = ELEVENLABS_API_KEY
if OPENAI_API_KEY:
    os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

# Write .env file
with open('/content/app/backend/.env', 'w') as f:
    f.write(f'GEMINI_API_KEY={GEMINI_API_KEY}\n')
    if ELEVENLABS_API_KEY:
        f.write(f'ELEVENLABS_API_KEY={ELEVENLABS_API_KEY}\n')
    if OPENAI_API_KEY:
        f.write(f'OPENAI_API_KEY={OPENAI_API_KEY}\n')

print("API keys set!")
print(f"  Gemini: {'configured' if GEMINI_API_KEY else 'MISSING'}")
print(f"  ElevenLabs: {'configured' if ELEVENLABS_API_KEY else 'not set (optional)'}")
print(f"  OpenAI: {'configured' if OPENAI_API_KEY else 'not set (optional)'}")
print(f"  ngrok: {'configured' if NGROK_AUTH_TOKEN else 'MISSING'}")

In [None]:
#@title 4. Pre-download Whisper model (faster first job)
from faster_whisper import WhisperModel
print("Downloading Whisper 'small' model...")
model = WhisperModel("small", device="cuda", compute_type="float16")
del model
print("Whisper model cached!")

In [None]:
#@title 5. Start Backend Server + ngrok Tunnel
#@markdown This cell starts the FastAPI backend and creates a public URL.
#@markdown
#@markdown **Copy the ngrok URL** and use it as your backend URL.

import subprocess
import time
from pyngrok import ngrok, conf

# Set ngrok auth token
if NGROK_AUTH_TOKEN:
    conf.get_default().auth_token = NGROK_AUTH_TOKEN
else:
    print("ERROR: ngrok auth token is required!")
    print("Get one free at: https://dashboard.ngrok.com/get-started/your-authtoken")
    raise ValueError("Missing NGROK_AUTH_TOKEN")

# Start uvicorn in background
proc = subprocess.Popen(
    ['python', '-m', 'uvicorn', 'app:app', '--host', '0.0.0.0', '--port', '8000'],
    cwd='/content/app/backend',
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT,
)
time.sleep(3)

# Create ngrok tunnel
public_url = ngrok.connect(8000, "http")

print("="*60)
print(f"Backend running on GPU!")
print(f"")
print(f"PUBLIC URL: {public_url}")
print(f"")
print(f"To use with your local frontend:")
print(f"  1. Open web/src/lib/api.ts")
print(f"  2. Change API_BASE to: {public_url}/api")
print(f"  3. Or set in next.config.mjs rewrites")
print(f"")
print(f"Test: {public_url}/api/health")
print("="*60)

In [None]:
#@title 6. Monitor Server Logs (run this to see live output)
#@markdown Keep this cell running to see backend logs in real-time.

import time

print("Monitoring server... (this cell keeps running)")
print("Submit a dubbing job from your frontend to see progress here.")
print("-" * 60)

try:
    while proc.poll() is None:
        line = proc.stdout.readline()
        if line:
            print(line.decode('utf-8', errors='replace').rstrip())
        else:
            time.sleep(0.5)
except KeyboardInterrupt:
    print("\nStopped monitoring (server still running)")