# pyVideoTrans Speech-to-Speech on Google Colab

This notebook installs the pyVideoTrans toolkit, prepares runtime dependencies, and
shows how to translate a video into a target language while cloning the original
timbre through **index-tts2**.

## 1. Clone the repository (run once per session)
If you already have a copy in your Colab environment you can skip this step.

In [None]:
import os, sys, pathlib
ROOT = pathlib.Path('/content/pyvideotrans')
if not ROOT.exists():
    !git clone https://github.com/jianchang512/pyvideotrans.git {ROOT}
os.chdir(ROOT)
sys.path.insert(0, str(ROOT))
print('Working directory:', os.getcwd())

## 2. Install runtime dependencies
This installs the lightweight Colab dependency set (CUDA 12.1 wheels are used
for Faster-Whisper).

In [None]:
!pip install -U pip
!pip install -r requirements-colab.txt

## 3. (Optional) Mount Google Drive
Uncomment the cell below if you prefer to read/write media files from Drive.

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')

## 4. Provide an input clip\nYou can upload a file through the Colab UI or download one programmatically.\nThe example below downloads a short sample from GitHub.

In [None]:
import pathlib, urllib.request, urllib.error\nsample_dir = pathlib.Path('/content/media')\nsample_dir.mkdir(parents=True, exist_ok=True)\nsample_path = sample_dir / 'sample.mp4'\nif not sample_path.exists():\n    url = 'https://github.com/jianchang512/pyvideotrans/releases/download/v0.0/sample.mp4'\n    try:\n        urllib.request.urlretrieve(url, sample_path)\n    except urllib.error.URLError:\n        print('Download failed, please upload your own clip instead.')\nsample_path

## 5. Launch speech-to-speech translation
Make sure your `index-tts2` web UI is reachable from Colab (for example via an
ngrok/cloudflared tunnel).  Set the URL below along with your desired target
language.

In [None]:
from colab_s2s import run_speech_to_speech
result = run_speech_to_speech(
    input_path=sample_path,
    target_language='en',
    index_tts_url='http://127.0.0.1:7860',  # replace with your public endpoint
    whisper_model='large-v3',
    translate_backend='google',
    recognition_backend='faster-whisper',
    separate_vocals=True,
    voice_role='clone'
)
result

The dictionary reports where the translated media and subtitles were saved.
You can download them through the Colab file browser or move them into
Google Drive.