Automates the first stage of a dubbing pipeline:
- Download a YouTube video (
yt-dlp) - Extract its audio track as high-quality MP3
- Obtain subtitles
- Fetch existing captions via youtube-transcript-api
- …or generate accurate VTT subtitles locally with OpenAI Whisper
- Python 3.9 or higher
- Virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtBasic usage with a YouTube URL:
python video_preparer.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ"Generate subtitles using Whisper (if no captions available):
python video_preparer.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --use-whisperSpecify output directory:
python video_preparer.py "https://www.youtube.com/watch?v=dQw4w9WgXcQ" --output-dir ./my_dubsProcess multiple videos:
python video_preparer.py \
"https://www.youtube.com/watch?v=dQw4w9WgXcQ" \
"https://www.youtube.com/watch?v=jNQXAC9IVRw" \
--output-dir ./batch_dubsFor more options and details, run:
python video_preparer.py --helpOnce you have prepared video data (video, audio, and VTT file), you can use the dubber module to generate a dubbed audio track using OpenAI's Text-to-Speech API.
The dubber requires an OpenAI API key. You can provide it in two ways:
-
Using a .env file (recommended): Create a
.envfile in the project root:OPENAI_API_KEY=your-openai-api-key-here(See
.env.samplefor an example) -
Using environment variable:
export OPENAI_API_KEY=your-openai-api-key-here
python -m dubber --data-dir data --video video_nameThis will:
- Load the original video from
data/video_name/video_name.mp4 - Read the VTT subtitles from
data/video_name/video_name.vtt - Generate speech for each subtitle using OpenAI's TTS API (model: gpt-4o-mini-tts, voice: coral)
- Mute the original audio during subtitle timings
- Overlay the generated speech at the correct times
- Create two output files:
data/video_name/video_name_dub.mp4- Dubbed video with synchronized audiodata/video_name/video_name_dub.mp3- Dubbed audio track separately
The output video retains the original soundtrack, but mutes it during dubbed phrases, creating a seamless viewing experience. The separate audio file contains the complete dubbed soundtrack.
python -m dubber --data-dir data --video video_name --config voice.configWe use pytest.
# inside dub_video/
pip install -r requirements.txt # installs pytest, yt-dlp, etc.
# 1 · quick, offline checks (ID parsing only)
pytest -m "not slow"
# 2 · full integration – actually downloads a pair of 5-second clips
pytest -m slowThe integration tests need to be run in a specific order:
- video_preparer tests first (to download and prepare test data)
- dubber tests second (uses the prepared data)
You can configure the tests using either:
-
A .env file (recommended):
# .env RUN_NETWORK_TESTS=1 OPENAI_API_KEY=your-api-key -
Environment variables:
export RUN_NETWORK_TESTS=1 export OPENAI_API_KEY=your-api-key
# With .env file configured
./run_integration_tests.sh
# Or manually:
pytest tests/test_video_preparer.py -v
pytest tests/test_dubber.py -vNote: The dubber tests require an OpenAI API key and will make actual API calls to generate speech.