Automatically dub YouTube videos into different languages using AI-powered transcription, translation, and speech synthesis. This project also includes a web interface for easy use.
- YouTube Download: Download videos directly from YouTube URLs
- Speech Recognition: Transcribe audio with speaker diarization using Deepgram
- Translation: Translate content using OpenAI GPT-4
- Speech Synthesis: Generate natural-sounding speech using ElevenLabs
- Audio Alignment: Synchronize dubbed audio with original video timing
- Multi-Speaker Support: Different voices for different speakers
- Web Interface: A simple web UI for submitting dubbing jobs and tracking progress.
- Real-time Progress: Step-by-step updates via polling in the web UI.
- Background Processing: Non-blocking job execution for web requests.
- Error Handling: Clear error messages in the web UI.
- Voice Cloning: Default enabled for best quality dubbing (via web UI).
- Background Preservation: Keeps original background music/effects (via web UI).
- Install system dependencies:
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt-get install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.html- Install Python dependencies:
pip install -r requirements.txt- Set up API keys in
.env:
ELEVENLABS_API_KEY=your_elevenlabs_key
OPENAI_API_KEY=your_openai_key
DEEPGRAM_API_KEY=your_deepgram_key# Dub to Spanish (default)
python -m autodub.main https://youtube.com/watch?v=VIDEO_ID
# Dub to French
python -m autodub.main https://youtube.com/watch?v=VIDEO_ID --lang fr
# Custom output name
python -m autodub.main https://youtube.com/watch?v=VIDEO_ID --lang de --output my_video-
Start the server:
python web_server.py
-
Open your browser:
- Web Interface: http://localhost:8000/
- API Docs: http://localhost:8000/docs
-
Use the interface:
- Paste YouTube URL
- Select language
- Choose options (Voice Clone, Background Preservation)
- Click "Auto-Dub"
- Watch progress in real-time
- Download result when complete
from autodub.main import autodub_pipeline
output_path = autodub_pipeline(
youtube_url="https://youtube.com/watch?v=VIDEO_ID",
target_language="es",
output_name="my_dubbed_video"
)es- Spanishfr- Frenchde- Germanit- Italianpt- Portugueseru- Russianja- Japaneseko- Koreanzh- Chinesear- Arabichi- Hindi
- Download: Downloads video and extracts audio using yt-dlp
- Transcribe: Transcribes audio with speaker labels using Deepgram
- Translate: Translates each segment using OpenAI
- Synthesize: Generates speech for each segment using ElevenLabs
- Align: Adjusts audio timing to match original video
- Mux: Combines dubbed audio with original video
autodub/
├── pipeline/
│ ├── download.py # YouTube download & audio extraction
│ ├── transcribe.py # Deepgram ASR integration
│ ├── translate.py # OpenAI translation
│ ├── synthesize.py # ElevenLabs TTS
│ ├── align.py # Audio timing alignment
│ └── mux.py # Video/audio muxing
├── main.py # CLI entry point
├── web_pipeline.py # Progress-tracked pipeline for web
└── config.py # Configuration & API keys
web_server.py # FastAPI server
web_static/
└── index.html # Minimal frontend
outputs/ # Generated videos
Dubbed videos are saved to the outputs/ directory with the format:
{output_name}_dubbed.mp4
Run the test script with a sample video:
python test_pipeline.py- Job history and management for the web interface
- Batch processing for CLI and web
- Advanced UI improvements
- External API integrations
- Performance optimizations
- Quality adjustment settings
- Python 3.8+
- FFmpeg
- API Keys: Deepgram, OpenAI, ElevenLabs
MIT