A Python script that automatically adds animated captions to videos using speech recognition, with extensive customization options.
-
Automatic Captioning
- Uses OpenAI's Whisper model for speech-to-text transcription
- Word-by-word or multi-word animated captions
- Customizable zoom animation effects
-
Styling Options
- Color cycling for captions
- Configurable font, size, and positioning
- Text stroke and background options
- Highlight individual words
-
Advanced Video Editing
- Optional intro sound
- Video overlay support
- Flexible caption placement
- Python 3.8 or higher
- FFmpeg installed on your system
- GPU recommended for faster processing (optional)
- Clone the repository:
git clone <repository-url>
cd video-caption-generator- Install dependencies:
pip install moviepy whisper PyYAML requests fonttools pillowThe fontify.py script provides an easy way to download fonts from Google Fonts:
# Download a specific font
python fontify.py "Open Sans"
# Or
python fontify.py "Roboto Black"This script will:
- Query the Google Fonts API
- Download the font file
- Save it in a
./fontsdirectory
Configuration is managed through a config.toml file with numerous options:
number_of_words: Number of words to display at oncefont: Font filenamefont_size: Text size in pixelsposition: Screen position (top/center/bottom)text_align: Text alignmenttext_colors: Color cycling for captionsstroke_colorandstroke_width: Text outline stylingbg_color: Background color or transparency
transition: Zoom animation togglehighlight: Highlight current wordhas_intro_sound: Add background audiooverlay: Overlay another video
# Basic usage
python caption.py input_video.mp4
# Specify output and configuration
python caption.py input_video.mp4 --output_file captioned_video.mp4 --config custom_config.tomlfrom caption import add_captions_to_video
add_captions_to_video(
"input_video.mp4",
output_path="output_video.mp4",
config_path="config.toml"
)Included silence_trimmer.py allows removing silent segments from videos:
python silence_trimmer.py input_video.mp4-
Font Issues
- Use
fontify.pyto download missing fonts - Ensure fonts are in the
./fontsdirectory
- Use
-
Performance
- Use smaller Whisper models for faster processing
- Reduce video resolution if memory is limited
-
Dependencies
- Ensure all required libraries are installed
- Check FFmpeg installation
- Experiment with
config.tomlsettings - Try different fonts using
fontify.py - Adjust Whisper model for accuracy vs. speed
- MoviePy: Video processing
- Whisper: Speech recognition
- PIL (Pillow): Image manipulation
- TOML: Configuration management
- Apache License 2.0 (for Roboto Fonts)
- MIT License for the script
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a pull request
- OpenAI (Whisper)
- Google Fonts
- MoviePy Developers