A fast, Rust video subtitle pipeline that:
- Extracts audio from a video using
ffmpeg
- Transcribes speech into text with precise timestamps using
whisper_rs
- Generates styled karaoke-style ASS subtitles (with word-level highlighting)
- Burns subtitles into videos or overlays on new video, image, or solid color backgrounds with
ffmpeg
Note
More examples are available in the example directory. The following is just a GIF for visual demo purposes
Important
Tested on macOS only.
The instructions below assume you’re on a Mac (using Homebrew).
The Engineering Requirements Document (ERD) is available here :
brew install ffmpeg
You can now build the transcriber using Rust Cargo.
cargo build
cargo run -- --input "example/input_audio.wav"
This runs the pipeline end-to-end:
- extracts or processes audio,
- transcribes with Whisper,
- generates subtitles,
- burns subtitles into a new video.
In the style.json
file, you control:
- Subtitle text styling: font, size, colors, alignment
- Background style: solid color, image, or looping video
You can control your background type to be either solid
, video
, or image
. Here are the 3 way to configure it:
{
"background": {
"type": "video",
"solid_color": null,
"video_path": "assets/bg_example.mp4",
"image_path": null
}
}
{
"background": {
"type": "image",
"solid_color": null,
"video_path": null,
"image_path": "assets/bg_image.jpg"
}
}
{
"background": {
"type": "solid",
"solid_color": "#101020",
"video_path": null,
"image_path": null
}
}