caption

Generate subtitles from any video file. Runs entirely on your computer. No accounts, no uploads, no internet required after setup.

Source: github.com/DukeSaputra/caption | Author: Duke Saputra | License: GPL-3.0

What It Does

caption video.mp4

Produces a subtitle file with precise word-level timing. Import into any video editor (Premiere Pro, DaVinci Resolve, CapCut, etc.) or upload directly to YouTube, Instagram, or TikTok.

Supported formats: MP4, MKV, MP3, FLAC, WAV, Vorbis (built-in), plus Opus, WebM, and more with FFmpeg.

Setup

Step 1: Download

Download the latest release for your platform and save it to your Downloads folder.

Platform	File
macOS (Apple Silicon & Intel)	`caption-macos-universal`
Windows	`caption-windows-x86_64.exe`
Linux	`caption-linux-x86_64`

Step 2: Install

No admin privileges required.

macOS

Make sure the downloaded file is in your Downloads folder (~/Downloads/caption-macos-universal). Safari may rename it (e.g. add .dms). If so, rename it back to caption-macos-universal before continuing.

1. Open Terminal (press Cmd + Space, type Terminal, hit Enter)

2. Copy and paste this entire block into Terminal, then press Enter:

if [ ! -f ~/Downloads/caption-macos-universal ]; then echo "caption-macos-universal not found. Move it to your Downloads folder and try again."; else mkdir -p ~/.local/bin && mv ~/Downloads/caption-macos-universal ~/.local/bin/caption && chmod +x ~/.local/bin/caption && xattr -d com.apple.quarantine ~/.local/bin/caption 2>/dev/null; (grep -q '.local/bin' ~/.zshrc 2>/dev/null || echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.zshrc) && source ~/.zshrc && caption setup; fi

This installs the binary and downloads all required models (~1.1 GB). It may take 5-15 minutes depending on your connection. If it gets interrupted, just run caption setup to resume.

Troubleshooting:

Killed: 9 or a security dialog means the quarantine flag wasn't fully removed. Run: xattr -d com.apple.quarantine ~/.local/bin/caption

command not found after running caption --help means the PATH update didn't take effect. Close Terminal and open a new window, then try again.

macOS Mojave or older (bash): replace ~/.zshrc with ~/.bash_profile in the command above. Not sure? Run echo $SHELL.

What does the install command do?

mkdir -p ~/.local/bin creates the install directory
mv ~/Downloads/caption-macos-universal ~/.local/bin/caption moves and renames the binary
chmod +x makes it executable
xattr -d com.apple.quarantine removes the macOS download quarantine
The grep || echo line adds ~/.local/bin to your PATH (only if not already there)
source ~/.zshrc reloads the config so caption is available immediately

Linux

Make sure the downloaded file is in your Downloads folder (~/Downloads/caption-linux-x86_64).

1. Open a terminal

2. Copy and paste this entire block, then press Enter:

if [ ! -f ~/Downloads/caption-linux-x86_64 ]; then echo "caption-linux-x86_64 not found. Move it to your Downloads folder and try again."; else mkdir -p ~/.local/bin && mv ~/Downloads/caption-linux-x86_64 ~/.local/bin/caption && chmod +x ~/.local/bin/caption && (grep -q '.local/bin' ~/.bashrc 2>/dev/null || echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc) && source ~/.bashrc && caption setup; fi

This installs the binary and downloads all required models (~1.1 GB). If it gets interrupted, run caption setup to resume.

command not found after running caption --help means the PATH update didn't take effect. Close your terminal and open a new one, then try again.

Windows

Note: Windows Defender SmartScreen may show a warning when you first run the binary ("Windows protected your PC"). This is normal for unsigned open-source software. Click More info, then Run anyway. You can verify the source code at github.com/DukeSaputra/caption.

Make sure the downloaded file is in your Downloads folder (caption-windows-x86_64.exe).

1. Open PowerShell and paste this entire block, then press Enter:

if (!(Test-Path "$HOME\Downloads\caption-windows-x86_64.exe")) { Write-Host "caption-windows-x86_64.exe not found. Move it to your Downloads folder and try again." } else { New-Item -ItemType Directory -Force -Path "$HOME\caption" | Out-Null; Move-Item "$HOME\Downloads\caption-windows-x86_64.exe" "$HOME\caption\caption.exe" -Force; [Environment]::SetEnvironmentVariable("Path", [Environment]::GetEnvironmentVariable("Path", "User") + ";$HOME\caption", "User"); Write-Host "Installed. Close and reopen PowerShell to continue setup." }

Close and reopen PowerShell, then run:

caption setup

caption is not recognized means the PATH update hasn't taken effect. Make sure you closed and reopened PowerShell after the install command.

Step 3: Try It

Type caption (with a space after it), then paste the path to any video or audio file. Press Enter.

Generate subtitles:

caption video.mp4

This creates video.srt in the same folder as the video.

Generate subtitles and burn them into the video:

caption video.mp4 --burn

This creates video.srt and video-captioned.mp4 with the subtitles baked into the video.

Tip: If the burned subtitles have errors, you can fix them without re-transcribing. First generate the subtitle file on its own, edit it in any text editor to fix mistakes, then burn the corrected version:
caption path/to/video.mp4 --srt path/to/video.srt

Tip: You don't need to type file paths manually.

macOS: Drag a file from Finder into Terminal to paste its path. Or right-click while holding Option and select Copy as Pathname.

Windows: Hold Shift, right-click the file in Explorer, and select Copy as path.

Linux: Most file managers let you right-click and copy the full path. Or drag the file into the terminal.

Usage

Common commands

caption video.mp4                        # SRT subtitles (default)
caption video.mp4 --format vtt           # WebVTT
caption video.mp4 --format txt           # Plain text transcript
caption video.mp4 --burn                 # Burn subtitles into video
caption video.mp4 -o my-subtitles.srt    # Custom output filename

Burn subtitles into video

Renders each word in a frosted glass capsule, one at a time. Designed for vertical/short-form video (Reels, Shorts, TikToks).

caption video.mp4 --burn

Produces video.srt + video-captioned.mp4.

All options

--format <srt|vtt|ass|txt>              Output format (default: srt)
--burn                                  Burn subtitles into video
--srt <PATH>                            Burn an existing SRT (skips transcription)
--initial-prompt <TEXT>                 Vocabulary hint for Whisper
--fillers <keep|remove-confident|remove-all>
                                        Filler word handling (default: remove-confident)
--profanity <keep|mask|replace>         Profanity filtering (default: keep)
--hallucination-filter <off|moderate|aggressive>
                                        False phrase detection (default: moderate)
--no-vad                                Disable speech detection
--no-align                              Disable word-level alignment
-o, --output <PATH>                     Custom output path
-v, --verbose                           Show detailed processing info

Troubleshooting

"command not found"

Your system doesn't know where the binary is. Either cd into the folder containing it, or follow the PATH setup in Step 2. You can always run it directly: ~/.local/bin/caption video.mp4

"No audio track found"

The input file has no audio. Verify it plays sound in a media player before trying again.

"Could not find a Whisper model file"

The models aren't where the binary expects them. Re-run caption setup to download them.

Subtitles are inaccurate

Add vocabulary hints for unusual words, names, or brands:

caption video.mp4 --initial-prompt "TikTok, iPhone, RSVP"

What Gets Downloaded

caption setup downloads everything the tool needs to run locally. Make sure you have at least 1.5 GB of free disk space.

Component	Size	Source	Purpose
Whisper large-v3-turbo Q8_0	874 MB	HuggingFace	Speech recognition via whisper.cpp
Silero VAD v5	2 MB	GitHub	Voice activity detection, reduces hallucinations
wav2vec2-base-960h	91 MB	HuggingFace	CTC forced alignment (~20ms word timestamps)
FFmpeg	~80 MB	GitHub	Audio extraction from video files
ONNX Runtime	~50 MB	Microsoft GitHub	Inference backend for alignment and VAD
Inter Bold	<1 MB	Google Fonts	Font for burned-in subtitles (`--burn`)

Models are saved to ~/Library/Application Support/caption/models on macOS, %APPDATA%\caption\models on Windows, and ~/.local/share/caption/models on Linux. FFmpeg, ONNX Runtime, and the font are placed next to the caption binary.

SHA256 hashes are verified automatically. Use --skip-hash only if verification fails due to an upstream model update.

Build from Source

Instructions

Requires Rust.

git clone https://github.com/DukeSaputra/caption.git && cd caption && cargo build --release

Binary at target/release/caption-cli. For Metal GPU acceleration on Apple Silicon (3-4x faster):

cargo build --release --features metal

Uninstall

caption uninstall

Removes all downloaded models, dependencies (FFmpeg, ONNX Runtime, font), and tells you how to remove the binary itself.

Security

Models downloaded from official sources (HuggingFace, GitHub) with SHA256 verification
Fully open source and buildable from source
No root/admin privileges required
No network access after initial setup
GPL-3.0 licensed

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
caption-cli		caption-cli
caption-core		caption-core
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

caption

What It Does

Setup

Step 1: Download

Step 2: Install

Step 3: Try It

Usage

Troubleshooting

What Gets Downloaded

Build from Source

Uninstall

Security

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

caption

What It Does

Setup

Step 1: Download

Step 2: Install

Step 3: Try It

Usage

Troubleshooting

What Gets Downloaded

Build from Source

Uninstall

Security

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages