A simple backend project that processes a local video and produces a subtitle-burned output video.
Pipeline steps:
- Extract audio from input video with FFmpeg
- Transcribe audio with OpenAI Whisper (Python script)
- Generate SRT subtitles
- Translate SRT lines using LibreTranslate-compatible API
- Burn translated subtitles into the output video with FFmpeg
The project follows a lightweight clean architecture split:
cmd/subtitle: CLI entrypoint and wiringinternal/usecase: pipeline orchestrationinternal/ports: service interfaces (video/transcription/translation/SRT)internal/infrastructure: concrete implementationsvideo: FFmpeg integrationtranscription: Whisper script wrappertranslation: LibreTranslate script wrappersrt: SRT writer
scripts/: Python scripts used by Go services
- Go 1.22+
- Python 3.11+
- FFmpeg installed and available in PATH
Python dependencies:
pip install -r requirements.txtgo run ./cmd/subtitle \
-input ./samples/input.mp4 \
-output ./samples/output_es.mp4 \
-source-lang auto \
-target-lang esAvailable flags:
-input(required): input video path-output: output video path (defaultoutput_subtitled.mp4)-source-lang: source language code for translation (defaultauto)-target-lang: target language code (defaultes)-workdir: temp working directory (default uses system temp)-keep-artifacts: keep intermediate WAV and SRT files-whisper-model: Whisper model name (defaultbase)-translate-endpoint: translation API endpoint (defaulthttps://libretranslate.com/translate)
Build:
docker build -t subtitle-pipeline .Run (single command):
docker run --rm \
-v "$PWD:/data" \
subtitle-pipeline \
-input /data/input.mp4 \
-output /data/output_es.mp4 \
-target-lang esNote: Whisper downloads model weights on first run, which can take time.
- Translation is line-based on subtitle text lines to keep implementation simple.
- You can replace
-translate-endpointwith your own LibreTranslate-compatible service.