A Python application that transcribes large audio files using OpenAI's
gpt-4o-mini-transcribe model. Available as both a CLI tool and web interface.
Files above 25MB are automatically chunked and merged into a single plain-text
output, with progress feedback.
- Dual Interface: CLI tool and web application
- Auto-chunking for large files (keeps each chunk under the API limit)
- Plain-text output to stdout or file
- Progress UI with Rich (can be silenced in CLI)
- Retry handling for transient API failures
- Supports common formats (mp3, wav, m4a, aac, flac, ogg, wma)
- Docker support with best practices (multi-stage builds, non-root user, health checks)
- Python 3.9+
- FFmpeg (system dependency)
- Linux:
apt-get install ffmpeg - macOS:
brew install ffmpeg
- Linux:
pip install -e .python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
export OPENAI_API_KEY="sk-your-key"
scribify path/to/audio.mp3 -o output.txtSet your API key in the environment:
export OPENAI_API_KEY="sk-your-key"You can copy .env.example to .env and set your key there if you prefer.
Run the web application for a browser-based transcription interface:
# Install web dependencies
pip install -r requirements.txt
# Run the web server
python web_app.pyThen open http://localhost:8000 in your browser.
The easiest way to run the web interface is with Docker:
# Copy and configure environment
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
# Start the service
docker compose up -d
# View logs
docker compose logs -f
# Stop the service
docker compose downAccess the web interface at http://localhost:8000
For detailed Docker configuration and production deployment, see DOCKER.md.
- Validates the file and checks size.
- If under 25MB, transcribes directly.
- If over 25MB, splits into chunks, transcribes each, and merges results.
- Temporary chunks are cleaned up after completion.
scribify path/to/audio.mp3 -o output.txtOptions:
-m, --modeloverride model--chunk-sizetarget chunk size in MB-q, --quietsuppress progress UI-v, --verboseverbose logging
- By default, transcripts are printed to stdout.
- Use
-o/--outputto write a file. - Chunked transcripts are concatenated with newlines.
Small file:
scribify small.mp3 -o transcript.txtLarge file:
scribify large.mp3 --verbose -o transcript.txt- Large files are chunked; chunks are exported as mp3 for broad FFmpeg compatibility.
- Costs & data handling: API calls incur OpenAI usage fees; your audio is sent to OpenAI for transcription; keep your
OPENAI_API_KEYprivate and out of version control.
- FFmpeg missing:
apt-get install ffmpegorbrew install ffmpeg - API key missing: ensure
OPENAI_API_KEYis set in your shell or.env
MIT. See LICENSE.
pip install -r requirements-dev.txt
pytest -v