Convert written posts into narrated audio files using open-source text-to-speech.
Narrator App is built for writers who want to offer audio versions of their posts — particularly Substack writers — without needing a recording setup or a paid service. Drop in a Markdown file, run one command, get an MP3.
- Place your post as a
.mdfile inposts/ - Optionally add intro/outro audio clips to
audio/intro/andaudio/outro/ - Run
python narrator.py generate posts/your-post.md - Find the finished audio in
audio/output/
Narration is generated locally using Kokoro-82M, a lightweight open-source TTS model that runs on CPU with no GPU required.
1. Terminal (CLI) The primary interface. One command, structured JSON output, scriptable and agent-friendly.
python narrator.py generate posts/your-post.md2. Browser UI
Run python narrator_ui.py to open a local Gradio interface — voice preview, speed and pause sliders, format picker, and one-click download. No terminal required after setup.
3. AI Agent
The CLI is designed for agent invocation: all output is JSON on stdout, progress goes to stderr, exit codes are 0 or 1. See the Agent Use section below.
pip install -r requirements.txt # install dependencies
python narrator.py setup # download Kokoro model (~82 MB)
python narrator.py check # verify setup
python narrator.py generate posts/your-post.mdSee wiki/getting-started.md for a full walkthrough including ffmpeg installation.
- Paragraph pauses — configurable silence between paragraphs for natural pacing
- Speech speed — adjustable playback speed multiplier (0.5–2.0×)
- Loudness normalization — RMS-matches intro and outro to the body audio so volume is consistent across all three segments
- Intro/outro fades — fades out the end of the intro and fades in the start of the outro for smooth transitions
- Volume control — apply a dB gain adjustment to the final output
- Resume-on-failure — pass
--cache-segmentsto write each paragraph to disk as it completes; a re-run skips already-finished paragraphs - Multilingual narration — optional Kokoro v1.0 model adds support for Spanish, French, Hindi, Italian, Japanese, Brazilian Portuguese, and Mandarin Chinese
All settings are controlled in config.yaml. See wiki/configuration.md for the full reference.
Sample narrations across all four accent and gender combinations:
| Voice | Accent | Gender | Sample |
|---|---|---|---|
af_bella |
American | Female | sample-audio-bella.mp3 |
af_nicole |
American | Female | sample-audio-nicole.mp3 |
af_sarah |
American | Female | sample-audio-sarah.mp3 |
af_sky |
American | Female | sample-audio-sky.mp3 |
am_adam |
American | Male | sample-audio-adam.mp3 |
am_michael |
American | Male | sample-audio-michael.mp3 |
bf_emma |
British | Female | sample-audio-emma.mp3 |
bf_isabella |
British | Female | sample-audio-isabella.mp3 |
bm_george |
British | Male | sample-audio-george.mp3 |
bm_lewis |
British | Male | sample-audio-lewis.mp3 |
Checkout all samples at https://priankr.github.io/narrator/.
| File | Description |
|---|---|
| wiki/getting-started.md | Step-by-step setup and first run walkthrough |
| wiki/configuration.md | Full config.yaml reference, all CLI flags, intro/outro setup |
| wiki/voices.md | Voice list with accent and gender reference |
| wiki/architechture.md | Technical architecture and pipeline design |
| docs/ | GitHub Pages site — rendered docs and voice sample gallery |
Narrator App is designed to be invoked by AI agents as part of larger automation workflows. The CLI is machine-readable: all commands print a single JSON line to stdout, all progress goes to stderr, and exit codes are strictly 0 (success/skipped) or 1 (error).
Quick-start for agents:
python narrator.py check # verify environment; parse issues[] if exit code 1
python narrator.py voices # discover available voices before generating
python narrator.py generate posts/my-post.md --dry-run # validate inputs without synthesizing
python narrator.py generate posts/my-post.md # run the full pipeline| Reference | Purpose |
|---|---|
| AGENTS.md | Generic agent quick-start: commands, flags, key rules |
| CLAUDE.md | Claude Code-specific conventions for developer agents |
| wiki/agent-guidelines.md | Full reference: all JSON schemas, error recovery, architecture, coding conventions |
MIT — see LICENSE.