Multi-backend TTS tool for creating voice-over audio assets
Install | Quick Start | Backends | CLI Options | Voice Models | Configuration | Docker | License
uv pip install quickcall-voiceover[piper]uv pip install quickcall-voiceover[kokoro]
# macOS: Also install espeak-ng
brew install espeak-nguv pip install quickcall-voiceover[all]
# macOS: Also install espeak-ng for Kokoro
brew install espeak-ng# Piper (default)
quickcall-voiceover config.json --combine
# Kokoro
quickcall-voiceover -b kokoro -v af_heart config.json --combine# Piper
quickcall-voiceover -t script.txt -c -o ./output
# Kokoro with af_heart voice
quickcall-voiceover -b kokoro -v af_heart -t script.txt -c -o ./outputquickcall-voiceover --text
quickcall-voiceover -b kokoro --textquickcall-voiceover --voices # Piper voices
quickcall-voiceover -b kokoro --voices # Kokoro voices| Backend | Model | Quality | Speed | Install |
|---|---|---|---|---|
| Piper | Various | Medium-High | Fast | Default |
| Kokoro | Kokoro-82M | High | Medium | [kokoro] extra |
quickcall-voiceover [CONFIG] [OPTIONS]
Arguments:
CONFIG Path to JSON configuration file
Options:
-b, --backend TTS backend: piper, kokoro (default: piper)
-t, --text [FILE] Text mode: provide .txt file or use interactively
-v, --voice VOICE Voice model (default depends on backend)
-o, --output DIR Output directory (default: ./output)
-m, --models DIR Models directory (default: ./models)
-c, --combine Create a combined audio file from all segments
--combined-name Filename for combined output (default: combined_voiceover.wav)
--voices Show available voice models and exit
-h, --help Show help message
# Piper (default backend)
quickcall-voiceover config.json --combine
quickcall-voiceover -t script.txt -v en_US-ryan-high -c
# Kokoro backend
quickcall-voiceover -b kokoro -v af_heart config.json -c
quickcall-voiceover -b kokoro -v am_michael -t script.txt -c
# Use config for voice settings, text file for content
quickcall-voiceover voice_config.json -t script.txt -c
# Interactive text mode
quickcall-voiceover --text
quickcall-voiceover -b kokoro --text| Model ID | Name | Description |
|---|---|---|
en_US-hfc_male-medium |
Male (US) | Clear male voice (default) |
en_US-hfc_female-medium |
Female (US) | Clear female voice |
en_US-amy-medium |
Amy (US) | Natural female voice |
en_US-joe-medium |
Joe (US) | Natural male voice |
en_US-ryan-high |
Ryan (US) | High quality male voice |
en_US-lessac-high |
Lessac (US) | High quality female voice |
en_GB-alan-medium |
Alan (UK) | British male voice |
en_GB-alba-medium |
Alba (UK) | British female voice |
en_GB-cori-high |
Cori (UK) | High quality British female |
Browse all Piper voices at Piper samples.
| Voice ID | Name | Description |
|---|---|---|
af_heart |
Heart (US Female) | Warm, expressive (default) |
af_bella |
Bella (US Female) | Clear American female |
af_nicole |
Nicole (US Female) | Professional American female |
af_sarah |
Sarah (US Female) | Friendly American female |
af_sky |
Sky (US Female) | Bright American female |
am_adam |
Adam (US Male) | Clear American male |
am_michael |
Michael (US Male) | Professional American male |
bf_emma |
Emma (UK Female) | British female |
bf_isabella |
Isabella (UK Female) | Elegant British female |
bm_george |
George (UK Male) | British male |
bm_lewis |
Lewis (UK Male) | Clear British male |
More info at Kokoro-82M on HuggingFace.
{
"voice": {
"backend": "piper",
"model": "en_US-hfc_male-medium",
"length_scale": 1.0,
"noise_scale": 0.667,
"noise_w": 0.8,
"sentence_silence": 0.5
},
"output": {
"format": "wav"
},
"segments": [
{
"id": "01_intro",
"text": "Welcome to the demo."
},
{
"id": "02_main",
"text": "This is the main content."
}
]
}{
"voice": {
"backend": "kokoro",
"model": "af_heart",
"speed": 1.0
},
"output": {
"format": "wav"
},
"segments": [
{
"id": "01_intro",
"text": "Welcome to the demo."
}
]
}| Field | Type | Default | Description |
|---|---|---|---|
backend |
string | piper |
TTS backend |
model |
string | en_US-hfc_male-medium |
Piper voice model |
length_scale |
float | 1.0 |
Speech speed (lower = faster) |
noise_scale |
float | 0.667 |
Voice variation |
noise_w |
float | 0.8 |
Phoneme width noise |
sentence_silence |
float | 0.5 |
Silence between sentences (seconds) |
| Field | Type | Default | Description |
|---|---|---|---|
backend |
string | kokoro |
TTS backend |
model |
string | af_heart |
Kokoro voice ID |
speed |
float | 1.0 |
Speech speed multiplier |
from pathlib import Path
from quickcall_voiceover import generate_voiceover, generate_from_text
# Piper (default)
generate_voiceover(
config_path=Path("config.json"),
output_dir=Path("./output"),
combine=True,
)
# Kokoro
generate_voiceover(
config_path=Path("config.json"),
output_dir=Path("./output"),
combine=True,
backend="kokoro",
voice="af_heart",
)
# From text lines with Kokoro
lines = [
"First line of voice over.",
"Second line of voice over.",
]
generate_from_text(
lines=lines,
voice="af_heart",
output_dir=Path("./output"),
combine=True,
backend="kokoro",
)Build the image:
docker build -t quickcall-voiceover .Run with a config file:
docker run -v $(pwd)/config:/config -v $(pwd)/output:/app/output \
quickcall-voiceover /config/voiceover.json --combineThis project is licensed under Apache-2.0.
Note: This tool depends on:
These are installed as separate dependencies and are not bundled with this package.
Built with ❤️ by QuickCall
