A command-line interface for text-to-speech with voice cloning, powered by Qwen3-TTS.
Supports PyTorch (CUDA / CPU) and MLX (Apple Silicon) backends with automatic platform detection.
- Voice cloning — clone any voice from a short audio sample
- Streaming playback — hear audio as it generates, no waiting
- Apple Silicon native — MLX backend for fast local inference
- Two model sizes — 1.7B (quality) and 0.6B (speed)
- JSON output — machine-readable output for scripting and pipelines
- Configurable — TOML config files or CLI flags
Requires Python 3.11+
curl -fsSL https://raw.githubusercontent.com/jiweiyuan/ttscli/main/install.sh | bashWith options:
# Specify backend
curl -fsSL https://raw.githubusercontent.com/jiweiyuan/ttscli/main/install.sh | bash -s -- --backend mlx
curl -fsSL https://raw.githubusercontent.com/jiweiyuan/ttscli/main/install.sh | bash -s -- --backend pytorch
# Force using uv
curl -fsSL https://raw.githubusercontent.com/jiweiyuan/ttscli/main/install.sh | bash -s -- --uv# Basic install
pip install ttscli
# With PyTorch backend
pip install ttscli[pytorch]
# With MLX backend (Apple Silicon)
pip install ttscli[mlx]
# Development
pip install ttscli[dev]Or install from source:
git clone https://github.com/your-org/ttscli.git
cd ttscli
pip install -e ".[pytorch]"Verify:
tts --versionTest basic commands:
tts voice list # List voices
tts config show # Show config
tts --help # View helpCommand not found — ensure your Python scripts directory is in PATH:
export PATH="$HOME/.local/bin:$PATH"Or use the module directly:
python -m ttscli --versionImport errors — reinstall dependencies:
pip install -e ".[pytorch]" --force-reinstallPermission errors — install in user mode:
pip install --user -e .pip uninstall ttsclitts voice add recording.wav --text "The transcript of the recording" --voice myvoicetts say "Hello, how are you today?" --voice myvoicetts say "Hello world" --voice myvoice -o hello.wav --no-playGenerate speech from text. Plays aloud with streaming by default.
tts say "Text to speak" [OPTIONS]
Options:
-v, --voice TEXT Voice name (default: configured default)
-l, --language TEXT Language code (default: en)
-m, --model TEXT Model size: 1.7B or 0.6B (default: 1.7B)
-o, --output PATH Save to WAV file
-i, --instruct TEXT Speaking style instruction
--no-play Don't play audio, only save to file
--no-stream Disable streaming (generate all, then play)
--seed INT Random seed for reproducibilityExamples:
tts say "Hello, how are you?" # play aloud
tts say "Good morning" --voice myvoice # use specific voice
tts say "Hello world" -o hello.wav # play and save
tts say "Hello world" -o hello.wav --no-play # save only
tts say "Breaking news!" -i "Speak urgently" # with style instruction
tts say "Slow and steady" --no-stream # generate all, then playManage voices and audio samples.
tts voice add <audio_file> [OPTIONS] # Add sample (creates voice if needed)
tts voice list # List all voices
tts voice info [VOICE] # Show voice details
tts voice delete <VOICE> [-y] # Delete a voice
tts voice default [VOICE] # Set/show default voice
tts voice default --unset # Unset default voiceView and update configuration.
tts config show # Show current config
tts config set <key> <value> # Set a config valueAvailable config keys: data_dir, default_voice, default_language, default_model, output_format, auto_play
Use --json or --output json for machine-readable output:
tts --json voice list
tts --output json say "Hello" --voice myvoiceConfiguration is loaded from (in order of priority):
- CLI flags (
--data-dir,--output) - Config files:
./tts.toml(project-local)~/.config/tts/config.toml~/.tts/config.toml
Example config.toml:
default_voice = "myvoice"
default_language = "en"
default_model = "1.7B"
output_format = "rich"
data_dir = "~/tts"All data is stored in ~/tts/ by default:
~/tts/
├── voices.json # Voice definitions and metadata
├── samples/ # Audio samples for voice cloning
└── generations/ # Generated audio files
-
Python 3.11+
-
PyTorch backend: torch, transformers, qwen-tts
-
MLX backend (Apple Silicon): mlx, mlx-audio
-
Audio: soundfile, sounddevice
-
System dependency: SoX (required by qwen-tts)
# macOS brew install sox # Ubuntu/Debian sudo apt install sox
MIT — see LICENSE for details.