VoiceCloner

Clone your voice locally with Qwen3-TTS. No data leaves your machine.

Record a few seconds of your voice, then generate speech in your voice from any text. Includes a full audiobook production mode for long-form content.

Features

Quick Clone -- Record or upload 5-10 seconds of your voice, type text, get speech output
Audiobook Mode -- Load an EPUB or paste text, chunk it into chapters, generate and review audio per-chunk, export a stitched audiobook
Voice Library -- Save and manage multiple cloned voices
VoiceDesign -- Optional style descriptions to control tone and delivery
Export -- WAV, MP3, FLAC, M4A output formats
Runs locally -- All inference happens on your machine (MPS/CUDA/CPU)

Quick Start (Python)

Requires Python 3.12+.

git clone https://github.com/thegian7/voicecloner.git
cd voicecloner
./start.sh

start.sh creates a virtual environment, installs dependencies, and launches the Gradio UI at http://localhost:7860.

Or manually:

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py

Desktop App (Electron)

The Electron shell wraps the Gradio UI into a native desktop app with automatic Python/venv setup and GPU detection.

npm install
npm run dist:mac    # or dist:win / dist:linux

This bundles a standalone Python runtime (downloaded via npm run download-python) so end users don't need Python installed.

Build targets

Platform	Format	GPU Support
macOS	DMG (arm64 + x64)	MPS (Apple Silicon)
Windows	NSIS installer (x64)	CUDA, CPU
Linux	AppImage (x64)	CUDA, ROCm, CPU

TTS API Server

tts_server.py exposes an OpenAI-compatible /v1/audio/speech endpoint using your cloned voice. Useful for integrating with other tools.

source .venv/bin/activate
python tts_server.py --voice "MyVoice" --port 8765

Project Structure

app.py                  # Main Gradio app (Quick Clone + Audiobook + Voices)
core/                   # Shared TTS engine (model loading, generation, audio processing)
audiobook/              # Audiobook-specific logic (chapters, export, state)
electron/               # Electron desktop shell
python/                 # Python files bundled into Electron builds
tts_server.py           # Standalone TTS API server
start.sh                # One-command launcher

Hardware Requirements

Apple Silicon Mac: Works out of the box via MPS. 16GB RAM recommended.
NVIDIA GPU: CUDA 12.4+ with 8GB+ VRAM recommended.
AMD GPU: ROCm 6.2+ supported.
CPU: Works but slow. 32GB RAM recommended.

Models are downloaded from Hugging Face on first run (~2GB).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
audiobook		audiobook
core		core
electron		electron
python		python
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
clone.py		clone.py
electron-builder.yml		electron-builder.yml
entitlements.mac.plist		entitlements.mac.plist
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
start.sh		start.sh
start_tts_server.sh		start_tts_server.sh
test_tts.py		test_tts.py
tsconfig.json		tsconfig.json
tts_server.py		tts_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceCloner

Features

Quick Start (Python)

Desktop App (Electron)

Build targets

TTS API Server

Project Structure

Hardware Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoiceCloner

Features

Quick Start (Python)

Desktop App (Electron)

Build targets

TTS API Server

Project Structure

Hardware Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages