Chatterbox TTS - Windows WebUI Fork

🚀 This is a fork of the original Resemble AI Chatterbox TTS with enhanced features for Windows users!

🆕 What's New in This Fork

🎯 One-Click Windows Installation

Automatic environment setup with CUDA 11.8 support
No technical knowledge required - just run one-click-installer.bat or install.bat
Instant launcher - double-click run_tts.bat to start

🚀 Smart Long-Form Processing (New Feature!)

What the original didn't have:

Smart Text Chunking: Automatically splits long documents at sentence boundaries
Parallel Processing: Processes multiple chunks simultaneously for 4x faster generation
Seamless Audio Stitching: Combines chunks into one cohesive audio file
Progress Tracking: Real-time progress indicators during generation
Voice Consistency: Maintains the same cloned voice across all chunks
Configurable Batch Size: Adjust parallel processing for your hardware

Perfect for: Articles, books, scripts, documentation, or any text longer than 300 characters!

🚀 Quick Start (Windows)

Method 1 - One click installation (Recommended)

Download one-click-installer.bat and just run

Method 2 (try this method if one-click installation fails)

git clone https://github.com/Saganaki22/chatterbox-WebUI
cd chatterbox-WebUI
install.bat

Launch WebUI

run_tts.bat

Then open: http://127.0.0.1:7860/

That's it! 🎉

I've provided 11 high quality ENG voice samples to get you started, find them in /samples/*

📖 About Chatterbox TTS

_Made with ♥️ by

We're excited to introduce Chatterbox, Resemble AI's first production-grade open source TTS model. Licensed under MIT, Chatterbox has been benchmarked against leading closed-source systems like ElevenLabs, and is consistently preferred in side-by-side evaluations.

Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. It's also the first open source TTS model to support emotion exaggeration control, a powerful feature that makes your voices stand out.

If you like the model but need to scale or tune it for higher accuracy, check out our competitively priced TTS service (link). It delivers reliable performance with ultra-low latency of sub 200ms—ideal for production use in agents, applications, or interactive media.

🔥 Key Features

SoTA zeroshot TTS - State-of-the-art voice cloning
0.5B Llama backbone - Powerful language model foundation
Unique exaggeration/intensity control - First open-source TTS with emotion control
Ultra-stable with alignment-informed inference - Consistent, high-quality output
Trained on 0.5M hours of cleaned data
Watermarked outputs - Built-in Perth watermarking for responsible AI
Easy voice conversion - Simple reference audio upload
Outperforms ElevenLabs in side-by-side comparisons

💡 Usage Tips

General Use (TTS and Voice Agents):

The default settings (exaggeration=0.5, cfg_weight=0.5) work well for most prompts.
If the reference speaker has a fast speaking style, lowering cfg_weight to around 0.3 can improve pacing.

Expressive or Dramatic Speech:

Try lower cfg_weight values (e.g. ~0.3) and increase exaggeration to around 0.7 or higher.
Higher exaggeration tends to speed up speech; reducing cfg_weight helps compensate with slower, more deliberate pacing.

Long-Form Content:

Use the Long Form Content tab for documents, articles, or scripts
Adjust Batch Size (1-8) based on your GPU memory
Set Chunk Size (100-500 chars) for optimal sentence splitting
Upload reference audio once - it applies to all chunks automatically

🔧 Manual Installation (Advanced Users)

If you prefer manual installation or need to customize:

# Create virtual environment
python -m venv myenv
myenv\Scripts\activate.bat

# Install PyTorch with CUDA 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# Install ChatterboxTTS and Gradio
pip install chatterbox-tts gradio

# Run the WebUI
python gradio_tts_app.py

🐍 Python API Usage

import torchaudio as ta
from chatterbox.tts import ChatterboxTTS

model = ChatterboxTTS.from_pretrained(device="cuda")

text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill."
wav = model.generate(text)
ta.save("test-1.wav", wav, model.sr)

# Voice cloning with reference audio
AUDIO_PROMPT_PATH="YOUR_FILE.wav"
wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH)
ta.save("test-2.wav", wav, model.sr)

🛠️ WebUI Features

Single Text Tab

Quick TTS generation for short texts
Voice cloning with reference audio upload
Real-time parameter adjustment
Example prompts included

Long Form Content Tab (Fork Enhancement)

Process documents, articles, books
Smart sentence-boundary chunking
Parallel processing for speed
Progress tracking with real-time updates
Automatic audio stitching
Voice consistency across all chunks

Utilities Tab

File cleanup and management
Tips for best results
Parameter explanations

🏗️ Original Repository

This fork is based on the original Resemble AI Chatterbox TTS. All core TTS functionality and model weights remain unchanged - we've simply added Windows-friendly installation and an enhanced web interface with long-form processing capabilities.

🙏 Acknowledgements

🔒 Built-in PerTh Watermarking for Responsible AI

Every audio file generated by Chatterbox includes Resemble AI's Perth (Perceptual Threshold) Watermarker - imperceptible neural watermarks that survive MP3 compression, audio editing, and common manipulations while maintaining nearly 100% detection accuracy.

💬 Official Discord

👋 Join us on Discord and let's build something awesome together!

⚠️ Disclaimer

Don't use this model to do bad things. Prompts are sourced from freely available data on the internet.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
samples		samples
src/chatterbox		src/chatterbox
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
gradio_tts_app.py		gradio_tts_app.py
gradio_vc_app.py		gradio_vc_app.py
install.bat		install.bat
run_tts.bat		run_tts.bat
run_vc.bat		run_vc.bat
voice_conversion.py		voice_conversion.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Chatterbox TTS - Windows WebUI Fork

🆕 What's New in This Fork

🎯 One-Click Windows Installation

🚀 Smart Long-Form Processing (New Feature!)

🚀 Quick Start (Windows)

Method 1 - One click installation (Recommended)

Method 2 (try this method if one-click installation fails)

Launch WebUI

📖 About Chatterbox TTS

🔥 Key Features

💡 Usage Tips

General Use (TTS and Voice Agents):

Expressive or Dramatic Speech:

Long-Form Content:

🔧 Manual Installation (Advanced Users)

🐍 Python API Usage

🛠️ WebUI Features

Single Text Tab

Long Form Content Tab (Fork Enhancement)

Utilities Tab

🏗️ Original Repository

🙏 Acknowledgements

🔒 Built-in PerTh Watermarking for Responsible AI

💬 Official Discord

⚠️ Disclaimer

About

Uh oh!

Releases 1

Languages

License

Saganaki22/chatterbox-WebUI

Folders and files

Latest commit

History

Repository files navigation

Chatterbox TTS - Windows WebUI Fork

🆕 What's New in This Fork

🎯 One-Click Windows Installation

🚀 Smart Long-Form Processing (New Feature!)

🚀 Quick Start (Windows)

Method 1 - One click installation (Recommended)

Method 2 (try this method if one-click installation fails)

Launch WebUI

📖 About Chatterbox TTS

🔥 Key Features

💡 Usage Tips

General Use (TTS and Voice Agents):

Expressive or Dramatic Speech:

Long-Form Content:

🔧 Manual Installation (Advanced Users)

🐍 Python API Usage

🛠️ WebUI Features

Single Text Tab

Long Form Content Tab (Fork Enhancement)

Utilities Tab

🏗️ Original Repository

🙏 Acknowledgements

🔒 Built-in PerTh Watermarking for Responsible AI

💬 Official Discord

⚠️ Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Languages