# 🎤 ChatterBox Unlimited - Colab Edition

Welcome to **ChatterBox Unlimited**! This notebook provides a powerful Gradio interface for ResembleAI's state-of-the-art ChatterBox TTS model.

## ✨ Features
### 🎤 Core TTS Features
- 🎯 **Zero-shot TTS**: Generate speech from any text
- 🎭 **Voice Cloning**: Clone voices from audio samples
- 🎨 **Emotion Control**: Adjust expressiveness
- 🚀 **GPU Acceleration**: Fast generation with Colab's GPUs
- 🌐 **Web Interface**: Beautiful Gradio UI

### 🆕 Enhanced Features
- 🎭 **Emotion & Style Presets**: 16 pre-configured presets (happy, sad, news anchor, storyteller, etc.)
- 🎚️ **Audio Post-Processing**: Noise reduction, EQ presets, and audio enhancement
- 📦 **Batch Processing**: Process multiple texts with CSV/JSON import and bulk download
- 📝 **Smart Text Processing**: Sound effects ([laugh], [cough]), pronunciation dictionary
- 🔧 **Quality Improvements**: Fixes for audio distortion and voice quality issues

## 🚀 Instructions
1. **Enable GPU**: Go to Runtime → Change runtime type → GPU
2. **Run all cells** below in order
3. **Access the interface** through the Gradio link
4. **Start generating speech**!

---

## 📦 Step 1: Install Dependencies

This will install all required packages including PyTorch with CUDA support and ChatterBox TTS.

In [None]:
# Install UV package manager for faster installations
!pip install uv

# Fix CUDA version conflicts by installing compatible versions
print("🔧 Installing compatible PyTorch versions...")
!uv pip install torch==2.4.0 torchaudio==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu121

# Install compatible versions to avoid import errors
print("📦 Installing compatible dependencies...")
!uv pip install "numpy>=1.24.0,<2.0.0"
!uv pip install "transformers>=4.40.0,<5.0.0"
!uv pip install "diffusers>=0.25.0,<1.0.0"

# Install ChatterBox TTS
print("🎤 Installing ChatterBox TTS...")
!uv pip install chatterbox-tts --no-deps

# Install remaining dependencies with compatible versions
!uv pip install gradio soundfile librosa resampy omegaconf conformer resemble-perth s3tokenizer scipy

# Verify installation
print("\n🧪 Testing installation...")
import torch
print(f"🔥 PyTorch version: {torch.__version__}")
print(f"🎮 CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"   GPU: {torch.cuda.get_device_name(0)}")
    print(f"   Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
    print(f"   CUDA Version: {torch.version.cuda}")

try:
    from chatterbox.tts import ChatterboxTTS
    print("✅ ChatterBox TTS installed successfully!")
except ImportError as e:
    print(f"❌ ChatterBox TTS installation failed: {e}")
    
    # Check if it's a transformers compatibility issue
    if 'is_quanto_available' in str(e):
        print("🔧 Fixing transformers compatibility issue...")
        !uv pip install --upgrade "transformers>=4.45.0"
        print("🔄 Trying again after transformers upgrade...")
        try:
            from chatterbox.tts import ChatterboxTTS
            print("✅ ChatterBox TTS working after transformers upgrade!")
        except ImportError as e3:
            print(f"❌ Still failed after upgrade: {e3}")
            print("🔄 Trying GitHub installation...")
            !uv pip install git+https://github.com/resemble-ai/chatterbox.git
    else:
        print("🔄 Trying GitHub installation...")
        !uv pip install git+https://github.com/resemble-ai/chatterbox.git
    
    try:
        from chatterbox.tts import ChatterboxTTS
        print("✅ ChatterBox TTS installed from GitHub!")
    except ImportError as e2:
        print(f"❌ Still failed: {e2}")
        print("💡 Try restarting runtime and running again.")

## 🔧 Step 1.5: Fix Dependencies (If Needed)

If you see the `is_quanto_available` error or other import issues, run this cell to fix them:

In [None]:
# Quick fix for common Colab dependency issues
import subprocess
import sys

def fix_transformers_issue():
    """Fix the is_quanto_available import error"""
    print("🔧 Fixing transformers compatibility issue...")
    
    # Upgrade transformers to a version that has is_quanto_available
    subprocess.run([sys.executable, "-m", "pip", "install", "--upgrade", "transformers>=4.45.0"], check=False)
    
    # Also upgrade diffusers for compatibility
    subprocess.run([sys.executable, "-m", "pip", "install", "--upgrade", "diffusers>=0.30.0"], check=False)
    
    print("✅ Dependencies upgraded. Please restart runtime if issues persist.")

# Check if we have the import issue
try:
    from transformers.utils import is_quanto_available
    print("✅ Transformers is compatible - no fix needed")
except ImportError as e:
    if 'is_quanto_available' in str(e):
        print("❌ Found is_quanto_available issue - applying fix...")
        fix_transformers_issue()
    else:
        print(f"❌ Different import issue: {e}")

# Test ChatterBox import
try:
    from chatterbox.tts import ChatterboxTTS
    print("✅ ChatterBox TTS import successful!")
except ImportError as e:
    print(f"❌ ChatterBox import failed: {e}")
    print("💡 You may need to restart runtime and re-run the installation.")

## 🔄 Step 2: Restart Runtime (If Needed)

If you see any CUDA or dependency conflicts above, restart the runtime and re-run from Step 1.

In [None]:
# Check if restart is needed
import sys
try:
    import torch
    from chatterbox.tts import ChatterboxTTS
    print("✅ All dependencies working correctly!")
    print(f"🎮 GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'CPU'}")
except Exception as e:
    print(f"⚠️ Issue detected: {e}")
    print("🔄 Please restart runtime: Runtime -> Restart Runtime")
    print("Then re-run all cells from the beginning.")

## 📥 Step 3: Download Repository

Clone the ChatterBox Unlimited repository with the Gradio interface.

In [None]:
# Clone the repository
!git clone https://github.com/Wamp1re-Ai/Chatterbox-Unlimited-Colab.git
%cd Chatterbox-Unlimited-Colab

# List files
!ls -la

## 🚀 Step 4: Launch ChatterBox TTS Interface

This will start the Gradio web interface. The model will automatically download (~5GB) on first run.

In [None]:
# Launch the ChatterBox TTS interface
!python main.py --share --port 7860

# Note: The interface will be available at the Gradio public link
# Look for the line that says "Running on public URL: https://xxxxx.gradio.live"

## 🎯 How to Use

Once the interface is running:

### Basic Text-to-Speech
1. **Load Model**: Click "Load ChatterBox TTS Model" (first time may take a few minutes)
2. **Enter Text**: Type your text in the input box
3. **Adjust Settings**:
   - **Exaggeration**: 0.0-1.0 (emotion intensity)
   - **CFG Weight**: 0.0-1.0 (speech pacing)
4. **Generate**: Click "🎤 Generate Speech"

### Voice Cloning
1. **Upload Reference Audio**: 3-10 seconds of clear speech
2. **Enter Text**: What you want the cloned voice to say
3. **Generate**: The output will mimic the reference voice

### 🆕 Enhanced Features Usage
#### Emotion & Style Presets
- **Single Generation Tab**: Choose from 16 presets (😊 Happy, 📺 News Anchor, 📚 Storyteller, etc.)
- **Smart Suggestions**: The system suggests presets based on your text content
- **Custom Presets**: Create and save your own emotion/style combinations

#### Audio Enhancement
- **Enable Audio Enhancement**: Toggle professional audio processing
- **Noise Reduction**: Remove background noise and artifacts (0.0-1.0)
- **EQ Presets**: Choose Vocal, Warm, Bright, or Balanced sound
- **Text Processing**: Automatic pronunciation fixes and sound effects

#### Batch Processing
- **Batch Processing Tab**: Process multiple texts at once
- **Text Lines**: Enter multiple lines, each becomes a separate audio file
- **CSV Upload**: Upload structured data with custom parameters per line
- **Bulk Download**: Get all results in a single ZIP file with reports

#### Smart Text Features
- **Sound Effects**: Use [laugh], [giggle], [cough], [pause], [sigh], etc.
- **Pronunciation**: Automatic fixes for AI, GPU, CPU, HTTP, URLs, etc.
- **Numbers**: Converts 2024 → "twenty twenty-four"
- **URLs/Emails**: Converts links to speakable format

### Tips for Best Results
- **General Use**: Default settings (0.5, 0.5) work well
- **Expressive Speech**: Lower CFG (~0.3) + higher exaggeration (~0.7+)
- **Voice Cloning**: Use clear, high-quality reference audio
- **GPU Acceleration**: Colab's GPU will make generation much faster!

---

## 📝 Notes
- Generated audio includes watermarking for responsible AI use
- First model load downloads ~5GB of weights
- Colab session will timeout after inactivity
- For extended use, consider Colab Pro for longer sessions

## 🔗 Links
- [GitHub Repository](https://github.com/Wamp1re-Ai/Chatterbox-Unlimited-Colab)
- [ResembleAI ChatterBox](https://github.com/resemble-ai/chatterbox)
- [Hugging Face Model](https://huggingface.co/ResembleAI/chatterbox)

**Enjoy creating amazing speech with ChatterBox Unlimited! 🎉**