# 🎤 ChatterBox Unlimited - Colab Edition

Welcome to **ChatterBox Unlimited**! This notebook provides a powerful Gradio interface for ResembleAI's state-of-the-art ChatterBox TTS model.

## ✨ Features
- 🎯 **Zero-shot TTS**: Generate speech from any text
- 🎭 **Voice Cloning**: Clone voices from audio samples
- 🎨 **Emotion Control**: Adjust expressiveness
- 🚀 **GPU Acceleration**: Fast generation with Colab's GPUs
- 🌐 **Web Interface**: Beautiful Gradio UI

## 🚀 Instructions
1. **Enable GPU**: Go to Runtime → Change runtime type → GPU
2. **Run all cells** below in order
3. **Access the interface** through the Gradio link
4. **Start generating speech**!

---

## 📦 Step 1: Install Dependencies

This will install all required packages including PyTorch with CUDA support and ChatterBox TTS.

In [None]:
# CRITICAL FIX for PyTorch/TorchVision circular import issue
import os
import sys

# Force clean Python import cache to avoid circular imports
print("🧹 Clearing Python import cache...")
for module in list(sys.modules.keys()):
    if 'torch' in module or 'vision' in module:
        del sys.modules[module]

# Install UV package manager for faster installations
!pip install uv

# CRITICAL: Complete PyTorch ecosystem cleanup
print("🧹 Thorough PyTorch cleanup...")
!pip uninstall torch torchvision torchaudio -y --quiet
!uv pip uninstall torch torchvision torchaudio -y --quiet
!pip uninstall torch torchvision torchaudio -y --quiet  # Double cleanup

# Clear any cached installations
!rm -rf /usr/local/lib/python3.*/dist-packages/torch*
!rm -rf ~/.cache/pip/wheels/*torch*

# Install PyTorch ecosystem with EXACT compatible versions in correct order
print("🔧 Installing PyTorch ecosystem (Step 1/3)...")
!pip install torch==2.1.0 --index-url https://download.pytorch.org/whl/cu118

print("🔧 Installing TorchVision (Step 2/3)...")
!pip install torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cu118

print("🔧 Installing TorchAudio (Step 3/3)...")
!pip install torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118

# Restart Python kernel to clear any import conflicts
print("🔄 Clearing import cache after PyTorch installation...")
import importlib
import sys
for module in list(sys.modules.keys()):
    if 'torch' in module:
        try:
            del sys.modules[module]
        except:
            pass

# Install compatible numpy version
print("📊 Installing NumPy...")
!pip install "numpy>=1.21.0,<1.25.0"

# Install transformers with proper version that includes is_quanto_available
print("🤖 Installing Transformers with proper version...")
!pip install "transformers>=4.46.0" "accelerate>=0.21.0"

# Install other ML dependencies with compatible versions
print("🧠 Installing ML dependencies...")
!pip install "diffusers>=0.25.0" "omegaconf>=2.3.0"

# Install audio processing dependencies
print("🎵 Installing audio processing libraries...")
!pip install soundfile librosa resampy

# Install Gradio for UI
print("🌐 Installing Gradio...")
!pip install "gradio>=4.0.0"

# Install ResembleAI specific dependencies
print("🎤 Installing ResembleAI dependencies...")
!pip install conformer resemble-perth s3tokenizer

# Install ChatterBox TTS (try multiple methods)
print("🎯 Installing ChatterBox TTS...")
import subprocess
try:
    result = subprocess.run(['pip', 'install', 'chatterbox-tts'], capture_output=True, text=True)
    if result.returncode == 0:
        print("✅ ChatterBox TTS installed from PyPI")
    else:
        raise Exception("PyPI install failed")
except:
    print("⚠️ PyPI installation failed, trying GitHub...")
    !pip install git+https://github.com/resemble-ai/chatterbox.git
    print("✅ ChatterBox TTS installed from GitHub")

print("\n🎉 Installation complete! Now testing compatibility...")

## 🧪 Step 2: Test Installation

This cell tests all dependencies and compatibility.

In [None]:
# Test installation with proper error handling
import sys
import importlib

def safe_import_test(module_name, test_func=None):
    """Safely test module import with optional functionality test"""
    try:
        # Clear any cached imports first
        if module_name in sys.modules:
            del sys.modules[module_name]
        
        module = importlib.import_module(module_name)
        version = getattr(module, '__version__', 'unknown')
        
        if test_func:
            test_func(module)
        
        print(f"✅ {module_name}: {version}")
        return True, module
    except Exception as e:
        print(f"❌ {module_name}: {e}")
        return False, None

print("🔍 Testing all dependencies...")
print("=" * 60)

all_good = True

# Test PyTorch
def test_torch(torch_module):
    # Test basic tensor operations
    x = torch_module.tensor([1.0, 2.0, 3.0])
    y = x * 2
    assert y.sum().item() == 12.0

success, torch = safe_import_test("torch", test_torch)
all_good &= success

if success:
    print(f"   🎮 CUDA available: {torch.cuda.is_available()}")
    if torch.cuda.is_available():
        print(f"   📱 GPU: {torch.cuda.get_device_name(0)}")
        print(f"   💾 GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")
        print(f"   🔧 CUDA Version: {torch.version.cuda}")

# Test TorchVision with specific compatibility check
def test_torchvision(tv_module):
    # Test the problematic NMS operator that was causing circular imports
    import torch
    boxes = torch.tensor([[0, 0, 1, 1], [0.5, 0.5, 1.5, 1.5]], dtype=torch.float32)
    scores = torch.tensor([0.9, 0.8], dtype=torch.float32)
    keep = tv_module.ops.nms(boxes, scores, 0.5)
    print("   🧪 NMS operator test: PASSED")

success, torchvision = safe_import_test("torchvision", test_torchvision)
all_good &= success

# Test Transformers with the specific function that was failing
def test_transformers(tf_module):
    from transformers.utils import is_quanto_available
    print("   🔧 is_quanto_available function: FOUND")

success, transformers = safe_import_test("transformers", test_transformers)
all_good &= success

# Test other critical dependencies
for module in ["numpy", "gradio", "soundfile", "librosa"]:
    success, _ = safe_import_test(module)
    all_good &= success

# Test ChatterBox TTS
def test_chatterbox(cb_module):
    # Just test that we can access the main class
    tts_class = getattr(cb_module.tts, 'ChatterboxTTS')
    print("   🎤 ChatterboxTTS class: ACCESSIBLE")

success, chatterbox = safe_import_test("chatterbox", test_chatterbox)
all_good &= success

print("=" * 60)
if all_good:
    print("🎉 ALL TESTS PASSED! No circular import issues detected.")
    print("🚀 You can proceed to Step 3.")
else:
    print("⚠️ Some tests failed. Recommended actions:")
    print("1. 🔄 Runtime → Restart Runtime")
    print("2. 🔄 Re-run Step 1")
    print("3. 📧 If issues persist, check your Colab runtime type (should be GPU)")

print(f"\n📊 Python version: {sys.version}")
print(f"🔍 Total modules loaded: {len(sys.modules)}")

## 📥 Step 3: Download Repository

Clone the ChatterBox Unlimited repository with the Gradio interface.

In [None]:
# Clone the repository
!git clone https://github.com/Wamp1re-Ai/Chatterbox-Unlimited-Colab.git
%cd Chatterbox-Unlimited-Colab

# List files
!ls -la

## 🚀 Step 4: Launch ChatterBox TTS Interface

This will start the Gradio web interface. The model will automatically download (~5GB) on first run.

In [None]:
# Launch the ChatterBox TTS interface
!python main.py --share --port 7860

# Note: The interface will be available at the Gradio public link
# Look for the line that says "Running on public URL: https://xxxxx.gradio.live"

## 🎯 How to Use

Once the interface is running:

### Basic Text-to-Speech
1. **Load Model**: Click "Load ChatterBox TTS Model" (first time may take a few minutes)
2. **Enter Text**: Type your text in the input box
3. **Adjust Settings**:
   - **Exaggeration**: 0.0-1.0 (emotion intensity)
   - **CFG Weight**: 0.0-1.0 (speech pacing)
4. **Generate**: Click "🎤 Generate Speech"

### Voice Cloning
1. **Upload Reference Audio**: 3-10 seconds of clear speech
2. **Enter Text**: What you want the cloned voice to say
3. **Generate**: The output will mimic the reference voice

### Tips for Best Results
- **General Use**: Default settings (0.5, 0.5) work well
- **Expressive Speech**: Lower CFG (~0.3) + higher exaggeration (~0.7+)
- **Voice Cloning**: Use clear, high-quality reference audio
- **GPU Acceleration**: Colab's GPU will make generation much faster!

---

## 📝 Notes
- Generated audio includes watermarking for responsible AI use
- First model load downloads ~5GB of weights
- Colab session will timeout after inactivity
- For extended use, consider Colab Pro for longer sessions

## 🔧 Troubleshooting

### Fixed Issues in This Version

**✅ FIXED: `partially initialized module 'torchvision' has no attribute 'extension'`**
- **Root Cause**: Circular import between PyTorch and TorchVision
- **Solution**: Complete cleanup + sequential installation + import cache clearing
- **Prevention**: Uses compatible PyTorch 2.1.0 + TorchVision 0.16.0

**✅ FIXED: `cannot import name 'is_quanto_available' from 'transformers.utils'`**
- **Root Cause**: Outdated transformers version
- **Solution**: Updated to transformers>=4.46.0

### Remaining Troubleshooting

**❌ ChatterBox TTS import fails**
- **Solution**: The notebook tries both PyPI and GitHub installation
- **Manual fix**: `!pip install git+https://github.com/resemble-ai/chatterbox.git`

**⚠️ Slow generation**
- **Check**: Make sure GPU is enabled (Runtime → Change runtime type → GPU)
- **Verify**: Step 2 should show your GPU name

**🔄 General troubleshooting steps:**
1. Restart runtime: Runtime → Restart Runtime
2. Re-run all cells from Step 1
3. If Step 2 shows issues, repeat steps 1-2
4. Check that GPU is enabled in runtime settings

## 🔗 Links
- [GitHub Repository](https://github.com/Wamp1re-Ai/Chatterbox-Unlimited-Colab)
- [ResembleAI ChatterBox](https://github.com/resemble-ai/chatterbox)
- [Hugging Face Model](https://huggingface.co/ResembleAI/chatterbox)

**Enjoy creating amazing speech with ChatterBox Unlimited! 🎉**