<a href="https://colab.research.google.com/github/HexaQuad4/style-/blob/main/english-colab-notebook/IndexTTS_Colab_EN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Wamp1re-Ai/index-tts/blob/feat/english-colab-notebook/IndexTTS_Colab_EN.ipynb)
[![Open In Kaggle](https://kaggle.com/static/images/open-in-kaggle.svg)](https://kaggle.com/kernels/welcome?src=https://github.com/Wamp1re-Ai/index-tts/blob/feat/english-colab-notebook/IndexTTS_Colab_EN.ipynb)

# IndexTTS: Zero-Shot Text-To-Speech on Colab/Kaggle (English UI)

This notebook allows you to run the IndexTTS system in Google Colab or Kaggle. It will clone the repository, install dependencies, download models, and start the Gradio web UI. The UI will be in English.

**Features:**
- ✅ Works on both Google Colab and Kaggle
- ✅ English UI with full internationalization support
- ✅ Automatic environment detection and optimization
- ✅ Fast dependency installation with UV package manager
- ✅ Pre-configured model downloads from Hugging Face

In [1]:
# Environment Detection and Setup
import os
import sys

# Detect environment
IN_COLAB = 'google.colab' in sys.modules
IN_KAGGLE = 'kaggle_secrets' in sys.modules or os.path.exists('/kaggle')

print(f"Environment detected:")
print(f"- Google Colab: {IN_COLAB}")
print(f"- Kaggle: {IN_KAGGLE}")

# Clone the IndexTTS repository
!git clone https://github.com/Wamp1re-Ai/index-tts.git
%cd index-tts

# Switch to the English support branch
!git checkout feat/english-colab-notebook

!ls -la

Environment detected:
- Google Colab: True
- Kaggle: True
Cloning into 'index-tts'...
remote: Enumerating objects: 557, done.[K
remote: Counting objects: 100% (137/137), done.[K
remote: Compressing objects: 100% (11/11), done.[K
remote: Total 557 (delta 127), reused 126 (delta 126), pack-reused 420 (from 1)[K
Receiving objects: 100% (557/557), 1.84 MiB | 14.62 MiB/s, done.
Resolving deltas: 100% (324/324), done.
/content/index-tts
Branch 'feat/english-colab-notebook' set up to track remote branch 'feat/english-colab-notebook' from 'origin'.
Switched to a new branch 'feat/english-colab-notebook'
total 212
drwxr-xr-x 8 root root  4096 Oct  7 15:54 .
drwxr-xr-x 1 root root  4096 Oct  7 15:54 ..
drwxr-xr-x 2 root root  4096 Oct  7 15:54 assets
drwxr-xr-x 2 root root  4096 Oct  7 15:54 checkpoints
-rw-r--r-- 1 root root  2212 Oct  7 15:54 DISCLAIMER
-rw-r--r-- 1 root root  5804 Oct  7 15:54 ENGLISH_SUPPORT_SUMMARY.md
drwxr-xr-x 8 root root  4096 Oct  7 15:54 .git
-rw-r--r-- 1 root root 

## Install Dependencies

This step installs `ffmpeg` (required for audio processing) and all the Python packages listed in `requirements.txt`. The installation is optimized for both Colab and Kaggle environments.

In [None]:
# Install system dependencies
if IN_COLAB or IN_KAGGLE:
    !apt-get update && apt-get install -y ffmpeg
else:
    print("Please ensure ffmpeg is installed on your system")

# Install uv for faster package installation
!pip install uv

# Install Python dependencies using uv
# Note: WeTextProcessing is required for text normalization
# pynini might have installation issues on some platforms
try:
    !uv pip install -r requirements.txt --system
    print("✅ Successfully installed requirements.txt")
except Exception as e:
    print(f"❌ Error installing requirements.txt: {e}")
    print("Trying alternative installation method...")
    !pip install -r requirements.txt

# Install WeTextProcessing separately for better error handling
try:
    !uv pip install WeTextProcessing --system
    print("✅ Successfully installed WeTextProcessing")
except Exception as e:
    print(f"❌ Error installing WeTextProcessing: {e}")
    print("Trying with pip...")
    !pip install WeTextProcessing

0% [Working]            Hit:1 https://cli.github.com/packages stable InRelease
0% [Connecting to archive.ubuntu.com (185.125.190.39)] [Connecting to security.                                                                               Get:2 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
0% [Waiting for headers] [Waiting for headers] [Connected to r2u.stat.illinois.                                                                               Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [3 InRelea0% [Waiting for headers] [Waiting for headers] [Waiting for headers] [Connected                                                                               Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:5 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Get:6 https://r2u.stat.illinois.edu/ubu

## Download Models

The following commands will download the necessary model checkpoints from Hugging Face. This works on both Colab and Kaggle environments.

**Note:** The models are approximately 2GB in total. Download time depends on your internet connection.

In [None]:
# Ensure huggingface_hub is installed
!pip install huggingface_hub

# Download models using huggingface-cli
print("📥 Downloading IndexTTS models from Hugging Face...")
print("This may take a few minutes depending on your connection.")

!huggingface-cli download IndexTeam/Index-TTS \
    bigvgan_discriminator.pth \
    bigvgan_generator.pth \
    bpe.model \
    dvae.pth \
    gpt.pth \
    unigram_12000.vocab \
    --repo-type model \
    --local-dir checkpoints \
    --local-dir-use-symlinks False

print("✅ Model download completed!")

# Verify checkpoint files
print("\n📁 Verifying downloaded files:")
!ls -l checkpoints/

# Check if all required files are present
import os
required_files = [
    'bigvgan_discriminator.pth',
    'bigvgan_generator.pth',
    'bpe.model',
    'dvae.pth',
    'gpt.pth',
    'unigram_12000.vocab',
    'config.yaml'
]

missing_files = []
for file in required_files:
    if not os.path.exists(f'checkpoints/{file}'):
        missing_files.append(file)

if missing_files:
    print(f"⚠️  Missing files: {missing_files}")
else:
    print("✅ All required model files are present!")

## Run the Gradio Web UI with Public Access

This will start the Gradio web interface with English UI and set up public URL access using Cloudflare tunnels.

### 🌐 Public URL Options:

**For Colab:**
- 🔗 **Primary**: Colab's built-in public URL (ending with `gradio.live`)
- 🌍 **Backup**: Cloudflare tunnel URL (ending with `trycloudflare.com`)

**For Kaggle:**
- 🌍 **Primary**: Cloudflare tunnel URL (ending with `trycloudflare.com`)
- 📱 **Fallback**: Kaggle's output panel

### ✨ Features:
- ✅ **English UI** with full internationalization support
- ✅ **Public URLs** accessible from anywhere
- ✅ **No registration required** for Cloudflare tunnels
- ✅ **Automatic setup** - just run the cell below

### 🔒 Security Note:
The public URLs are temporary and will expire when the notebook session ends. Do not share sensitive information through these interfaces.

In [None]:
# Setup public tunnel access with ngrok (primary) and Cloudflare (fallback)
import os
import subprocess
import threading
import time

# Set environment variables for optimal performance
os.environ['GRADIO_SERVER_NAME'] = '0.0.0.0'
os.environ['GRADIO_SERVER_PORT'] = '7860'

# Ensure English language is set
os.environ['LANG'] = 'en_US.UTF-8'
os.environ['LC_ALL'] = 'en_US.UTF-8'

print("🚀 Starting IndexTTS Web UI with English interface...")
print("🌐 Setting up reliable public URL access...")
print("🎯 Using ngrok (primary) + Cloudflare (fallback) for maximum reliability")

# Setup ngrok tunnel (more reliable)
def setup_ngrok_tunnel():
    try:
        # Install ngrok
        print("📦 Installing ngrok for reliable public URL access...")
        !wget -q https://bin.equinox.io/c/bNyj1mQVY4c/ngrok-v3-stable-linux-amd64.tgz
        !tar xzf ngrok-v3-stable-linux-amd64.tgz
        !mv ngrok /usr/local/bin/ngrok
        !chmod +x /usr/local/bin/ngrok
        print("✅ ngrok installed successfully")

        # Start ngrok tunnel
        def start_ngrok():
            time.sleep(8)  # Wait for Gradio to start
            try:
                print("\n🚀 Starting ngrok tunnel...")
                print("⏳ This usually takes 10-20 seconds...")

                process = subprocess.Popen([
                    'ngrok', 'http', '7860', '--log=stdout'
                ], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, bufsize=1)

                # Monitor output for tunnel URL
                start_time = time.time()
                timeout = 45

                while time.time() - start_time < timeout:
                    line = process.stdout.readline()
                    if line:
                        line = line.strip()
                        print(f"[ngrok] {line}")

                        # Look for ngrok URL
                        if 'url=' in line and 'ngrok' in line:
                            parts = line.split('url=')
                            if len(parts) > 1:
                                url = parts[1].split()[0]
                                if url.startswith('http') and 'ngrok' in url:
                                    print(f"\n🎉 SUCCESS! ngrok tunnel is ready!")
                                    print(f"🔗 ngrok URL: {url}")
                                    print(f"🌍 Share this URL with anyone: {url}")
                                    print(f"📱 Your IndexTTS is now publicly accessible!")
                                    print(f"✨ ngrok is more reliable than Cloudflare tunnels\n")
                                    return

                        # Alternative format
                        if 'Forwarding' in line and 'ngrok' in line:
                            parts = line.split()
                            for part in parts:
                                if part.startswith('http') and 'ngrok' in part:
                                    print(f"\n🎉 SUCCESS! ngrok tunnel is ready!")
                                    print(f"🔗 ngrok URL: {part}")
                                    print(f"🌍 Share this URL with anyone: {part}")
                                    print(f"📱 Your IndexTTS is now publicly accessible!\n")
                                    return

                    if process.poll() is not None:
                        break

                    time.sleep(0.5)

                print("⏰ ngrok tunnel setup timeout")

            except Exception as e:
                print(f"⚠️  ngrok tunnel error: {e}")

        # Start ngrok in background
        ngrok_thread = threading.Thread(target=start_ngrok, daemon=True)
        ngrok_thread.start()

        return True

    except Exception as e:
        print(f"⚠️  ngrok setup failed: {e}")
        return False

# Setup Cloudflare as fallback
def setup_cloudflare_fallback():
    try:
        print("\n🔄 Also setting up Cloudflare tunnel as backup...")
        !wget -q https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb
        !dpkg -i cloudflared-linux-amd64.deb

        def start_cloudflare():
            time.sleep(15)  # Wait a bit longer
            try:
                print("\n🌐 Starting Cloudflare tunnel as backup...")
                process = subprocess.Popen([
                    'cloudflared', 'tunnel', '--url', 'http://localhost:7860'
                ], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True, bufsize=1)

                for line in iter(process.stdout.readline, ''):
                    line = line.strip()
                    if line and 'trycloudflare.com' in line:
                        words = line.split()
                        for word in words:
                            if word.startswith('http') and 'trycloudflare.com' in word:
                                print(f"\n🔗 Backup Cloudflare URL: {word}")
                                return

            except Exception as e:
                print(f"⚠️  Cloudflare backup failed: {e}")

        cf_thread = threading.Thread(target=start_cloudflare, daemon=True)
        cf_thread.start()

    except Exception as e:
        print(f"⚠️  Cloudflare backup setup failed: {e}")

# Setup tunnels
ngrok_success = setup_ngrok_tunnel()
setup_cloudflare_fallback()  # Always setup as backup

if IN_COLAB:
    print("🔗 Colab will also provide a gradio.live URL")
    if ngrok_success:
        print("🌐 ngrok tunnel will provide the most reliable public URL (see above)")
        print("🔄 Cloudflare tunnel available as backup")
elif IN_KAGGLE:
    if ngrok_success:
        print("🌐 Public access via ngrok tunnel (see above)")
    else:
        print("🔗 Interface will be available in Kaggle's output panel")

print("\n🚀 Launching IndexTTS...")
print("⏳ Please wait for the ngrok URL to appear above...")
print("💡 ngrok URLs are more reliable than Cloudflare tunnels")

# Run the Web UI with public access
!python webui.py --host 0.0.0.0 --port 7860

---
## (Optional) Command-Line Interface (CLI) Usage

You can also use IndexTTS via its command-line interface.
First, you'll need a reference audio. You can upload one to your Colab environment or use a sample. Let's create a dummy reference for demonstration if you don't have one.

**Note:** You'll need to have a `reference_voice.wav` file in the main `index-tts` directory for the example below to work, or modify the path.
You might need to stop the Web UI cell above to run this.

In [None]:
# (Example) Create a dummy reference voice file if you don't have one
# This is just a placeholder. Replace with your actual reference audio.
# import numpy as np
# import soundfile as sf
# samplerate = 22050
# duration = 1
# frequency = 440
# t = np.linspace(0., duration, int(samplerate * duration), endpoint=False)
# data = 0.5 * np.sin(2. * np.pi * frequency * t)
# sf.write('reference_voice.wav', data, samplerate)

# Install IndexTTS as a package for CLI usage
!pip install -e .

# Run CLI inference (make sure 'reference_voice.wav' exists or change path)
# !indextts "Hello, this is a test of the IndexTTS command line interface." \
#   --voice reference_voice.wav \
#   --model_dir checkpoints \
#   --config checkpoints/config.yaml \
#   --output output_cli.wav

# print("If successful, output_cli.wav should be generated.")
# You can then listen to it or download it from the file browser on the left.

---
## (Optional) Python Script Usage

You can also use IndexTTS directly in Python.

**Note:** You might need to stop the Web UI cell above to run this.

In [None]:
# from indextts.infer import IndexTTS

# # Ensure you have a reference voice, e.g., 'reference_voice.wav'
# # This assumes 'reference_voice.wav' is in the current directory (index-tts)
# reference_audio_path = "reference_voice.wav"
# text_to_speak = "This is a sample sentence generated using the IndexTTS Python interface."
# output_file_path = "output_script.wav"

# if 'tts' not in locals(): # Avoid re-initializing if already done
#   tts = IndexTTS(model_dir="checkpoints",cfg_path="checkpoints/config.yaml")

# # Check if reference_voice.wav exists, if not, skip inference
# import os
# if os.path.exists(reference_audio_path):
#   tts.infer(reference_audio_path, text_to_speak, output_file_path)
#   print(f"Generated audio saved to {output_file_path}")
#   # You can play/download this file from Colab's file browser
# else:
#   print(f"Reference audio {reference_audio_path} not found. Skipping script inference demo.")