<a href="https://colab.research.google.com/github/Troyanovsky/awesome-TTS-Colab/blob/main/Index_TTS_V2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🗣️ IndexTTS2 Colab

## 📄 Description

This Colab notebook uses **IndexTTS2**, a breakthrough autoregressive zero-shot text-to-speech (TTS) model with **emotionally expressive** and **duration-controllable** speech synthesis.

**Capabilities**: Emotion-Controlled Speech, Duration-Specific Generation, Zero-Shot Timbre Cloning, Multi-Modal Emotion Guidance, High-Stability Emotional Speech

---

## How to use

* Run all cells
* Wait for the gradio to provide an URL for Web UI
* User Web UI to generate audio

---

## 🔗 Resources

* **GitHub Repository:** [https://github.com/index-tts/index-tts](https://github.com/index-tts/index-tts)
* **Model Availability:** [https://huggingface.co/IndexTeam/IndexTTS-2](https://huggingface.co/IndexTeam/IndexTTS-2)

---

## 🎙️ Explore More TTS Models

Want to try out additional TTS models? Check out the curated collection here:
👉 [awesome-TTS-Colab](https://github.com/Troyanovsky/awesome-TTS-Colab)

## Check GPU

In [1]:
import subprocess, sys, os, textwrap, shutil, json

def sh(cmd, check=True, cwd=None, env=None):
    print(f"\n$ {cmd}")
    result = subprocess.run(cmd, shell=True, check=check, cwd=cwd, env=env)
    return result

# Check GPU
try:
    sh("nvidia-smi")
    print("✅ GPU detected.")
except Exception as e:
    print("⚠️ No NVIDIA GPU detected. In Colab, go to: Runtime → Change runtime type → Hardware accelerator: T4 GPU")
    raise


$ nvidia-smi
✅ GPU detected.


## Depdencies

In [2]:
# Clean any partial clone
import shutil, os, subprocess, sys
REPO_DIR = "/content/index-tts"
if os.path.exists(REPO_DIR):
    shutil.rmtree(REPO_DIR, ignore_errors=True)

# Clone WITHOUT LFS (prevents any auth prompts)
!git -c filter.lfs.smudge= -c filter.lfs.process= -c filter.lfs.required=false clone https://github.com/index-tts/index-tts.git /content/index-tts

# Install uv (required) and set up env
!python -m pip install -U pip uv
%cd /content/index-tts
!UV_LINK_MODE=copy uv sync --extra webui

# Pull the actual model from Hugging Face (this is what you need)
!uv tool install "huggingface_hub[cli]"
# Optional mirror if HF is slow:  os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
!hf download IndexTeam/IndexTTS-2 --local-dir=checkpoints


Cloning into '/content/index-tts'...
remote: Enumerating objects: 1106, done.[K
remote: Counting objects: 100% (14/14), done.[K
remote: Compressing objects: 100% (11/11), done.[K
remote: Total 1106 (delta 5), reused 4 (delta 3), pack-reused 1092 (from 2)[K
Receiving objects: 100% (1106/1106), 33.32 MiB | 37.37 MiB/s, done.
Resolving deltas: 100% (540/540), done.
Collecting pip
  Downloading pip-25.2-py3-none-any.whl.metadata (4.7 kB)
Collecting uv
  Downloading uv-0.8.17-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading pip-25.2-py3-none-any.whl (1.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m32.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading uv-0.8.17-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.1/21.1 MB[0m [31m107.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: uv, pip
  Attempting uninstall: 

## WebUI

In [3]:
import os, re, sys, subprocess, time, pathlib

REPO_DIR = "/content/index-tts"
WEBUI_FILE = os.path.join(REPO_DIR, "webui.py")

# 1) Patch webui.py to force a share link
with open(WEBUI_FILE, "r", encoding="utf-8") as f:
    code = f.read()

patched = code.replace(
    "demo.launch(server_name=cmd_args.host, server_port=cmd_args.port)",
    "demo.launch(server_name=cmd_args.host, server_port=cmd_args.port, share=True)"
)

if code != patched:
    with open(WEBUI_FILE, "w", encoding="utf-8") as f:
        f.write(patched)
    print("✅ Patched webui.py (share=True).")
else:
    print("ℹ️ webui.py already patched.")

# 2) Kill any previous run (safe if none)
!pkill -f "uv run webui.py" || true

✅ Patched webui.py (share=True).
^C


In [6]:
# 3) Run the WebUI in the foreground so Gradio prints the link here.
#    Keep this cell running while you use the UI.
%cd /content/index-tts
!uv run webui.py

/content/index-tts
>> GPT weights restored from: ./checkpoints/gpt.pth
GPT2InferenceModel has generative capabilities, as `prepare_inputs_for_generation` is explicitly overwritten. However, it doesn't directly inherit from `GenerationMixin`. From 👉v4.50👈 onwards, `PreTrainedModel` will NOT inherit from `GenerationMixin`, and this model will lose the ability to call `generate` and other related functions.
  - If you are the owner of the model architecture code, please modify your model class such that it inherits from `GenerationMixin` (after `PreTrainedModel`, otherwise you'll get an exception).
  - If you are not the owner of the model architecture class, please contact the model code owner to update it.
preprocessor_config.json: 100% 275/275 [00:00<00:00, 1.77MB/s]
config.json: 1.87kB [00:00, 9.10MB/s]
model.safetensors: 100% 2.32G/2.32G [03:06<00:00, 12.4MB/s]
semantic_codec/model.safetensors: 100% 177M/177M [00:05<00:00, 31.6MB/s]
>> semantic_codec weights restored from: /root/.cac