# Yoruba TTS Training on Google Colab

This notebook sets up and trains a VITS-based Text-to-Speech model for Yoruba language.

**Features:**
- GPU-accelerated training
- Automatic setup and dependency installation
- Training progress monitoring
- Audio synthesis testing

**Note:** Make sure to enable GPU runtime: `Runtime > Change runtime type > GPU`

## 1. Setup Environment

In [None]:
# Check GPU availability
!nvidia-smi

In [None]:
# Clone the repository
!git clone https://github.com/T-ultrafast/Naija_tts.git
%cd Naija_tts

In [None]:
# Install dependencies
!pip install -q TTS trainer coqpit
!pip install -q flask

## 2. Verify Setup

In [None]:
# List files
!ls -la

In [None]:
# Check metadata file
!head -5 metadata_yor.csv

## 3. Update Config for GPU Training

In [None]:
# Update train_vits.py to use CUDA if available
import os

# Read the current train_vits.py
with open('train_vits.py', 'r') as f:
    content = f.read()

# Replace use_cuda=False with use_cuda=True in the script if needed
# (The Trainer should auto-detect GPU, but we can verify)
print("train_vits.py is ready for training")
print(f"CUDA available: {__import__('torch').cuda.is_available()}")

## 4. Start Training

**Note:** Training will take several hours. You can monitor progress in the output below.

In [None]:
# Start training (this will run for a long time)
!python train_vits.py

## 5. Monitor Training (Optional)

You can check training logs and checkpoints while training is running.

In [None]:
# List output directories
!ls -lh out/naija_xtts_yor/

In [None]:
# View latest training log (update the directory name to match your run)
!tail -50 out/naija_xtts_yor/*/trainer_0_log.txt

## 6. Test Inference

After training (or using an existing checkpoint), test the model.

In [None]:
# Find the latest checkpoint
import glob
import os

checkpoint_dirs = glob.glob('out/naija_xtts_yor/*/')
if checkpoint_dirs:
    latest_dir = max(checkpoint_dirs, key=os.path.getmtime)
    checkpoints = glob.glob(os.path.join(latest_dir, 'checkpoint_*.pth'))
    if checkpoints:
        latest_checkpoint = max(checkpoints, key=os.path.getmtime)
        print(f"Latest checkpoint: {latest_checkpoint}")
    else:
        print("No checkpoints found yet")
else:
    print("No training runs found")

In [None]:
# Test synthesis
from TTS.utils.synthesizer import Synthesizer
from naija_formatter import naija_formatter
import TTS.tts.datasets
from IPython.display import Audio

# Register formatter
TTS.tts.datasets.naija = naija_formatter

# Update these paths to match your latest checkpoint
MODEL_PATH = latest_checkpoint  # Use the checkpoint found above
CONFIG_PATH = os.path.join(os.path.dirname(latest_checkpoint), 'config.json')

# Load model
synthesizer = Synthesizer(
    tts_checkpoint=MODEL_PATH,
    tts_config_path=CONFIG_PATH,
    use_cuda=True,  # Use GPU for inference
)

# Synthesize
text = "Bawo ni, se dada ni?"
print(f"Synthesizing: {text}")
wav = synthesizer.tts(text)

# Play audio
Audio(wav, rate=synthesizer.output_sample_rate)

## 7. Save Checkpoint to Google Drive (Optional)

To preserve your trained model, save it to Google Drive.

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# Copy checkpoints to Drive
!mkdir -p /content/drive/MyDrive/yoruba_tts_checkpoints
!cp -r out/naija_xtts_yor/* /content/drive/MyDrive/yoruba_tts_checkpoints/
print("Checkpoints saved to Google Drive!")

## 8. Run Web Interface (Optional)

You can run the Flask web interface in Colab using ngrok for public access.

In [None]:
# Install pyngrok
!pip install -q pyngrok

In [None]:
# Update app.py to use the latest checkpoint
# Then start the Flask app with ngrok
from pyngrok import ngrok
import threading

# Start Flask in background
def run_flask():
    os.system('python app.py')

thread = threading.Thread(target=run_flask)
thread.start()

# Create ngrok tunnel
public_url = ngrok.connect(5000)
print(f"\nüåê Web Interface URL: {public_url}")
print("Click the link above to access your Yoruba TTS web interface!")