<a href="https://colab.research.google.com/github/MarkTarry/Piper-TTS/blob/main/Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

0) Colab runtime + GPU check

In [None]:
# Check GPU
import torch, platform, sys
print("Python:", sys.version.split()[0])
print("PyTorch:", torch.__version__)
print("CUDA available:", torch.cuda.is_available())
!nvidia-smi

1) System packages (incl. eSpeak dev)

In [None]:
!sudo apt-get update -y
!sudo apt-get install -y build-essential cmake ninja-build espeak-ng espeak-ng-data libespeak-ng-dev pkg-config ffmpeg
!pkg-config --modversion espeak-ng

2) Clone repo fresh

In [None]:
%cd /content
!rm -rf piper1-gpl
!git clone https://github.com/OHF-voice/piper1-gpl.git
%cd piper1-gpl
!pwd

3) Python deps (editable install, no venv in Colab)

In [None]:
!python3 -m pip install --upgrade pip setuptools wheel
!python3 -m pip install -e ".[train]"

4) Build the Cython extension used for alignment

In [None]:
%cd /content/piper1-gpl
!chmod +x ./build_monotonic_align.sh
!./build_monotonic_align.sh

5) Dev build (repo mode)

In [None]:
!python3 -m pip install --upgrade pip setuptools wheel scikit-build cmake ninja

In [None]:
%cd /content/piper1-gpl
!python3 setup.py build_ext --inplace -v

6) (Optional) Mount Google Drive for datasets and outputs

In [None]:
from google.colab import drive
drive.mount('/content/drive')

7) Set paths and training hyperparams

In [None]:
%env DATA_ROOT_DIR=/content/drive/MyDrive/Piper
%env MODEL_NAME=jarvis
%env MODEL_VOICE=en

In [None]:
from pathlib import Path
import os

# ==== CHANGE THESE ====
VOICE_NAME      = os.getenv('MODEL_NAME', 'model')
ESPEAK_VOICE    = os.getenv('MODEL_VOICE', 'en')
SAMPLE_RATE_HZ  = 22050
BATCH_SIZE      = 32       # drop to 8 or 4 if you OOM

DATA_ROOT       = Path(f"{os.getenv('DATA_ROOT_DIR', '/content/drive/MyDrive/Piper')}/{VOICE_NAME}")
AUDIO_DIR       = DATA_ROOT / "wavs"
CSV_PATH        = DATA_ROOT / "metadata.csv"

CACHE_DIR       = Path("/content/piper_cache")
CONFIG_PATH     = DATA_ROOT / f"config.json"

# Optional: start from an existing checkpoint to speed up & stabilize training
# Get a .ckpt from https://huggingface.co/datasets/rhasspy/piper-checkpoints (medium quality recommended)
CKPT_PATH       = ""  # e.g., "/content/drive/MyDrive/piper_ckpts/en_US-lessac-medium.ckpt"

# Make sure dirs exist
CACHE_DIR.mkdir(parents=True, exist_ok=True)
print("CSV exists:", CSV_PATH.exists())
print("Audio dir exists:", AUDIO_DIR.exists())
print("Cache dir:", CACHE_DIR)
print("Config will be written to:", CONFIG_PATH)

8) Quick sanity checks

In [None]:
!espeak-ng --voices | head -n 20

In [None]:
import pandas as pd, io, os, textwrap

csv_path = str(CSV_PATH)
if os.path.exists(csv_path):
    # Read as pipe-delimited, two columns
    try:
        df = pd.read_csv(csv_path, sep="|", header=None, names=["audio","text"])
        print(df.head())
        # Check a few audio files exist
        missing = [a for a in df["audio"].head(5) if not (AUDIO_DIR/str(a)).exists()]
        print("Missing among first 5:", missing)
    except Exception as e:
        print("CSV read error:", e)
else:
    print("CSV not found at:", csv_path)

9) Kick off training

Use online checkpoint to start:
```
--ckpt_path "https://huggingface.co/datasets/rhasspy/piper-checkpoints/resolve/main/en/en_US/lessac/medium/epoch%3D2164-step%3D1355540.ckpt"
```
Or use a local save when resuming:
```
--ckpt_path "$DATA_ROOT_DIR/$MODEL_NAME/latest.ckpt"
```

In [None]:
!timeout 60m python3 -m piper.train fit \
  --data.voice_name "$MODEL_NAME" \
  --data.csv_path "$DATA_ROOT_DIR/$MODEL_NAME/metadata.csv" \
  --data.audio_dir "$DATA_ROOT_DIR/$MODEL_NAME/wavs" \
  --model.sample_rate 22050 \
  --data.espeak_voice "$MODEL_VOICE" \
  --data.cache_dir "/content/piper_cache" \
  --data.config_path "$DATA_ROOT_DIR/$MODEL_NAME/config.json" \
  --data.batch_size 8 \
  --ckpt_path "$DATA_ROOT_DIR/$MODEL_NAME/latest.ckpt"

10. Take backup of export

In [None]:
import os
import glob
import re

# Find the latest version directory
lightning_logs_dir = "/content/piper1-gpl/lightning_logs"
version_dirs = glob.glob(os.path.join(lightning_logs_dir, "version_*"))
if version_dirs:
    latest_version_dir = max(version_dirs, key=os.path.getmtime)
    print(f"Latest version directory found: {latest_version_dir}")

    # Find the first .ckpt file in the latest version directory
    ckpt_files = glob.glob(os.path.join(latest_version_dir, "checkpoints", "*.ckpt"))
    if ckpt_files:
        first_ckpt_file = min(ckpt_files, key=os.path.getctime)
        print(f"First checkpoint file found: {first_ckpt_file}")

        # Extract epoch and step from the filename
        match = re.search(r"epoch=(\d+)-step=(\d+)", os.path.basename(first_ckpt_file))
        if match:
            epoch = match.group(1)
            step = match.group(2)

            !cp -v {first_ckpt_file} {DATA_ROOT}/
            !cp -v -f {first_ckpt_file} {DATA_ROOT}/latest.ckpt

            !python3 -m piper.train.export_onnx \
                --checkpoint {first_ckpt_file} \
                --output-file {DATA_ROOT}/model-epoch={epoch}-step={step}.onnx

            !cp -v {DATA_ROOT}/config.json {DATA_ROOT}/model-epoch={epoch}-step={step}.onnx.json
        else:
            print("Could not extract epoch and step from checkpoint filename.")
    else:
        print("No checkpoint files found in the latest version directory.")
else:
    print("No version directories found in lightning_logs.")

### Final Step - Export to onnx
```shell
!python3 -m piper.train.export_onnx \
  --checkpoint "/content/piper1-gpl/lightning_logs/version_2/checkpoints/epoch=2174-step=2680.ckpt" \
  --output-file "/content/model.onnx"
```
```shell
!cp /content/drive/MyDrive/tts_data/myvoice/my_colab_voice.json /content/model.onnx.json
```