## OptiSpeech Training: HFC-Female (en-US)
This notebook allows you to train [OptiSpeech TTS](https://github.com/mush42/optispeech) on [HiFiCaptin en-US female dataset](https://ast-astrec.nict.go.jp/en/release/hi-fi-captain/)


## Plumming

In [None]:
#@markdown ### Google Colab Anti-Disconnect
#@markdown Avoid automatic disconnection. Still, it will disconnect after **6 to 12 hours**.

import IPython
js_code = '''
function ClickConnect(){
console.log("Working");
document.querySelector("colab-toolbar-button#connect").click()
}
setInterval(ClickConnect,60000)
'''
display(IPython.display.Javascript(js_code))


#@markdown ### Check GPU type
#@markdown A higher capable GPU can lead to faster training speeds. By default, you will have a **Tesla T4**.
!nvidia-smi

## Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

## Prepare environment

In [None]:
#@markdown ### Clone OptiSpeech repository

import os

if not os.path.isdir(os.path.join(os.getcwd(), "optispeech")):
    print("Cloning optispeech repository...")
    !git clone --branch eden-alighn --depth=1 https://github.com/mush42/optispeech

#@markdown ### Install system dependencies

# Nothing...

#@markdown ### Upgrade packages

!pip3 install --upgrade pip setuptools wheel

#@markdown ### Install OptiSpeech dependencies

%cd /content/optispeech
!pip3 install -r requirements.txt

#@markdown ### Download HiFiGAN checkpoint for use during training

%cd /content/
!wget https://github.com/shivammehta25/Matcha-TTS-checkpoints/releases/download/v1.0/g_02500000

## Preprocess Dataset

In [None]:
%cd /content
!unzip -q /content/drive/MyDrive/hfc_female-en_us-dataset.zip
%cd /content/optispeech
!python3 -m optispeech.tools.preprocess_dataset \
    --format ljspeech \
    hfc_female-en_us \
    /content/hfc_female-en_us-dataset \
    /content/optispeech/data/hfc_female-en_us


## Enable Tensorboard

In [None]:
# Create log directory
!mkdir -p /content/drive/MyDrive/optispeech/logs

%load_ext tensorboard
%tensorboard --logdir /content/drive/MyDrive/optispeech/logs


## Start training

In [None]:
%cd /content/optispeech
!python3 -m optispeech.train \
    experiment="hfc_female-en_us" \
    model.train_args.hifigan_ckpt="/content/g_02500000" \
    data.train_filelist_path="data/hfc_female-en_us/train.txt" \
    data.valid_filelist_path="data/hfc_female-en_us/val.txt" \
    data.batch_size=64 \
    data.num_workers=2 \
    callbacks.model_checkpoint.every_n_epochs=5 \
    paths.log_dir=" /content/drive/MyDrive/optispeech/logs"
  