# `Forward Tacotron` training notebook
This notebook has been developed by [rmcpantoja](https://github.com/rmcpantoja)
## credits:

* [as-ideas/ForwardTacotron repository](https://github.com/as-ideas/ForwardTacotron).


### important!:

* This notebook is still under development, so some features may not be available. For example the vocoder training and alignment features. The notebook will be constantly updated and I will remove this notice as soon as it is finished and testing is complete.
* For now, this notebook is not optimal for training with small datasets. I'm planning to retrain ljspeech dataset for tacotron. After that, small datasets can be trained.


last update: 2022/11/13

In [None]:
#@markdown ### check allocated GPU
#@markdown ---
#@markdown You need at least one tesla t4, since the training process will take longer. If you have a GPU like k80, go to the menu bar and select runtime-disconnect and remove runtime.
#@markdown * You can also run this notebook without a GPU (not recommended) by disabling hardware acceleration in the notebook settings.
#@markdown * However, keep in mind that the total time required to train will be much longer than on a GPU, and could take up to weeks to complete.
!nvidia-smi -L

In [None]:
#@markdown ### mount google drive
#@markdown ---
#@markdown This is very important to store the checkpoints and preprocessed datasets that Forward Tacotron will be able to work with. However, some important notes:
#@markdown * It's important that you verify your storage space in [Drive](http://drive.google.com/). Depending on the size of the dataset, you need to calculate a larger amount of available space.

from google.colab import drive
drive.mount('/content/drive', force_remount=True)

In [None]:
#@markdown ## install process
#@markdown ---
#@markdown This will install the synthesizer and other important dependencies.
%cd /content
import os
from os.path import exists
if (not os.path.exists("/content/ForwardTacotron")):
  print("Cloning repository...")
  !git clone https://github.com/as-ideas/ForwardTacotron
else:
  print("The working repository already exists. Skipping...")
# pip:
!pip install numba librosa pyworld phonemizer==2.2.2 webrtcvad PyYAML dataclasses soundfile scipy tensorboard matplotlib unidecode inflect
#!pip install git+https://github.com/wkentaro/gdown.git
%cd /content/ForwardTacotron
#apt:
!apt install espeak-ng
!wget https://github.com/mikefarah/yq/releases/download/v4.29.2/yq_linux_amd64.tar.gz
!tar -xvf yq_linux_amd64.tar.gz
!mv /content/ForwardTacotron/yq_linux_amd64 /content/ForwardTacotron/yq
#!bash install-man-page.sh
print("ready")

In [None]:
#@markdown ### settings
#@markdown These are some options with which we can modify settings related to data and training.
#@markdown ---
#@markdown #### desired name for the TTS model
tts_model_id = "myvoice" #@param {type:"string"}
!./yq -i '.tts_model_id = "{tts_model_id}"' "config.yaml"
#@markdown ---
#@markdown #### Desired name for the vocoder (if you are going to train one)
voc_model_id = "myvocoder" #@param {type:"string"}
!./yq -i '.voc_model_id = "{voc_model_id}"' "config.yaml"
#@markdown ---
#@markdown #### the model type to be trained on this dataset
tts_model = "forward_tacotron" #@param ["forward_tacotron", "fast_pitch"]
!./yq -i '.tts_model = "{tts_model}"' "config.yaml"
#@markdown ---
#@markdown #### output directory
#@markdown (it is recommended to save it in drive)
data_path = "/content/drive/MyDrive/ForwardTacotron" #@param {type:"string"}
if not os.path.exists(data_path):
  os.makedirs(data_path)
  os.makedirs(data_path+"/data")
!./yq -i '.data_path = "{data_path}/data/"' "config.yaml"
#@markdown ---
#@markdown #### number of validation (not recommended to change)
n_val = 200 #@param {type:"integer"}
!./yq -i '.preprocessing.n_val = {n_val}' "config.yaml"
#@markdown ---
#@markdown #### Choose the language variation in which you have this dataset
#@markdown Here is a table to choose the desired language code:

#@markdown Code  | Language 

#@markdown en-029  | English (Caribbean)

#@markdown en-gb  | English (Great Britain)

#@markdown en-gb-scotland  | English (Scotland)

#@markdown en-gb-x-gbclan  | English (Lancaster)

#@markdown en-gb-x-gbcwmd  | English (West Midlands)

#@markdown en-gb-x-rp  | English (Received Pronunciation)

#@markdown en-us  | English (America)

language = 'en-us' #@param ["en-029", "en-gb", "en-gb-scotland", "en-gb-x-gbclan", "en-gb-x-gbcwmd", "en-gb-x-rp", "en-us"]
!./yq -i '.preprocessing.language = "{language}"' "config.yaml"
#@markdown ---
#@markdown #### Step interval to generate model training signals
#@markdown Here we can configure how many steps figures, images, visuals and audio will be generated, that is, the progress of the training that can be seen in tensorboard (later).
plot_every = 1000 #@param {type:"integer"}
!./yq -i '.tacotron.training.plot_every = {plot_every}' "config.yaml"
!./yq -i '.forward_tacotron.training.plot_every = {plot_every}' "config.yaml"
!./yq -i '.fast_pitch.training.plot_every = {plot_every}' "config.yaml"
#@markdown ---

## working with the dataset

You can skip these cells if you have already preprocessed a dataset for the first time and want to train it on the last saved checkpoint. Otherwise, expand this section and read the instructions for each cell.

In [None]:
import zipfile
import os
import os.path
#@markdown ### dataset preprocessing
#@markdown ---
#@markdown This will extract the dataset, do a few tweaks to the transcripts, and finally process it. Mel and pitch datasets will be created, these will be part of the training process.
#@markdown * Note: If you are going to preprocess larger datasets, it is recommended to have more space available on your drive.
#@markdown ---
#@markdown #### wavs path (zip file)
wavs_path = "/content/drive/MyDrive/Wavs_m.zip" #@param {type:"string"}
#@markdown ---
#@markdown #### transcription path (*.txt or *.csv)
list_path = "/content/drive/MyDrive/list_m.txt" #@param {type:"string"}
list_filename = os.path.basename(list_path).split('/')[-1]
#@markdown ---
%cd /content/ForwardTacotron
!mkdir /content/ForwardTacotron/wavs
if zipfile.is_zipfile(wavs_path):
  !unzip -j "$wavs_path" -d /content/ForwardTacotron/wavs
else:
  print("Warning: the wav path is not a compressed file.")

if not os.path.exists(list_path):
  raise Exception("Error: Transcript file does not exist, please try again.")
else:
  !cp $list_path /content/ForwardTacotron
if list_path.endswith('.txt'):
  print("Fixing transcript...")
  !mv /content/ForwardTacotron/$list_filename /content/ForwardTacotron/list.csv
  !sed -i -- 's,.wav|,|,g' "/content/ForwardTacotron/list.csv"
  !sed -i -- 's,wavs/,,g' "/content/ForwardTacotron/list.csv"
print("Running preprocess...")
!python preprocess.py --path /content/ForwardTacotron
print("Ready")

### Caution! You should run this cell if you have a dataset in your forward tacotron and want to train another. The contents will be deleted.

In [None]:
#@markdown ### remove the current dataset (if it exists)
#@markdown ---
#@markdown Since the datasets are in the working folder, you may need to train another dataset. If so, run this cell to do so.
# dataset
!rm -rf /content/ForwardTacotron/*.csv
!rm -rf /content/ForwardTacotron/*.wav
!rm -rf /content/ForwardTacotron/*.zip
# preprocessed:
!rm -rf /content/ForwardTacotron/*.npy
!rm -rf /content/ForwardTacotron/data/*

## Train!
This series of steps will require time to achieve a stable training and after hours, and sometimes a few days, to obtain the final results. Please, I suggest carefully reading the indications of each of the cells.

### to do:
* An option if the user wants to train a vocoder.
* Make a retrain of LJSpeech pretrained model to be able to train smaller datasets. This can last up to 2-3 weeks.
  * When this process is finished, allow to be downloaded in the training section and make a selection box with pre-trained model.

These features will be added soon.

In [None]:
#@markdown ### Apply patches
#@markdown ---
#@markdown Before continuing, I recommend running this cell to patch the path functions.
#@markdown * This patch fixes the save path of the models that are generated, and configures them to be saved in the {forwardTacotron folder on your [drive](http://drive.google.com/).
#@markdown * By skipping the creation of this patch the generated models will be saved in the root folder of the project instead of the ForwardTacotron folder created in the drive.
%%writefile /content/ForwardTacotron/utils/paths.py
import os
from pathlib import Path


class Paths:
    """Manages and configures the paths used by WaveRNN, Tacotron, and the data."""
    def __init__(self, data_path, voc_id, tts_id):
        self.base = Path(__file__).parent.parent.expanduser().resolve()

        # Data Paths
        self.data = Path(data_path).expanduser().resolve()
        self.quant = self.data/'quant'
        self.mel = self.data/'mel'
        self.gta = self.data/'gta'
        self.alg = self.data/'alg'
        self.raw_pitch = self.data/'raw_pitch'
        self.phon_pitch = self.data/'phon_pitch'
        self.phon_energy = self.data/'phon_energy'

        self.model_output = self.base / 'model_output'

        self.voc_checkpoints = self.data/'../checkpoints'/f'{voc_id}.wavernn'
        self.voc_top_k = self.voc_checkpoints/'top_k_models'
        self.voc_log = self.voc_checkpoints/'logs'

        self.taco_checkpoints = self.data/ '../checkpoints' / f'{tts_id}.tacotron'
        self.taco_log = self.taco_checkpoints / 'logs'

        self.forward_checkpoints = self.data/'../checkpoints'/f'{tts_id}.forward'
        self.forward_log = self.forward_checkpoints/'logs'

        self.create_paths()

    def create_paths(self):
        os.makedirs(self.data, exist_ok=True)
        os.makedirs(self.quant, exist_ok=True)
        os.makedirs(self.mel, exist_ok=True)
        os.makedirs(self.gta, exist_ok=True)
        os.makedirs(self.alg, exist_ok=True)
        os.makedirs(self.raw_pitch, exist_ok=True)
        os.makedirs(self.phon_pitch, exist_ok=True)
        os.makedirs(self.phon_energy, exist_ok=True)
        os.makedirs(self.voc_checkpoints, exist_ok=True)
        os.makedirs(self.voc_top_k, exist_ok=True)
        os.makedirs(self.taco_checkpoints, exist_ok=True)
        os.makedirs(self.forward_checkpoints, exist_ok=True)

    def get_tts_named_weights(self, name):
        """Gets the path for the weights in a named tts checkpoint."""
        return self.taco_checkpoints / f'{name}_weights.pyt'

    def get_tts_named_optim(self, name):
        """Gets the path for the optimizer state in a named tts checkpoint."""
        return self.taco_checkpoints / f'{name}_optim.pyt'

    def get_voc_named_weights(self, name):
        """Gets the path for the weights in a named voc checkpoint."""
        return self.voc_checkpoints/f'{name}_weights.pyt'

    def get_voc_named_optim(self, name):
        """Gets the path for the optimizer state in a named voc checkpoint."""
        return self.voc_checkpoints/f'{name}_optim.pyt'




In [None]:
#@markdown ### Run tensorboard
#@markdown --
#@markdown The tensorboard is used to visualize the model training process. Note that if you want to visualize this, you can go to the **audio**, **image** or **scalars** tabs.
%load_ext tensorboard
%tensorboard --logdir "checkpoints"
import tensorflow as tf
import datetime

In [None]:
#@markdown ### Train 1: Tacotron
#@markdown ---
!python train_tacotron.py

In [None]:
#@markdown ### Train 2: Forward Tacotron
#@markdown ---
#@markdown This will train the final model for forward tacotron, taking into account the pre-processed dataset along with the alignments and pitch conditions and the generated tacotron model that we worked on in the previous cells.
#@markdown * We will have a division of two models, 150k steps each.
#@markdown * Please note that care will be taken into account based on the tacotron model. If you are training with few files due to bad attention (we can tell this when training starts) there is a problem in the dataset, so please try to review it, fix what is necessary, add more data or revise carefully the Tensorboard.
!python train_forward.py