# `Flowtron` training colab notebook
notebook made by [rmcpantoja](https://github.com/rmcpantoja)
# credits:
* This notebook is based on the [Original Notebook](https://colab.research.google.com/drive/1B0DLBIcqOejPr_syAzOfeK9Ib1WE78Gq) made by `reubenkway`.

* [nvidia/flowtron repository](http://github.com/nvidia/flowtron)


## settings:
Here is everything you need to install the synthesizer and its dependencies. Remember to run the cells in order for it to work properly.

In [None]:
#@markdown ### 1: check GPU
!nvidia-smi -L

In [None]:
#@markdown ### 2: installation and configuration tools
#@markdown ---
#@markdown We will mount your drive to work with a dataset and install the necessary tools.
#@markdown * If you are asked if you want to connect to google drive, click on the respective button.
#@markdown * At the end, you will probably be prompted to `restart the environment`. It is highly recommended to do this before proceeding with the following cells.
from google.colab import drive
drive.mount('/content/drive')
%cd /content/
!git clone https://github.com/NVIDIA/flowtron.git
%cd /content/flowtron
!git submodule update --init 
%cd /content/flowtron/tacotron2 
!git submodule update --init
%cd /content/flowtron
!pip install -r requirements.txt

In [None]:
#@markdown ### 2.1: installation and configuration tools - part two
#@markdown ---
#@markdown Run this cell to finish the configuration.
!sed -i -- 's,from numba.decorators import jit as optional_jit, ,g' '/usr/local/lib/python3.7/dist-packages/librosa/util/decorators.py'

## training:
This section is the training part, where you can set the parameters, paths and settings for the model.

In [None]:
#@markdown ### 3: Model settings
#@markdown ---
#@markdown Here you can configure the path for the audios and transcripts, as well as options for the training.
!mkdir /content/flowtron/wavs
#@markdown #### give the model a name
model_name = "testmodel" #@param {type:"string"}
#@markdown ---
#@markdown #### output directory
output_directory = "/content/drive/MyDrive/flowtron_checkpoints" #@param {type:"string"}
#@markdown ---
#@markdown #### the wavs zip path
WavsPath = "" #@param {type:"string"}
!unzip $WavsPath -d /content/flowtron/wavs
#@markdown ---
#@markdown #### transcription paths
#@markdown * The validation file is the same one that we will insert in the training file.
#@markdown * Unlike the native tacotron 2 single speaker, we need a third parameter for the lists, which will be the id. of the speaker This is good for making a multi-voice model. Here's an example:
#@markdown  * wavs/1.wav|text|0
#@markdown  * wavs/1b.wav|second voice text|1
training_files = "" #@param {type:"string"}
!sed -i -- 's,"training_files": "filelists/ljs_audiopaths_text_sid_train_filelist.txt", "training_files": "{training_files}",g' '/content/flowtron/config.json'
validation_files = "" #@param {type:"string"}
!sed -i -- 's,"validation_files": "filelists/ljs_audiopaths_text_sid_val_filelist.txt", "validation_files": "{validation_files}",g' '/content/flowtron/config.json'
#@markdown ---
#@markdown #### the learning rate
learning_rate = 0.0005 #@param {type:"number"}
!sed -i -- 's,"learning_rate": 0.001, "learning_rate": {learning_rate},g' '/content/flowtron/config.json'
#@markdown ---
#@markdown #### choose batch size
#@markdown ---
#@markdown a higher batch may consume more ram, so please choose with caution.
batch_size = 4 #@param {type:"integer"}
!sed -i -- 's,"batch_size": 6, "batch_size": {batch_size},g' '/content/flowtron/config.json'
#@markdown ---
#@markdown #### epochs to train
epochs = 500 #@param {type:"integer"}
!sed -i -- 's,"epochs": 10000000, "epochs": {epochs},g' '/content/flowtron/config.json'
#@markdown ---
#@markdown #### iterations per checkpoint (not recommended to change)
iters_per_checkpoint = 25 #@param {type:"integer"}
!sed -i -- 's,"iters_per_checkpoint": 1000, "iters_per_checkpoint": {iters_per_checkpoint},g' '/content/flowtron/config.json'

In [None]:
#@markdown ### 4: train
#@markdown ---
#@markdown run this cell to start model training.
#@markdown * You can turn tensorboard on or off in the box below.
#@markdown  * The tensorboard is used to visualize the model training process.
#@markdown * Note! If cuda runs out of memory, go to the **settings** cell, decrease the batch size and run it.
#@markdown ---
#@markdown to do: fix tensorboard training.
run_tensorboard = True #@param {type:"boolean"}
if run_tensorboard:
  %load_ext tensorboard
  import tensorflow as tf
  import datetime, os
  %tensorboard --logdir '$output_directory/logs'
%cd /content/flowtron/
print("Downloading pretrained model...")
!gdown https://drive.google.com/uc?id=1sKTImKkU0Cmlhjc_OeUDLrOLIXvUPwnO
!sed -i -- 's,model_{},{model_name},g' '/content/flowtron/train.py'
! python3 train.py -c config.json -p train_config.learning_rate=0.0001 train_config.warmstart_checkpoint_path="/content/flowtron/flowtron_libritts2p3k.pt" train_config.output_directory="$output_directory" train_config.with_tensorboard=$run_tensorboard