# License

> Copyright 2020 NVIDIA. All Rights Reserved.
> 
> Licensed under the Apache License, Version 2.0 (the "License");
> you may not use this file except in compliance with the License.
> You may obtain a copy of the License at
> 
>     http://www.apache.org/licenses/LICENSE-2.0
> 
> Unless required by applicable law or agreed to in writing, software
> distributed under the License is distributed on an "AS IS" BASIS,
> WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> See the License for the specific language governing permissions and
> limitations under the License.

In [None]:
"""
You can run either this notebook locally (if you have all the dependencies and a GPU) or on Google Colab.
Instructions for setting up Colab are as follows:
1. Open a new Python 3 notebook.
2. Import this notebook from GitHub (File -> Upload Notebook -> "GITHUB" tab -> copy/paste GitHub URL)
3. Connect to an instance with a GPU (Runtime -> Change runtime type -> select "GPU" for hardware accelerator)
4. Run this cell to set up dependencies# .
"""
# If you're using Goab and not running locally, uncomment and run this cell.
!apt-get install sox libsndfile1 ffmpeg
!pip install wget unidecode
BRANCH = 'v1.5.0'
!python -m pip install git+https://github.com/NVIDIA/NeMo.git@$BRANCH#egg=nemo_toolkit[tts]

Reading package lists... Done
Building dependency tree       
Reading state information... Done
libsndfile1 is already the newest version (1.0.28-4ubuntu0.18.04.2).
ffmpeg is already the newest version (7:3.4.8-0ubuntu0.2).
The following additional packages will be installed:
  libmagic-mgc libmagic1 libopencore-amrnb0 libopencore-amrwb0 libsox-fmt-alsa
  libsox-fmt-base libsox3
Suggested packages:
  file libsox-fmt-all
The following NEW packages will be installed:
  libmagic-mgc libmagic1 libopencore-amrnb0 libopencore-amrwb0 libsox-fmt-alsa
  libsox-fmt-base libsox3 sox
0 upgraded, 8 newly installed, 0 to remove and 37 not upgraded.
Need to get 760 kB of archives.
After this operation, 6,717 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 libopencore-amrnb0 amd64 0.1.3-2.1 [92.0 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 libopencore-amrwb0 amd64 0.1.3-2.1 [45.8 kB]
Get:3 http://archive.ubuntu.com/ubuntu bion

# Training

In [None]:
# NeMo's training scripts are stored inside the examples/ folder. Let's grab the tacotron2.py file
!wget https://raw.githubusercontent.com/NVIDIA/NeMo/v1.5.0/examples/tts/tacotron2.py

# download training data and config file
!gdown --id 1jI5sQY_Vhyk0lhEJXiBM79nFXA44udtV
!7z x data.7z
!rm data.7z

--2021-11-26 16:20:26--  https://raw.githubusercontent.com/NVIDIA/NeMo/v1.5.0/examples/tts/tacotron2.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1874 (1.8K) [text/plain]
Saving to: ‘tacotron2.py’


2021-11-26 16:20:26 (32.0 MB/s) - ‘tacotron2.py’ saved [1874/1874]

Downloading...
From: https://drive.google.com/uc?id=1jI5sQY_Vhyk0lhEJXiBM79nFXA44udtV
To: /content/data.7z
100% 419M/419M [00:05<00:00, 70.5MB/s]

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) CPU @ 2.30GHz (306F0),ASM,AES-NI)

Scanning the drive for archives:
  0M Scan         1 file, 419357931 bytes (400 MiB)

Extracting archive: data.7z
--
Path = data.7z
T

In [None]:
# Training.........
!python tacotron2.py sample_rate=22050 train_dataset=TSync2/tsync2_train.json validation_datasets=TSync2/tsync2_test.json trainer.max_epochs=5000 trainer.check_val_every_n_epoch=1 model.train_ds.dataloader_params.batch_size=32 model.validation_ds.dataloader_params.batch_size=32

[NeMo W 2021-11-26 16:25:51 optimizers:50] Apex was not found. Using the lamb or fused_adam optimizer will error out.
################################################################################
###          (please add 'export KALDI_ROOT=<your_path>' in your $HOME/.profile)
###          (or run as: KALDI_ROOT=<your_path> python <your_script>.py)
################################################################################

      f"Passing `Trainer(accelerator={self.distributed_backend!r})` has been deprecated"
    
      f"Setting `Trainer(checkpoint_callback={checkpoint_callback})` is deprecated in v1.5 and will "
    
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
      f"Setting `Trainer(flush_logs_every_n_steps={flush_logs_every_n_steps})` is deprecated in v1.5 "
    
[NeMo I 2021-11-26 16:25:52 exp_manager:280] Experiments will be logged at /content/nemo_experiments/Tacotron2/2021-11-26_16-25-52
[NeMo I 2021-11-

In [None]:
# Synthesis

from nemo.collections.tts.models import Tacotron2Model
from nemo.collections.tts.models import HifiGanModel

model = Tacotron2Model.restore_from("nemo_experiments/Tacotron2/checkpoints/Tacotron2.nemo").eval().cuda()
vocoder = HifiGanModel.from_pretrained("tts_hifigan").eval().cuda()

token_input = model.parse('ทด สอบ สัง เคราะห์ เสียง พูด')
spec_gen = model.generate_spectrogram(tokens=token_input.to('cuda:0'))
audio = vocoder.convert_spectrogram_to_audio(spec=spec_gen).to('cuda:0')

import IPython.display as ipd
ipd.Audio(audio.to('cpu').detach().numpy()[0], rate=22050)

import soundfile as sf
sf.write("test.wav", audio.to('cpu').detach().numpy()[0], 22050)