# Mongolian Text To Speech with Tacotron

This is a Mongolian text to speech using the data from the Mongolian Bible audio book with Tacotron:
```
Tacotron: Towards End-to-End Speech Synthesis
https://arxiv.org/abs/1703.10135
```

The original open source Tacotron repo can be found here: [Rayhane-mamah/Tacotron-2](https://github.com/Rayhane-mamah/Tacotron-2). 

The original repo was forked and updated for Mongolian: [tugstugi/Tacotron-2](https://github.com/tugstugi/Tacotron-2)

There is also another Mongolian open source TTS using PyTorch and a fully convolutional network: [tugstugi/pytorch-dc-tts](https://github.com/tugstugi/pytorch-dc-tts)

To test this demo, click on "**Runtime->Run All**" (Google account required).

## Setup

### Install dependencies

In [None]:
%tensorflow_version 1.x
import os
from os.path import exists, join, expanduser

import IPython
from IPython.display import Audio, clear_output

# pyaudio needs this system dependency!
!apt-get install -qq portaudio19-dev > /dev/null
# clone Tacotron-2 and install dependencies
if not exists('Tacotron-2'):
  !git clone https://github.com/tugstugi/Tacotron-2.git && cd Tacotron-2 && pip install -q -r requirements.txt

### Download a pretrained model

In [None]:
if not exists('Tacotron-2/logs-Tacotron/taco_pretrained'):
  # download pretrained model from the Google Drive link
  pretrained_file_id = "1fgx0kpf0Oe2Idz3lUZM6-p343nc-24Hq"
  pretrained_file_name = "taco_pretrained.tar.gz"
  !curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id={pretrained_file_id}" > /dev/null
  confirm_text = !awk '/download/ {print $NF}' ./cookie
  confirm_text = confirm_text[0]
  !curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm={confirm_text}&id={pretrained_file_id}" -o {pretrained_file_name}
  # extract it
  !mkdir Tacotron-2/logs-Tacotron/
  !tar xvfz {pretrained_file_name} --directory Tacotron-2/logs-Tacotron/

## Synthesize

### Allowed characters

абвгдеёжзийклмноөпрстуүфхцчшъыьэюя-.,!?

### Sentences to synthesize

In [None]:
SENTENCES = [
    "Хэнтий, Хангай, Соёны өндөр сайхан нуруунууд. Хойд зүгийн чимэг болсон ой хөвч уулнууд.",
    "Мэнэн, Шарга, Номины өргөн их говиуд. Өмнө зүгийн манлай болсон элсэн манхан далайнууд.", 
    "Энэ бол миний төрсөн нутаг. Монголын сайхан орон."
]
text_list = "\n".join(SENTENCES)

### Synthetize on CPU

In [None]:
# synthesize with Tacotron
!cd Tacotron-2/ && python simple-synthesize.py --text_list "{text_list}"

# show the text and WAV files
clear_output()
for i in range(len(SENTENCES)):
  print(SENTENCES[i])
  IPython.display.display(Audio('Tacotron-2/tacotron_output/logs-eval/wavs/wav-batch_%i_sentence_0-linear.wav' %i, rate=22050))

Хэнтий, Хангай, Соёны өндөр сайхан нуруунууд. Хойд зүгийн чимэг болсон ой хөвч уулнууд.


Мэнэн, Шарга, Номины өргөн их говиуд. Өмнө зүгийн манлай болсон элсэн манхан далайнууд.


Энэ бол миний төрсөн нутаг. Монголын сайхан орон.
