A TensorFlow implementation of Google's Tacotron speech synthesis
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
assets
datasets
.gitignore
LICENSE
README.md
config.py
dataloader.py
model.py
modules.py
requirements.txt
synthesize.py
tfutils.py
train.py
utils.py

README.md

tacotron-tensorflow

A TensorFlow implementation of DeepMind's Tacotron. A deep neural network architectures described in many papers.

Especially for English, Korean.

highly inspired by here

Total alerts Language grade: Python

Requirements

  • Python 3.x (preferred)
  • Tensorflow 1.x
  • matplotlib
  • librosa
  • numpy
  • tqdm

Usage

0. Download Dataset

0. Install Pre-Requisites

python -m pip install -r requirements.txt

1. Adjust Configuration

edit config.py

2. Train!

python train.py

DataSet

DataSet Samples Size
IJSpeech-1.1 13100 about 30GB is needed

Source Tree

│
├── assets
│    └── images       (readme images)
├── datasets
│    ├── ljspeech.py  (LJSpeech 1.1 DataSet)
│    └── ...
├── model
│    └── log data     (readme images)
├── config.py         (whole configuration)
├── dataloader.py     (data loading stuff)
├── model.py          (lots of TTS models)
├── modules.py        (lots of modules frequently used at model)
├── synthesize.py     (inference)
├── train.py          (model training)
├── utils.py          (useful utils)
└── tfutils.py        (useful TF utils)

Model Architecture

Tacotron 1

architecture

Tacotron 2

architecture

DeepVoice v2

soon!

DeepVoice v3

architecture

Author

HyeongChan Kim / @kozistr