Skip to content
/ tts-gan Public

End-to-end Text-to-Speech with Generative Adversarial Networks

License

Notifications You must be signed in to change notification settings

vliu15/tts-gan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

End-to-end Text-to-Speech with Generative Adversarial Networks

This repository contains implementation and end-to-end training scripts for text-to-speech models, based off End-to-End Adversarial Text-to-Speech (Donahue et al. 2020).

Usage

To setup the Python environment, run

python -m venv ttsgan
source ttsgan/bin/activate
python -m pip install --upgrade pip
python -m pip install -r requirements.txt

Aggregate audio files from the LJ-Speech dataset by running

ls LJSpeech-1.1/wavs/*.wav | tail -n+10 > train_files.txt
ls LJSpeech-1.1/wavs/*.wav | head -n10 > test_files.txt

Specify the path to the metadata.csv via the --metadata_file flag. Download the CMU phonemizer dictionary here and specify the path via the --cmudict_file flag.

To train, simply run

python train.py -c config.yml

About

End-to-end Text-to-Speech with Generative Adversarial Networks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages