Skip to content

Rongjiehuang/Multiband-WaveRNN

Repository files navigation

Multiband-WaveRNN

Pytorch implementation of MultiBand-WaveRNN model from Efficient Neural Audio Synthesis DURATION INFORMED ATTENTION NETWORK FOR MULTIMODAL SYNTHESIS

Issues

RAW mode, Unbatched generation supported. Welcome for your contribution to implement MOL mode.

Installation

Ensure you have:

Then install the rest with pip:

pip install -r requirements.txt

How to Use

Training your own Models

Download the LJSpeech Dataset.

Edit hparams.py, point wav_path to your dataset and run:

python preprocess.py

or use preprocess.py --path to point directly to the dataset


Here's my recommendation on what order to run things:

1 - Train WaveRNN with:

python train_wavernn.py

2 - Generate Sentences with both models using:

python gen_wavernn.py

Speech

Mandarin

Speaker Recording WaveRNN Parallel WaveGAN FB MelGAN SingVocoder
#1
#2
#3
#4
#5

English

Speaker Recording WaveRNN Parallel WaveGAN FB MelGAN SingVocoder
#1
#2
#3
#4
#5

References

Acknowlegements