Parallel WaveNet vocoder

Note: the code is adapted from r9y9's wavenet vocoder, u can get more information about wavenet at there.

Samples

some problems still exists:

the generated wav from teacher will have some noise in silence area(1000k step)
the generated wav from student still have little noise, but most high frequence noise have been removed

important details

use relu rather than leaky relu
don't apply skip connection after the residual connection, the same as r9y9's implemention
you should set share_upsample_conv=True in hparams.py when u train the student

Quick Start

Prepare Data

python preprocess.py \
    ljspeech \  # data name, i use ljspeech as defalut
    your_data_dir \
    the_dir_to_save_data/\
    --preset=presets/ljspeech_gaussian.json \

Train Autoregressive WaveNet(Teacher)

python train.py \
    --preset=presets/ljspeech_gaussian.json \
    --data-root=your_data_dir \
    --hparams='batch_size=9,' \  # in my expreiment, i use 3 gpus(1080Ti)
    --checkpoint-dir=checkpoint-ljspeech \
    --log-event-path=log-ljspeech

Synthesis Using Teacher

python synthesis.py \
    --conditional your_local_condition_path \
    --preset=presets/ljspeech_gaussian.json \
    your_teacher_checkpoint_path \
    your_save_dir

Train Distillation WaveNet(Student)

python train_student.py \
    --preset=presets/ljspeech_gaussian.json \
    --data-root=your_data_dir \
    --hparams='batch_size=8,' \  # in my expreiment, i use 4 gpus(1080Ti)
    --checkpoint-dir=checkpoint-ljspeech_student \
    --log-event-path=log-ljspeech_student \
    --checkpoint_teacher=your_teacher_checkpoint_path

Synthesis Using Student

python synthesis_student.py \
    --conditional your_local_condition_path \
    --preset=presets/ljspeech_gaussian.json \
    your_checkpoint_path \
    your_save_dir

References

ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech

Name		Name	Last commit message	Last commit date
Latest commit History 184 Commits
docs		docs
presets		presets
tests		tests
wavenet_vocoder		wavenet_vocoder
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
MANIFEST.in		MANIFEST.in
README.md		README.md
appveyor.yml		appveyor.yml
audio.py		audio.py
cmu_arctic.py		cmu_arctic.py
dump_hparams_to_json.py		dump_hparams_to_json.py
evaluate.py		evaluate.py
hparams.py		hparams.py
jsut.py		jsut.py
librivox.py		librivox.py
ljspeech.py		ljspeech.py
lrschedule.py		lrschedule.py
preprocess.py		preprocess.py
release.sh		release.sh
setup.py		setup.py
synthesis.py		synthesis.py
synthesis_student.py		synthesis_student.py
tox.ini		tox.ini
train.py		train.py
train_student.py		train_student.py

License

azraelkuan/parallel_wavenet_vocoder

Folders and files

Latest commit

History

Repository files navigation

Parallel WaveNet vocoder

important details

Quick Start

Prepare Data

Train Autoregressive WaveNet(Teacher)

Synthesis Using Teacher

Train Distillation WaveNet(Student)

Synthesis Using Student

References

About

Topics

Resources

License

Stars

Watchers

Forks

Languages