Speech synthesis using recurrent neural networks.

This repo has the code for our ICLR submission:

Jose Sotelo, Soroush Mehri, Kundan Kumar, João Felipe Santos, Kyle Kastner, Aaron Courville, Yoshua Bengio. Char2Wav: End-to-End Speech Synthesis.

The website is here.

NOTE: The code is currently being refactored/cleaned/documented. We wanted to make it available as soon as possible but we know well that the current version is not ready for replication. Therefore, if you're interested in this, please come back later. Alternatively, you can send me an email and I will let you know when it's ready.

NOTE(2): The code for the neural vocoder is based on sampleRNN.

Updates:

SampleRNN modules added
End-to-end model can generate more than 4s of audio in 1s on p6000 gpu. (This is achieved by generating 200 samples of length 10s in a single batch. Total time taken for this is 448 seconds.)

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
sampleRNN		sampleRNN
.gitignore		.gitignore
README.MD		README.MD
__init__.py		__init__.py
datasets.py		datasets.py
extensions.py		extensions.py
generate.py		generate.py
model.py		model.py
quantize.py		quantize.py
sample.py		sample.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sampleRNN

sampleRNN

.gitignore

.gitignore

README.MD

README.MD

init.py

init.py

datasets.py

datasets.py

extensions.py

extensions.py

generate.py

generate.py

model.py

model.py

quantize.py

quantize.py

sample.py

sample.py

train.py

train.py

utils.py

utils.py

Repository files navigation

Speech synthesis using recurrent neural networks.

About

Releases

Packages

Languages

sotelo/parrot

Folders and files

Latest commit

History

Repository files navigation

Speech synthesis using recurrent neural networks.

About

Topics

Resources

Stars

Watchers

Forks

Languages