Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New LJSpeech model #108

Closed
erogol opened this issue Feb 18, 2019 · 0 comments
Closed

New LJSpeech model #108

erogol opened this issue Feb 18, 2019 · 0 comments
Labels
model-release explanation for new model releases

Comments

@erogol
Copy link
Contributor

erogol commented Feb 18, 2019

Some of the improvements are as follows

  • History queue to be used as an auto-regressive connection. It enables you to train the model by changing the prediction frame size by only fine-tuning the output layer of the network without any architectural change. It is also a solution for prenet dropout #50

  • Embedding layers to initialize decoder hidden states.

  • Phoneme-based training for faster convergence and robust results.

  • Model prediction frame size is 2 (r=2), leading to better voice synthesis and spectrogram reconstruction. It's a better candidate to cooperate with a neural vocoder.

  • No explicit weighting for lower frequency part of the spectrograms.

  • This model is trained with r=5 for 120K iterations and then continued to train until 185K iterations.

This model is a checkpoint from 185K iteration. It might reach better results through longer training.

@erogol erogol added the model-release explanation for new model releases label Feb 18, 2019
@erogol erogol closed this as completed Feb 18, 2019
@erogol erogol mentioned this issue Feb 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model-release explanation for new model releases
Projects
None yet
Development

No branches or pull requests

1 participant