-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Would love to go through the workshop #1
Comments
@JohannesTK unfortunately, we can't publish pretrained models, because we've used proprietary data to train them. However, it is really easy to train one yourself - just grab one of the librispeech books (with good quality) and follow the instructions in README.md. I've tried to make sure the process is as painless as possible. Let me know how it went |
Thanks for the fast answer! Tried it out with librispeech dev clean where is a total of 323 mins of data. Split it 90% train, 10% test. Didn't get good results. The VAE training loss exploded:
The mel spectrogram inverter looks better: test_mel_pred: https://instaud.io/3OsU Seems like the VAE model needs more training steps? But the loss explodes. What are your thoughts? |
Hm, yeah, the loss looks really unstable and it should definitely sound better. The reason might be that we trained the model on a single speaker whereas the dataset you've used contains a whole bunch of them. Let me take a look at it a little bit later. In the meantime I'd suggest trying a single clean speaker (e.g. chopping up one good book from librivox) and look for anomalies in the tensorboard images. |
The notebook is missing the pre-trained models: https://github.com/respeecher/vae_workshop/blob/master/latent_codes.ipynb
Do you plan to upload them because I would love to go through the workshop?
Thanks!
The text was updated successfully, but these errors were encountered: