New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VCTK datasets #1
Comments
Hi, thanks you rely for the question about VCTK datasets, but i faced a new problem on LibriTTS. I load the pretrained model 100000.pth.tar and 600000.pth.tar in this repo, but the synthesize performance is very poor. Have you ever gotten better results? |
Sorry for the late response. I think it is because the meta-learning part is not perfect yet. I've got to optimize it more and will share the result if I succeeded. Stay tuned! |
I fixed the optimizer and now can confirm that meta learner is working. I'll share the pre-trained model soon. |
@XXXHUA and also try to downsample the training data from 22050 to 16k. it will boost up training and also the generated speech quality (compared to the same step of the current pre-trained models) |
Thanks for your a series of detailed replies, i have tried to use 16KHz data in this model, but i guess the poor performance i faced is result from the 24KHz vocoder. The latest model is trained on 16KHz data ? |
Ah, I didn't know that you used 24kHz vocoder and yes the mismatch between synthesizer and vocoder is definitely the bottleneck for the poor quality. You may need to match them first. The latest one is also trained on 22050Hz. |
Hi, I note your paper evaluates the models' performance on VCTK datasets, but I not see the process file about VCTK. Hence, could you share the files, thank you very much.
The text was updated successfully, but these errors were encountered: