-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjusting for other languages #1
Comments
Hi, @AnnCod , I think it can work on non-English languages. We tested this solution on Chinese speeches before, and we got a good result, though the audio quality is not good. I supposed it could because:
If you want to get a decent audio quality, you may try to use pretrained models trained on multilingual corpus like XLS-R and then train a vocoder with your target language. |
Thanks for the reply. Is this demo working correctly? I have some errors while trying to run it on colab. |
Sorry, I accidentally misspelled a variable name, fixed by 8060405 |
but there's still an error "RuntimeError: The size of tensor a (23866) must match the size of tensor b (214) at non-singleton dimension 2" |
I can't reproduce this error, would you consider share your colab notebook with these output? |
Hi,
Do you think that this solution can be adapted easily to work on different languages than English?
The text was updated successfully, but these errors were encountered: