Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Alternative Pretrained models for UK accent #388

Closed
munceyboyjoe opened this issue Jun 29, 2020 · 4 comments
Closed

Request: Alternative Pretrained models for UK accent #388

munceyboyjoe opened this issue Jun 29, 2020 · 4 comments

Comments

@munceyboyjoe
Copy link

Whilst this tool is fascinating and fun to play with, the pretrained models made available for download impart a US accent on UK English speakers. I don't have the compute power to train models, does anyone know of available models for download that would fair better with UK accents?

@ghost ghost changed the title Alternative Pretrained models for UK accent Request: Alternative Pretrained models for UK accent Jun 29, 2020
@deeepgandhi
Copy link

Yeah, I face a similar problem. I think if you could find datasets of speakers in the UK accent. You can finetune the speaker encoding model. I think that can make the results better.

@ghost
Copy link

ghost commented Aug 7, 2020

As @deepgandhideep mentioned, you can finetune a single-speaker model and I have provided directions here. The training can be completed in less than a day without a GPU. #437 (comment)

@ghost ghost mentioned this issue Aug 21, 2020
@ghost
Copy link

ghost commented Sep 30, 2020

I tried training on VCTK, but there is not enough data to make a good voice cloning model. Many of the speakers in that dataset have unsuitable accents (American, Canadian, Australian, Indian). Furthermore, the vocabulary in that corpus lacks the variety that we have in LibriSpeech/TTS which makes for a better synthesizer.

If anyone wants this, please provide the dataset and transcripts in a usable format (preferably the one in #437 (comment) ). Commonvoice is a potential dataset as you can use the metadata to filter by accent. You could also explore the speakers in the Spoken Wikipedia Corpora (https://nats.gitlab.io/swc/).

@ghost
Copy link

ghost commented Oct 6, 2020

Closing this issue due to inactivity. Anyone is welcome to reopen it if taking this project on, or if a dataset is located that isn't VCTK. It is possible to train a voice cloning model on VCTK, but the result doesn't meet my standards.

@ghost ghost closed this as completed Oct 6, 2020
@ghost ghost mentioned this issue Oct 8, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants