Data requirements for fine tuning lj speech to learn my voice in english #264

JRMeyer · 2021-03-07T08:54:02Z

JRMeyer
Mar 7, 2021
Maintainer

>>> Hassan_Jalil
[August 28, 2020, 8:05pm]

So I have been going through the Issues on Github and questions here on
discourse, and my understanding is, if I want to train Mozilla TTS on my
own voice in English, the best approach is to fine tune the pertained
model with new dataset. slash
Now I have a few questions regarding this

1. How much data is needed for fine tuning, considering new data is
also in English but will have a slightly different ascent (I am from
Pakistan) and voice is male. Is 4-5 hours of clean good data enough
?

2. So clean the dataset, give it similar structure to LJ Speech, update
the config and start training ? slash
Can some one provide some basic how to on getting started with Fine
Tuning a pretrained model with my own dataset.

3. For Dataset we require only Audio and Transcript right ? we dont
need alignment?

Thank you. I know these are noobish questions, but I am starting out and
I couldnt find answer to these questions.

[This is an archived TTS discussion thread from discourse.mozilla.org/t/data-requirements-for-fine-tuning-lj-speech-to-learn-my-voice-in-english]

JRMeyer · 2021-03-07T08:54:05Z

JRMeyer
Mar 7, 2021
Maintainer Author

>>> georroussos
[September 1, 2020, 12:08pm]

Hi! Yes this sounds about right. 4-5 hours should be enough if the data
is clean and you should have good results after 30k-40k additional
steps.

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data requirements for fine tuning lj speech to learn my voice in english #264

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Data requirements for fine tuning lj speech to learn my voice in english #264

JRMeyer Mar 7, 2021 Maintainer

Replies: 1 comment

JRMeyer Mar 7, 2021 Maintainer Author

JRMeyer
Mar 7, 2021
Maintainer

JRMeyer
Mar 7, 2021
Maintainer Author