Advice on prepping datasets other than LJspeech? #6

scripples · 2020-06-07T21:03:10Z

Hi, I'm trying to prep my own dataset to train on the ForwardTacotron model--could you give any insight as to what train_tacotron.py or train_forward.py is expecting in terms of training data organization? Like, the old NVIDIA TT2 repo expects two text files formatted in a certain way and a path to the WAV files in the arguments. Is there something similar for this repo?

cschaefer26 · 2020-06-08T08:39:40Z

Hey, just prepare your dataset in the LJSpeech format:

|- dataset_folder/
|   |- metadata.csv
|   |- wav/
|       |- file1.wav
|       |- ...

If the language differs from English, make sure you set the correct language in the hparams.py file:

language = 'fr'
tts_cleaner_name = 'basic_cleaners'

Then just follow the steps from the README with preprocessing the folder, everything should be done automatically including splitting of the dataset into train/val etc.

I updated the README to be clearer on this. Best of luck!

scripples · 2020-06-11T01:40:30Z

Thanks for your help! I'm looking forward to giving it a try.

ghost · 2020-10-04T05:01:51Z

Where find list is supported languages?

prajwaljpj · 2020-10-20T08:13:49Z

@paklau99988 you can find the list of languages from here

scripples closed this as completed Jun 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advice on prepping datasets other than LJspeech? #6

Advice on prepping datasets other than LJspeech? #6

scripples commented Jun 7, 2020 •

edited

Loading

cschaefer26 commented Jun 8, 2020 •

edited

Loading

scripples commented Jun 11, 2020

ghost commented Oct 4, 2020

prajwaljpj commented Oct 20, 2020

Advice on prepping datasets other than LJspeech? #6

Advice on prepping datasets other than LJspeech? #6

Comments

scripples commented Jun 7, 2020 • edited Loading

cschaefer26 commented Jun 8, 2020 • edited Loading

scripples commented Jun 11, 2020

ghost commented Oct 4, 2020

prajwaljpj commented Oct 20, 2020

scripples commented Jun 7, 2020 •

edited

Loading

cschaefer26 commented Jun 8, 2020 •

edited

Loading