Help making Italian Vocoder/Synthesizer #697

xzVice · 2021-03-08T09:14:06Z

Let's suppose I got the Italian dataset from here (ASR one, flac) http://www.openslr.org/94/
How am I supposed to create all the pretrained models from it (the .pt files, for vocoder, synthesizer and encoder)?

ghost · 2021-03-08T10:13:05Z

Please start by reading my advice on training. This contains the link to training documentation: #431 (comment)

If I were doing this, I would reuse the encoder and vocoder models. For the synthesizer, you have the option of training from scratch or finetuning the English model. Training from scratch should give better pronunciation and prosody. Finetuning will reduce training time and possibly have better voice similarity. If you finetune, modify the text cleaner to remove diacritics from vowels (change à to a, è and é to e, etc.). This is necessary since the English synthesizer does not include these characters in symbols.py.

xzVice · 2021-03-08T20:19:07Z

Please start by reading my advice on training. This contains the link to training documentation: #431 (comment)

If I were doing this, I would reuse the encoder and vocoder models. For the synthesizer, you have the option of training from scratch or finetuning the English model. Training from scratch should give better pronunciation and prosody. Finetuning will reduce training time and possibly have better voice similarity. If you finetune, modify the text cleaner to remove diacritics from vowels (change à to a, è and é to e, etc.). This is necessary since the English synthesizer does not include these characters in symbols.py.

So, I tried doing what you told me to do and everything was doing well until the synthesizer_train.py command...
Here is the execution of all the commands contained there https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training (till the train one ofc, which thrown the error)
Any idea? 🤔

I also noticed those weird symbols inside the SV2TTS/synthesizer/train.txt file...

Is it normal? I tried to edit the symbols.py/cleaners.py files but doing that didn't fix it... but anyways this is probably not what's causing the crash of the train command...

C:\Users\Workspace\Desktop\Real-Time-Voice-Cloning>py -3.6 synthesizer_preprocess_audio.py datasets_root --datasets_name LibriTTS --subfolders testing --no_alignments
Arguments:
    datasets_root:   datasets_root
    out_dir:         datasets_root\SV2TTS\synthesizer
    n_processes:     None
    skip_existing:   False
    hparams:
    no_alignments:   True
    datasets_name:   LibriTTS
    subfolders:      testing

Using data from:
    datasets_root\LibriTTS\testing
LibriTTS: 100%|████████████████████████████████████████████████████████████████████| 1/1 [00:09<00:00,  9.52s/speakers]
The dataset consists of 9 utterances, 7450 mel frames, 1488960 audio timesteps (0.03 hours).
Max input length (text chars): 140
Max mel frames length: 889
Max audio timesteps length: 177600





C:\Users\Workspace\Desktop\Real-Time-Voice-Cloning>python synthesizer_preprocess_embeds.py datasets_root/SV2TTS/synthesizer
Arguments:
    synthesizer_root:      datasets_root\SV2TTS\synthesizer
    encoder_model_fpath:   encoder\saved_models\pretrained.pt
    n_processes:           4

Embedding:   0%|                                                                         | 0/9 [00:00<?, ?utterances/s]Loaded encoder "pretrained.pt" trained to step 1564501
Loaded encoder "pretrained.pt" trained to step 1564501
Loaded encoder "pretrained.pt" trained to step 1564501
Loaded encoder "pretrained.pt" trained to step 1564501
Embedding: 100%|█████████████████████████████████████████████████████████████████| 9/9 [00:05<00:00,  1.73utterances/s]





C:\Users\Workspace\Desktop\Real-Time-Voice-Cloning>python synthesizer_train.py testing datasets_root/SV2TTS/synthesizer
Arguments:
    run_id:          testing
    syn_dir:         datasets_root/SV2TTS/synthesizer
    models_dir:      synthesizer/saved_models/
    save_every:      1000
    backup_every:    25000
    force_restart:   False
    hparams:

Checkpoint path: synthesizer\saved_models\testing\testing.pt
Loading training data from: datasets_root\SV2TTS\synthesizer\train.txt
Using model: Tacotron
Using device: cpu

Initialising Tacotron Model...

Trainable Parameters: 30.876M

Starting the training of Tacotron from scratch

Using inputs from:
        datasets_root\SV2TTS\synthesizer\train.txt
        datasets_root\SV2TTS\synthesizer\mels
        datasets_root\SV2TTS\synthesizer\embeds
Found 9 samples
+----------------+------------+---------------+------------------+
| Steps with r=2 | Batch Size | Learning Rate | Outputs/Step (r) |
+----------------+------------+---------------+------------------+
|   20k Steps    |     12     |     0.001     |        2         |
+----------------+------------+---------------+------------------+

Traceback (most recent call last):
  File "synthesizer_train.py", line 35, in <module>
    train(**vars(args))
  File "C:\Users\Workspace\Desktop\Real-Time-Voice-Cloning\synthesizer\train.py", line 158, in train
    for i, (texts, mels, embeds, idx) in enumerate(data_loader, 1):
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 355, in __iter__
    return self._get_iterator()
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 301, in _get_iterator
    return _MultiProcessingDataLoaderIter(self)
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\site-packages\torch\utils\data\dataloader.py", line 914, in __init__
    w.start()
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'train.<locals>.<lambda>'

C:\Users\Workspace\Desktop\Real-Time-Voice-Cloning>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\Workspace\AppData\Local\Programs\Python\Python36\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

ghost · 2021-03-08T21:23:33Z

I don't have time to fully troubleshoot issues, but this may help. If not, you'll need to figure it out yourself.

Weird characters in train.txt

Problem may be coming from this line, which reads the transcripts:

Real-Time-Voice-Cloning/synthesizer/preprocess.py

Line 77 in b5ba6d0

with text_fpath.open("r") as text_file:

Try adding utf-8 file encoding.

with text_fpath.open("r", encoding="utf-8") as text_file:

Error running synthesizer_train.py

For a soluton to:

AttributeError: Can't pickle local object 'train.<locals>.<lambda>'
EOFError: Ran out of input

Please see #669 (comment) for a workaround. We set num_workers=0 on Windows.

xzVice · 2021-03-09T07:59:04Z

Thanks! Now both errors got solved... but it's really slow (the 20000 steps train command)... also idk why it says Using device: cpu even tho I installed the latest cuda toolkit and I got a GTX 1050 Ti...

xzVice · 2021-03-09T08:18:35Z

Nevermind I had the cpu version of pytorch installed...

AVTV64 · 2021-03-14T14:17:31Z

Let's suppose I got the Italian dataset from here (ASR one, flac) http://www.openslr.org/94/
How am I supposed to create all the pretrained models from it (the .pt files, for vocoder, synthesizer and encoder)?

HI, can you release the Italian models you trained? How do I set it up? I want to clone voices in this language.

frossi65 · 2021-03-15T14:37:28Z

@ArianaGlande
hello, i am looking for italian models. let me know if i can help to train the model. i have a rtx2070 gpu.

FedericoFedeFede · 2021-03-17T15:11:11Z

@ArianaGlande I'm also looking for it. If you managed to do that, it would be very helpful sharing that with us. Thanks

TalissaDreossi · 2021-04-16T15:25:10Z

I'm trying to do the same and as @blue-fish said (if I got it correct) I just need to train the synthesizer so I have to skip the first steps in https://github.com/CorentinJ/Real-Time-Voice-Cloning/wiki/Training#datasets until I reach the:
"Begin with the audios and the mel spectrograms:
python synthesizer_preprocess_audio.py <datasets_root>".
Is it right? If so, how have I to structure my dataset? I have downloaded the italian one from http://www.openslr.org/94/ but I don't know if I have to preprocess it before running the instruction above (in other words I don't know what it is expected in <datasets_root>)
Thanks in advance

alessandrolamberti · 2022-07-14T19:25:08Z

@ArianaGlande Hi, how did you manage to preprocess the italian dataset into the format the scripts accept?

Alex2610 · 2022-12-17T16:41:34Z

can please someone upload the pretrained models?

ghost closed this as completed Apr 1, 2021

ghost mentioned this issue Oct 8, 2021

Support for other languages #30

Open

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help making Italian Vocoder/Synthesizer #697

Help making Italian Vocoder/Synthesizer #697

xzVice commented Mar 8, 2021

ghost commented Mar 8, 2021

xzVice commented Mar 8, 2021

ghost commented Mar 8, 2021

xzVice commented Mar 9, 2021

xzVice commented Mar 9, 2021

AVTV64 commented Mar 14, 2021

frossi65 commented Mar 15, 2021

FedericoFedeFede commented Mar 17, 2021

TalissaDreossi commented Apr 16, 2021

alessandrolamberti commented Jul 14, 2022

Alex2610 commented Dec 17, 2022

Help making Italian Vocoder/Synthesizer #697

Help making Italian Vocoder/Synthesizer #697

Comments

xzVice commented Mar 8, 2021

ghost commented Mar 8, 2021

xzVice commented Mar 8, 2021

ghost commented Mar 8, 2021

Weird characters in train.txt

Error running synthesizer_train.py

xzVice commented Mar 9, 2021

xzVice commented Mar 9, 2021

AVTV64 commented Mar 14, 2021

frossi65 commented Mar 15, 2021

FedericoFedeFede commented Mar 17, 2021

TalissaDreossi commented Apr 16, 2021

alessandrolamberti commented Jul 14, 2022

Alex2610 commented Dec 17, 2022