Training problem #14

JokeCorleone · 2020-06-07T02:38:05Z

First of all, thank you for sharing the open-source of Multi-Tacotron-Voice-Cloning. I also just started learning about natural language processing programming. And I also started learning Python programming.
-I put the software in the directory: D: \ SV2TTS
-I put the dataset in the directory: D: \ Datasets, I have D: \ Datasets \ book and D: \ Datasets \ LibriSpeech

When using the code you provided, I had some training issues:

I have finished the steps

Run python encoder_preprocess.py D: \ Datasets
and the result is
Arguments:
datasets_root: D: \ Datasets
out_dir: D: \ Datasets \ SV2TTS \ encoder
datasets: ['preprocess_voxforge']
skip_existing: False
Done preprocessing book.

Run visdom
But I could not continue

Run python encoder_train.py my_run D: \ Datasets
because the notice appeared
C: \ Users \ Admin \ anaconda3 \ envs \ [Test_Voice] \ lib \ site-packages \ umap \ spectral.py: 4: NumbaDeprecationWarning: No direct replacement for 'numba.targets' available. Visit https://gitter.im/numba/numba-dev to request help. Thanks!
import numba.targets
usage: encoder_train.py [-h] [--clean_data_root CLEAN_DATA_ROOT]
[-m MODELS_DIR] [-v VIS_EVERY] [-u UMAP_EVERY]
[-s SAVE_EVERY] [-b BACKUP_EVERY] [-f]
[--visdom_server VISDOM_SERVER] [--no_visdom]
run_id
encoder_train.py: error: unrecognized arguments: D: \ Datasets

My question: How can I fix this problem?

Thanks again for your sharing!!!

vlomme · 2020-06-07T08:36:12Z

Hello. use
python encoder_train.py my_run --clean_data_root D:\Datasets\SV2TTS\encoder

JokeCorleone · 2020-06-07T14:53:33Z

Hello @vlomme
Thank for your support,
When I used python encoder_train.py my_run --clean_data_root D:\Datasets\SV2TTS\encoder,
The result is

File "encoder_train.py", line 46, in
train(**vars(args))
File "D:\SV2TTS\encoder\train.py", line 87, in train
model.do_gradient_ops()
File "D:\SV2TTS\encoder\model.py", line 39, in do_gradient_ops
clip_grad_norm_(self.parameters(), 3, norm_type=2)
File "C:\Users\Admin\anaconda3\envs[Test_Voice]\lib\site-packages\torch\nn\utils\clip_grad.py", line 30, in clip_grad_norm_
total_norm = torch.norm(torch.stack([torch.norm(p.grad.detach(), norm_type) for p in parameters]), norm_type)
RuntimeError: All input tensors must be on the same device. Received cpu and cuda:0

ramanova · 2020-06-08T07:25:05Z

Hello, getting the same error, torch==1.5.0
I see that we have

    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    # FIXME: currently, the gradient is None if loss_device is cuda
    loss_device = torch.device("cpu")

after that if we are using clip_grad_norm_ from torch, it performs operation on all of the parameters, two of which are on cpu, and the rest on cuda:0

total_norm = torch.norm(torch.stack([torch.norm(p.grad.detach(), norm_type) for p in parameters]), norm_type)

which throws the error.
Could it be that the torch version is incorrect? I'm using 1.5.0

[UPDATE]
Reinstalled torch and it started training!
pip uninstall torch # you might need to ran it twice, check with
pip list | grep torch # that you don't have torch left
pip install torch # or pip install torch==1.5.0 to ensure the version

JokeCorleone · 2020-06-15T23:22:56Z

Hello,
When I trained vocoder (run python vocoder_train.py my_run D:\Datasets), I encountered an error:

+------------+--------+--------------+
| Batch size | LR | Sequence Len |
+------------+--------+--------------+
| 60 | 0.0001 | 1000 |
+------------+--------+--------------+

RuntimeError: CUDA out of memory. Tried to allocate 118.00 MiB (GPU 0; 4.00 GiB total capacity; 2.87 GiB already allocated; 10.61 MiB free; 32.29 MiB cached)

how can i solve this error ?

vlomme · 2020-06-16T07:35:31Z

not enough video memory. Reduce the Batch size

JokeCorleone · 2020-06-16T11:40:36Z

Thank @vlomme

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training problem #14

Training problem #14

JokeCorleone commented Jun 7, 2020

vlomme commented Jun 7, 2020 •

edited

Loading

JokeCorleone commented Jun 7, 2020

ramanova commented Jun 8, 2020 •

edited

Loading

JokeCorleone commented Jun 15, 2020

vlomme commented Jun 16, 2020

JokeCorleone commented Jun 16, 2020

Training problem #14

Training problem #14

Comments

JokeCorleone commented Jun 7, 2020

vlomme commented Jun 7, 2020 • edited Loading

JokeCorleone commented Jun 7, 2020

ramanova commented Jun 8, 2020 • edited Loading

JokeCorleone commented Jun 15, 2020

vlomme commented Jun 16, 2020

JokeCorleone commented Jun 16, 2020

vlomme commented Jun 7, 2020 •

edited

Loading

ramanova commented Jun 8, 2020 •

edited

Loading