List index out of range #221

Polarbear2121 · 2019-06-16T15:39:42Z

Hello,

I'm having an IndexError: list index out of range.

I'm using:
RTX Titan
My own dataset -- 22050 sampling rate
PyTorch:1.1.0
Cuda 10.0, V10.0.130
Also, the only parameter I modified was FP16 Run to true. If I change this to False, same error at same location.

Thanks for any help.

python train.py --output_directory=outdir --log_directory=logdir

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

FP16 Run: True
Dynamic Loss Scaling: True
Distributed Run: False
cuDNN Enabled: True
cuDNN Benchmark: False
Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights.

Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Epoch: 0
Train loss 0 37.254456 Grad Norm 10.247010 2.76s/it
Validation loss 0: 28.915405
Saving model and optimizer state at iteration 0 to outdir/checkpoint_0
Train loss 1 29.291567 Grad Norm 18.072716 2.50s/it
Train loss 2 11.869477 Grad Norm 14.188349 2.43s/it
Train loss 3 12.002620 Grad Norm 13.625305 2.38s/it
Train loss 4 6.071761 Grad Norm 9.110525 2.38s/it
Train loss 5 6.419538 Grad Norm 10.619104 2.34s/it
Train loss 6 6.234971 Grad Norm 6.840116 2.33s/it
Train loss 7 4.918466 Grad Norm 5.713524 2.35s/it
Train loss 8 6.090720 Grad Norm 5.344365 2.37s/it
Train loss 9 3.785891 Grad Norm 3.572156 2.32s/it
Train loss 10 5.615504 Grad Norm 4.542571 2.32s/it
Train loss 11 7.237193 Grad Norm 5.978166 2.45s/it
Train loss 12 3.910069 Grad Norm 3.138545 2.45s/it
Train loss 13 5.718569 Grad Norm 3.422372 2.30s/it
Train loss 14 5.154222 Grad Norm 2.684073 2.45s/it
Train loss 15 4.743378 Grad Norm 2.216073 2.32s/it
Train loss 16 4.106128 Grad Norm 2.238141 2.30s/it
Train loss 17 3.872780 Grad Norm 2.017791 2.44s/it
Train loss 18 4.498167 Grad Norm 2.209799 2.45s/it
Traceback (most recent call last):
File "train.py", line 290, in
args.warm_start, args.n_gpus, args.rank, args.group_name, hparams)
File "train.py", line 208, in train
for i, batch in enumerate(train_loader):
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/julio/Downloads/tacotron2/data_utils.py", line 61, in getitem
return self.get_mel_text_pair(self.audiopaths_and_text[index])
File "/home/julio/Downloads/tacotron2/data_utils.py", line 32, in get_mel_text_pair
audiopath, text = audiopath_and_text[0], audiopath_and_text[1]
IndexError: list index out of range

akshitmittal1 · 2019-06-16T19:02:17Z

Hi @jferrer21, this is possibly due to some error in the transcript text file of your dataset. Check if there are any empty lines in the text file (possibly at last) . There could be one more thing which could be wrong is that there might be a line which either contains the address to audio file or it just contains the transcript.

Also, your training error is very high, what batch size are you training on?

Polarbear2121 · 2019-06-17T14:33:38Z

Thank you so much @akshitmittal1 for your fast response. You were right. I had a small portion of the transcript with a bunch of commas and one empty line. My batch size is 64. Now, it is working.

Also, do you have an idea when I should stop the training? So far:
Train loss 11843 0.311154 Grad Norm 0.245443 2.32s/it

akshitmittal1 · 2019-06-17T17:41:14Z

@jferrer21 check the loss curves and validation curve. If both are saturating (not decreasing), then you might stop training it. Also, one thing is to infer the model and see how it works, if it's working fine, you may stop it

rafaelvalle · 2019-07-30T23:29:52Z

Closing due to inactivity.

rafaelvalle closed this as completed Jul 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

List index out of range #221

List index out of range #221

Polarbear2121 commented Jun 16, 2019

akshitmittal1 commented Jun 16, 2019

Polarbear2121 commented Jun 17, 2019

akshitmittal1 commented Jun 17, 2019

rafaelvalle commented Jul 30, 2019

List index out of range #221

List index out of range #221

Comments

Polarbear2121 commented Jun 16, 2019

akshitmittal1 commented Jun 16, 2019

Polarbear2121 commented Jun 17, 2019

akshitmittal1 commented Jun 17, 2019

rafaelvalle commented Jul 30, 2019