Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List index out of range #221

Closed
Polarbear2121 opened this issue Jun 16, 2019 · 4 comments
Closed

List index out of range #221

Polarbear2121 opened this issue Jun 16, 2019 · 4 comments

Comments

@Polarbear2121
Copy link

Hello,

I'm having an IndexError: list index out of range.

I'm using:
RTX Titan
My own dataset -- 22050 sampling rate
PyTorch:1.1.0
Cuda 10.0, V10.0.130
Also, the only parameter I modified was FP16 Run to true. If I change this to False, same error at same location.

Thanks for any help.

python train.py --output_directory=outdir --log_directory=logdir

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:

FP16 Run: True
Dynamic Loss Scaling: True
Distributed Run: False
cuDNN Enabled: True
cuDNN Benchmark: False
Selected optimization level O2: FP16 training with FP32 batchnorm and FP32 master weights.

Defaults for this optimization level are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O2
cast_model_type : torch.float16
patch_torch_functions : False
keep_batchnorm_fp32 : True
master_weights : True
loss_scale : dynamic
Epoch: 0
Train loss 0 37.254456 Grad Norm 10.247010 2.76s/it
Validation loss 0: 28.915405
Saving model and optimizer state at iteration 0 to outdir/checkpoint_0
Train loss 1 29.291567 Grad Norm 18.072716 2.50s/it
Train loss 2 11.869477 Grad Norm 14.188349 2.43s/it
Train loss 3 12.002620 Grad Norm 13.625305 2.38s/it
Train loss 4 6.071761 Grad Norm 9.110525 2.38s/it
Train loss 5 6.419538 Grad Norm 10.619104 2.34s/it
Train loss 6 6.234971 Grad Norm 6.840116 2.33s/it
Train loss 7 4.918466 Grad Norm 5.713524 2.35s/it
Train loss 8 6.090720 Grad Norm 5.344365 2.37s/it
Train loss 9 3.785891 Grad Norm 3.572156 2.32s/it
Train loss 10 5.615504 Grad Norm 4.542571 2.32s/it
Train loss 11 7.237193 Grad Norm 5.978166 2.45s/it
Train loss 12 3.910069 Grad Norm 3.138545 2.45s/it
Train loss 13 5.718569 Grad Norm 3.422372 2.30s/it
Train loss 14 5.154222 Grad Norm 2.684073 2.45s/it
Train loss 15 4.743378 Grad Norm 2.216073 2.32s/it
Train loss 16 4.106128 Grad Norm 2.238141 2.30s/it
Train loss 17 3.872780 Grad Norm 2.017791 2.44s/it
Train loss 18 4.498167 Grad Norm 2.209799 2.45s/it
Traceback (most recent call last):
File "train.py", line 290, in
args.warm_start, args.n_gpus, args.rank, args.group_name, hparams)
File "train.py", line 208, in train
for i, batch in enumerate(train_loader):
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
IndexError: Traceback (most recent call last):
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/julio/anaconda3/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/julio/Downloads/tacotron2/data_utils.py", line 61, in getitem
return self.get_mel_text_pair(self.audiopaths_and_text[index])
File "/home/julio/Downloads/tacotron2/data_utils.py", line 32, in get_mel_text_pair
audiopath, text = audiopath_and_text[0], audiopath_and_text[1]
IndexError: list index out of range

@akshitmittal1
Copy link

Hi @jferrer21, this is possibly due to some error in the transcript text file of your dataset. Check if there are any empty lines in the text file (possibly at last) . There could be one more thing which could be wrong is that there might be a line which either contains the address to audio file or it just contains the transcript.

Also, your training error is very high, what batch size are you training on?

@Polarbear2121
Copy link
Author

Thank you so much @akshitmittal1 for your fast response. You were right. I had a small portion of the transcript with a bunch of commas and one empty line. My batch size is 64. Now, it is working.

Also, do you have an idea when I should stop the training? So far:
Train loss 11843 0.311154 Grad Norm 0.245443 2.32s/it

@akshitmittal1
Copy link

@jferrer21 check the loss curves and validation curve. If both are saturating (not decreasing), then you might stop training it. Also, one thing is to infer the model and see how it works, if it's working fine, you may stop it

@rafaelvalle
Copy link
Contributor

Closing due to inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants