ctc_loss problem when running on multiple GPUs #38

gmh8000 · 2021-07-08T03:37:36Z

Hi, I didn't have any bugs when training AV on a single GPU;But when I try to use multi-GPU training AV, CTC_LOSS gets the following error:

Traceback (most recent call last):
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/train.py", line 162, in
main()
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/train.py", line 111, in main
trainingLoss, trainingCER, trainingWER = train(model, trainLoader, optimizer, loss_function, device, trainParams)
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/utils/general.py", line 74, in train
loss = loss_function(outputBatch, targetBatch, inputLenBatch, targetLenBatch)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1295, in forward
self.zero_infinity)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/functional.py", line 1767, in ctc_loss
zero_infinity)
RuntimeError: Expected tensor to have size at least 660 at dimension 1, but got size 1474 for argument #2 'targets' (while checking arguments for ctc_loss_gpu)

Have you ever encountered this problem?

smeetrs · 2021-07-08T16:28:38Z

Kindly share the solution as it may help others.

gmh8000 · 2021-07-20T08:26:50Z

I just changed the batch input to be equally divided according to the second dimension, for example model = torch.nn.DataParallel(model, device_ids=[5,7], dim=1)

smeetrs · 2021-07-20T08:29:46Z

Thanks for sharing!

gmh8000 closed this as completed Jul 8, 2021

gmh8000 reopened this Jul 20, 2021

gmh8000 closed this as completed Jul 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ctc_loss problem when running on multiple GPUs #38

ctc_loss problem when running on multiple GPUs #38

gmh8000 commented Jul 8, 2021

smeetrs commented Jul 8, 2021

gmh8000 commented Jul 20, 2021

smeetrs commented Jul 20, 2021

ctc_loss problem when running on multiple GPUs #38

ctc_loss problem when running on multiple GPUs #38

Comments

gmh8000 commented Jul 8, 2021

smeetrs commented Jul 8, 2021

gmh8000 commented Jul 20, 2021

smeetrs commented Jul 20, 2021