You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I didn't have any bugs when training AV on a single GPU;But when I try to use multi-GPU training AV, CTC_LOSS gets the following error:
Traceback (most recent call last):
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/train.py", line 162, in
main()
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/train.py", line 111, in main
trainingLoss, trainingCER, trainingWER = train(model, trainLoader, optimizer, loss_function, device, trainParams)
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/utils/general.py", line 74, in train
loss = loss_function(outputBatch, targetBatch, inputLenBatch, targetLenBatch)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1295, in forward
self.zero_infinity)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/functional.py", line 1767, in ctc_loss
zero_infinity)
RuntimeError: Expected tensor to have size at least 660 at dimension 1, but got size 1474 for argument #2 'targets' (while checking arguments for ctc_loss_gpu)
Have you ever encountered this problem?
The text was updated successfully, but these errors were encountered:
I just changed the batch input to be equally divided according to the second dimension, for example model = torch.nn.DataParallel(model, device_ids=[5,7], dim=1)
Hi, I didn't have any bugs when training AV on a single GPU;But when I try to use multi-GPU training AV, CTC_LOSS gets the following error:
Traceback (most recent call last):
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/train.py", line 162, in
main()
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/train.py", line 111, in main
trainingLoss, trainingCER, trainingWER = train(model, trainLoader, optimizer, loss_function, device, trainParams)
File "/home/cca01/work2020/guominghao/Deep_AV_ASR/audio_visual/utils/general.py", line 74, in train
loss = loss_function(outputBatch, targetBatch, inputLenBatch, targetLenBatch)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/modules/loss.py", line 1295, in forward
self.zero_infinity)
File "/home/cca01/work2020/guominghao/.conda/envs/Deep_AV_PYTORCH/lib/python3.6/site-packages/torch/nn/functional.py", line 1767, in ctc_loss
zero_infinity)
RuntimeError: Expected tensor to have size at least 660 at dimension 1, but got size 1474 for argument #2 'targets' (while checking arguments for ctc_loss_gpu)
Have you ever encountered this problem?
The text was updated successfully, but these errors were encountered: