Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input and target size don't match for loss function #66

Closed
heinzermch opened this issue May 16, 2018 · 14 comments
Closed

input and target size don't match for loss function #66

heinzermch opened this issue May 16, 2018 · 14 comments

Comments

@heinzermch
Copy link

It looks like every combination but the default resnet50_dilated8/ppm_bilinear_deepsup leads to a mismatch in size between the input and the target of the loss function. I'm a bit mystified, I did not change any of the models. What I adapted was the number of labels (to 8 as one can see below).

Encoder: resnet50_dilated8. Decoder: upernet
RuntimeError: input and target batch or spatial sizes don't match: target [1 x 85 x 106], input [1 x 8 x 170 x 212] at /opt/conda/conda-bld/pytorch_1524582441669/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24
Encoder: Resnet101. Decoder: ppm_bilinear_deepsup
return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce) RuntimeError: input and target batch or spatial sizes don't match: target [1 x 75 x 94], input [1 x 8 x 19 x 24] at /opt/conda/conda-bld/pytorch_1524582441669/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24

Encoder: Resnet101. Decoder: Upernet
RuntimeError: input and target batch or spatial sizes don't match: target [1 x 85 x 106], input [1 x 8 x 170 x 212] at /opt/conda/conda-bld/pytorch_1524582441669/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24

In cases where the program runs the last two dimensions seem to be consistent
torch.Size([1, 8, 75, 94]) torch.Size([1, 75, 94])

@hangzhaomit
Copy link
Collaborator

Note that resnet50_dilated8 an Upernet should not be combined.
The design of Upernet avoids the use of dilated convolutions, instead follows the paradigm of FPN (feature pyramid networks), so please simply use resnet50.

@heinzermch
Copy link
Author

But shouldn't Resnet101 and Upernet work in that case? After all it is an example you are giving in your documentation. This still leads to an error

RuntimeError: input and target batch or spatial sizes don't match: target [1 x 57 x 71], input [1 x 8 x 114 x 142] at /opt/conda/conda-bld/pytorch_1524582441669/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24

@hangzhaomit
Copy link
Collaborator

Resnet101 and Upernet works on my side. Are you missing any arguments we provided?

@heinzermch
Copy link
Author

heinzermch commented May 17, 2018

Thanks, yes that was the issue. I was not adapting the padding constant and the down-sampling rate when changing models.

@xmengli
Copy link

xmengli commented May 22, 2018

@heinzermch I encountered the same problem. how did you solve adapt the padding constant and down-sampling rate?

@xmengli
Copy link

xmengli commented May 23, 2018

I run with resnet50_dilated8-ppm_bilinear_deepsup. The "pred" after self.decoder has the shape of 2022828 (2 class problem). However, the input of network is 202224224 and the error is mismatch in loss calculation. Is it due to the padding constant and downsampling rate issue and how can I adjust padding constant and down-sampling rate?
I appreciate the answers. Thanks!

torch.Size([20, 2, 28, 28])
Traceback (most recent call last):
  File "train.py", line 317, in <module>
    main(args)
  File "train.py", line 205, in main
    train(segmentation_module, iterator_train, optimizers, history, epoch, args)
  File "train.py", line 39, in train
    loss, acc = segmentation_module(batch_data)
  File "/home/xmli/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/xmli/pheng4/semantic-segmentation-pytorch-master/models/models.py", line 37, in forward
    loss = self.crit(pred, feed_dict['seg_label'].cuda())
  File "/home/xmli/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xmli/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/modules/loss.py", line 193, in forward
    self.ignore_index, self.reduce)
  File "/home/xmli/anaconda2/envs/py35/lib/python3.5/site-packages/torch/nn/functional.py", line 1334, in nll_loss
    return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce)
RuntimeError: input and target batch or spatial sizes don't match: target [20 x 224 x 224], input [20 x 2 x 28 x 28] at /opt/conda/conda-bld/pytorch_1524580978845/work/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24

@heinzermch
Copy link
Author

I just used the parameters as described in the Readme:

python3 train.py --num_gpus NUM_GPUS --arch_encoder resnet101 --arch_decoder upernet
--segm_downsampling_rate 4 --padding_constant 32

@xmengli
Copy link

xmengli commented May 23, 2018

Thanks for your answer.
The problem still exists when I run with setting.

resnet101+upernet --segm_downsampling_rate 4 --padding_constant 32

and the error is the same with above.

I evaluate the output of line32 in '‘’models.py‘ file
(pred, pred_deepsup) = self.decoder(self.encoder(feed_dict['img_data'].cuda(), return_feature_maps=True))
And I can see the input of the self.encoder is

input torch.Size([20, 3, 224, 224])

The output of self.encoder is the list including 4 tensors.

torch.Size([20, 256, 56, 56])
torch.Size([20, 512, 28, 28])
torch.Size([20, 1024, 14, 14])
torch.Size([20, 2048, 7, 7])

The output of self.decoder is (I am doing 2 class segmentation).

torch.Size([20, 2, 56, 56])

It seems that the erros exists in self.encoder and the expected output should be a list, where the first element is a tensor with (20,256,224,224)

So that the decoder output will have the shape (20,256,224,224)?

Is there anything I missed? Thanks again for the solutions!

@xmengli
Copy link

xmengli commented May 23, 2018

any solution for this problem?

@ghost
Copy link

ghost commented Nov 28, 2018

I have the exact same problem:

return torch._C._nn.nll_loss2d(input, target, weight, size_average, ignore_index, reduce)
RuntimeError: input and target batch or spatial sizes don't match: target [4 x 47 x 75], input [4 x 150 x 94 x 150] at /pytorch/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:24

@Macfa
Copy link

Macfa commented Nov 28, 2018

RuntimeError: input and target shapes do not match: input [3242340 x 1], target [1 x 1] at

I set a batch_size=256 in torch.utils.data.DataLoader()
and modify input size && outsize in model and i try to setting a same value using padding, stride

result :
data : torch.Size([256, 5, 8, 8])
model : torch.Size([256, 1])

@shahaniket
Copy link

Getting the same error:

RuntimeError: input and target shapes do not match: input [128 x 1], target [128] at /opt/conda/conda-bld/pytorch-cpu_1532576596369/work/aten/src/THNN/generic/MSECriterion.c:12

@hangzhaomit hangzhaomit reopened this Dec 2, 2018
@hangzhaomit
Copy link
Collaborator

I cannot reproduce this error, I can run the following command successfully:
python3 train.py --num_gpus NUM_GPUS --arch_encoder resnet50 --arch_decoder upernet --segm_downsampling_rate 4 --padding_constant 32

@deeponcology @shahaniket @Macfa @xmengli999 Can you show the commands you are running?

@chccgiven
Copy link

Hi! If the combination of resnet50 and ppm work? When I run this I get error(don’t match). Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants