tensor size not match #31

JingweiZhang12 · 2019-09-03T15:49:08Z

Before saying the problems, I show my running environment:
pytorch version 1.0.1.post2
cuda version 8.0.61
cudnn version: 7102
python version: 3.7

I run the latest code, two problems bother me:
1.Traceback (most recent call last):
File "train_autodeeplab.py", line 20, in
import apex
File "/home/zhangjw/anaconda3/lib/python3.7/site-packages/apex/init.py", line 18, in
from apex.interfaces import (ApexImplementation,
File "/home/zhangjw/anaconda3/lib/python3.7/site-packages/apex/interfaces.py", line 10, in
class ApexImplementation(object):
File "/home/zhangjw/anaconda3/lib/python3.7/site-packages/apex/interfaces.py", line 14, in ApexImplementation
implements(IApex)
File "/home/zhangjw/anaconda3/lib/python3.7/site-packages/zope/interface/declarations.py", line 483, in implements
raise TypeError(_ADVICE_ERROR % 'implementer')
TypeError: Class advice impossible in Python3. Use the @Implementer class decorator instead.

Then I comment related code and set the APEX_AVAILABLE=False. The code continues to run， but the problems 2 occurs

Traceback (most recent call last):
File "train_autodeeplab.py", line 421, in
main()
File "train_autodeeplab.py", line 414, in main
trainer.training(epoch)
File "train_autodeeplab.py", line 176, in training
output = self.model(image)
File "/home/zhangjw/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/zhangjw/AutoML/auto_deeplab.py", line 282, in forward
normalized_alphas)
File "/home/zhangjw/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/zhangjw/AutoML/cell_level_search.py", line 138, in forward
s = sum(new_states)
RuntimeError: The size of tensor a (13) must match the size of tensor b (14) at non-singleton dimension 3
It's so weird, so I print the new_states[0].shape, as following:
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 16, 14, 14])
torch.Size([2, 32, 7, 7])
torch.Size([2, 32, 7, 7])
torch.Size([2, 32, 7, 7])
torch.Size([2, 32, 7, 7])
torch.Size([2, 32, 7, 7])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 4, 56, 56])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 8, 28, 28])
torch.Size([2, 16, 13, 13])

when the last line torch.Size([2, 16, 13, 13]) comes, RuntimeError occurs.It's so weird.
Have you encountered this situation? I would appreciate it if you tell me how to solve it.

NoamRosenberg · 2019-09-03T18:35:30Z

@JingweiZhang12 hi, we’re running this code daily and haven’t seen this error yet. Can you give us the details of your run? Hyper parameter setting ect. Sorry about this, we’ll try to get it working again quick. And do let us know if you find anything.

albert-ba · 2019-09-04T06:39:41Z

I encountered something similar when I use: --crop_size 112.
I don't know if it helps..
One I got it back to 224 everything was ok

iariav · 2019-09-05T06:18:34Z

@JingweiZhang12
Hi, this first issue you described with apex seems like an apex installation issue and it's not related specifically to this repository. please see NVIDIA/apex#116. could you try to uninstall apex and then reinstall and see it this solves the issue?

regarding the tensor size mis-match, i think @albert-ba might be right. the network can't accept any arbitrary input size, since it might cause size mis-match during one of the down-sampling \ up-sampling operations. could you please share the hyper-parameters you used in your run?

JingweiZhang12 · 2019-09-05T06:42:11Z

@iariav
Thank for your helpful information. I refer to NVIDIA/apex#116 and reinstall apex. The module apex can be used now.
As for the tensor size mis-match, I change the --crop-size from 224 to 256, and it works now.

NoamRosenberg assigned NoamRosenberg and iariav and unassigned NoamRosenberg Sep 3, 2019

JingweiZhang12 closed this as completed Sep 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensor size not match #31

tensor size not match #31

JingweiZhang12 commented Sep 3, 2019

NoamRosenberg commented Sep 3, 2019

albert-ba commented Sep 4, 2019

iariav commented Sep 5, 2019

JingweiZhang12 commented Sep 5, 2019

tensor size not match #31

tensor size not match #31

Comments

JingweiZhang12 commented Sep 3, 2019

NoamRosenberg commented Sep 3, 2019

albert-ba commented Sep 4, 2019

iariav commented Sep 5, 2019

JingweiZhang12 commented Sep 5, 2019