change resnet AvgPool2d to Adaptive_AvgPool_2d（global avg pooling） to make the input size changeable #155

XavierLinNow · 2017-04-21T08:18:40Z

This PR change resnet final pooling layer before classifier from fixed AvgPool2d(7) to adaptive Adaptive_AvgPool_2d(1)
When we use AvgPool2d(7), the input size is forced to be 224*224. Otherwise it will cause the wrong incomplete output, even raise the error RuntimeError: size mismatch, m1: [1 x 2048], m2: [512 x 1000] at /b/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:1229

fmassa · 2017-04-21T13:32:40Z

I'm unsure about merging this in.
It has already been asked before #140, so this is a recurring question, but is this the expected behavior that one would want to have for torchvision models? If yes, then we should maybe modify that in all models, and not only resnets.
On the other hand, it's usually very easy to the end user to adapt the network to handle varying sizes. Something like

resnet = torchvision.models.resnet18()
resnet.avgpool = nn.AdaptiveAvgPool2d(1)

would be enough, and explicit to the user.

On the other hand, it hides a potential problem, which is that the pre-trained models might not work well if too big images are fed to the network, as they were trained to identify patterns at a specific resolution.

Maratyszcza · 2017-04-27T12:16:50Z

I'm for using adaptive pooling in all models. If you are willing to accept this change, I will create a PR for all models.

fmassa · 2017-04-30T18:49:29Z

Well, I suppose most common use-cases are just feeding images for classification without any changes in the architecture. Other scenarios (like segmentation) will anyway require some modification on the structure of the network.
Let's add adaptive pool for the models then. @soumith @colesbury are you ok with that?

soumith · 2017-04-30T23:02:01Z

there are two common use cases:

fully dense outputs
adaptively pooled outputs

fully dense is the case where we give bigger inputs and we get bigger outputs than 1000x1x1

adaptive is when we give bigger inputs and always get out 1000x4096 to be fed into fc layers.

they are both common and in conflicting paths.

we should plan for these both via constructor arguments

alykhantejani · 2017-10-20T09:54:20Z

As there have been a few issues as PRs around this topic (#166, #184, #155, #190), we should probably make a decision on what to do

my 2¢ on these two cases:

Adaptive Pooling
Users might want to using adaptive pooling when passing images of a different size to the pretrained classifier, however, if we enable this by default the models will work with variable image sizes without the user being explicit and can exhibit degradation silently (as the model was trained with a different image size). For this reason, I think it should be explicit in the users code that they want this behavior, either by a constructor arg or doing some model surgery.

I think the model surgery is quite straight forward, and can be well documented in a tutorial/example. IMO the cost of cluttering the constructor + the docs to support this case is not worth it as it reads almost/if not more clearly by the user being explicit by doing:

resnet = torchvision.models.resnet18()
resnet.avgpool = nn.AdaptiveAvgPool2d(1)

The one problem with this is that in some models the final pooling is done using the functional interface (e.g. densenet), but this can be easily remedied by creating members on the module for the feature pooling (and is backwards compatible)

Fully Convolutional Networks
In this case the network surgery required is a little more complex, but again something we can clearly document in an example for reference. Additionally, in most cases people who want a FCN will be modifying the network structure for their new task and will likely have to do some network surgery anyway.

What are your thoughts @fmassa @soumith @colesbury?

vfdev-5 · 2017-10-22T23:14:27Z

For DenseNet model, it would be good at least to write explicitly that input image size is 224x224, otherwise F.avg_pool2d(x, kernel_size=7) silently takes only first 7x7 feature map block.

duygusar · 2018-06-21T10:25:20Z

I have the similar problem but the model I use is a custom RecurrentAttention one and it has no attribute "AdaptivePooling" --> https://github.com/kevinzakka/recurrent-visual-attention/blob/master/model.py I did modify the nn parts that are imported and modified from torch models (in modules.py and model.py) but it didn't work.

mamunir · 2018-07-31T15:34:01Z

I have added the adaptive avg pooling but error still remain the same. Please help?
RuntimeError: size mismatch, m1: [512 x 1], m2: [512 x 2] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:249

Detailed Error

Exception NameError: "global name 'FileNotFoundError' is not defined" in <bound method _DataLoaderIter.del of <torch.utils.data.dataloader._DataLoaderIter object at 0x7fcf3cf65990>> ignored

RuntimeError Traceback (most recent call last)
in ()
1 model_conv = train_model(model_conv, criterion, optimizer_ft, exp_lr_scheduler,
----> 2 num_epochs=25)

in train_model(model, criterion, optimizer, scheduler, num_epochs)
34 # print (inputs.shape)
35 # print (model)
---> 36 outputs = model(inputs)
37 _, preds = torch.max(outputs, 1)
38 # print (outputs)

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in call(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)

in forward(self, x)
62 # x = F.relu(F.max_pool2d(self.conv1(x), 2))
63 # x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
---> 64 x = self.model_f(x)
65 # x = x.view(-1, 14336)
66 # x = F.relu(self.fc1(x))

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in call(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/container.pyc in forward(self, input)
89 def forward(self, input):
90 for module in self._modules.values():
---> 91 input = module(input)
92 return input
93

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/module.pyc in call(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
--> 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)

/usr/local/lib/python2.7/dist-packages/torch/nn/modules/linear.pyc in forward(self, input)
53
54 def forward(self, input):
---> 55 return F.linear(input, self.weight, self.bias)
56
57 def extra_repr(self):

/usr/local/lib/python2.7/dist-packages/torch/nn/functional.pyc in linear(input, weight, bias)
992 return torch.addmm(bias, input, weight.t())
993
--> 994 output = input.matmul(weight.t())
995 if bias is not None:
996 output += bias

RuntimeError: size mismatch, m1: [512 x 1], m2: [512 x 2] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:249

fmassa · 2018-08-02T17:17:52Z

@mamunir the error that you are facing happens because you added the wrong output size for the AdaptiveAvgPool layer. For resnets, it should be 1x1

mamunir · 2018-08-03T10:47:42Z

@fmassa thanks man for reply. I did input call wrong as not properly flow from the network. now fine :)

fmassa · 2018-11-06T09:40:44Z

This has been implemented in #643

Thanks for the PR!

…rch#155)

change resnet Avgpool2d to Adaptive_AvgPool_2d

90b8528

fmassa mentioned this pull request May 7, 2017

Adapted SqueezeNet for Variable Size Input Images #166

Closed

chmaz mentioned this pull request Jul 29, 2017

Error in trying to use for the first time gpleiss/efficient_densenet_pytorch#1

Closed

Irynei mentioned this pull request Oct 5, 2018

Feature/enhance training Irynei/SafeAugmentation#4

Merged

fmassa closed this Nov 6, 2018

rajveerb pushed a commit to rajveerb/vision that referenced this pull request Nov 30, 2023

update object detection to pytorch 0.4.1, which contains caffe2 (pyto…

a4de7c1

…rch#155)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change resnet AvgPool2d to Adaptive_AvgPool_2d（global avg pooling） to make the input size changeable #155

change resnet AvgPool2d to Adaptive_AvgPool_2d（global avg pooling） to make the input size changeable #155

XavierLinNow commented Apr 21, 2017

fmassa commented Apr 21, 2017

Maratyszcza commented Apr 27, 2017

fmassa commented Apr 30, 2017

soumith commented Apr 30, 2017

alykhantejani commented Oct 20, 2017

vfdev-5 commented Oct 22, 2017

duygusar commented Jun 21, 2018 •

edited

Loading

mamunir commented Jul 31, 2018 •

edited

Loading

fmassa commented Aug 2, 2018 •

edited

Loading

mamunir commented Aug 3, 2018

fmassa commented Nov 6, 2018

change resnet AvgPool2d to Adaptive_AvgPool_2d（global avg pooling） to make the input size changeable #155

change resnet AvgPool2d to Adaptive_AvgPool_2d（global avg pooling） to make the input size changeable #155

Conversation

XavierLinNow commented Apr 21, 2017

fmassa commented Apr 21, 2017

Maratyszcza commented Apr 27, 2017

fmassa commented Apr 30, 2017

soumith commented Apr 30, 2017

alykhantejani commented Oct 20, 2017

vfdev-5 commented Oct 22, 2017

duygusar commented Jun 21, 2018 • edited Loading

mamunir commented Jul 31, 2018 • edited Loading

fmassa commented Aug 2, 2018 • edited Loading

mamunir commented Aug 3, 2018

fmassa commented Nov 6, 2018

duygusar commented Jun 21, 2018 •

edited

Loading

mamunir commented Jul 31, 2018 •

edited

Loading

fmassa commented Aug 2, 2018 •

edited

Loading