Skip to content
This repository has been archived by the owner on Nov 2, 2018. It is now read-only.

SpatialConvolutionUpsample behaviour #7

Closed
yujiali opened this issue Dec 16, 2015 · 10 comments
Closed

SpatialConvolutionUpsample behaviour #7

yujiali opened this issue Dec 16, 2015 · 10 comments

Comments

@yujiali
Copy link

yujiali commented Dec 16, 2015

I was expecting the SpatialConvolutionUpsample class to do the expected "upsampling" but it seems like this class is doing something else. Here is one example:

conv = nn.SpatialConvolutionUpsample(1,1,1,1,3)
w, dw = conv:parameters()
w[1]:fill(1)
w[2]:zero()

This creates an upsampling module that upsamples the input image by a factor of 3, the convolution is 1x1 with weight 1 and bias 0 so it just copies the input.

I tried this on a 1x1x2x2 input tensor:

x = torch.range(1,4):resize(1,1,2,2)
y = conv:forward(x)

and here is the result:

th> x
(1,1,.,.) = 
  1  2
  3  4
[torch.DoubleTensor of size 1x1x2x2]

th> y
(1,1,.,.) = 
  1  2  3  4  1  2
  3  4  1  2  3  4
  1  2  3  4  1  2
  3  4  1  2  3  4
  1  2  3  4  1  2
  3  4  1  2  3  4
[torch.DoubleTensor of size 1x1x6x6]

However I was actually expecting y to be like this (which I think is the more standard "upsampling"):

1 1 1 2 2 2
1 1 1 2 2 2
1 1 1 2 2 2
3 3 3 4 4 4
3 3 3 4 4 4
3 3 3 4 4 4

The problem is, in the current SpatialConvolutionUpsample class, the new views created after computing the results do not play very well with element ordering. I wonder if this is the intended behaviour?

@yujiali yujiali changed the title SpatialConvolutionUpsample behavior SpatialConvolutionUpsample behaviour Dec 16, 2015
@soumith
Copy link
Contributor

soumith commented Dec 16, 2015

@yujiali yes, this is intended behavior, sorry the module is poorly named. What you are looking for is https://github.com/torch/nn/blob/master/doc/convolution.md#nn.SpatialUpSamplingNearest

@soumith soumith closed this as completed Dec 16, 2015
@yujiali
Copy link
Author

yujiali commented Dec 16, 2015

Unfortunately the SpatialUpSamplingNearest module does not support upsampling with learned parameters which is what I actually need. There is a discussion at torch/nn#405 about a similar topic but the full convolution module does not do the same thing. I'm looking for the behaviour described in sec 3.3 of http://www.cs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf which seems to be supported in Caffe http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1DeconvolutionLayer.html

Given this I'm quite surprised that this eyescream model actually works as the spatial pixel ordering are all broken after one SpatialConvolutionUpsample forward pass.. Any insights on why this is not a problem?

@soumith
Copy link
Contributor

soumith commented Dec 16, 2015

if you are looking for shelhamers's deconvolutionlayer, SpatialFullConvolution is exactly that.

@soumith
Copy link
Contributor

soumith commented Dec 16, 2015

if you look at how we use this module in eyescream, we only use it to calculate "SAME" padding, we only use it at a scale of 1.0, so no upsampling.

@yujiali
Copy link
Author

yujiali commented Dec 16, 2015

Thank you @soumith for the clarification. Now I realized that when preparing data you scaled the images down and then up so resolution is reduced but image size is kept the same and therefore on all layers the images are actually the same size.

And yes you are right about the deconvolution layer and SpatialFullConvolution being equivalent, I was wrong on that.

gcr added a commit to gcr/nn that referenced this issue Dec 24, 2015
This commit simply adds a greppable note to the docs for `SpatialFullConvolution` to let people find it.

I agree that the terminology used by other frameworks is incorrect (particularly the use of the word "deconvolution"), but this layer is impossible to understand without it. After reading so many papers that refer to "deconvolution" or "upconvolution" layers, I strongly feel that a note like this in the docs is necessary. When I first read the docs for `SpatialFullConvolution`, I thought that it was just convolution with a full connection table since it comes right after the connection-table convolution functions. I wouldn't have ever guessed that this was really deconvolution with a different name.

Other discussions:
torch#405
facebookarchive/eyescream#7
@anuragranj
Copy link

@yujiali, @soumith : Could you point to some code(in torch) where SpatialFullConvolution being used as upconvolution layer in shelhamer's fcn? Also, the SpatialFullConvolution does not support fractional strides, is there a way to do that using it somehow? Thanks.

@soumith
Copy link
Contributor

soumith commented Feb 4, 2016

@anuragranj google for Deconvolution caffe

@anuragranj
Copy link

@soumith I was looking for an implementation in torch. The caffe framework is quite clear.

@soumith
Copy link
Contributor

soumith commented Feb 4, 2016

@anuragranj
Copy link

Great. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants