Spatial Pyramid Pooling Layer #2177

pgao · 2015-03-22T08:30:39Z

Implements Spatial Pyramid Pooling as described in this paper: http://arxiv.org/pdf/1406.4729.pdf

Implemented using a composition of Pooling, Flatten, and Concat layers. Takes in one required parameter, which is the desired height of the pyramid. Optional parameters include padding amount and pooling method.

Flow is like: (pyramid_height) PoolingLayers -> (pyramid_height) FlattenLayers -> ConcatLayer. End result is a one-dimensional vector containing all the pooling results from different heights of the pyramid.

melgor · 2015-03-22T15:10:43Z

Thanks very much! Could you tell me if this implementation and Caffe allow muliti size training? I mean, that learning with images 227x227 and 300x300 using only one database?

pgao · 2015-03-22T20:23:44Z

This implementation produces fixed length pooling outputs from variable-sized inputs. I'm not sure about how Caffe handles variable-sized inputs: most of the nets I've seen use center-cropping and/or resizing to produce fixed-size inputs.

ducha-aiki · 2015-03-23T08:16:35Z

@pgao, nice PR!
Have you checked if it is better to compute the most coarse levels of pooling based on previous pooling layers? (13x13 -> 6x6 -> 3x3 -> 1x1) instead of (13x13 -> 1x1, 13x13 -> 6x6, 13x13 -> 3x3 )?

pgao · 2015-03-23T19:04:04Z

@ducha-aiki I haven't done any benchmarking, but my intuition was that calculating everything from the input (rather than from the previous pooling outputs) was:

easier to implement because the pooling layers all feed into flatten layers that feed into a concat layer, as opposed to each pooling layer feeding into another pooling layer and also into a flatten layer.
more parallelisable since the pooling layers don't need to wait for other pooling layers to complete their calculations.

shelhamer · 2015-03-24T06:07:02Z

src/caffe/layers/spp_layer.cpp

+    pooling_top_vecs_.push_back(new vector<Blob<Dtype>*>);
+    pooling_top_vecs_[i]->push_back(pooling_outputs_[i]);
+
+    // pooling layer setup


The kernel size and stride logic need to be in Reshape(). The number of spatial pyramid pooling bins should stay constant but their dimensions will need to change for each input. Inputs can change shape with (1) reshaping data layers #1313 or (2) calls to net or blob reshape(). When this happens, the kernel size and stride need re-configuring.

Is there a way to change parameters of a layer without having to set it up again? The only way I could figure out the re-configuring of the kernel size and stride height is by constructing a new LayerParameter, resetting the PoolingLayer with that LayerParameter, and calling the PoolingLayer's SetUp.

I don't think there is a way to change parameters without deleting and reinitializing the layer -- you could add a setter to Layer but I don't think it would really save anything since the constructor itself is probably basically free (SetUp is probably a little more expensive but you'd have to call that regardless). Do you know if it's an issue in practice?

shelhamer · 2015-03-24T06:20:48Z

Nice job @pgao! This is well on the on the way. Once you've done another pass and taken care of the inline comments we can look at merging this.

shelhamer · 2015-03-24T06:23:54Z

@melgor one could include variable size inputs by the reshaping data layer #1313 or calling net.blobs['data'].reshape(...) or the like in Python. For batching multi-scale inputs together one could make a Python data layer to carry out the pyramid packing of #308 although you might need a companion layer to unpack the multi-scale feature map. That would make a good example.

pgao · 2015-04-04T08:25:59Z

@shelhamer I think it's ready for review now.

pgao · 2015-04-08T20:32:53Z

Anyone have any comments on this?

lsy1993311 · 2015-04-10T03:58:37Z

So how to use the layer?

kyodaisuki · 2015-04-10T09:12:35Z

hello, i am a student in Taiwan.
I try to fine-tune caffe reference model by PASCAL 2007 database and i use SPP layer between conv-5 and fc-6. This SPP code can be compiled. When i fine-tune the model,it got some problem after 2000 iterations. like this:

I0410 16:29:24.782274 20130 solver.cpp:204] Train net output #0: loss = 0.482958 (* 1 = 0.482958 loss)
I0410 16:29:24.782297 20130 solver.cpp:464] Iteration 1980, lr = 0.001
I0410 16:29:27.842135 20130 solver.cpp:266] Iteration 2000, Testing net (#0)
F0410 16:29:30.335808 20130 syncedmem.cpp:51] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7f04e3ab1daa (unknown)
@ 0x7f04e3ab1ce4 (unknown)
@ 0x7f04e3ab16e6 (unknown)
@ 0x7f04e3ab4687 (unknown)
@ 0x7f04e3e0d9db caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f04e3ec1ef2 caffe::Blob<>::mutable_gpu_data()
@ 0x7f04e3f05717 caffe::PoolingLayer<>::Forward_gpu()
@ 0x7f04e3e97c61 caffe::Layer<>::Forward()
@ 0x7f04e3eaac82 caffe::SPPLayer<>::Forward_cpu()
@ 0x7f04e3dfbe9f caffe::Net<>::ForwardFromTo()
@ 0x7f04e3dfc2c7 caffe::Net<>::ForwardPrefilled()
@ 0x7f04e3eb5f77 caffe::Solver<>::Test()
@ 0x7f04e3eb6836 caffe::Solver<>::TestAll()
@ 0x7f04e3ebe3e9 caffe::Solver<>::Step()
@ 0x7f04e3ebed3f caffe::Solver<>::Solve()
@ 0x4064e6 train()
@ 0x404a21 main
@ 0x7f04e2fc2ec5 (unknown)
@ 0x404fcd (unknown)
@ (nil) (unknown)
Aborted (core dumped)

The modified train_val.prototxt is following..
layer {
name: "spatial_pyramid_pooling"
type: "SPP"
bottom: "conv5"
top: "spatial_pyramid_pooling"
spp_param {
pool: MAX
pyramid_height: 3
}
}

when pyramid_height is 2, the out of memory problem will happen in the 10060 iterations.
when pyramid_height is 3, the out of memory problem will happen in the 2000 iterations.
when pyramid_height is 4, the out of memory problem will happen in the 460 iterations.

the other question is that.

Is the pyramid_height be the pyramid level height?
and i check the SPP-layer dimension.
the pyramid_height = 2 , the dimension will be 5. (1+4 ok)
the pyramid_height = 3 , the dimension will be 21. (1+4+16 ok)
the pyramid_height = 4 , the dimension will be 190 ??. (1+4+16+ 64???)

sorry for asking some question and reading my poor English..
If i have some wrong steps in implement SPP-layer. please tell me.
thanks.

pgao · 2015-04-10T17:08:37Z

@lsy1993311 You use it the same way you would use a regular pooling layer, except you can only specify the pyramid height as a parameter.

@kyodaisuki Sounds like a memory leak. I'll take a look at my code and see what's wrong. The pyramid_height is the pyramid level height, that's right. What's the dimensionality of your data? I might have miscalculated the dimensions.

pgao · 2015-04-10T17:54:30Z

@kyodaisuki I removed what I think was causing the memory leak, but I can't seem to repro your dimension problem. Could you try it now and see if it works?

kyodaisuki · 2015-04-10T18:10:44Z

pgao
thanks a lot.
i will try it tomorrow.

kyodaisuki · 2015-04-13T10:45:59Z

I am training network with pyramid_height = 2 now.
The memory leak problem do not happen until 96000 iterations.
good~

pgao · 2015-04-13T17:31:59Z

Good to hear. Would you like me to optimise that further? Also, what about the issue with the dimensions?

kyodaisuki · 2015-04-14T03:40:45Z

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. The 13th European Conference on Computer Vision (ECCV), 2014

in this paper, they can decide the pyramid level and totally number of bins.
like this:
4-level spatial pyramid (1×1, 2×2, 3×3, 6×6, totally 50 bins)
Can you optimise in this way ?
thanks

pgao · 2015-04-14T04:00:18Z

Well, here I'm choosing a kernel height and width so that the output of the layer is the same size regardless of the input size. I checked this worked by using the unit tests, so I'm not sure what is going wrong with the dimensionality.

Check line 50-82 of the test_spp_layer.cpp file for the tests that enforce this. The actual kernel height/width calculation is done in GetPoolingParam, line 17 of spp_layer.cpp.

kyodaisuki · 2015-04-14T11:33:20Z

may be i have something wrong.
in your test_spp_layer.cpp file.
if pyramid_height=5 ,it is the same as my calculation .(1+2+4+16+64+256 = 341)
it is right .

pgao · 2015-05-06T01:01:49Z

@shelhamer @jeffdonahue @longjon Any chance of having this pull request reviewed?

jeffdonahue · 2015-05-06T02:34:43Z

src/caffe/layers/spp_layer.cpp

+        i, bottom_h_, bottom_w_, spp_param);
+
+    delete pooling_layers_[i];
+    pooling_layers_[i] = new PoolingLayer<Dtype>(pooling_param);


Can you make pooling_layers_ a vector<shared_ptr<PoolingLayer<Dtype> > > and change the above two lines to just pooling_layers_[i].reset(new PoolingLayer...)? I don't think the last set of PoolingLayers you create here will be deleted with your current implementation.

jeffdonahue · 2015-05-06T02:50:18Z

@pgao sorry for the silence -- I added a couple minor comments above. In general, looks great!

pgao · 2015-05-06T04:37:43Z

@jeffdonahue No problem! I made the changes you requested.

kyodaisuki · 2015-05-07T10:27:25Z

hi, pgao
i found some issues on auto-pad process.
when the input kernel map is 7x7 and pyramid_height=3 , it will produce a wrong situation.(the number 54 output,i think it should be from 1 + 4 + 49.)
for example:

int num_bins = pow(2, pyramid_level);
// find padding and kernel size so that the pooling is
// performed across the entire image
int kernel_h = ceil(bottom_h / static_cast(num_bins));
int pad_h = 0;
int remainder_h = kernel_h - bottom_h % num_bins + 1;
if (bottom_h % num_bins > 0) {
pad_h = (remainder_h + 1) / 2;
kernel_h = (bottom_h + 2 * pad_h) / num_bins;
}

when pyramid_height=3 , num_bins = 2^0 , 2^1 , 2^2.
if num_bins = 2^2
kernel_h = ceil(7/4) = ceil(1...3) = 2
remainder_h = 2 - 7%4 + 1 = 2-3+1 =0 (<------------------?)
if(3>0){
pad_h = (0+1)/2=0
kernel_h = (7 + 2*0) / 4 = 1.XX (double) = 1 (int)
}

so finally it will produce 1X1 kernal pooling on num_bins =2^2.
and than, i will be 1+4+49 != 21

Could you help me check this problem ?

pgao · 2015-05-13T00:03:02Z

@kyodaisuki Sorry about that. I fixed what I think is the issue (line 27 of spp_layer.cpp). I added in your example as a test case, and I'm waiting for Travis to finish running.

pgao · 2015-05-13T00:31:35Z

@jeffdonahue Does it look good to merge?

kyodaisuki · 2015-05-13T08:59:55Z

thanks, pgao
I will try it again.

jeffdonahue · 2015-05-14T23:31:28Z

Hey @pgao -- looks good! Please squash and I'll merge.

pgao · 2015-05-15T01:13:21Z

@jeffdonahue Squashed!

Spatial Pyramid Pooling Layer

jeffdonahue · 2015-05-15T01:17:22Z

Thanks!

siddharthm83 · 2015-11-02T22:48:12Z

How do I use this ? Can I configure through the prototxt file? Say I am interested in configuring 3 pools in a layer each with a different kernel size and stride (for eg: kernel size (2,3,5), stride (1,3,5)), is there a way to do this through the prototxt file in caffe?

hermitman · 2016-02-26T19:12:40Z

@siddharthm83 Have you figured out how to use SPP layer? Could you let me know how to set the parameters.

Thanks,

roeiherz · 2016-04-09T14:54:30Z

@hermitman Have you figured out how to use SPP layer? can you write an example of it

mollyStark · 2016-04-13T07:10:24Z

@pgao hello~I've tried the spp layer on caffe, my input data is of type ImageData, where images are of different size. But when I train the network, I ran into an error that said check failed of image height. It seems caffe can't accept image data with different size without resizing or cropping to fixed size.
Did I misunderstand the purpose of spp layer that "produces fixed length pooling outputs from variable-sized inputs" as you said? Does it still need to resize images to same size first and then use spp layer?

davidstutz · 2016-05-04T09:29:44Z

@mollyStark I am having the same problem. Currently you can either use a batch size of 1 or need to ensure that each batch contains images of the same size, see the comments by Zakum here: http://stackoverflow.com/questions/34697519/caffe-variable-input-image-size.

mollyStark · 2016-05-06T07:26:09Z

@davidstutz I don't know if the result will be good if using batch size of 1, I'll give it a try. Thank you for sharing~

CurtisLi · 2016-05-26T03:30:49Z

@davidstutz @mollyStark I am having the same problem.But,it failed and shows"Check failed: pad_w_<Kernel_w_<1 vs. 1>"if using batch size of 1....Can you help me about it?

ftraining · 2016-08-01T03:44:41Z

@pgao hello, I've tried spp layer on caffe, I want to fine tune bvlc_reference_caffenet.caffemodel using my own data. my input images are of different size. I only changed the pool5 to spp layer in the prorotxt, like this:
layer {
name:"pool5"
type:"SPP"
bottom:"conv5"
top:"pool5"
spp_param{
pool:MAX
pyramid_height:3
}
}
As I know, there is no use to crop from images when using spp-layer, but when I delete the crop_size parameter in data layer,it got error. Can you help me about it?

Jinglei5 · 2018-07-31T08:09:37Z

How could I generate a deploy file for the net with spp layer? Since the input dimensions are various according to the size of test data. Thanks!

longjon added the in progress label Mar 22, 2015

shelhamer reviewed Mar 24, 2015
View reviewed changes

shelhamer added ready for review and removed in progress labels Apr 5, 2015

jeffdonahue reviewed May 6, 2015
View reviewed changes

shelhamer mentioned this pull request May 11, 2015

Implement SpatialPyramidPoolingLayer with the Split, Pooling, Flatten & Concat layers #560

Closed

pgao force-pushed the spp_layer branch from 0df5886 to 29e499b Compare May 14, 2015 23:57

Spatial Pyramid Pooling Layer

438cf0e

pgao force-pushed the spp_layer branch from 29e499b to 438cf0e Compare May 15, 2015 00:24

jeffdonahue added a commit that referenced this pull request May 15, 2015

Merge pull request #2177 from pgao/spp_layer

35a5df5

Spatial Pyramid Pooling Layer

jeffdonahue merged commit 35a5df5 into BVLC:master May 15, 2015

BlGene mentioned this pull request Jun 15, 2015

spp-net first steps #2593

Closed

williford mentioned this pull request Oct 24, 2016

Achieve more pooling type #4882

Closed

Spatial Pyramid Pooling Layer #2177

Spatial Pyramid Pooling Layer #2177

Conversation

pgao commented Mar 22, 2015

melgor commented Mar 22, 2015

pgao commented Mar 22, 2015

ducha-aiki commented Mar 23, 2015

pgao commented Mar 23, 2015

shelhamer Mar 24, 2015

Choose a reason for hiding this comment

pgao Mar 25, 2015

Choose a reason for hiding this comment

jeffdonahue May 6, 2015

Choose a reason for hiding this comment

shelhamer commented Mar 24, 2015

shelhamer commented Mar 24, 2015

pgao commented Apr 4, 2015

pgao commented Apr 8, 2015

lsy1993311 commented Apr 10, 2015

kyodaisuki commented Apr 10, 2015

pgao commented Apr 10, 2015

pgao commented Apr 10, 2015

kyodaisuki commented Apr 10, 2015

kyodaisuki commented Apr 13, 2015

pgao commented Apr 13, 2015

kyodaisuki commented Apr 14, 2015

pgao commented Apr 14, 2015

kyodaisuki commented Apr 14, 2015

pgao commented May 6, 2015

jeffdonahue May 6, 2015

Choose a reason for hiding this comment

jeffdonahue commented May 6, 2015

pgao commented May 6, 2015

kyodaisuki commented May 7, 2015

pgao commented May 13, 2015

pgao commented May 13, 2015

kyodaisuki commented May 13, 2015

jeffdonahue commented May 14, 2015

pgao commented May 15, 2015

jeffdonahue commented May 15, 2015

siddharthm83 commented Nov 2, 2015

hermitman commented Feb 26, 2016

roeiherz commented Apr 9, 2016

mollyStark commented Apr 13, 2016

davidstutz commented May 4, 2016

mollyStark commented May 6, 2016

CurtisLi commented May 26, 2016 • edited Loading

ftraining commented Aug 1, 2016

Jinglei5 commented Jul 31, 2018

CurtisLi commented May 26, 2016 •

edited

Loading