Allow images of different sizes as inputs #557

sguada · 2014-06-28T15:02:08Z

Bases in recent experiments, cropping from images with smallest size = 256 perform better.
http://arxiv.org/pdf/1405.3531v2.pdf

The idea is allowing that images have different sizes before cropping, but became same size after cropping. This would require remove the mean_file and replace it with a mean_value.

LevelDB, LMDB and Image_Data_Layer should not assume that the images have the same size.

sguada · 2014-06-28T15:30:45Z

@Yangqing when you do the data_layers re-design in #407 #244 keep this in mind.

jamt9000 · 2014-06-28T16:43:15Z

Regarding that paper, I believe they will be releasing their source code (and models) soon

http://www.robots.ox.ac.uk/~vgg/research/deep_eval/

kloudkl · 2014-06-30T02:13:22Z

The paper that #548 wants to implement [1] proposed a very natural and general way to extract convolutional features for images of any sizes and then pool the feature maps into fixed length vectors with spatial pyramids. The spatial pyramid pooling(SPP) idea is not new. But until now, most people are only used to doing pooling with sliding windows in CNN. On the other hand, the SPP-net only experimented with max pooling in each spatial bin while sliding windows also used other aggregation methods.

[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. The 13th European Conference on Computer Vision (ECCV), 2014

kloudkl · 2014-06-30T02:22:36Z

This is complementary to #505.

kloudkl · 2014-07-01T15:09:43Z

@sguada, do you think #355 is a prerequisite for this issue?

sguada · 2014-07-01T16:33:33Z

The idea is allow images of different sizes as inputs, but then keep them with fix size after cropping, so the rest of the network will work as usual.

Therefore for now #355 is not needed, although could be combined later on.

kloudkl · 2014-07-02T01:08:24Z

Got your idea. The ImageDataLayer resizes the images before cropping and mirroring them. The convert_imageset tool ensures that the images stored in the Leveldb are of the same size. So there is basically no requirement of the original images sizes. Only LMDB needs to be enhanced.

The mean_value is just a simplification of the mean_file and don't have to replace the latter.

qingqing01 · 2014-08-13T15:34:41Z

I use the ImageDataLayer and I don't understand what you said "replace mean_file with a mean_value. ". How to compute the mean_value? Does the Caffe have tool to compute the mean_file using the input images? Before I write the tool to compute the mean_file, I want to confirm. Thanks!

shelhamer · 2014-09-21T22:06:42Z

@Dcocoa it turns out the spatial mean i.e. the mean over images with dimensions K x H x W is almost everywhere the same in height and width so averaging over spatial dimensions in to a channel mean with dimensions K x 1 x 1 achieves virtually the same network performance while making preprocessing simpler and more flexible.

compute_image_mean computes the mean.

hayderm · 2015-04-13T01:52:38Z

Please, I want to use SPP-Net and it works well. However, when I change the number of layers it givers error !! so do I need to compile caffe ? or just use the given caffe.mex ?

longjon · 2015-05-09T03:45:47Z

Closing as we now have per-channel mean so this should work, and should be doable with gradient accumulation. (If it's broken for batch size > 1, you're welcome to open a new issue for that.)

This was referenced Jul 2, 2014

DenseNet feature pyramid computation #308

Closed

On-the-fly net resizing, without reallocation (where possible) #594

Merged

longjon closed this as completed May 9, 2015

frankier pushed a commit to frankier/caffe that referenced this issue May 19, 2020

BVLC#557

587d488

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow images of different sizes as inputs #557

Allow images of different sizes as inputs #557

sguada commented Jun 28, 2014

sguada commented Jun 28, 2014

jamt9000 commented Jun 28, 2014

kloudkl commented Jun 30, 2014

kloudkl commented Jun 30, 2014

kloudkl commented Jul 1, 2014

sguada commented Jul 1, 2014

kloudkl commented Jul 2, 2014

qingqing01 commented Aug 13, 2014

shelhamer commented Sep 21, 2014

hayderm commented Apr 13, 2015

longjon commented May 9, 2015

Allow images of different sizes as inputs #557

Allow images of different sizes as inputs #557

Comments

sguada commented Jun 28, 2014

sguada commented Jun 28, 2014

jamt9000 commented Jun 28, 2014

kloudkl commented Jun 30, 2014

kloudkl commented Jun 30, 2014

kloudkl commented Jul 1, 2014

sguada commented Jul 1, 2014

kloudkl commented Jul 2, 2014

qingqing01 commented Aug 13, 2014

shelhamer commented Sep 21, 2014

hayderm commented Apr 13, 2015

longjon commented May 9, 2015