How to pruning the residual block without the 1x1 conv projection shortcut? #5

ganji15 · 2018-05-18T03:16:24Z

I encountered some troubles when I try to prune filters for ResNet.

In the paper "Pruning Filters for Efficient ConvNets", the author has shown how to prune a residual block with a 1x1 conv projection shortcut, which is logical and reasonable; However, I cannot figure out how to efficiently prune a residual block without that 1x1 conv projection shortcut. It should note that most residual blocks in the ResNet have no 1x1 conv projection shortcut, i.e., their shortcuts contain no parameters. If we prune filters of both conv layers in such a residual block, the channels of the residual branch and the main branch might be different, and thus F(x) and the residual x cannot be added directly. I wonder how you slove this problem in your implemtation, and any suggestions/advices will be greatly appreciated!

Thanks.

slothkong · 2018-05-21T05:25:22Z

To the best of my understanding, is simply not possible to prune those residual blocks. The reason is exactly what you mention:

"The channels of the residual branch and the main branch might be different, and thus F(x) and the residual x cannot be added directly"

I this is a limitation that residual models have.

ganji15 · 2018-05-21T06:22:20Z

@slothkong
It would be a little disappointing that we cannot apply this excellent idea to prune ResNet.
Anyway, thanks very much for your valuable opinion and discussion.

ZhuweiQin · 2018-05-26T04:16:06Z

Hi, @ganji15

I have the same problem when I try to prune the ResNet. I was wondering did you prune the ResNet successfully?

Thanks.

ganji15 · 2018-05-26T06:39:55Z

Hi, @ZhuweiQin

In the paper "Pruning Filters for Efficient ConvNets", the author mentioned that

"Since there is no projection mapping for choosing the identity feature maps, we only consider pruning the first layer of the residual block."

However, this compromise limits the prune effecitivness for ResNet.

ZhuweiQin · 2018-05-27T05:11:37Z

@ganji15 Yeah, I saw that. So, did you try to prune the first layer of the residual block?

slothkong · 2018-05-28T05:21:03Z

@ZhuweiQin, here is a summary of how I pruned resnet56. To the left is the pruning ratio defined in Mr. Li's paper and to the right, the one that I used.

ZhuweiQin · 2018-05-28T05:52:51Z

@slothkong Thanks for sharing your experiment result. That‘s amazing. Your work prune more filters than Mr. Li's paper.
How about the accuracy drop under this pruning ratio? Does this figure come from your published paper? Would mind share with us?
I will try to implement the ResNet pruning, and really hope you could keep updating this repository. It's really helpful.
Thanks~

ganji15 · 2018-05-28T06:04:26Z

@ZhuweiQin, I have pruned ResNet18 but the results are not satisfactory compared with VGG16.

@slothkong, I still think it is not a good idea to prune filters of the ResNet due to its complex structure. On the other hand, it would be greatly helpful if you would like to share your implementations.

slothkong · 2018-05-29T05:57:34Z

As you may know, Mr. Li mentions in his paper that he determines the pruning ratio (the number of filters to prune from each layer) by simple experimentation. However the ratio he defines might not optimal for new models or slighly different version of the same model. So I defined a simple schema to estimate how many filters can your prune from each layer without braking the model. This the summary of the models I have managed to prune.

The paper is not out yet T.T
I willl try upload an updated version of the code.

ganji15 · 2018-05-29T07:12:29Z

@slothkong, really impressive results! I am waiting for your good news! ;-)

ZhuweiQin · 2018-08-13T19:03:31Z

Hi, @slothkong How is your paper? I was wondering if you could share your implementations.

slothkong · 2018-10-14T18:21:06Z

sorry for the late replay. I'm afraid the paper will be coming any time soon (maybe never). But I plan to open source the full code with which I obtained those results in the table. Unfortunately I no longer have the .caffemodel files but I think can provide the .prototxt and the training configurations for you guys to train the models on your own. Please give me at least one week. I need to write the documentation

ganji15 closed this as completed May 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to pruning the residual block without the 1x1 conv projection shortcut? #5

How to pruning the residual block without the 1x1 conv projection shortcut? #5

ganji15 commented May 18, 2018

slothkong commented May 21, 2018

ganji15 commented May 21, 2018

ZhuweiQin commented May 26, 2018

ganji15 commented May 26, 2018

ZhuweiQin commented May 27, 2018

slothkong commented May 28, 2018

ZhuweiQin commented May 28, 2018

ganji15 commented May 28, 2018

slothkong commented May 29, 2018 •

edited

Loading

ganji15 commented May 29, 2018

ZhuweiQin commented Aug 13, 2018

slothkong commented Oct 14, 2018

How to pruning the residual block without the 1x1 conv projection shortcut? #5

How to pruning the residual block without the 1x1 conv projection shortcut? #5

Comments

ganji15 commented May 18, 2018

slothkong commented May 21, 2018

ganji15 commented May 21, 2018

ZhuweiQin commented May 26, 2018

ganji15 commented May 26, 2018

ZhuweiQin commented May 27, 2018

slothkong commented May 28, 2018

ZhuweiQin commented May 28, 2018

ganji15 commented May 28, 2018

slothkong commented May 29, 2018 • edited Loading

ganji15 commented May 29, 2018

ZhuweiQin commented Aug 13, 2018

slothkong commented Oct 14, 2018

slothkong commented May 29, 2018 •

edited

Loading