Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to pruning the residual block without the 1x1 conv projection shortcut? #5

Closed
ganji15 opened this issue May 18, 2018 · 12 comments
Closed

Comments

@ganji15
Copy link

ganji15 commented May 18, 2018

Hi, @slothkong

I encountered some troubles when I try to prune filters for ResNet.

In the paper "Pruning Filters for Efficient ConvNets", the author has shown how to prune a residual block with a 1x1 conv projection shortcut, which is logical and reasonable; However, I cannot figure out how to efficiently prune a residual block without that 1x1 conv projection shortcut. It should note that most residual blocks in the ResNet have no 1x1 conv projection shortcut, i.e., their shortcuts contain no parameters. If we prune filters of both conv layers in such a residual block, the channels of the residual branch and the main branch might be different, and thus F(x) and the residual x cannot be added directly. I wonder how you slove this problem in your implemtation, and any suggestions/advices will be greatly appreciated!

Thanks.

@slothkong
Copy link
Owner

To the best of my understanding, is simply not possible to prune those residual blocks. The reason is exactly what you mention:

"The channels of the residual branch and the main branch might be different, and thus F(x) and the residual x cannot be added directly"

I this is a limitation that residual models have.

@ganji15
Copy link
Author

ganji15 commented May 21, 2018

@slothkong
It would be a little disappointing that we cannot apply this excellent idea to prune ResNet.
Anyway, thanks very much for your valuable opinion and discussion.

@ganji15 ganji15 closed this as completed May 21, 2018
@ZhuweiQin
Copy link

Hi, @ganji15

I have the same problem when I try to prune the ResNet. I was wondering did you prune the ResNet successfully?

Thanks.

@ganji15
Copy link
Author

ganji15 commented May 26, 2018

Hi, @ZhuweiQin

In the paper "Pruning Filters for Efficient ConvNets", the author mentioned that

"Since there is no projection mapping for choosing the identity feature maps, we only consider pruning the first layer of the residual block."

However, this compromise limits the prune effecitivness for ResNet.

@ZhuweiQin
Copy link

@ganji15 Yeah, I saw that. So, did you try to prune the first layer of the residual block?

@slothkong
Copy link
Owner

@ZhuweiQin, here is a summary of how I pruned resnet56. To the left is the pruning ratio defined in Mr. Li's paper and to the right, the one that I used.

capture

@ZhuweiQin
Copy link

@slothkong Thanks for sharing your experiment result. That‘s amazing. Your work prune more filters than Mr. Li's paper.
How about the accuracy drop under this pruning ratio? Does this figure come from your published paper? Would mind share with us?
I will try to implement the ResNet pruning, and really hope you could keep updating this repository. It's really helpful.
Thanks~

@ganji15
Copy link
Author

ganji15 commented May 28, 2018

@ZhuweiQin, I have pruned ResNet18 but the results are not satisfactory compared with VGG16.

@slothkong, I still think it is not a good idea to prune filters of the ResNet due to its complex structure. On the other hand, it would be greatly helpful if you would like to share your implementations.

@slothkong
Copy link
Owner

slothkong commented May 29, 2018

As you may know, Mr. Li mentions in his paper that he determines the pruning ratio (the number of filters to prune from each layer) by simple experimentation. However the ratio he defines might not optimal for new models or slighly different version of the same model. So I defined a simple schema to estimate how many filters can your prune from each layer without braking the model. This the summary of the models I have managed to prune.

capture

The paper is not out yet T.T
I willl try upload an updated version of the code.

@ganji15
Copy link
Author

ganji15 commented May 29, 2018

@slothkong, really impressive results! I am waiting for your good news! ;-)

@ZhuweiQin
Copy link

Hi, @slothkong How is your paper? I was wondering if you could share your implementations.

@slothkong
Copy link
Owner

sorry for the late replay. I'm afraid the paper will be coming any time soon (maybe never). But I plan to open source the full code with which I obtained those results in the table. Unfortunately I no longer have the .caffemodel files but I think can provide the .prototxt and the training configurations for you guys to train the models on your own. Please give me at least one week. I need to write the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants