-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Finetune Individual Layers and Change Model Structure #186
Comments
I assume that setting the learning rate for all layers you do not want to update will suffice in this case. |
Right, to expand on Yangqing's comment you can zero the per-layer learning rates for layers below the sixth layer that you do not want to train. See the imagenet example prototxt for setting the layer learning rates. |
Hi Yangqing and Shelhamer Thankyou so much for the suggestions. I took a look at the Imagenet.prototxt and Imagenet_solver.prototxt files. But my issue is that
Thanks once again |
What you're describing is finetuning, which is a key task in Caffe. It's as simple as
once you have edited the model definitions as described in the slide. You can absolutely start from the learned model–that's the point. For 2: No, a nice point about Caffe is that you define and adjust models without having to code. Just edit the prototxt definition to remove the layers you do not want. They will be ignored in creating the new model by finetuning. As a simple example, try editing imagenet.prototxt to remove the softmax layer–you'll see the output is then the raw scores from the fully-connected innerproduct layer. For 3: To use all the Caffe training machinery, you should write your own loss layer in c++ and define your model with it. Then you can simply call Good luck! |
Hi Shelhamer, Thanks a lot for all the info and support. This is amazing. Thanks once again. |
A general question about finetuning: I am asking since I did a small experiment where I visualized the filters of the first convolutional layer of my modified version of imagenet after finetuning and it looked exactly like the picture in the tutorial. (http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/filter_visualization.ipynb ) |
Hi, |
Judging frome the header files: |
thank you very much.. |
@shelhamer To ensure that nothing changes in the model, would using |
Hi
I would like to use the pretrained model as reference and plug in my own cost function at the end of the sixth layer and update the layer six alone (using gradient descent).
Right now, I am in a point where I can extract features successfully from any layer.
I digged into the cpp code and found that for each layer, a Forward_cpu and Backward_cpu is written. But to the best of my knowledge, this calculates the gradients alone. The ComputeUpdateValue and Update function updates all the layers
To do the updates in the sixth layer alone, or generally if I want to update any single layer, is it possible to reuse these functions or has it to be newly written ?
Any suggestions/help much appreciated, Thanks
The text was updated successfully, but these errors were encountered: