Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different learning rate of different layers #3

Closed
bzhong2 opened this issue May 15, 2017 · 5 comments
Closed

Different learning rate of different layers #3

bzhong2 opened this issue May 15, 2017 · 5 comments

Comments

@bzhong2
Copy link

bzhong2 commented May 15, 2017

Question 1: In the fine tuning process, it seems that the excluded layers or new adding customized layers should have faster learning rates than the layers with restored parameters. Thus, how can we choose different learning rate for different layers?
Question 2: I am trying modify this tutorial to use inception model. However, inception model downloaded from google research blog does not have similar function as "inception_resnet_v2_arg_scope". It seems it is for normalizing and regularizing, but there is no such part in inception. Thus, the following code need to be changed but I am not sure how.
with slim.arg_scope(inception_resnet_v2_arg_scope()):
logits, end_points = inception_resnet_v2(images, num_classes = dataset.num_classes, is_training = True)

Thanks a lot!

@kwotsin
Copy link
Owner

kwotsin commented May 15, 2017

  1. Do you mean the 'excluded' layers not restored from the checkpoint would have a faster learning rate? I don't think there will be a faster learning rate for these layers, since restoring from checkpoint mean there is simply a nicer set of weights that perform better than random weights. Otherwise, the variables are either trainable or non-trainable, and if they're trainable I guess they take the same amount of time. I'm not very sure if it is possible to set a different learning for different layers, or if there's a good rationale behind doing this.

  2. The arg_scope is useful when you want to set certain parameters consistently throughout the model without having to repeat writing the same code again. The arg_scope is found here: https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py

@kwotsin kwotsin added this to the awaiting response milestone May 15, 2017
@bzhong2
Copy link
Author

bzhong2 commented May 15, 2017

Hello,

Thank you very much for your responds. I agree with you. The reason I am wondering whether we can set different learning rate is because the layers restored from pre-trained model already has a nice set of parameters while the layers excluded (not restored from the checkpoint) do not. Thus, I suspect that the "excluded layers" need more training than the restored layers.

@kwotsin
Copy link
Owner

kwotsin commented May 16, 2017

Yes indeed. The excluded layers will require some training before you can customize the model to your own use. However, I'm not sure if the excluded layers themselves each require a different learning rate.

@bzhong2
Copy link
Author

bzhong2 commented May 16, 2017

Thanks a lot!

@kwotsin
Copy link
Owner

kwotsin commented May 17, 2017

No problem :D

@kwotsin kwotsin closed this as completed May 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants