Different learning rate of different layers #3

bzhong2 · 2017-05-15T15:11:45Z

Question 1: In the fine tuning process, it seems that the excluded layers or new adding customized layers should have faster learning rates than the layers with restored parameters. Thus, how can we choose different learning rate for different layers?
Question 2: I am trying modify this tutorial to use inception model. However, inception model downloaded from google research blog does not have similar function as "inception_resnet_v2_arg_scope". It seems it is for normalizing and regularizing, but there is no such part in inception. Thus, the following code need to be changed but I am not sure how.
with slim.arg_scope(inception_resnet_v2_arg_scope()):
logits, end_points = inception_resnet_v2(images, num_classes = dataset.num_classes, is_training = True)

Thanks a lot!

kwotsin · 2017-05-15T16:32:06Z

Do you mean the 'excluded' layers not restored from the checkpoint would have a faster learning rate? I don't think there will be a faster learning rate for these layers, since restoring from checkpoint mean there is simply a nicer set of weights that perform better than random weights. Otherwise, the variables are either trainable or non-trainable, and if they're trainable I guess they take the same amount of time. I'm not very sure if it is possible to set a different learning for different layers, or if there's a good rationale behind doing this.
The arg_scope is useful when you want to set certain parameters consistently throughout the model without having to repeat writing the same code again. The arg_scope is found here: https://github.com/tensorflow/models/blob/master/slim/nets/inception_resnet_v2.py

bzhong2 · 2017-05-15T17:47:05Z

Hello,

Thank you very much for your responds. I agree with you. The reason I am wondering whether we can set different learning rate is because the layers restored from pre-trained model already has a nice set of parameters while the layers excluded (not restored from the checkpoint) do not. Thus, I suspect that the "excluded layers" need more training than the restored layers.

kwotsin · 2017-05-16T10:12:46Z

Yes indeed. The excluded layers will require some training before you can customize the model to your own use. However, I'm not sure if the excluded layers themselves each require a different learning rate.

bzhong2 · 2017-05-16T17:57:17Z

Thanks a lot!

kwotsin · 2017-05-17T10:11:59Z

No problem :D

kwotsin added this to the awaiting response milestone May 15, 2017

kwotsin closed this as completed May 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different learning rate of different layers #3

Different learning rate of different layers #3

bzhong2 commented May 15, 2017

kwotsin commented May 15, 2017

bzhong2 commented May 15, 2017

kwotsin commented May 16, 2017

bzhong2 commented May 16, 2017

kwotsin commented May 17, 2017

Different learning rate of different layers #3

Different learning rate of different layers #3

Comments

bzhong2 commented May 15, 2017

kwotsin commented May 15, 2017

bzhong2 commented May 15, 2017

kwotsin commented May 16, 2017

bzhong2 commented May 16, 2017

kwotsin commented May 17, 2017