Skip to content

Setting up Discriminative Fine Tuning

Sandip Bhattacharjee edited this page Jan 29, 2019 · 1 revision

Here we set up the factors for each of layers in the network that will be applied on the base learning rate. In other words, if base[lr] denotes the learning rate for the entire network then learning rate of a layer k is defined k[lr] = base[lr] * multiplicative factor.

The core idea is that the initial layers of a network contain much finer features like, edge, lines, curves etc. We would not want to change the weights of the pretrained network too much in these layers (most of these networks are trained on 'imagenet' data that probably has the largest collection of labelled data).

In the last few layers of the network we essentially combine the lower level features i.e. lines, curves, edges to come up with higher level features like circle with a certain shade, ellipse with texture or in case of human features (nose, ear, face) etc. This is where we would like to change the weights from their initial problem statement and make them specific to our classification problem.

Clone this wiki locally