## Finetuning

In this notebook, we show how to finetune an existing network for a new task. As kind of recap we start with the original model and then create a new model which we finetune that to the new data.

In [1]:
import tensorflow as tf
import tensorflow.contrib.slim as slim
import tensorflow.contrib.slim.nets as nets
from scipy.misc import imread, imresize
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from imagenet_classes import class_names

## Definition of the model 

### Original model

We start with the original model. There seems to be a competition how to define a network with the least possible amount of code. The slim library includes a **arg_scope** which allows define defaults for aguments and allows to repeat building blocks. With this library we can defined the VGG16 network as follows:

In [2]:
tf.reset_default_graph()
images = tf.placeholder(tf.float32, [None, None, None, 3], name='images')
inputs = tf.image.resize_images(images, (224,224))
with tf.variable_scope('vgg_16', [inputs]) as sc:
    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                          activation_fn=tf.nn.relu,
                          weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                          weights_regularizer=slim.l2_regularizer(0.0005)):
        net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
        net = slim.max_pool2d(net, [2, 2], scope='pool1')
        net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
        net = slim.max_pool2d(net, [2, 2], scope='pool2')
        net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
        net = slim.max_pool2d(net, [2, 2], scope='pool3')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
        net = slim.max_pool2d(net, [2, 2], scope='pool4')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
        net = slim.max_pool2d(net, [2, 2], scope='pool5')
        net = slim.conv2d(net, 4096, [7, 7], padding='VALID', scope='fc6')
        net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
        net = slim.conv2d(net, 1000, [1, 1], activation_fn=None, normalizer_fn=None,scope='fc8')

tf.train.SummaryWriter('/tmp/dumm/fine_tuning', tf.get_default_graph()).close()

In [3]:
variables_to_restore = slim.get_variables_to_restore()
#init_assign_op, init_feed_dict = \
#   slim.assign_from_checkpoint('/Users/oli/Dropbox/server_sync/tf_slim_models/vgg_16.ckpt', variables_to_restore)
init_assign_op, init_feed_dict = \
   slim.assign_from_checkpoint('/home/dueo/Dropbox/Server_Sync//tf_slim_models/vgg_16.ckpt', variables_to_restore)
sess = tf.Session()
sess.run(init_assign_op, init_feed_dict)

In [4]:
from imagenet_classes import class_names
img1 = imread('poodle.jpg')
feed_vals = [img1]
np.shape(feed_vals)
d = sess.run(net, feed_dict={images:feed_vals})[0,0,0,]
prob = np.exp(d) / np.sum(np.exp(d))
preds = (np.argsort(prob)[::-1])[0:5]
for p in preds:
    print p, class_names[p], prob[p]
sess.close()

267 standard poodle 0.991464
265 toy poodle 0.00368469
266 miniature poodle 0.0030329
160 Afghan hound, Afghan 0.000330886
221 Irish water spaniel 0.000312831


## Finetuning

We now start from scretch and add a fully connected network with 10 classes after the convolutional part of the network. We also have to define a loss function for training, the network later.

In [5]:
tf.reset_default_graph()
images = tf.placeholder(tf.float32, [None, None, None, 3], name='images')
inputs = tf.image.resize_images(images, (224,224))

with tf.variable_scope('vgg_16', [inputs]) as sc:
    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                          activation_fn=tf.nn.relu,
                          weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                          weights_regularizer=slim.l2_regularizer(0.0005)):
        net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
        net = slim.max_pool2d(net, [2, 2], scope='pool1')
        net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
        net = slim.max_pool2d(net, [2, 2], scope='pool2')
        net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
        net = slim.max_pool2d(net, [2, 2], scope='pool3')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
        net = slim.max_pool2d(net, [2, 2], scope='pool4')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
        net = slim.max_pool2d(net, [2, 2], scope='pool5')
        # Here we start you own definitions
        shape = int(np.prod(net.get_shape()[1:]))
        net = tf.reshape(net, [-1, shape])
        net = slim.fully_connected(net, 1000, scope='fc6') #Only 1000 
        net = slim.dropout(net, 0.5, scope='dropout6')
        net = slim.fully_connected(net, 10, scope='fc7', activation_fn=None, normalizer_fn=None) #Only 10
        # This would be the original vgg16. We replace these layers with FC layers and less classes
        #net = slim.conv2d(net, 4096, [7, 7], padding='VALID', scope='fc6')
        #net = slim.conv2d(net, 4096, [1, 1], scope='fc7')
        #net = slim.conv2d(net, 1000, [1, 1], activation_fn=None, normalizer_fn=None,scope='fc8')
        
        # The graph is stored w/o loss, adding loss to graph

Y = tf.placeholder(tf.int64, shape=None, name='Labels')
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(net, Y, name='xentropy')
loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')

sess = tf.Session()       
tf.train.SummaryWriter('/tmp/dumm/fine_tuned', tf.get_default_graph()).close()

We want to restore all of the convolutional layers, we can get them as follows 

In [6]:
vars_to_restore = slim.get_variables(scope='vgg_16/con')
[var.name for var in vars_to_restore]

[u'vgg_16/conv1/conv1_1/weights:0',
 u'vgg_16/conv1/conv1_1/biases:0',
 u'vgg_16/conv1/conv1_2/weights:0',
 u'vgg_16/conv1/conv1_2/biases:0',
 u'vgg_16/conv2/conv2_1/weights:0',
 u'vgg_16/conv2/conv2_1/biases:0',
 u'vgg_16/conv2/conv2_2/weights:0',
 u'vgg_16/conv2/conv2_2/biases:0',
 u'vgg_16/conv3/conv3_1/weights:0',
 u'vgg_16/conv3/conv3_1/biases:0',
 u'vgg_16/conv3/conv3_2/weights:0',
 u'vgg_16/conv3/conv3_2/biases:0',
 u'vgg_16/conv3/conv3_3/weights:0',
 u'vgg_16/conv3/conv3_3/biases:0',
 u'vgg_16/conv4/conv4_1/weights:0',
 u'vgg_16/conv4/conv4_1/biases:0',
 u'vgg_16/conv4/conv4_2/weights:0',
 u'vgg_16/conv4/conv4_2/biases:0',
 u'vgg_16/conv4/conv4_3/weights:0',
 u'vgg_16/conv4/conv4_3/biases:0',
 u'vgg_16/conv5/conv5_1/weights:0',
 u'vgg_16/conv5/conv5_1/biases:0',
 u'vgg_16/conv5/conv5_2/weights:0',
 u'vgg_16/conv5/conv5_2/biases:0',
 u'vgg_16/conv5/conv5_3/weights:0',
 u'vgg_16/conv5/conv5_3/biases:0']

In [7]:
sess.run(tf.initialize_all_variables())
restorer = tf.train.Saver(vars_to_restore)
#restorer.restore(sess, '/Users/oli/Dropbox/server_sync/tf_slim_models/vgg_16.ckpt')
restorer.restore(sess, '/home/dueo/Dropbox/Server_Sync/tf_slim_models/vgg_16.ckpt')
print("Model restored.")

Model restored.


To see if everything is fine in principle, we apply the untrained model to the image.

In [8]:
sess.run(net, feed_dict={images:feed_vals})

array([[  7.53580141, -15.98202515,   4.61846018, -18.85750008,
        -13.94858742,  14.51944923,   6.84897995,   0.39248943,
        -15.06960678,  16.3237114 ]], dtype=float32)

### Learning 

We are now adapting the network trained on ImageNet to do predictions on CIFAR10.  

### The CIFAR 10 data set

The images can be obtained from: http://www.cs.toronto.edu/~kriz/cifar.html. The 32x32 images are of have the following classes:

In [9]:
names = ['plane','auto','bird','cat','deer','dog','frog','horse','ship','truck']

In [10]:
import cPickle
def unpickle(file):
    fo = open(file, 'rb')
    dict = cPickle.load(fo)
    fo.close()
    data = dict['data']
    imgs = np.transpose(np.reshape(data,(-1,32,32,3), order='F'),axes=(0,2,1,3)) #order batch,x,y,color
    y = np.asarray(dict['labels'], dtype='uint8')
    return y, imgs

In [11]:
#y, imgs = unpickle('/Users/oli/Dropbox/data/CIFAR-10/cifar-10-batches-py/test_batch')
y, imgs = unpickle('/home/dueo/Dropbox/data/CIFAR-10/cifar-10-batches-py/test_batch')

In [12]:
y[0:10], imgs.shape

(array([3, 8, 8, 0, 6, 6, 1, 6, 3, 1], dtype=uint8), (10000, 32, 32, 3))

### Finetuning

For training the TFLearn library (http://tflearn.org/) is quite comfortable. This library introduce some variables, which need to be initialized. We find them by the following trick: 

In [13]:
import tflearn

In [14]:
train_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, "vgg_16/fc") #The naming starts with fc1, fc2
print("Number of trainable variables sets {} detail: {}".format(len(train_vars), [u.name for u in train_vars]))

Number of trainable variables sets 4 detail: [u'vgg_16/fc6/weights:0', u'vgg_16/fc6/biases:0', u'vgg_16/fc7/weights:0', u'vgg_16/fc7/biases:0']


In [15]:
temp = set(tf.all_variables()) #All we have 
optimizer = tf.train.AdamOptimizer()
trainop = tflearn.TrainOp(loss=loss, metric=None, optimizer=optimizer, trainable_vars=train_vars,batch_size=32)
trainer = tflearn.Trainer(train_ops=trainop, tensorboard_verbose=0, session=sess, tensorboard_dir='/tmp/dumm/fine_tuned')
uninititalized = set(tf.all_variables()) - temp
print('The Following Variables have been uninitialized {}'.format([u.name for u in uninititalized]))
sess.run(tf.initialize_variables(uninititalized))

The Following Variables have been uninitialized [u'Training_step:0', u'Global_Step:0', u'Optimizer/vgg_16/fc6/biases/Adam_1:0', u'val_loss:0', u'Optimizer/vgg_16/fc7/biases/Adam:0', u'Optimizer/vgg_16/fc7/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/vgg_16/conv4/conv4_1/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/vgg_16/conv5/conv5_2/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/vgg_16/fc6/weights/Adam_1:0', u'Optimizer/vgg_16/fc7/biases/Adam_1:0', u'Optimizer/vgg_16/conv4/conv4_2/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/vgg_16/conv3/conv3_2/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/beta2_power:0', u'Optimizer/vgg_16/fc7/weights/Adam:0', u'Optimizer/vgg_16/fc6/weights/Adam:0', u'Optimizer/xentropy_mean/moving_avg:0', u'Optimizer/vgg_16/conv3/conv3_1/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/vgg_16/conv2/conv2_1/weights/Regularizer/l2_regularizer/moving_avg:0', u'Optimizer/Opt

In [16]:
trainer.fit({images: imgs, Y: y}, n_epoch=5, show_metric=False, val_feed_dicts=0.1, run_id='5Epochs')

Training Step: 1410  | total loss: [1m[32m0.75346[0m[0m
| Optimizer | epoch: 005 | loss: 0.75346 | val_loss: 1.33991 -- iter: 9000/9000
Training Step: 1410  | total loss: [1m[32m0.75346[0m[0m
| Optimizer | epoch: 005 | loss: 0.75346 | val_loss: 1.33991 -- iter: 9000/9000
--


In [17]:
sess.close()

![](Fine_Tuning_TB.png)