# Fine-Tuning and Transfer Learning in TF-Slim
*by Marvin Bertin*
<img src="../images/tensorflow.png" width="400">

In [1]:
import sys  
sys.path.append("../") 

import tensorflow as tf
slim = tf.contrib.slim

%load_ext autoreload
%autoreload 2

## Fine-Tuning Existing Models

## Restoring Variables from a Checkpoint

After a model has been trained, it can be restored using tf.train.Saver() which restores Variables from a given checkpoint. tf.train.Saver() provides a simple mechanism to restore all or just a few variables.

In [None]:
# Create some variables.
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")

# Add ops to restore all the variables.
restorer = tf.train.Saver()

# Add ops to restore some variables.
restorer = tf.train.Saver([v1, v2])

with tf.Session() as sess:
    # Restore variables from disk.
    restorer.restore(sess, "/tmp/model.ckpt")
    print("Variables restored.")

## Initialize a Model From a Checkpoint

It is common to want to 'warm-start' a model from a pre-trained checkpoint. For example, the training process was paused and  we wish to resume training. Or we wish to restore the model from a crash.

TF-Slim provides a convenient mechanism for doing so.

In [None]:
# Create the train_op
train_op = slim.learning.create_train_op(total_loss, optimizer)

# Create the initial assignment op
checkpoint_path = '/path/to/old_model_checkpoint'

# get all the variable from the model
variables_to_restore = slim.get_model_variables()

# Load and restore all the variables from the model
init_assign_op, init_feed_dict = slim.assign_from_checkpoint(
    checkpoint_path, variables_to_restore)

# Create an initial assignment function.
def InitAssignFn(sess):
    sess.run(init_assign_op, init_feed_dict)
    
# Runs a training loop
slim.learning.train(train_op, my_log_dir, init_fn=InitAssignFn)

## Fine-Tuning Part of a Model from Checkpoint 


Rather than initializing all of the weights of a given model, we sometimes
only want to restore some of the weights from a checkpoint.

**Partially Restoring Models**

For example, it is common to fine-tune a pre-trained model on an entirely new dataset. In these situations, one can use TF-Slim's helper functions to select a subset of variables to restore.

In [None]:
# Specify the variables to restore via a list of inclusion or exclusion
# Keep all the convolutional layers and remove the fully connected ones
variables_to_restore = slim.get_variables_to_restore(
    include=["conv"], exclude=["fc8", "fc9"])

# or this is equivalent
variables_to_restore = slim.get_variables_to_restore(include=["conv"])

# or by variable name
variables_to_restore = slim.get_variables_by_name("conv2")

# or by variable name suffix
variables_to_restore = slim.get_variables_by_suffix("2")

# or by variable scope
variables_to_restore = slim.get_variables(scope="conv-scope")

## Transfer Learning on a Different Task


<img src="../images/transfer_learning.jpg" width="400">

In practice, it is very time consuming to train an entire Convolutional Network from scratch, because it is relatively rare to have a dataset of sufficient size.

Instead, people use pre-trained models trained on a large datasets. For example, ImageNet, which contains 1.2 million images with 1000 categories).

Your new task most likely won't have excatly the same number of categories to classify over, than the pre-trained model. In practice, only the convolutional layers are kept and the fully connected layers are re-initialized to random vectors.

There are two main Transfer Learning schemes

1. Convolutional layers as **fixed feature extractor**

        This scheme treats the Convolutional layers as a fixed feature extractor for the new dataset. Convolutional layers have fixed weights and therefore are not trained. They are used to extract features and construct a rich vector embedding for every image. Once these embeddings have been computed for all images, they become the new inputs and can be used to train a linear classifier (e.g. Linear SVM or Softmax classifier) for the new dataset.
    

2. **Fine-tuning** the Convolutional layers.

        This scheme treats the Convolutional layers has part of the model and applies backpropagation through the all model.  This fine-tunes the weights of the pretrained network to the new task. It is also possible to keep some of the earlier layers fixed (due to overfitting concerns) and only fine-tune some higher-level portion of the network. Earlier features of a CNN contain more generic features (e.g. edge detectors or color blob detectors) that should be useful to many tasks. However, later layers becomes progressively more specific and therefore may not be appropriate for the new task.
 

In [None]:
# Load other dataset
images, labels = DataLoader(...)

# Create the model
predictions = ModelNewTask(images)

train_op = slim.learning.create_train_op(total_loss, optimizer)

# Specify where the Model, trained on ImageNet, was saved.
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'

# Specify where the new model will be stored:
log_dir = '/path/to/newl_model_dir/'

# Restore only the convolutional layers:
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])

# restore selected variables from checkpoints
init_fn = slim.assign_from_checkpoint_fn(model_path, variables_to_restore)

# Runs a training loop using a TensorFlow supervisor.
slim.learning.train(train_op, log_dir, init_fn=init_fn)

## Next Lesson
### TensorBoard - Visualize Neural Networks and Inspect Model Learning
-  Use TensorBoard to visualize the network computational graph, as well as, inspecting and understanding the model's training progress.

<img src="../images/divider.png" width="100">