# Transfer Learning

## Requirements

This notebook requires the following package to be installed:

1. tensorflow

This can be installed via Anaconda (**`conda install <package>`**), or via PIP: (**`pip install <package>`**).

However, this is already installed on Google Colab

## Introduction

>In this lab, we will be using transfer learning, which means we are starting with a model that has been already trained on another problem. We will then be retraining it on a similar problem. **Deep learning from scratch can take days, but transfer learning can be done in short order.**

We're going to use a model trained on ImageNet's (http://image-net.org/) Large Visual Recognition Challenge dataset (http://www.image-net.org/challenges/LSVRC/2012/).

These models can differentiate between 1,000 different classes, like Dalmatian or dishwasher. You will have a choice of model architectures, so you can determine the right tradeoff between speed, size and accuracy for your problem.

We will use this same model, but retrain it to tell apart a small number of classes based on our own examples.

## Overview

![transfer](https://github.com/jojker/PML_Workshops/blob/master/Summer%202019/Day%202%20-%20Goal%201%20-%20Turning%20Images%20into%20Data/Ex%202%20-%20Multi-label%20classification%20(hopfield%20and%20CNNs)/TransferLearning%20(very%20large%20choice%20set)/imgs/transferlearningworkflow.png?raw=1)

In summary, by using a network trained on a similar problem, we can dramatically reduce the training time for a similar problem, and still obtain decent results.

## Objectives

We will learn:

1. How to use Python and Tensorflow to train an image classifier
2. How to classify images with this classifier

## Training Data

We'll be using an archive of creative-commons licensed flower photos (~ 278 MB in size).

Which contains the following:

```
    daisy/
    dandelion/
    roses/
    sunflowers/
    tulip/
    LICENSE.txt
```

In [0]:
# dowload the flower tar file from the tensorflow website
!wget http://download.tensorflow.org/example_images/flower_photos.tgz

# extract the images
!tar -xvzf flower_photos.tgz

The images are now in the folder structure. First select refresh, then you can see...

Files/flower_photos/

daisy, dandelion, roses, sunflowers, tulips

## Retraining the network

Google's retraining script can retrain either the Inception V3 Model (https://github.com/tensorflow/models/tree/master/research/slim#pre-trained-models) or ImageNet (https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html). 

The principal difference is that Inception V3 is optimized for accuracy, while the MobileNets are optimized to be small and efficient, at the cost of some accuracy.

**We'll be using the MobileNet.**

Inception V3 has a first-choice accuracy of 78% on ImageNet, but the model is 85MB, and requires much more processing than even the largest MobileNet configuration, which achieves 70.5% accuracy, with just a 19MB download.

## Configuration Options

Pick the following configuration options:

1. Input image resolution: 128,160,192, or 224px. Unsurprisingly, feeding in a higher resolution image takes more processing time, but results in better classification accuracy. Google recommends 224 as an initial setting.

2. The relative size of the model as a fraction of the largest MobileNet: 1.0, 0.75, 0.50, or 0.25. Google recommends 0.5 as an initial setting. The smaller models run significantly faster, at a cost of accuracy.

With the recommended settings, it typically takes only a couple of minutes to retrain on a laptop.

In [0]:
# dowload retrain python file
!wget https://github.com/tensorflow/hub/raw/master/examples/image_retraining/retrain.py

# download the image labeler python file
!wget https://github.com/tensorflow/tensorflow/raw/master/tensorflow/examples/label_image/label_image.py

## Accuracy

The graph below shows the first-choice-accuracies of these configurations (y-axis), vs the number of calculations required (x-axis), and the size of the model (circle area).

16 points are shown for mobilenet. For each of the 4 model sizes (circle area in the figure) there is one point for each image resolution setting. The 128px image size models are represented by the lower-left point in each set, while the 224px models are in the upper right.

Other notable architectures are also included for reference. "GoogleNet" in this figure is "Inception V1" (https://github.com/tensorflow/models/tree/master/slim#pre-trained-models). An extended version of this figure is available in slides 84-89 from here: (http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture9.pdf)

## Tensorboard

Before starting the training, launch tensorboard in the background. TensorBoard is a monitoring and inspection tool included with tensorflow. You will use it to monitor the training progress.

In [0]:
# install tensorboard for colab
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip ngrok-stable-linux-amd64.zip

In [0]:
# initialize tensorboard

LOG_DIR = 'tf_files/training_summaries'
get_ipython().system_raw(
    'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)

get_ipython().system_raw('./ngrok http 6006 &')


# produces the URL to visit to see the TensorBoard webpage
! curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

## Running the retraining script

As noted in the introduction, Imagenet models are networks with millions of parameters that can differentiate a large number of classes. We're only training the final layer of that network, so training will end in a reasonable amount of time.

Start your retraining with one big command (note the `--summaries_dir` option, sending training progress reports to the directory that tensorboard is monitoring) :

In [0]:
# go to Runtime and change runtime type and select GPU

import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

In [0]:
# retrain

IMAGE_SIZE=224
ARCHITECTURE="mobilenet_0.50_${IMAGE_SIZE}"
with tf.device('/gpu:0'):
  !python retrain.py \
    --bottleneck_dir=tf_files/bottlenecks \
    --how_many_training_steps=500 \
    --model_dir=tf_files/models/ \
    --summaries_dir=tf_files/training_summaries/"${ARCHITECTURE}" \
    --output_graph=tf_files/retrained_graph.pb \
    --output_labels=tf_files/retrained_labels.txt \
    --architecture="${ARCHITECTURE}" \
    --image_dir=flower_photos

## In the background

This script downloads the pre-trained model, adds a new final layer, and trains that layer on the flower photos you've downloaded. 

ImageNet does not include any of these flower species we're training on here. However, the kinds of information that make it possible for ImageNet to differentiate among 1,000 classes are also useful for distinguishing other objects. By using this pre-trained network, we are using that information as input to the final classification layer that distinguishes our flower classes.

*If you want to get higher accuracy, you can train for longer - e.g. instead of 500 iterations, increase this to a few thousand. Naturally, this will take longer, though.*

## How retraining works

The first phase analyzes all the images on disk and calculates the bottleneck values for each of them. What's a bottleneck?

![bottle](https://github.com/jojker/PML_Workshops/blob/master/Summer%202019/Day%202%20-%20Goal%201%20-%20Turning%20Images%20into%20Data/Ex%202%20-%20Multi-label%20classification%20(hopfield%20and%20CNNs)/TransferLearning%20(very%20large%20choice%20set)/imgs/stack.png?raw=1)

These ImageNet models are made up of many layers stacked on top of each other, a simplified picture of Inception V3 from TensorBoard, is shown above (all the details are available in this paper: http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf, with a complete picture on page 6). These layers are pre-trained and are already very valuable at finding and summarizing information that will help classify most images. For this codelab, you are training only the last layer (`final_training_ops` in the figure below). While all the previous layers retain their already-trained state.

In the below figure, the node labeled "softmax", on the left side, is the output layer of the original model. While all the nodes to the right of the "softmax" were added by the retraining script.

![network](https://github.com/jojker/PML_Workshops/blob/master/Summer%202019/Day%202%20-%20Goal%201%20-%20Turning%20Images%20into%20Data/Ex%202%20-%20Multi-label%20classification%20(hopfield%20and%20CNNs)/TransferLearning%20(very%20large%20choice%20set)/imgs/network.png?raw=1)

## Bottlenecks

A bottleneck is an informal term we often use for the layer just before the final output layer that actually does the classification. "Bottelneck" is not used to imply that the layer is slowing down the network. We use the term bottleneck because near the output, the representation is much more compact than in the main body of the network.

Every image is reused multiple times during training. Calculating the layers behind the bottleneck for each image takes a significant amount of time. Since these lower layers of the network are not being modified their outputs can be cached and reused.

So the script is running the constant part of the network, everything below the node labeled Bottlene... above, and caching the results.

The command you ran saves these files to the bottlenecks/ directory. If you rerun the script, they'll be reused, so you don't have to wait for this part again.

## More about training

Once the script finishes generating all the bottleneck files, the actual training of the final layer of the network begins.

The training operates efficiently by feeding the cached value for each image into the Bottleneck layer. The true label for each image is also fed into the node labeled `GroundTruth`. Just these two inputs are enough to calculate the classification probabilities, training updates, and the various performance metrics.

As it trains you'll see a series of step outputs, each one showing training accuracy, validation accuracy, and the cross entropy:

1. The **training accuracy** shows the percentage of the images used in the current training batch that were labeled with the correct class.
2. **Validation accuracy**: The validation accuracy is the precision (percentage of correctly-labelled images) on a randomly-selected group of images from a different set.
3. **Cross entropy** is a loss function that gives a glimpse into how well the learning process is progressing (lower numbers are better here).

## More on training

The training's objective is to make the cross entropy as small as possible, so you can tell if the learning is working by keeping an eye on whether the loss keeps trending downwards, ignoring the short-term noise.

If your model has finished generating the bottleneck files you can check your model's progress by opening TensorBoard  (the URL produced above as an output of the TensorBoard initialization cell), and clicking on the figure's name to show them. TensorBoard may print out warnings to your command line. These can safely be ignored.

As the program runs, at each step 10 images are chosen at random from the training set, finds their bottlenecks from the cache, and feeds them into the final layer to get predictions. Those predictions are then compared against the actual labels to update the final layer's weights through a backpropagation process.

As the process continues, you should see the reported accuracy improve. After all the training steps are complete, the script runs a final test accuracy evaluation on a set of images that are kept separate from the training and validation pictures. This test evaluation provides the best estimate of how the trained model will perform on the classification task.

You should see an accuracy value of between 85% and 99%, though the exact value will vary from run to run since there's randomness in the training process. (If you are only training on two classes, you should expect higher accuracy.) This number value indicates the percentage of the images in the test set that are given the correct label after the model is fully trained.

## Using the retrained model

The retraining script writes data to the following two files:

1. `tf_files/retrained_graph.pb`, which contains a version of the selected network with a final layer retrained on your categories.
2. `tf_files/retrained_labels.txt`, which is a text file containing labels.

## Classifying an image

The codelab repo also contains a copy of tensorflow's label_image.py example, which you can use to test your network.

As you can see, this Python program takes quite a few arguments. The defaults are all set for this project, but if you used a MobileNet architecture with a different image size you will need to set the `--input_size` argument using the variable you created earlier: `--input_size=${IMAGE_SIZE}`

Now, let's run the script on the image of a daisy (`tf_files/flower_photos/daisy/21652746_cc379e0eea_m.jpp`)

![daisy](https://github.com/jojker/PML_Workshops/blob/master/Summer%202019/Day%202%20-%20Goal%201%20-%20Turning%20Images%20into%20Data/Ex%202%20-%20Multi-label%20classification%20(hopfield%20and%20CNNs)/TransferLearning%20(very%20large%20choice%20set)/tf_files/flower_photos/daisy/21652746_cc379e0eea_m.jpg?raw=1)


## Testing the network

Each execution will print a list of flower labels (these match the folder labels in `tf_files`), in most cases with the correct flower on top (though each retrained model may be slightly different).

You might get results like this for a daisy photo:

```
    daisy (score = 0.99071)
    sunflowers (score = 0.00595)
    dandelion (score = 0.00252)
    roses (score = 0.00049)
    tulips (score = 0.00032)
```

This indicates a high confidence (~99%) that the image is a daisy, and low confidence for any other label.

In [0]:
# identify the above daisy

!python label_image.py \
    --graph=tf_files/retrained_graph.pb  \
    --labels=tf_files/retrained_labels.txt \
    --output_layer=final_result \
    --image=flower_photos/daisy/21652746_cc379e0eea_m.jpg \
    --input_layer=Placeholder

## Further Testing

You can use `label_image.py` to classify any image file you choose, either from your downloaded collection, or new ones. You just have to change the `--image` file name argument to the script.

For example (`tf_files/flower_photos/roses/2414954629_3708a1a04d.jpg`):

![rose](https://github.com/jojker/PML_Workshops/blob/master/Summer%202019/Day%202%20-%20Goal%201%20-%20Turning%20Images%20into%20Data/Ex%202%20-%20Multi-label%20classification%20(hopfield%20and%20CNNs)/TransferLearning%20(very%20large%20choice%20set)/tf_files/flower_photos/roses/2414954629_3708a1a04d.jpg?raw=1)


In [0]:
# identify the above rose

!python label_image.py \
    --graph=tf_files/retrained_graph.pb  \
    --labels=tf_files/retrained_labels.txt \
    --output_layer=final_result \
    --image=flower_photos/roses/2414954629_3708a1a04d.jpg \
    --input_layer=Placeholder

# Activity: What happens when you increase/decrease the learning rate?

# Extra: Try your own data set! 

# References:

1. Heavily adapted from: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#7