# Pretrained Keras Models


In this notebook we will go through loading a keras based pretrained model. While the API can sometimes be a little buggy it also has a number of the most powerful pretrained models ready at your disposal with a few lines of code. The ins and outs of using keras for pretrained image detectors is covered here and you should be able to use them in a number of different tasks and integrate them into more complex models by the end of this.

For more information on the keras pretrained models visit: https://keras.io/applications/

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
## Imports
import tensorflow as tf
from tensorflow.python.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.python.keras import backend as K
import numpy as np
import matplotlib.pyplot as plt
import cv2
from tensorflow.keras.applications.vgg16 import decode_predictions

For this example we will use an image of a cat for reference.

In [None]:
H,W = 224,224
img_path = 'cat.jpeg'
img = cv2.imread('cat.jpeg')
img = cv2.resize(img,(H,W))

plt.imshow(img)
plt.show()

The image must be put into a batch shape to be injested by the model
shape = __[BATCH SIZE, HEIGHT, WIDTH, COLOR CHANNELS]__.

In [None]:
img = img.reshape([1,H,W,3])

First we draw the computational model. Take note of the variable `include_top`. This is the prediction layer, when using the pretrained model on a different task this layer can be discarded.

In [None]:
# First we draw the computational map of the model.
model = VGG16(include_top=True, weights='imagenet')

When inputing images into the model they must be preprocessed in the same manner as the model was trained. For different models this can sometimes be slightly different. Keras models come with a 'preprocess_input' function that take care of this for you. Feed an appropriate placeholder into this function to make sure the preprocessing matches the model implementation.

In [None]:
inputs = tf.placeholder(shape=img.shape, dtype=tf.float32)
processed_inputs = preprocess_input(inputs)

Input the readied data tensor into the model to complete the process. The output tensor in this case will be a prediction. If `include_top = False` in the model declaration then output will be the last feature vector output before prediction.

In [None]:
pred = model(processed_inputs)

You can access any intermediate tensor within the model which is inside `model.layers`. By checking the architecture of the particular model (see details at https://keras.io/applications/) these can be more easily accessed by creating a dictionary indexed by tensor name and can be used in conjunction with any additional operations drawn on the computational graph.

In [None]:
model_layers = {l.name: l.output for l in model.layers}

for k,v in model_layers.items():
    print(k,'has shape', v.shape)

To run the model with the correct weights a keras backend session must be used. This is because of some gymnastics keras is doing in the background and can be a bit buggy. So be careful to make sure the correct weights are loaded.

In [None]:
with tf.keras.backend.get_session() as sess:
    K.set_session(sess)
    output = sess.run(pred, feed_dict={inputs: images})


The model outputs a prediction for each possible classification outcome, this model is trained on imagenet with 1000 different labels. As in most multi-class classification models we just take the max output as our result.

In [None]:
print('The output has a shape',output.shape)
print('First 5 predictions',output[0,:5])
print()
idx = np.argmax(output[0])
print('The largest prediciton label %d has output %.3f.'%(idx,output[0,idx]))

We can use the keras packages to interprete this correctly.

In [None]:
# convert the probabilities to class labels
label = decode_predictions(output)
# retrieve the most likely result, e.g. highest probability
label = label[0][0]
# print the classification
print('The prediction is %s with %.2f%% certainty.' % (label[1], label[2]*100))

## Exercise

Use the vgg model and fine tune it on the MNIST classification task.

Considerations:
* The vgg model takes images of size 224,224 and MNIST data is of size 28,28.
* A fully connected layer will need to be added onto the vgg model with the correct number of labels for our classification problem.
* Which parameters do we want to train, and what should the learning rate be?
