# Fetching and playing with a pre-trained VGG Net (16) in Keras

Let's start by importing some packages we need. To simplify things somewhat, we're no longer going to use TensorFlow proper from this point on, we're going to move to a higher-level library called Keras.

In [4]:
import numpy as np
from PIL import Image

from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input, decode_predictions

ImportError: cannot import name image_data_format

## Getting a feel for the model and data it is trained on

In this notebook, we're going to fetch a network that is pre-trained on the [ImageNet](http://www.image-net.org) data set. In particular, we're going to fetch the [VGG Net](https://arxiv.org/abs/1409.1556) model with 16 layers (that we're going to refer to as `VGG16`).

ImageNet project is a large visual database designed for use in visual object recognition software research. As of 2016, over ten million URLs of images have been hand-annotated by ImageNet to indicate what objects are pictured. ImageNet crowdsources its annotation process.

![ImageNet Data Sample](images/imagenet-sample.jpg "ImageNet Data Sample")

Since 2010, the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is a competition where research teams submit programs that classify and detect objects and scenes. (This is like the Olympics of computer vision challenges.)

VGG Net was introduced as one of the contenders in 2014's ImageNet Challenge. VGG Net secured the first and the second places in the localisation and classification tracks respectively. It was later described in great detail in [a paper](https://arxiv.org/abs/1409.1556) that came out the following year. The paper describes how a family of models essentially composed of simple 3x3 convolutional filters with increasing depth (11–19 layers, ReLU not shown for brevity) managed to perform so well at a range of computer vision tasks.

![VGG Network Architectures](images/vgg-architecture.png "VGG Network Architectures")

We're going to first reproduce this 16 layer network marked in green for classification, and soon see how it can be repurposed for the style transfer problem.

## Fetching a pretrained model in Keras

This is trivial to do in Keras, and can be done in a single line. [There is a selection](https://github.com/fchollet/keras/tree/master/keras/applications) of such models one can import.

In [2]:
model = VGG16(weights='imagenet', include_top=True)

Let's take a look at the model, convince ourselves it looks the same as the paper

In [3]:
layers = dict([(layer.name, layer.output) for layer in model.layers])
layers

{'block1_conv1': <tf.Tensor 'Relu:0' shape=(?, 224, 224, 64) dtype=float32>,
 'block1_conv2': <tf.Tensor 'Relu_1:0' shape=(?, 224, 224, 64) dtype=float32>,
 'block1_pool': <tf.Tensor 'MaxPool:0' shape=(?, 112, 112, 64) dtype=float32>,
 'block2_conv1': <tf.Tensor 'Relu_2:0' shape=(?, 112, 112, 128) dtype=float32>,
 'block2_conv2': <tf.Tensor 'Relu_3:0' shape=(?, 112, 112, 128) dtype=float32>,
 'block2_pool': <tf.Tensor 'MaxPool_1:0' shape=(?, 56, 56, 128) dtype=float32>,
 'block3_conv1': <tf.Tensor 'Relu_4:0' shape=(?, 56, 56, 256) dtype=float32>,
 'block3_conv2': <tf.Tensor 'Relu_5:0' shape=(?, 56, 56, 256) dtype=float32>,
 'block3_conv3': <tf.Tensor 'Relu_6:0' shape=(?, 56, 56, 256) dtype=float32>,
 'block3_pool': <tf.Tensor 'MaxPool_2:0' shape=(?, 28, 28, 256) dtype=float32>,
 'block4_conv1': <tf.Tensor 'Relu_7:0' shape=(?, 28, 28, 512) dtype=float32>,
 'block4_conv2': <tf.Tensor 'Relu_8:0' shape=(?, 28, 28, 512) dtype=float32>,
 'block4_conv3': <tf.Tensor 'Relu_9:0' shape=(?, 28, 28

We can also get a sense for how many parameters they are in this model.

In [4]:
model.count_params() # A lot!

138357544

## Using it for classification

Now that we have our pre-trained model loaded, we can use it for classification.

### Load an test image and preprocess it

In [1]:
image_path = 'images/Sadie-dancing.png'
image = Image.open(image_path)
image = image.resize((224, 224))
image

NameError: name 'Image' is not defined

In [6]:
# Convert it into an array
x = np.asarray(image, dtype='float32')
# Convert it into a list of arrays
x = np.expand_dims(x, axis=0)
# Pre-process the input to match the training data
x = preprocess_input(x)

### Classify the test image

The following code classifies the test image and decodes the results into a list of tuples (class, description, probability). There is one such list for each sample in the batch, but since we're only sending in one test image we only get one set of output.

In [7]:
preds = model.predict(x)
print('Predicted:', decode_predictions(preds, top=3)[0])

('Predicted:', [(u'n02504458', u'African_elephant', 0.84805286), (u'n01871265', u'tusker', 0.10270286), (u'n02504013', u'Indian_elephant', 0.049056333)])


## Extension 1: The pre-trained model can be fine-tuned for your own classes

This is classic *transfer learning*. We will not go into this today, but a good worked out example can be found in [the Keras documentation](https://keras.io/applications/).

## Extension 2: Extracting features from a specific layer

We can *extract features* from a specific layer (using the names above).

In [8]:
from keras.models import Model
base_model = VGG16(weights='imagenet')
model = Model(input=base_model.input, output=base_model.get_layer('block4_pool').output)

block4_pool_features = model.predict(x)
print(block4_pool_features)
print(block4_pool_features.shape)

[[[[   0.            0.            0.         ...,    0.           82.01461029
       0.        ]
   [   0.            0.            0.         ...,    0.           85.68616486
       0.        ]
   [   0.            0.            0.         ...,    0.            0.
       0.        ]
   ..., 
   [   0.            0.            0.         ...,    0.          241.86521912
       0.        ]
   [   0.            0.            0.         ...,    0.          324.13943481
       0.        ]
   [   0.            0.            0.         ...,    0.          352.39898682
       0.        ]]

  [[   0.            0.            0.         ...,    0.          223.67085266
       0.        ]
   [   0.            0.            0.         ...,    0.          235.8238678
       0.        ]
   [   0.            0.            0.         ...,    0.           58.34262848
       0.        ]
   ..., 
   [   0.          177.63851929  143.94613647 ...,    0.          210.84683228
      33.22182846]
   [   0.