# How to extract bottleneck features

Modern CNNs can take weeks to train on multiple GPUs on ImageNet, but fortunately, many researchers share their final weights. Keras, e.g., contains pre-trained models for several of the reference architectures discussed above, namely VGG16 and 19, ResNet50, InceptionV3 and InceptionResNetV2, MobileNet, DenseNet, NASNet and MobileNetV2

This notebook illustrates how to download pre-trained VGG16 model, either with the final layers to generate predictions or without the final layers as illustrated in the figure below to extract the outputs produced by the bottleneck features.

## Imports

In [1]:
from keras.applications.vgg19 import VGG19, preprocess_input
from keras.applications.vgg16 import VGG16
from keras.applications.inception_v3 import InceptionV3
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
import keras.backend as K
import numpy as np
from pathlib import Path

Using TensorFlow backend.


## Load and Preprocess Sample Images

Before supplying an image to a pre-trained network in Keras, there are some required preprocessing steps.

We have imported a very small dataset of 8 images and stored the  preprocessed image input as `img_input`.  Note that the dimensionality of this array is `(8, 224, 224, 3)`.  In this case, each of the 8 images is a 3D tensor, with shape `(224, 224, 3)`.

In [2]:
img_paths = Path('images').glob('*.jpg')

In [3]:
def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

In [4]:
def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in img_paths]
    return np.vstack(list_of_tensors)

In [None]:
# calculate the image input
img_input = preprocess_input(paths_to_tensor(img_paths))

img_input.shape

## Import Pre-Trained VGG-19

Import the VGG-16 network (including the final classification layer) that has been pre-trained on ImageNet.

![VGG-16 model](images/vgg19.png)

Keras makes it very straightforward to download and use pre-trained models:

In [None]:
vgg19 = VGG19()
vgg19.summary()

For this network, `model.predict` returns a 1000-dimensional probability vector containing the predicted probability that an image returns each of the 1000 ImageNet categories.  The dimensionality of the obtained output from passing `img_input` through the model is `(8, 1000)`.  The first value of `8` merely denotes that 8 images were passed through the network.

In [None]:
y_pred = model.predict(img_input)
y_pred.shape

In [None]:
np.argmax(y_pred, axis=1)

## Import the VGG-16 Model, with the Final Fully-Connected Layers Removed

When performing transfer learning, we need to remove the final layers of the network, as they are too specific to the ImageNet database.  This is accomplished in the code cell below.

![VGG-16 model for transfer learning](images/vgg19_transfer.png)

You can use this model like any other Keras model for predictions. To exclude the fully-connected layers, just add the keyword `include_top=False` to obtain the output of the final convolutional layer when passing an image to the CNN.

In [None]:
vgg19 = VGG19(include_top=False)
vgg19.summary()

By omitting the fully-connected layers, we are no longer forced to use a fixed input size for the model (224x224, the original ImageNet format). By only keeping the convolutional modules, our model can be adapted to arbitrary input sizes.

### Extract Output of Final Max Pooling Layer

Now, the network stored in `model` is a truncated version of the VGG-16 network, where the final three fully-connected layers have been removed.  In this case, `model.predict` returns a 3D array (with dimensions $7\times 7\times 512$) corresponding to the final max pooling layer of VGG-16.  The dimensionality of the obtained output from passing `img_input` through the model is `(8, 7, 7, 512)`.  The first value of `8` merely denotes that 8 images were passed through the network.  

In [None]:
vgg16.predict(img_input).shape

This is exactly how we calculate the bottleneck features for your project!

## Import ResNet50

### With final layer

In [None]:
model = ResNet50()
model.summary()

### Without final layer

In [None]:
model = ResNet50(include_top=False)
model.summary()

## Import Inception V3

### With final layer

In [None]:
model = InceptionV3()
model.summary()

### Without final layer

In [None]:
model = InceptionV3(include_top=False)
model.summary()