# Bottleneck Features

Bottleneck features are the representation of the images passed through a network minus the classification layer (typically a fully connected layer with as many nodes as classes in the dataset, and softmax-activated). 

Precalculating bottleneck features using pre-trained models is a great way to train other models on the resulting tensors, such as traditional multilayer perceptrons, or SVCs or even decision trees.

In this notebook we'll calculate a couple of bottleneck features using VGG16.

## Load and Preprocess a Sample of Images

Calculating bottleneck features can be quite expensive, both in terms of computing power and time. 

Is for this reason that's well only take a subset of the dog images we used in the previous notebook.

In [1]:
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
import numpy as np
from glob import glob
import random

SAMPLE_SIZE = 50

image_paths = glob('dog_images/*/*/*')
image_paths = random.sample(image_paths, SAMPLE_SIZE)

def path_to_tensor(image_path):
    img = image.load_img(image_path, target_size=(224, 224))
    x = image.img_to_array(img)
    
    return np.expand_dims(x, axis=0)

def paths_to_tensor(image_paths):
    tensors = [path_to_tensor(image_path) for image_path in image_paths]
    return np.vstack(tensors)

# Calculate the image input
image_input = preprocess_input(paths_to_tensor(image_paths))

print(image_input.shape)

Using TensorFlow backend.


(50, 224, 224, 3)


## Importing VGG16

Now we can import VGG16 without the fully connected layers

In [2]:
from keras.applications.vgg16 import VGG16
model = VGG16(include_top=False)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
__________

## Create Bottlenecks

As we can see above, the final layer of this network is a max pooling. This will give us a 3D tensor as a result.

Let's create the bottlenecks:

In [3]:
bottlenecked_batch = model.predict(image_input)
print(f'Bottlenecked batch shape: {bottlenecked_batch.shape}')

Bottlenecked batch shape: (50, 7, 7, 512)
