# Use pre-trained network to compute bottlenecks

We will use a CNN pre-trained on the Imagenet challenge to compute the bottlenecks of our images.

Bottleneck is just a way to name the output of the last convolutional layer of a convolutional neural network.

By precomputing this outputs we will be able to try (with very small computational cost) different models on top of the bottlenecks.

## Create resnet50 body

In [1]:
import sys

In [2]:
sys.path.append("D:\\GitHub\\models")

Check the resnet50.py module

In [3]:
from resnet50 import ResNet50

Using TensorFlow backend.


Download the weights from [this link](https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5)

You can increase the input_shape (probably will lead to better accuracy) according to your GPU capacity.

I have a 4G gpu.

In [4]:
%%time
model = ResNet50(input_shape=(300,300,3), weights_path="D:/GitHub/models/resnet50_body.h5")

Wall time: 17.5 s


## Generate Train / Val / Test bottlenecks

In [5]:
model.input_shape[1:3], model.output_shape

((300, 300), (None, 10, 10, 2048))

In [6]:
from keras.preprocessing.image import ImageDataGenerator

In [7]:
gen = ImageDataGenerator()

In [8]:
train_batches = gen.flow_from_directory("train", model.input_shape[1:3], shuffle=False)
valid_batches = gen.flow_from_directory("valid", model.input_shape[1:3], shuffle=False)
test_batches = gen.flow_from_directory("test", model.input_shape[1:3], shuffle=False, class_mode=None)

Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
Found 12500 images belonging to 1 classes.


## Generate the bottlenecks

In [9]:
%%time
train_bottleneck = model.predict_generator(train_batches, train_batches.nb_sample)

KeyboardInterrupt: 

In [10]:
%%time
valid_bottleneck = model.predict_generator(valid_batches, valid_batches.nb_sample)

Wall time: 1min 17s


In [10]:
%%time
test_bottleneck = model.predict_generator(test_batches, test_batches.nb_sample)

Wall time: 7min 35s


## Save bottlenecks

In [11]:
import h5py

In [12]:
with h5py.File("300_bottlenecks.h5") as hf:
    hf.create_dataset("train", data=train_bottleneck)
    hf.create_dataset("valid", data=valid_bottleneck)
    hf.create_dataset("test", data=test_bottleneck)

## Save labels

In [14]:
from keras.utils.np_utils import to_categorical

In [16]:
with h5py.File("labels.h5") as hf:
    hf.create_dataset("train", data=to_categorical(train_batches.classes))
    hf.create_dataset("valid", data=to_categorical(valid_batches.classes))