# Use pre-trained network to compute bottlenecks

We will use a CNN pre-trained on the Imagenet challenge to compute the bottlenecks of our images.

Bottleneck is just a way to name the output of the last convolutional layer of a convolutional neural network.

By precomputing this outputs we will be able to try (with very small computational cost) different models on top of the bottlenecks.

## Create resnet50 body

Check the resnet50_GAP.py module

In [None]:
from resnet50_GAP import ResNet50_GAP

Download the weights from [this link](https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5)

You can increase the input_shape (probably will lead to better accuracy) according to your GPU capacity.

I have a 4G gpu.

In [None]:
# edit this to the path where you saved the downloaded weights
weights_path = "D:/GitHub/models/resnet50_body.h5"

In [None]:
%%time
model = ResNet50_GAP(input_shape=(300,300,3), weights_path="D:/GitHub/models/resnet50_body.h5")

In [None]:
model.summary()

## Generate Train / Val / Test bottlenecks

In [None]:
model.input_shape[1:3], model.output_shape

In [None]:
from keras.preprocessing.image import ImageDataGenerator

In [None]:
gen = ImageDataGenerator()

In [None]:
TRAIN_PATH = "D:/GitHub/Kaggle/redux/train"
VALID_PATH = "D:/GitHub/Kaggle/redux/train"
TEST_PATH = "D:/GitHub/Kaggle/redux/train"

Adapt the batch size according to your GPU capacity.

In [None]:
train_batches = gen.flow_from_directory(TRAIN_PATH, 
                                        model.input_shape[1:3], shuffle=False, batch_size=8)
valid_batches = gen.flow_from_directory(VALID_PATH, 
                                        model.input_shape[1:3], shuffle=False, batch_size=8)
test_batches = gen.flow_from_directory(TEST_PATH, 
                                       model.input_shape[1:3], shuffle=False, batch_size=8, class_mode=None)

## Generate the bottlenecks

In [None]:
%%time
train_bottleneck = model.predict_generator(train_batches, train_batches.samples // train_batches.batch_size)

In [None]:
%%time
valid_bottleneck = model.predict_generator(valid_batches, valid_batches.samples // valid_batches.batch_size)

In [None]:
%%time
test_bottleneck = model.predict_generator(test_batches, test_batches.samples // test_batches.batch_size)

## Save bottlenecks

In [None]:
import h5py

In [None]:
with h5py.File("300_300_bottlenecks.h5") as hf:
    hf.create_dataset("train", data=train_bottleneck)
    hf.create_dataset("valid", data=valid_bottleneck)
    hf.create_dataset("test", data=test_bottleneck)

## Save labels

In [None]:
from keras.utils.np_utils import to_categorical

In [None]:
with h5py.File("labels.h5") as hf:
    hf.create_dataset("train", data=to_categorical(train_batches.classes))
    hf.create_dataset("valid", data=to_categorical(valid_batches.classes))