### Training the bottleneck features of VGG16 
##### Keras with Tensorflow backend (channel_last configuration)
Based on:
- https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html 
- https://gist.github.com/fchollet/f35fbc80e066a49d65f1688a7e99f069 

We first train the `top_model`, this model will receive as input the outputs of the last convolutional layer of VGG16. Once we train the `top_model` we will combine it with `vgg_model` in a single model, and we will re-train this single model with the layers of VGG frozen.

### 0. Donwload the data sets and prepare the data
A dataset of cats and dogs from an old kaggle competition:
- https://www.kaggle.com/c/dogs-vs-cats/data

Copy 2000/800 validation/training images in the next folder structure
- data
  - train
    - cats
      - 1000 cats
    - dogs
      - 1000 dogs
  - validation
    - cats
      - 400 cats
    - dogs
      - 400 dogs

In [1]:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
import time

Using TensorFlow backend.


In [2]:
# dimensions of our images.
img_width, img_height = 150, 150

top_model_weights_path = 'bottleneck_fc_model.h5'
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50
batch_size = 16

### 1. Calculate the inputs for the `top_model` (_bottleneck features_)
The inputs for the `top_model` or _bottleneck features_ will be the output of the last convolutional layer of VGG16. So we first use VGG16 to make the predictions on the training and validation sets.

In [3]:
datagen = ImageDataGenerator(rescale=1. / 255)
# build the VGG16 network
model = applications.VGG16(include_top=False, weights='imagenet')

generator = datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode=None,
    shuffle=False)

start = time.time()
bottleneck_features_train = model.predict_generator(
    generator, nb_train_samples // batch_size, verbose=1)
np.save(open('bottleneck_features_train.npy', 'wb'),
        bottleneck_features_train)
print("ellapsed time in seconds:", (time.time()-start))


generator = datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode=None,
    shuffle=False)

start = time.time()
bottleneck_features_validation = model.predict_generator(
    generator, nb_validation_samples // batch_size, verbose=1)
np.save(open('bottleneck_features_validation.npy', 'wb'),
        bottleneck_features_validation)
print("ellapsed time in seconds:", (time.time()-start))

Found 2000 images belonging to 2 classes.
ellapsed time in seconds: 351.3123679161072
Found 800 images belonging to 2 classes.
ellapsed time in seconds: 140.320405960083


### 2. Build and train the `top_model`
The input is the bottleneck features (results of the last conv layer of VGG16) from the previous step.

In [4]:
# make a list of 0s for cats and 1 for dogs, and concatenate them as an array
train_data = np.load(open('bottleneck_features_train.npy', 'rb'))
train_labels = np.array([0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))

# make a list of 0s for cats and 1 for dogs, and concatenate them as an array
validation_data = np.load(open('bottleneck_features_validation.npy', 'rb'))
validation_labels = np.array([0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))

model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy', metrics=['accuracy'])

start = time.time()
history = model.fit(train_data, train_labels,
          epochs=epochs,
          batch_size=batch_size,
          validation_data=(validation_data, validation_labels))
model.save_weights(top_model_weights_path)
print("ellapsed time in seconds:", (time.time()-start))

Train on 2000 samples, validate on 800 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
ellapsed time in seconds: 140.85941338539124


### 3. Build the `model` as a combination of `vgg16_model` and `top model`
Remember to freeze the layers of VGG16 (set them as non-trainable).

In [8]:
from keras import backend as K
from keras.models import Model 

if K.image_data_format() == 'channels_first':
    input_shape = (3, img_width, img_height)
else:
    input_shape = (img_width, img_height, 3)
    
# build the VGG16 network
# gracias a include_top=False no se carga las 3 capas iniciales, donde está definida una entrada de 224x224
base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=input_shape)

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# add the model on top of the convolutional base
# model.add(top_model)  # 'Model' object has no attribute 'add'
# keras.applications.vgg16 uses Functional API. You can only use the "add" method to a Sequential AP
model = Model(inputs=base_model.input, outputs= top_model(base_model.output))
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
__________

In [9]:
# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
# for layer in model.layers[:25]:
for layer in model.layers[:18]:
    print(layer.get_config()['name'])
    layer.trainable = False

input_4
block1_conv1
block1_conv2
block1_pool
block2_conv1
block2_conv2
block2_pool
block3_conv1
block3_conv2
block3_conv3
block3_pool
block4_conv1
block4_conv2
block4_conv3
block4_pool
block5_conv1
block5_conv2
block5_conv3


In [11]:
from keras import optimizers
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
# using hyperparameters used by others, see http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

### 4. Train the `model`

In [12]:
# prepare data augmentation configuration
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary')

Found 2000 images belonging to 2 classes.
Found 800 images belonging to 2 classes.


In [13]:
start = time.time()
# fine-tune the model
# we should get 90-95% in 50 epochs (7.5 hours of training on CPU!)
history = model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=5,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)
print("ellapsed time in seconds:", (time.time()-start))

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
ellapsed time in seconds: 2507.8110370635986


In [15]:
model.save_weights("vgg16+topmodel-weights-only.h5")
model.save("vgg16+topmodel.h5")

### 5. Use the `model` to make predictions

In [16]:
predictor = load_model("vgg16+topmodel.h5")

In [14]:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img
from keras.models import load_model
# from https://gist.github.com/ragvri/6a28b08b9ad844bc66b90db7d7cebb17
def predict_image_class(predictor, file, w, h):
#     model = applications.VGG16(include_top=False, weights='imagenet')
    x = load_img(file, target_size=(w, h))
    x = img_to_array(x)
    x = np.expand_dims(x, axis=0)
    array = predictor.predict(x)
    print(array)
    if array[0][0] == 1:
        print("dog")
    else:
        print("cat")

In [18]:
predict_image_class(predictor, "data/validation/dogs/dog.12100.jpg", img_width, img_height)
predict_image_class(predictor, "data/validation/cats/cat.12100.jpg", img_width, img_height)

[[ 1.]]
dog
[[ 0.]]
cat
