# Recognizing dogs and cats with Convolutional Neural Network

The purpose of this laboratory is to build a first end to end reflex-based AI model to teach computers to [**understand images**](https://www.ted.com/talks/fei_fei_li_how_we_re_teaching_computers_to_understand_pictures).

In particular, the objective of this lab is to write an AI application able to recognize cats and dogs on images. Your application will take an image as input and will be able to say wheter the image contains a dog or a cat. You will work with the data of the [**Dogs vs Cats**](http://www.kaggle.com/c/dogs-vs-cats) competition from Kaggle. This competition was launched in 2013 and the first place was obtained by [Pierre Sermanet](https://research.google.com/pubs/PierreSermanet.html), actually Research Scientist at Google Brain, by using the [Overfeat](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start#overfeatobject_recognizer_feature_extractor) deep learning library he wrote during his PhD at New York University under the supervision of [Yann Le Cun](http://yann.lecun.com/), Director of AI Research at Facebook. He obtained $1.09%$ of classification errors. Try to do your best to approach this score!!!

To download the data, install the kaggle API:
```
pip3 install kaggle
```

Then run:
```
kaggle competitions download -c dogs-vs-cats
```

    
**The final objective of this laboratory is to be aware to the potential but also to the limitations of reflex-based AI approaches towards visual recognition tasks.**
    

# Data preparation and loading

As for any ML algorithm, data preparation is required when working with convolutional neural networks and deep learning models. You will use the [*ImageDataGenerator class*](https://keras.io/preprocessing/image/) that defines the configuration for image data preparation but also for data augmentation, a step often necessary for deep learning. The code below shows how to create and configure an ImageDataGenerator and to fit it on your data. In this example, we will use the sample dataset of the Dogs and Cats challenge. We consider that you have a training directory and a validation directory setup in this manner :

```
  data/
    train/
        dogs/
        cats/
    validation/
        dogs/
        cats/
```

In [None]:
# Make sure you organize well the data here.

# Data Augmentation 

Only few training examples are available in the sampleDeep dataset. In order to make the most of these training examples, a current approach is to **augment** them via a number of random transformations, so that our model would never see twice the exact same picture. This augmentation step helps prevent overfitting and helps the model generalize better.


In [46]:
from keras.preprocessing.image import ImageDataGenerator

# definition of the number of samples propagated through the network at each step
batch_size = 16

# dimensions of our images.
img_width, img_height = 150, 150

train_data_dir = '/home/olivier/dev/data/catsdogs/train/'
validation_data_dir = '/home/olivier/dev/data/catsdogs/validation/'

# Augmentation of the training data using rotation, horizontal and vertical shift, shearing tranformation, zooming 
train_datagen = ImageDataGenerator(
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        fill_mode='nearest')

# Only rescaling for validation data
test_datagen = ImageDataGenerator(rescale=1./255)

# generator that will read pictures found in the train dataset directory and that will indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
        train_data_dir,  
        target_size=(150, 150),  
        batch_size=batch_size,
        class_mode='binary')  

# Similar generator for validation data
validation_generator = test_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode='binary')


Found 19904 images belonging to 2 classes.
Found 5096 images belonging to 2 classes.


# Build a Sequential model, from scratch

Models can be build easily with the Keras API. Here we will use the Sequential model API :
+ [https://keras.io/getting-started/sequential-model-guide/](https://keras.io/getting-started/sequential-model-guide/)


Here, we will build a convolutional neural network which is ,by design, one of the best models available for most "perceptual" problems (such as image classification), even with very little data to learn from.

In the code below, a model composed of 3 convolution layers with a ReLU activation and followed by max-pooling layers is built.
Your main work here is to look at the code in order to understand how deep neural architectures can be build. In particular, have a look on the [documentation on the different kinds of layers available in Keras](https://keras.io/layers/about-keras-layers/)




In [47]:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense

model = Sequential()

# !!! Implement the model here !!!
# First convolutional layer
# Second  convolutional layer 
# Third convolutional layer    
# Adding of two fully-connected layers
# single unit and sigmoid activation, which is perfect for a binary classification. 

# Use of the binary_crossentropy loss to train our model.
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])



# Training the model

We can now use some defined generators to train our build model.

In [None]:
nb_train_samples = 2000
nb_validation_samples = 800
epochs = 50

# model training
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size)

# saving the learned model
model.save_weights('first_try.h5')

# Evaluation 

- Carefully have a look on the results and on the diffferent metrics and their obtained values. What is your interpretation of the results ?
- Use the network to predict cats and dogs

You can now apply this model to any new image. For instance, the code below apply the model on an image of the test dataset. Try the following model on some images to test your model.

In [None]:
%matplotlib inline

from skimage import data, io
from matplotlib.pyplot import imshow
from keras.preprocessing import image
import numpy as np

def showPrediction(img_path, model):
    ima=io.imread(img_path)
    imshow(ima)
    img = image.load_img(img_path, target_size=(150, 150))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)

    print("Prediction")
    preds = model.predict_classes(x)
    print(preds)

## Use a pretrained Convnet model

In practice, very few people train an entire Convolutional Network from scratch (with random initialization), because it is relatively rare to have a dataset of sufficient size. Instead, in image classification, it is common to use networks pre-trained on a large dataset (such as ImageNet)  and to use it either as an initialization of as a fixed feature extractor for the task of interest (**transfer learning**). Indeed, these networks have already learned features that are useful for most computer vision problems, and leveraging such features would allow us to reach a better accuracy than any method that would only rely on the available data.

Different strategies can be used in transfer learning scenarios :

1. The ConvNet, trained on a large image dataset such as Imagenet, is used as a fixed feature extractor. In this case, the pipeline consists in taking the pre-trained ConvNet, removing the last fully connected layer and that by treating the rest of the ConvNet architecture as a fixed feature extractor for the new dataset
2. Fine Tuning of the ConvNet. In this case,  the weights of a part of the pretrained network are fine-tuned by continuing the backpropagation. As it as been observed that the first features of a ConvNet contain more generic features (e.g. edge detectors or color blob detectors) that should be useful to many tasks and that later layers become progressively more specific to the details of the classes contained in the original dataset, only a higher portion of the network is fine-tuned.


### ConvNet as a fixed feature extractor   

In our case, the ImageNet dataset contains several "cat" classes (persian cat, siamese cat...) and many "dog" classes among its total of 1000 classes. As a consequence any model pre-trained on ImageNet will already have learned features that are relevant to our classification problem. 

In particular, we will use the VGG16 architecture which won the 2014 Imagenet competition, and is a very simple model to create and understand. The VGG Imagenet team created both a larger, slower, slightly more accurate model (VGG 19) and a smaller, faster model (VGG 16). We will be using VGG 16 since the much slower performance of VGG19 is generally not worth the very minor improvement in accuracy.

![VGG16](https://heuritech.files.wordpress.com/2016/02/vgg16.png?w=470)

Source : [https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/](https://blog.heuritech.com/2016/02/29/a-brief-report-of-the-heuritech-deep-learning-meetup-5/)

In the code below, the strategy consists in instantiating only the convolutional part of the model (using the *include_top* argument) (see the [Keras documentation on VGG16](https://keras.io/applications/#vgg16)) and in running this model on our own training and validation data once by recording the output in two numpy arrays. Then, we will train a small fully-connected model on top of the stored features.

Some references :
 + VGG models : [http://www.robots.ox.ac.uk/~vgg/research/very_deep/](http://www.robots.ox.ac.uk/~vgg/research/very_deep/)
 



In [None]:
from keras import applications

top_model_weights_path = 'bottleneck_fc_model.h5'

def save_bottlebeck_features():
    """# Function that instanciates the convolutional part of the VGG16 pre-trained model
    on Imagenet and that runs it on our training and validation data"""
    datagen = ImageDataGenerator(rescale=1. / 255)

    # build and load the VGG16 network without the fully connected layers
    model = applications.VGG16(include_top=False, weights='imagenet')

    # preparation of the training data
    generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    
    # Generation of the predictions for the input samples from the training data generator and return them as a numpy array that we can saved
    bottleneck_features_train = model.predict_generator(
        generator, nb_train_samples // batch_size)
    np.save(open('bottleneck_features_train.npy', 'wb'),
            bottleneck_features_train)

    # preparation of the validation data
    generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    
    # Generation of the predictions for the input samples from the validation data generator and return them as a numpy array that we can saved
    bottleneck_features_validation = model.predict_generator(
        generator, nb_validation_samples // batch_size)
    np.save(open('bottleneck_features_validation.npy', 'wb'),
            bottleneck_features_validation)

    
def train_top_model():
    """Function that trains a small fully-connected model on top of the
    stored previous features
    """
    train_data = np.load(open('bottleneck_features_train.npy','rb'))
    train_labels = np.array(
        [0] * (nb_train_samples // 2) + [1] * (nb_train_samples // 2))

    validation_data = np.load(open('bottleneck_features_validation.npy','rb'))
    validation_labels = np.array(
        [0] * (nb_validation_samples // 2) + [1] * (nb_validation_samples // 2))

    # Building of the small fully-connected model
    model = Sequential()
    model.add(Flatten(input_shape=train_data.shape[1:]))
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1, activation='sigmoid'))
    
    # Configuration of the learning process
    model.compile(optimizer='rmsprop',
                  loss='binary_crossentropy', metrics=['accuracy'])

    # Training of the model
    model.fit(train_data, train_labels,
              epochs=epochs,
              batch_size=batch_size,
              validation_data=(validation_data, validation_labels))
    model.save_weights(top_model_weights_path)
    
save_bottlebeck_features()
train_top_model()

## Fine-tuning the top layers of a a pre-trained network

We will now try to "fine-tune" the last convolutional block of the VGG16 model. It consist in starting from a trained network (the VGG16 network), then re-training it on a new dataset using very small weight updates. In our case, this can be done in 3 steps:
+ Instantiate the convolutional base of VGG16 and load its weights.
+ Add our previously defined fully-connected model on top, and load its weights.
+ Freeze the layers of the VGG16 model up to the last convolutional block


In [None]:
from keras import backend
from keras import optimizers
import keras

# path to the model weights files.
weights_path = '../keras/examples/vgg16_weights.h5'
top_model_weights_path = 'bottleneck_fc_model.h5'
keras.backend.set_image_dim_ordering('tf')

# creation of the base VGG pre-trained model
model = applications.VGG16(
    weights='imagenet', include_top=False, input_shape=(150,150,3))
print('Model loaded.')

# build a classifier model to put on top of the convolutional model
top_model = Sequential()
top_model.add(Flatten(input_shape=model.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)

# creation of a real model from vgg
new_model = Sequential()
for l in model.layers:
    new_model.add(l)

# concatenation of the base model with the top model
new_model.add(top_model)


# set the first 25 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
for layer in new_model.layers[:25]:
    layer.trainable = False

# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
new_model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

# fine-tune the model
new_model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples//batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples// batch_size)


# Sources and references

+ This study case is inpired from the Lesson 1 of the fast.ai's online course, Practical Deep Learning For Coders : [http://course.fast.ai/](http://course.fast.ai/)
+ Others sources :
    + Stanford CS231n course on Convolutional Neural Networks for Visual Recognition : [http://cs231n.stanford.edu/](http://cs231n.stanford.edu/)
    + Keras blog post on building image classification models [here](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)