## Problem Scenario:
In this project we have a dataset containing different images of cats and dogs. There are 5000 images of both cats and dogs. We need to use these 10000 images to train the neural network model to predict whether the image is of a cat or a dog. 

To achieve this goal, we will train a **Convolutional Neural Network (CNN)**. We will build a CNN using one of the deep learning libraries, **Keras**.

We first divide the available dataset into training and validation set. We use 80% (4000 images of cats + 4000 images of dogs = 8000 images) of the dataset as training set and 20% (1000 images of cats + 1000 images of dogs = 2000 images) as validation set.

## Import Libraries
    

In [1]:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

Using TensorFlow backend.


All the layers in CNN are arranged sequentially. Hence, we import the Sequential model from the Keras package. CNN is composed of following layers:
* **Convolutional layer:** 
Conv2D helps in adding this layer to our CNN model.Since we are dealing with images in our project, we use Conv2D.
* **Pooling layer:** 
Pooling can be of different types like Max, Sum, Average, etc. In our project we are using MaxPooling.
* **Flatten layer:** 
This layer converts a 2-D matrix a 1-D array that acts as input to the next layer.
* **Fully Connected layer:** 
Dense helps in adding this layer.


In [2]:
import os

training_datapath = os.getcwd() + '/dataset/training_set'
validation_datapath = os.getcwd() + '/dataset/validation_set'

In [3]:
# ----------------------------- #
# PART 1 - Building a CNN model #
# ----------------------------- #

# Initialising the CNN 
classifier = Sequential()

# Adding Convolution layer 
classifier.add(Conv2D(32,(3,3),input_shape = (64, 64, 3),activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2,2)))

# Adding second Convolution layer 
classifier.add(Conv2D(32,(3,3),activation = 'relu' ))
classifier.add(MaxPooling2D(pool_size = (2,2)))

# Adding Flatten layer
classifier.add(Flatten())

# Adding fully connected layer
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 1, activation = 'sigmoid'))

# Compiling the CNN model
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

The first Convolutional layer of our CNN model consists of 32 filters/ kernels/ feature detectors of size/ shape (3 X 3). 
The dataset contains images of different sizes and formats. Hence, we need to first convert all the images to a single common format. This is done by the input_shape parameter. Now, all our images from the dataset will have the same format of 64X64. 3 in the input_shape parameter indicates that we are dealing with the color images. 1 is used incase of black and white images (greyscale images).

The second Convolutional layer has been added for the sake of improving accuracy of the training. Here, we do not need the input_shape parameter because, the inputs are already forced to be of uniform format in the first layer. This means that after the first layer all the outcomes henceforth will have the uniform format.

The Convolutional layer is accompanied with the Pooling layer to downsample the size of the feature maps resulting after convolution operation. In our CNN model, we are using the pooling window of size (2X2).

Like the name suggests, the Flatten layer flattens i.e., it converts the 2D matrices resulting from the previous layer into a 1D array, which acts as an input layer of the following Fully Connected Neural Network. In our CNN model, we are using a single hidden layer with 128 neurons and relu as activation function of the hidden layer. Since, there are only two categories (cats and dogs) in our classification problem, one neuron in the output layer would be sufficient and the sigmoid activation function suits the best (in case of multi-class classification softmax function will be a better alternative). 

The loss function is binary crossentropy due to the binary classification of output.

In [4]:
# ---------------------------------- #
# PART 2 - Data(Image) Augmentation  #
# ---------------------------------- #

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

validation_datagen = ImageDataGenerator(rescale=1./255)

training_set = train_datagen.flow_from_directory(
        training_datapath,
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

validation_set = validation_datagen.flow_from_directory(
        validation_datapath,
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

training_set.class_indices   #gives the indices of the two classes of outputs

Found 8000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


{'cats': 0, 'dogs': 1}

Data augmentation is done when we have small sample of training data. In our project we only have 10000 images for training the model. Small sample of training data often leads to overfitting of the model. Hence, we perform image augmentation before fitting the model to our training data.Image augmentation performs random transformations on the available data images in every batch like rotation, shearing, zooming and resizing and hence, increasing our training data. Thus, avoids overfitting of data.

Here, the target_size of the augmented images is (64X64) same as that of our input images and the class mode is binary because of the binary classification of the output.

In [5]:
# ---------------------------------------- #
# PART 3 - Fitting/ Training the CNN model #
# ---------------------------------------- #

classifier.fit_generator(
        training_set,
        steps_per_epoch=8000,
        epochs=25,
        validation_data=validation_set,
        validation_steps=2000)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.callbacks.History at 0x7f1d11dbd9e8>

The accuracy of the model can be further increased by..
*  adding more convolutional layers
*  adding more hidden layers in the fully connected neural network.
*  tweaking the number of neurons in the hidden layers.
*  increasing the input size. Because, higher the input size higher will be the pixel information and better will be the results.

We need to remember that this could also be computationally expensive.


In [13]:
# -------------------------------  #
# PART 4 - Making New Prediction   #
# -------------------------------- #

import numpy as np
from keras.preprocessing import image

test_image_1_path = os.getcwd() + '/dataset/test_set/cat_or_dog_1.jpg'
test_image_2_path = os.getcwd() + '/dataset/test_set/cat_or_dog_2.jpg'

test_image_1 = image.load_img(test_image_1_path, target_size = (64,64))
test_image_1 = image.img_to_array(test_image_1)
test_image_1 = np.expand_dims(test_image_1, axis = 0)

result = classifier.predict(test_image_1)

if result[0][0] == 1:
    print("prediction = dog")
else:
    print("prediction = cat")

prediction = dog
