<a href="https://colab.research.google.com/github/rmschulman/mldata/blob/master/CatsDogs_simple.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### First let us import all the required keras packages using which we are going to build our CNN, make sure that every package is installed

In [3]:
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.preprocessing.image import ImageDataGenerator

Using TensorFlow backend.


In line 1, we’ve imported Sequential from keras.models, to initialise our neural network model as a sequential network. There are two basic ways of initialising a neural network, either by a sequence of layers or as a graph.

In line 2, we’ve imported Conv2D from keras.layers, this is to perform the convolution operation i.e the first step of a CNN, on the training images. Since we are working on images here, which a basically 2 Dimensional arrays, we’re using Convolution 2-D, you may have to use Convolution 3-D while dealing with videos, where the third dimension will be time.

In line 3, we’ve imported MaxPooling2D from keras.layers, which is used for pooling operation, that is the step — 2 in the process of building a cnn. For building this particular neural network, we are using a Maxpooling function, there exist different types of pooling operations like Min Pooling, Mean Pooling, etc. Here in MaxPooling we need the maximum value pixel from the respective region of interest.

In line 4, we’ve imported Flatten from keras.layers, which is used for Flattening. Flattening is the process of converting all the resultant 2 dimensional arrays into a single long continuous linear vector.

And finally in line 5, we’ve imported Dense from keras.layers, which is used to perform the full connection of the neural network, which is the step 4 in the process of building a CNN.

Now, we will create an object of the sequential class below:

In [0]:
classifier = Sequential()

Let us now code the Convolution step, you will be surprised to see how easy it is to actually implement these complex operations in a single line of code in python, thanks to Keras.

In [5]:
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))

Instructions for updating:
Colocations handled automatically by placer.


Let’s break down the above code function by function. We took the object which already has an idea of how our neural network is going to be(Sequential), then we added a convolution layer by using the “Conv2D” function. The Conv2D function is taking 4 arguments, the first is the number of filters i.e 32 here, the second argument is the shape each filter is going to be i.e 3x3 here, the third is the input shape and the type of image(RGB or Black and White)of each image i.e the input image our CNN is going to be taking is of a 64x64 resolution and “3” stands for RGB, which is a colour img, the fourth argument is the activation function we want to use, here ‘relu’ stands for a rectifier function.

---

Now, we need to perform pooling operation on the resultant feature maps we get after the convolution operation is done on an image. The primary aim of a pooling operation is to reduce the size of the images as much as possible. In order to understand what happens in these steps in more detail you need to read few external resources. But the key thing to understand here is that we are trying to reduce the total number of nodes for the upcoming layers.

In [0]:
classifier.add(MaxPooling2D(pool_size = (2, 2)))

We start by taking our classifier object and add the pooling layer. We take a 2x2 matrix we’ll have minimum pixel loss and get a precise region where the feature are located. Again, to understand the actual math behind Pooling, i suggest you to go learn from an external source, this tutorial concentrates more on the implementation part. We just reduced the complexity of the model without reducing it’s performance.

---

It’s time for us to now convert all the pooled images into a continuous vector through Flattening. Flattening is a very important step to understand. What we are basically doing here is taking the 2-D array, i.e pooled image pixels and converting them to a one dimensional single vector.

In [0]:
classifier.add(Flatten())

In this step we need to create a fully connected layer, and to this layer we are going to connect the set of nodes we got after the flattening step, these nodes will act as an input layer to these fully-connected layers. As this layer will be present between the input layer and output layer, we can refer to it a hidden layer.

In [0]:
classifier.add(Dense(units = 128, activation = 'relu'))

As you can see, Dense is the function to add a fully connected layer, ‘units’ is where we define the number of nodes that should be present in this hidden layer, these units value will be always between the number of input nodes and the output nodes but the art of choosing the most optimal number of nodes can be achieved only through experimental tries. Though it’s a common practice to use a power of 2. And the activation function will be a rectifier function.

---

Now it’s time to initialise our output layer, which should contain only one node, as it is binary classification. This single node will give us a binary output of either a Cat or Dog.

In [0]:
classifier.add(Dense(units = 1, activation = 'sigmoid'))

Observe that the final layer contains only one node, and we will be using a sigmoid activation function for the final layer.  Why just one node?  Because it is classifying dog vs cat.

---

Now that we have completed building our CNN model, it’s time to compile it.

In [0]:
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

*   Optimizer parameter is to choose the stochastic gradient descent algorithm.
*   Loss parameter is to choose the loss function.
*   Finally, the metrics parameter is to choose the performance metric.

---

It’s time to fit our CNN to the image dataset that you’ve downloaded.But before we do that, we are going to pre-process the images to prevent over-fitting. Overfitting is when you get a great training accuracy and very poor test accuracy due to overfitting of nodes from one layer to another.
So before we fit our images to the neural network, we need to perform some image augmentations on them, which is basically synthesising the training data. We are going to do this using keras.preprocessing library for doing the synthesising part as well as to prepare the training set as well as the test test set of images that are present in a properly structured directories, where the directory’s name is take as the label of all the images present in it. For example : All the images inside the ‘cats’ named folder will be considered as cats by keras.

Load the training and test data from github.  If you've already set up the dataset directory, don't do this again.  You use **Runtime > Run before** and **Runtime > Run after** to control whether to run this step.

In [16]:
!git clone https://github.com/rmschulman/mldata.git
!mv mldata/dataset .
!rm -r mldata

Cloning into 'mldata'...
remote: Enumerating objects: 1976, done.[K
remote: Counting objects: 100% (1976/1976), done.[K
remote: Compressing objects: 100% (1971/1971), done.[K
remote: Total 1976 (delta 4), reused 1973 (delta 4), pack-reused 0[K
Receiving objects: 100% (1976/1976), 18.81 MiB | 23.15 MiB/s, done.
Resolving deltas: 100% (4/4), done.


In [17]:
train_datagen = ImageDataGenerator(rescale = 1./255,
   shear_range = 0.2,
   zoom_range = 0.2,
   horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
   target_size = (64, 64),
   batch_size = 32,
   class_mode = 'binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
   target_size = (64, 64),
   batch_size = 32,
   class_mode = 'binary')

Found 1589 images belonging to 2 classes.
Found 372 images belonging to 2 classes.


Now lets fit the data to our model.  In other words, train the model using the training data.

‘steps_per_epoch’ holds the number of training images, i.e the number of images the training_set folder contains.

A single epoch is a single step in training a neural network; in other words when a neural network is trained on every training samples only in one pass we say that one epoch is finished. So training process should consist more than one epochs.In this case we have defined 25 epochs.

In [18]:
classifier.fit_generator(training_set,
steps_per_epoch = 500, # was 8000
epochs = 5, # was 25
validation_data = test_set,
validation_steps = 200) # was 2000

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7efec665c668>

# Make predictions using our trained model

Find a test image and try it out!

Upload it into the directory **single_prediction**

In [0]:
!mkdir -p single_prediction

In [20]:
import numpy as np
from keras.preprocessing import image
test_image = image.load_img('single_prediction/cat.2.jpg', target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
  prediction = 'dog'
else:
  prediction = 'cat'
print ('result = ', prediction)

FileNotFoundError: ignored