# Image Classifier Using Convolutional Neural Network

The goal of this image classifier is to identify class to which an image belongs to. The way I am going to achieve it is by training an artificial neural network on few thousand images of cats and dogs and make the NN(Neural Network) learn to predict which class the image belongs to, next time it sees an image having a cat or dog in it.

This model can be trained on any type of classes like for example a doctor can train Neural Network that can take a brain scan as an input and predict if the scan contains tumor or not.

So coming to the coding part, I am going to use Keras deep learning library in python to build CNN(Convolutional Neural Network).

The process of building a Convolutional Neural Network always involves four major steps:
1. Convolution.
2. Pooling.
3. Flattening.
4. Full connection.

In [1]:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

Using TensorFlow backend.


1. I have imported Sequential from keras.models, to initialise my neural network model as a sequential network. There are two basic ways of initialising a neural network, either by a sequence of layers or as a graph.

2. I have imported Conv2D from keras.layers, this is to perform the convolution operation i.e the first step of a CNN, on the training images. Since we are working on images here, which a basically 2 Dimensional arrays, we’re using Convolution 2-D. For videos we can use convolution 3D.

3.  I have imported MaxPooling2D from keras.layers, which is used for pooling operation, that is the step — 2 in the process of building a cnn. For building this particular neural network, we are using a Maxpooling function, there exist different types of pooling operations like Min Pooling, Mean Pooling, etc. Here in MaxPooling we need the maximum value pixel from the respective region of interest.

4. Flatten is used for flattening, it is the process of converting resultant 2D array into a single long continuos linear vector.

5. Dense is used to perform full connection of the neural network.

In [17]:
classifier = Sequential()    #Creating an object of the sequential class below.

In [18]:
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))    

Let’s break down the above code function by function. I took the sequential object , then I added a convolution layer by using the “Conv2D” function. The Conv2D function is taking 4 arguments, the first is the number of filters i.e 32 here, the second argument is the shape each filter is going to be i.e 3x3 here, the third is the input shape and the type of image(RGB or Black and White)of each image i.e the input image our CNN is going to be taking is of a 64x64 resolution and “3” stands for RGB, which is a colour img, the fourth argument is the activation function we want to use, here ‘relu’ stands for a rectifier function.

## Pooling

To understand pooling I recommend to use the following link: http://ufldl.stanford.edu/tutorial/supervised/Pooling/

In [19]:
classifier.add(MaxPooling2D(pool_size = (2, 2)))

## Flattening

In [20]:
classifier.add(Flatten())

What I am basically doing here is taking the 2-D array, i.e pooled image pixels and converting them to a one dimensional single vector. 

In [21]:
classifier.add(Dense(units = 128, activation = 'relu'))

In this step we need to create a fully connected layer, and to this layer we are going to connect the set of nodes we got after the flattening step, these nodes will act as an input layer to these fully-connected layers.

‘units’ is where we define the number of nodes that should be present in this hidden layer, these units value will be always between the number of input nodes and the output nodes but the art of choosing the most optimal number of nodes can be achieved only through experimental tries. Though it’s a common practice to use a power of 2. And the activation function will be a rectifier function.

In [22]:
classifier.add(Dense(units = 1, activation = 'sigmoid'))

Now it’s time to initialize our output layer, which should contain only one node, as it is binary classification. This single node will give us a binary output of either a Cat or Dog.

In [23]:
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

Optimizer parameter is to choose the stochastic gradient descent algorithm.
Loss parameter is to choose the loss function.
The metrics parameter is to choose the performance metric.

So before we fit our images to the neural network, we need to perform some image augmentations on them, which is basically synthesising the training data. This step is done to prevent overfitting of the model. I am going to do this using keras.preprocessing library for doing the synthesising part as well as to prepare the training set and the test set of images that are present in a properly structured directories, where the directory’s name is take as the label of all the images present in it.

In [24]:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')

Found 8005 images belonging to 2 classes.
Found 2023 images belonging to 2 classes.


## Fiting the model

A single epoch is a single step in training a neural network; in other words when a neural network is trained on every training samples only in one pass we say that one epoch is finished. So training process should consist more than one epochs.In this case we have defined 25 epochs.

In [25]:
classifier.fit_generator(training_set,
steps_per_epoch = 8000,
epochs = 1,
validation_data = test_set,
validation_steps = 2000)

Epoch 1/1


<keras.callbacks.History at 0x237f16cde80>

## Making new predictions from our trained model

The test_image holds the image that needs to be tested on the CNN. Preparing the image to be sent into the model by converting its resolution to 64x64 as the model only excepts that resolution. Then I am using predict() method on classifier object to get the prediction. As the prediction will be in a binary form, we will be receiving either a 1 or 0, which will represent a dog or a cat respectively.

In [51]:
import numpy as np
from keras.preprocessing import image

imageID = 4001
dogCount=0
for i in range(100):
    path = "C:/Image Classifier/test_set-20180213T232259Z-001/test_set/dogs/dog." + str(imageID) + ".jpg"
    imageID+=1
    test_image = image.load_img(path, target_size = (64, 64))
    test_image = image.img_to_array(test_image)
    test_image = np.expand_dims(test_image, axis = 0)
    result = classifier.predict(test_image)
    training_set.class_indices
    if result[0][0] == 1:
        prediction = 'dog'
        dogCount+=1
    else:
        prediction = 'cat'

print(dogCount)    

94


In [54]:
classifier.save("dogs_cats.h5")