Binary Image classification - cat or dog using rgb image

- imported Sequential from keras.models, to initialise our neural network model as a sequential network. There are two basic ways of initialising a neural network, either by a sequence of layers or as a graph.
- imported Conv2D from keras.layers, working on images => 2 Dimensional arrays, Convolution 3-D is for videos, where the third dimension will be time.
- imported MaxPooling2D from keras.layers, there exist different types of pooling operations like Min Pooling, Mean Pooling, etc. Here in MaxPooling we need the maximum value pixel from the respective region of interest.
- imported Flatten from keras.layers, Flattening is the process of converting all the resultant 2 dimensional arrays into a single long continuous linear vector.
- imported Dense from keras.layers, which is used to perform the full connection of the neural network.
- imported ImageDataGenerator from keras.preprocessing.image, which is used for image preprocessing like flipping, rotating, blurring, rescaling.

In [1]:
# Importing the Keras libraries and packages

from keras.models import Sequential

from keras.layers import Conv2D

from keras.layers import MaxPooling2D

from keras.layers import Flatten

from keras.layers import Dense

from keras.preprocessing.image import ImageDataGenerator

Using TensorFlow backend.


create an object of the sequential class 

In [2]:
classifier = Sequential()

Sequential convolution layer 
- 32 : no.of filters
- (3x3) : shape of each filter
- (64, 64, 3): input shape 64x64 resolution and the type of image 3 channel RGB
- activation function : relu, rectifier function, allows only positive values to pass through it. The negative values are mapped to zero.

In [3]:
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))

perform pooling operation on the resultant feature maps we get after the convolution operation is done on an image.
Here we are trying to reduce the total number of nodes for the upcoming layers.
We take a 2x2 matrix we’ll have minimum pixel loss and get a precise region where the feature are located.

In [4]:
classifier.add(MaxPooling2D(pool_size = (2, 2)))

Taking the 2-D array, i.e pooled image pixels and converting them to a one dimensional single vector through Flattening

In [5]:
classifier.add(Flatten())

fully-connected layers:
As this layer will be present between the input layer and output layer, we can refer to it a hidden layer.
- Dense : function to add a fully connected layer
- units : the number of nodes that should be present in this hidden layer, value will be always between the number of input nodes and the output nodes. optimal number of nodes can be achieved only through experimental tries. use a power of 2.
- activation function : rectifier function.

In [6]:
classifier.add(Dense(units = 128, activation = 'relu'))

output layer:
should contain only one node, as it is binary classification. This single node will give us a binary output of either a Cat or Dog.
final layer contains only one node, and we will be using a sigmoid activation function for the final layer.

In [7]:
classifier.add(Dense(units = 1, activation = 'sigmoid'))

completed building our CNN model, it’s time to compile it
- Optimizer parameter : to choose the stochastic gradient descent algorithm.
- Loss parameter : to choose the loss function.
- metrics parameter : to choose the performance metric

In [8]:
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

Image preprocessing so that it doesnt overfit.
- rescaling,horizontal flip https://keras.io/preprocessing/image/

In [None]:
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set_cd/',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('dataset/test_set_cd/',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')

Found 8005 images belonging to 2 classes.
Found 2023 images belonging to 2 classes.


lets fit the data to our model !
- steps_per_epoch : holds the number of training images, i.e the number of images the training_set folder contains.
- epochs : A single epoch is a single step, a neural network is trained on every training samples only in one pass we say that one epoch is finished. So training process should consist more than one epochs.In this case we have defined 2 epochs.

In [None]:
classifier.fit_generator(training_set,
steps_per_epoch = 8000,
epochs = 2,
validation_data = test_set,
validation_steps = 2000)

Epoch 1/2
 726/8000 [=>............................] - ETA: 56:23 - loss: 0.6093 - acc: 0.6713

Making new predictions from our trained model :
 -  we have the test image, we will prepare the image to be sent into the model by converting its resolution to 64x64 as the model only excepts that resolution. 
 - we are using predict() method on our classifier object to get the prediction. As the prediction will be in a binary form, we will be receiving either a 1 or 0, which will represent a dog or a cat respectively.

In [None]:
import numpy as np
from keras.preprocessing import image
test_image = image.load_img('dataset/single_prediction/cat_or_dog_1.jpg', target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
prediction = 'dog'
else:
prediction = 'cat'