In [1]:
#importing libraries

a specialized type of neural network model designed for working with two-dimensional image data, although they can be used with one-dimensional and three-dimensional data.

a convolution is a linear operation that involves the multiplication of a set of weights with the input, much like a traditional neural network. Given that the technique was designed for two-dimensional input, the multiplication is performed between an array of input data and a two-dimensional array of weights, called a filter or a kernel.

The filter is smaller than the input data and the type of multiplication applied between a filter-sized patch of the input and the filter is a dot product.

Using a filter smaller than the input is intentional as it allows the same filter (set of weights) to be multiplied by the input array multiple times at different points on the input. Specifically, the filter is applied systematically to each overlapping part or filter-sized patch of the input data, left to right, top to bottom.

This systematic application of the same filter across an image is a powerful idea. If the filter is designed to detect a specific type of feature in the input, then the application of that filter systematically across the entire input image allows the filter an opportunity to discover that feature anywhere in the image. This capability is commonly referred to as translation invariance, e.g. the general interest in whether the feature is present rather than where it was present.

The output from multiplying the filter with the input array one time is a single value. As the filter is applied multiple times to the input array, the result is a two-dimensional array of output values that represent a filtering of the input. As such, the two-dimensional output array from this operation is called a “feature map“

Convolutional neural networks do not learn a single filter; they, in fact, learn multiple features in parallel for a given input.

For example, it is common for a convolutional layer to learn from 32 to 512 filters in parallel for a given input.

A filter must always have the same number of channels as the input, often referred to as “depth“. If an input image has 3 channels (e.g. a depth of 3), then a filter applied to that image must also have 3 channels (e.g. a depth of 3). In this case, a 3×3 filter would in fact be 3x3x3 or [3, 3, 3] for rows, columns, and depth. Regardless of the depth of the input and depth of the filter, the filter is applied to the input using a dot product operation which results in a single value

In [1]:
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator


Using TensorFlow backend.


# Part-1 Data preprocessing


preprocessing the training set

In [4]:
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

training_set = train_datagen.flow_from_directory(
        'C:\\Users\\BizAct-110\\OneDrive\\MachineLearning\\Section 40 - Convolutional Neural Networks (CNN)\\dataset\\training_set',
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')
#shear = shift one part of an image, a layer, a selection or a path to a direction and the other part to the opposite direction

Found 8000 images belonging to 2 classes.


Preprocessing test set

In [5]:
test_datagen = ImageDataGenerator(rescale=1./255)
test_set = test_datagen.flow_from_directory(
        'C:\\Users\\BizAct-110\\OneDrive\\MachineLearning\\Section 40 - Convolutional Neural Networks (CNN)\\dataset\\test_set',
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

Found 2000 images belonging to 2 classes.


# Building the CNN

Initializing the CNN

In [7]:
cnn=tf.keras.models.Sequential()

Step-1 Convolution

In [8]:
cnn.add(tf.keras.layers.Conv2D(filters=32,kernel_size=3,activation='relu',input_shape=[64,64,3]))
#rectified linear unit()relu
#TF and Keras expects image dimension as (Width, Height, Channels), channels being 3 for RGB images and 1 for greyscale images

Step-2 pooling

In [9]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2,strides=2
#Role of pooling layer is to reduce the resolution of the feature map but retaining features of the map required for 
#classification through translational and rotational invariants

Adding 2nd convolutin layer

In [10]:
cnn.add(tf.keras.layers.Conv2D(filters=32,kernel_size=3,activation='relu'))
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2,strides=2))

Step-3 flattening

In [12]:
cnn.add(tf.keras.layers.Flatten())

Step-4 full connection

In [13]:
cnn.add(tf.keras.layers.Dense(units=128,activation='relu'))

Step-5 Output layer

In [14]:
cnn.add(tf.keras.layers.Dense(units=1,activation='sigmoid'))

# Part-3 Training the CNN

Compiling the CNN

In [15]:
cnn.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

Training the CNN on training data set and evaluating it on test data set

In [16]:
cnn.fit(x=training_set,validation_data=test_set,epochs=25)

  ...
    to  
  ['...']
  ...
    to  
  ['...']
Train for 250 steps, validate for 63 steps
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<tensorflow.python.keras.callbacks.History at 0x2e0ac192208>

# Part-4 Making a single prediction

In [17]:
import numpy as np
from keras.preprocessing import image
test_image=image.load_img('C:\\Users\\BizAct-110\\OneDrive\\MachineLearning\\Section 40 - Convolutional Neural Networks (CNN)\\dataset\\single_prediction\\cat_or_dog_1.jpg',target_size=[64,64])
test_image=image.img_to_array(test_image)
test_image=np.expand_dims(test_image,axis=0)
result=cnn.predict(test_image)
training_set.class_indices
if result[0][0]==1:
  prediction='dog'
else:
  prediction='cat'

In [18]:
print(prediction)

dog
