# Simple Image Classification using CNN - Deep Learning
In this article, I will be solving an image classification problem, where my goal will be to tell which class the input image belongs to. The way I am going to achieve it is by training an artificial neural network on a few thousand images of cats and dogs and making the NN(Neural Network) learn to predict which class the image belongs to, the next time it sees an image having a cat or dog in it.

The dataset: https://www.microsoft.com/en-US/download/details.aspx?id=54765

Please note that the dataset has to contain 2 files, one for training and the other for testing, each file has to contain 2 sub-files, one for cats and the other for dogs.

The process of building a Convolutional Neural Network always involves four major steps.
- Convolution
- Pooling
- Flattening
- Full connection

I will be going through each of the above operations while coding our neural network.nnection

### Importing the Keras libraries and packages

In [96]:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

import warnings
warnings.filterwarnings('ignore')

Let us now see what each of the above packages are imported for :

- We’ve imported Sequential from keras.models, to initialise our neural network model as a sequential network.
  
- We’ve imported Conv2D from keras.layers, this is to perform the convolution operation i.e the first step of a CNN, on the training images. Since we are working on images here, which a basically 2 Dimensional arrays, we’re using Convolution 2-D.
  
- We’ve imported MaxPooling2D from keras.layers, which is used for pooling operation, that is the step — 2 in the process of building a CNN. Here in MaxPooling we need the maximum value pixel from the respective region of interest.
  
- We’ve imported Flatten from keras.layers, which is used for Flattening. Flattening is the process of converting all the resultant 2 dimensional arrays into a single long continuous linear vector.
  
- we’ve imported Dense from keras.layers, which is used to perform the full connection of the neural network, which is the step 4 in the process of building a CNN.

### Building the layers

In [97]:
classifier = Sequential() # creating an object of Sequential model

In [98]:
# Coding the Convolution step:
classifier.add(Conv2D(32, (3, 3), input_shape=(64, 64, 3), activation='relu'))

The Conv2D function is taking 4 arguments, the first is the number of filters i.e 32 here, the second argument is the shape each filter is going to be i.e 3x3 here, the third is the input shape and the type of image(RGB or Black and White)of each image i.e the input image our CNN is going to be taking is of a 64x64 resolution and “3” stands for RGB, which is a colour img, the fourth argument is the activation function we want to use, here ‘relu’ stands for a rectifier function.

In [99]:
# perform pooling operation on the resultant feature after the convolution operation is done.
classifier.add(MaxPooling2D(pool_size=(2, 2)))

The primary aim of a pooling operation is to reduce the size of the images as much as possible. The key thing to understand here is that we are trying to reduce the total number of nodes for the upcoming layers. We take a 2x2 matrix we’ll have minimum pixel loss and get a precise region where the feature are located. We just reduced the complexity of the model without reducing it’s performance.

In [100]:
# converting all the pooled images into a continuous vector
classifier.add(Flatten())

What we are basically doing here is taking the 2-D array, i.e pooled image pixels and converting them to a one dimensional single vector. We no need to add any special parameters, keras will understand that the “classifier” object is already holding pooled image pixels and they need to be flattened.

In [101]:
# creating a fully connected layer
classifier.add(Dense(units=128, activation='relu'))

We are going to connect the set of nodes we got after the flattening step, these nodes will act as an input layer to these fully-connected layers. As this layer will be present between the input layer and output layer, we can refer to it a hidden layer. 

Dense is the function to add a fully connected layer, ‘units’ is where we define the number of nodes that should be present in this hidden layer, these units value will be always between the number of input nodes and the output nodes but the art of choosing the most optimal number of nodes can be achieved only through experimental tries. Though it’s a common practice to use a power of 2. And the activation function will be a rectifier function.

In [102]:
# initialise our output layer
classifier.add(Dense(units=1, activation='sigmoid'))

The output layer should contain only one node, as it is binary classification. This single node will give us a binary output of either a Cat or Dog.

In [103]:
# compiling our CNN model
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

- Optimizer parameter is to choose the stochastic gradient descent algorithm.
- Loss parameter is to choose the loss function.
- Finally, the metrics parameter is to choose the performance metric.

### Images pre-processing
we are going to pre-process the images to prevent over-fitting. Overfitting is when you get a great training accuracy and very poor test accuracy due to overfitting of nodes from one layer to another.

So before we fit our images to the neural network, we need to perform some image augmentations on them, which is basically synthesising the training data. The directory’s name is take as the label of all the images present in it. For example : All the images inside the ‘cats’ named folder will be considered as cats by keras.

Also, we are creating synthetic data out of the same images by performing different type of operations on these images like flipping, rotating, blurring, etc.

In [104]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)

test_datagen = ImageDataGenerator(rescale = 1./255)

training_set = train_datagen.flow_from_directory(r'dataset_img\training_set', 
                                                 target_size = (64, 64),
                                                 batch_size = 32,
                                                 class_mode = 'binary')

test_set = test_datagen.flow_from_directory(r'dataset_img\test_set',
                                            target_size = (64, 64),
                                            batch_size = 32,
                                            class_mode = 'binary')


Found 19999 images belonging to 2 classes.
Found 4999 images belonging to 2 classes.


### Fitting the data to our model and making new predictions

In [105]:
import os
from tensorflow.keras.preprocessing.image import load_img

# Define the directory containing your dataset
dataset_directory = 'dataset_img/training_set'  # Change to your dataset directory

# Create a list to store the filenames of problematic images
problematic_images = []

# Iterate through all image files in the directory
for root, dirs, files in os.walk(dataset_directory):
    for file in files:
        image_path = os.path.join(root, file)
        try:
            # Attempt to load the image using Keras
            load_img(image_path)
        except Exception as e:
            # If an error occurs, the image is problematic
            print(f"Problematic Image: {image_path}")
            problematic_images.append(image_path)

# If problematic images were found, you can choose to delete them
if problematic_images:
    print(f"Found {len(problematic_images)} problematic image(s).")
    delete_images = input("Do you want to delete these images? (y/n): ").strip().lower()
    
    if delete_images == 'y':
        for image_path in problematic_images:
            os.remove(image_path)
            print(f"Deleted: {image_path}")
        print("Problematic images deleted.")
    else:
        print("No images were deleted.")
else:
    print("No problematic images found in the dataset.")


No problematic images found in the dataset.


**I had some corrupted images in the dataset so I used the above code to delete these images.**

In [123]:
steps_per_epoch = len(training_set) // batch_size
validation_steps = len(test_set) // batch_size

classifier.fit(training_set, 
               steps_per_epoch = steps_per_epoch, 
               epochs=25, 
               validation_data = test_set,
               validation_steps = validation_steps)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25


<keras.src.callbacks.History at 0x1f4dc561c50>

steps_per_epoch: This parameter specifies the number of batches of data that the model should process from the training set during each epoch. In other words, it determines how many times the model will update its weights based on the training data within a single epoch. The value of steps_per_epoch is often set based on the total number of training samples and the batch size.

validation_steps: This parameter is similar to steps_per_epoch but is used during the validation or testing phase. It specifies the number of batches of data from the validation set that the model should process during each validation epoch. Like steps_per_epoch, the value of validation_steps is often set based on the total number of validation samples and the batch size.


And ‘epochs’, A single epoch is a single step in training a neural network; in other words when a neural network is trained on every training samples only in one pass we say that one epoch is finished. So training process should consist more than one epochs.In this case we have defined 25 epochs.

In [124]:
import numpy as np
from keras.preprocessing import image

test_image = image.load_img(r'dataset_img/American_Eskimo_Dog.jpg', target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)

result = classifier.predict(test_image)

training_set.class_indices

if result[0][0] == 1:
    prediction = 'Dog'
else:
    prediction = 'Cat'

print(f'It\'s a {prediction}!')

It's a Dog!


The test_image holds the image that needs to be tested on the CNN. Once we have the test image, we will prepare the image to be sent into the model by converting its resolution to 64x64 as the model only excepts that resolution. Then we are using predict() method on our classifier object to get the prediction. As the prediction will be in a binary form, we will be receiving either a 1 or 0, which will represent a dog or a cat respectively.