<a href="https://colab.research.google.com/github/Saswata020/Deep_learning_Concept/blob/main/Image_Classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Structure of an Image Classification Task
* **Image Preprocessing** - The aim of this process is to improve the image data(features) by *suppressing unwanted distortions* and *enhancement of some important image features* so that our Computer Vision models can benefit from this improved data to work on.

* **Detection of an object** - Detection refers to the localization of an object which *means the segmentation of the image and identifying the position of the object of interest*.
* **Feature extraction and Training**- This is a crucial step wherein statistical or deep learning methods are used to identify the most *interesting patterns of the image, features that might be unique to a particular class and that will, later on, help the model to differentiate between different classes.* This process where the model learns the features from the dataset is called model training.
* **Classification of the object** - This step categorizes detected objects into predefined classes by using a suitable classification technique that compares the image patterns with the target patterns.

#Need for Image-Preprocessing
Computers are able to perform computations on numbers and *is unable to interpret images in the way that we do*. We have to **somehow convert the images to numbers for the computer to understand**.
The aim of pre-processing is an improvement of the image data that **suppresses unwilling distortions or enhances some image features important for further processing**.

#Steps for image pre-processing:

* Read image
* Resize image
* Data Augmentation
* Gray scaling of image
* Reflection
* Gaussian Blurring
* Histogram Equalization
* Rotation
* Translation


#Reading Image
In this step, we simply store the path to our image dataset into a variable and then we create a function to load folders containing images into arrays so that computers can deal with it.

In [3]:
# importing libraries
from pathlib import Path  # Import the Path class from pathlib for working with file paths
import glob  # Import the glob module for file pattern matching
import pandas as pd  # Import the pandas library for data manipulation

# reading images from path
images_dir = Path('img')  # Set the 'images_dir' variable to the 'img' directory using Path

images = images_dir.glob("*.tif")  # Use glob to get a list of all files with '.tif' extension in 'img' directory

train_data = []  # Initialize an empty list 'train_data' to store image filenames and labels

counter = 0  # Initialize a counter variable to keep track of the number of processed images

# Iterate through the images in the 'images' generator
for img in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

# converting data into pandas dataframe for easy visualization
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)  # Convert 'train_data' list to a DataFrame


#Resize image
Some images captured by a camera and fed to our AI algorithm vary in size, therefore, we should establish a base size for all images fed into our AI algorithms by resizing them.

* Sample code for resizing images into 229x229 dimensions:

In [7]:
import cv2
# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

# converting data into pandas dataframe for easy visualization
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)  # Convert 'train_data' list to a DataFrame


#Data Augmentation
Data augmentation is a way of creating new 'data' with different orientations. The benefits of this are two-fold, the first being the ability to generate 'more data' from limited data and secondly, it prevents overfitting.

In [8]:
from IPython.display import Image
# URL of the image
image_url = "https://iq.opengenus.org/content/images/2019/07/cat_aug.png"
# Display the image
Image(url=image_url)


#Data Augmentation Techniques:

**Gray Scaling**

The image will be *converted to gray scale* *(range of gray shades from white to black)* the *computer will assign each pixel a value based on how dark it is*. All the *numbers are put into an array* and the computer does computations on that array.

Sample code to convert an RGB(3 channels) image into a Gray scale image:

In [10]:
import cv2

# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

    # Convert the current image to grayscale
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# converting data into pandas dataframe for easy visualization
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)  # Convert 'train_data' list to a DataFrame


In [11]:
image_url = "https://iq.opengenus.org/content/images/2019/07/rgb.jpg"
# Display the image
Image(url=image_url)


In [12]:
image_url = "https://iq.opengenus.org/content/images/2019/07/grayscale.jpg"
# Display the image
Image(url=image_url)


#Reflection/Flip

You can **flip images horizontally and vertically**. Some frameworks do not provide function for vertical flips. But, a vertical flip is equivalent to rotating an image by 180 degrees and then performing a horizontal flip.

In [13]:
import cv2
import pandas as pd
from pathlib import Path
import glob

# Set the 'images_dir' variable to the 'img' directory using Path
images_dir = Path('img')

# Use glob to get a list of all files with '.tif' extension in 'img' directory
images = images_dir.glob("*.tif")

train_data = []  # Initialize an empty list 'train_data' to store image filenames and labels
counter = 0  # Initialize a counter variable to keep track of the number of processed images

# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

    # Convert the current image to grayscale
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Horizontal Flip
    img_horizontal = cv2.flip(img_gray, 0)  # Horizontal flip

    # Vertical Flip
    img_vertical = cv2.flip(img_gray, 1)  # Vertical flip

    # Append horizontally flipped image with label 1 (you can change label as needed)
    train_data.append((img_horizontal, 1))

    # Append vertically flipped image with label 1 (you can change label as needed)
    train_data.append((img_vertical, 1))

# Convert 'train_data' list to a DataFrame
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)


#Gaussian Blurring

Gaussian blur (also known as Gaussian smoothing) is the **result of blurring an image by a Gaussian function**. It is a widely used effect in graphics software, typically to **reduce image noise**.

In [15]:
import cv2
import pandas as pd
from pathlib import Path
from scipy import ndimage  # Importing the ndimage module from SciPy
import glob

# Set the 'images_dir' variable to the 'img' directory using Path
images_dir = Path('img')

# Use glob to get a list of all files with '.tif' extension in 'img' directory
images = images_dir.glob("*.tif")

train_data = []  # Initialize an empty list 'train_data' to store image filenames and labels
counter = 0  # Initialize a counter variable to keep track of the number of processed images

# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

    # Convert the current image to grayscale
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Apply Gaussian filter to the grayscale image
    img_filtered = ndimage.gaussian_filter(img_gray, sigma=5.11) #Image with blur radius = 5.1

    # Horizontal Flip
    img_horizontal = cv2.flip(img_filtered, 0)  # Horizontal flip

    # Vertical Flip
    img_vertical = cv2.flip(img_filtered, 1)  # Vertical flip

    # Append horizontally flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_horizontal, 1))

    # Append vertically flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_vertical, 1))

# Convert 'train_data' list to a DataFrame
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)


#Histogram Equalization

Histogram equalization is another image processing technique to **increase global contrast of an image using the image intensity histogram**. This method needs no parameter, but it sometimes results in an unnatural looking image.

In [16]:
image_url = "https://iq.opengenus.org/content/images/2019/07/histogram.jpeg"
# Display the image
Image(url=image_url)


In [18]:
import cv2
import pandas as pd
from pathlib import Path
from scipy import ndimage
import glob

# Define the histogram equalization function
def hist(img):
    img_to_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
    img_to_yuv[:, :, 0] = cv2.equalizeHist(img_to_yuv[:, :, 0])
    hist_equalization_result = cv2.cvtColor(img_to_yuv, cv2.COLOR_YUV2BGR)
    return hist_equalization_result

# Set the 'images_dir' variable to the 'img' directory using Path
images_dir = Path('img')

# Use glob to get a list of all files with '.tif' extension in 'img' directory
images = images_dir.glob("*.tif")

train_data = []  # Initialize an empty list 'train_data' to store image filenames and labels
counter = 0  # Initialize a counter variable to keep track of the number of processed images

# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

    # Apply histogram equalization to the current image
    img_equalized = hist(img)

    # Convert the current image to grayscale
    img_gray = cv2.cvtColor(img_equalized, cv2.COLOR_BGR2GRAY)

    # Apply Gaussian filter to the grayscale image
    img_filtered = ndimage.gaussian_filter(img_gray, sigma=5.11)

    # Horizontal Flip
    img_horizontal = cv2.flip(img_filtered, 0)  # Horizontal flip

    # Vertical Flip
    img_vertical = cv2.flip(img_filtered, 1)  # Vertical flip

    # Append horizontally flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_horizontal, 1))

    # Append vertically flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_vertical, 1))

# Convert 'train_data' list to a DataFrame
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)


#Rotation

This is yet another **image augmentation technique**. Rotating an *image might not preserve its original dimensions* (depending on what angle you choose to rotate it with )

In [19]:
import cv2
import pandas as pd
from pathlib import Path
from scipy import ndimage
import glob
import random  # Import the random module for random rotation

# Define the histogram equalization function
def hist(img):
    img_to_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
    img_to_yuv[:, :, 0] = cv2.equalizeHist(img_to_yuv[:, :, 0])
    hist_equalization_result = cv2.cvtColor(img_to_yuv, cv2.COLOR_YUV2BGR)
    return hist_equalization_result

# Define the rotation function
def rotation(img):
    rows, cols = img.shape[0], img.shape[1]
    randDeg = random.randint(-180, 180)
    matrix = cv2.getRotationMatrix2D((cols / 2, rows / 2), randDeg, 0.70)
    rotated = cv2.warpAffine(img, matrix, (cols, rows), borderMode=cv2.BORDER_CONSTANT, borderValue=(144, 159, 162))
    return rotated

# Set the 'images_dir' variable to the 'img' directory using Path
images_dir = Path('img')

# Use glob to get a list of all files with '.tif' extension in 'img' directory
images = images_dir.glob("*.tif")

train_data = []  # Initialize an empty list 'train_data' to store image filenames and labels
counter = 0  # Initialize a counter variable to keep track of the number of processed images

# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

    # Apply histogram equalization to the current image
    img_equalized = hist(img)

    # Convert the current image to grayscale
    img_gray = cv2.cvtColor(img_equalized, cv2.COLOR_BGR2GRAY)

    # Apply Gaussian filter to the grayscale image
    img_filtered = ndimage.gaussian_filter(img_gray, sigma=5.11)

    # Horizontal Flip
    img_horizontal = cv2.flip(img_filtered, 0)  # Horizontal flip

    # Vertical Flip
    img_vertical = cv2.flip(img_filtered, 1)  # Vertical flip

    # Append horizontally flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_horizontal, 1))

    # Append vertically flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_vertical, 1))

    # Apply random rotation to the image
    img_rotated = rotation(img_filtered)

    # Append rotated and filtered image with label 1 (you can change label as needed)
    train_data.append((img_rotated, 1))

# Convert 'train_data' list to a DataFrame
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)


In [20]:
image_url = "https://iq.opengenus.org/content/images/2019/07/rot.jpeg"
# Display the image
Image(url=image_url)


#Translation

Translation just involves **moving the image along the X or Y direction (or both)**.
This method of augmentation is very **useful as most objects can be located at almost anywhere in the image**. This forces **our feature extractor to look everywhere**.

In [21]:
import cv2
import pandas as pd
from pathlib import Path
from scipy import ndimage
import glob
import random  # Import the random module for random rotation
import numpy as np  # Import NumPy for array operations

# Define the histogram equalization function
def hist(img):
    img_to_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
    img_to_yuv[:, :, 0] = cv2.equalizeHist(img_to_yuv[:, :, 0])
    hist_equalization_result = cv2.cvtColor(img_to_yuv, cv2.COLOR_YUV2BGR)
    return hist_equalization_result

# Define the rotation function
def rotation(img):
    rows, cols = img.shape[0], img.shape[1]
    randDeg = random.randint(-180, 180)
    matrix = cv2.getRotationMatrix2D((cols / 2, rows / 2), randDeg, 0.70)
    rotated = cv2.warpAffine(img, matrix, (cols, rows), borderMode=cv2.BORDER_CONSTANT, borderValue=(144, 159, 162))
    return rotated

# Set the 'images_dir' variable to the 'img' directory using Path
images_dir = Path('img')

# Use glob to get a list of all files with '.tif' extension in 'img' directory
images = images_dir.glob("*.tif")

train_data = []  # Initialize an empty list 'train_data' to store image filenames and labels
counter = 0  # Initialize a counter variable to keep track of the number of processed images

# Iterate through the images in the 'images' generator
for img_path in images:
    counter += 1  # Increment the counter by 1 for each image processed

    # Read the image using OpenCV
    img = cv2.imread(str(img_path))

    # Check if the counter is less than or equal to 130
    if counter <= 130:
        img = cv2.resize(img, (229, 229))  # Resize the image
        train_data.append((img, 1))  # Append a tuple (img, 1) to 'train_data' with label 1
    else:
        train_data.append((img, 0))  # Append a tuple (img, 0) to 'train_data' with label 0

    # Apply histogram equalization to the current image
    img_equalized = hist(img)

    # Convert the current image to grayscale
    img_gray = cv2.cvtColor(img_equalized, cv2.COLOR_BGR2GRAY)

    # Apply Gaussian filter to the grayscale image
    img_filtered = ndimage.gaussian_filter(img_gray, sigma=5.11)

    # Horizontal Flip
    img_horizontal = cv2.flip(img_filtered, 0)  # Horizontal flip

    # Vertical Flip
    img_vertical = cv2.flip(img_filtered, 1)  # Vertical flip

    # Append horizontally flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_horizontal, 1))

    # Append vertically flipped and filtered image with label 1 (you can change label as needed)
    train_data.append((img_vertical, 1))

    # Apply random rotation to the image
    img_rotated = rotation(img_filtered)

    # Append rotated and filtered image with label 1 (you can change label as needed)
    train_data.append((img_rotated, 1))

    # Apply translation to the image
    img_translated = cv2.warpAffine(img_filtered, np.float32([[1, 0, 84], [0, 1, 56]]), (img.shape[1], img.shape[0]),
                                     borderMode=cv2.BORDER_CONSTANT, borderValue=(144, 159, 162))

    # Append translated and filtered image with label 1 (you can change label as needed)
    train_data.append((img_translated, 1))

# Convert 'train_data' list to a DataFrame
train_data = pd.DataFrame(train_data, columns=['image', 'label'], index=None)


In [22]:
image_url = "https://iq.opengenus.org/content/images/2019/07/trans.png"
# Display the image
Image(url=image_url)
