<a href="https://colab.research.google.com/github/Sannya-Wasim/Dice_AI_Course/blob/main/Assignment_02_Task_03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Assignment 03**

## Task 03

In this task, the goal is to develop a Python program that enables users to select a query image from the "query_images" folder and retrieve the top N (N=4 in this case) similar images from the local directory named "images_database". To achieve this, the program will utilize the powerful TensorFlow library and its pre-trained **MobileNet Convolutional Neural Network (CNN) model** for feature extraction. Additionally, the program will employ the **Euclidean distance metric** for measuring the similarity between images. The program will also be designed to handle images in various formats such as JPG, PNG, and JPEG, thanks to the **Pillow library**.

By implementing this solution, users will be able to find closely similar images in the database based on their query image, facilitating efficient image retrieval and analysis.

### **Downloading Dataset**

In [2]:
!pip install opendatasets --upgrade

Collecting opendatasets
  Downloading opendatasets-0.1.22-py3-none-any.whl (15 kB)
Installing collected packages: opendatasets
Successfully installed opendatasets-0.1.22


In [3]:
import opendatasets as od
dataset_path = 'https://www.kaggle.com/datasets/swaroopkml/cifar10-pngs-in-folders'
od.download(dataset_path)

Please provide your Kaggle credentials to download this dataset. Learn more: http://bit.ly/kaggle-creds
Your Kaggle username: sannyawasim
Your Kaggle Key: ··········
Downloading cifar10-pngs-in-folders.zip to ./cifar10-pngs-in-folders


100%|██████████| 140M/140M [00:01<00:00, 98.1MB/s]





### **Data Pre-processing**

In [4]:
# importing libraries
import os
import numpy as np
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.resnet50 import preprocess_input
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D, Flatten
from tensorflow.keras.models import Model

In [5]:
# defining dimensions
batch_size = 32
img_width = 32
img_height = 32

In [6]:
# defining path for train data
train_data_path = '/content/cifar10-pngs-in-folders/cifar10/cifar10/images_database'
test_data_path = '/content/cifar10-pngs-in-folders/cifar10/cifar10/query_images'

In [7]:
# Validation splitting
datagen = ImageDataGenerator(
    preprocessing_function = preprocess_input,
    validation_split = 0.2
)

train_generator = datagen.flow_from_directory(
    train_data_path,
    target_size = (img_width, img_height),
    batch_size = batch_size,
    class_mode = 'categorical',
    subset = 'training'
)

validation_generator = datagen.flow_from_directory(
    train_data_path,
    target_size = (img_width, img_height),
    batch_size = batch_size,
    class_mode = 'categorical',
    subset = 'validation'
)

Found 40000 images belonging to 10 classes.
Found 10000 images belonging to 10 classes.


In [8]:
# display the classes
class_labels = train_generator.class_indices
class_labels

{'airplane': 0,
 'automobile': 1,
 'bird': 2,
 'cat': 3,
 'deer': 4,
 'dog': 5,
 'frog': 6,
 'horse': 7,
 'ship': 8,
 'truck': 9}

## **Fine Tuning VGG16 for Image Classification**

In [9]:
# https://medium.com/@roshankg96/transfer-learning-and-fine-tuning-model-using-vgg-16-90b5401e1ebd
num_classes = len(class_labels)

# load the pre-trained model and removing the top layer --> https://www.tensorflow.org/api_docs/python/tf/keras/applications/vgg16/VGG16
base_model = MobileNet(weights='imagenet', include_top = False, input_shape=(32, 32, 3))

# The VGG16 model by default has an output shape of (None, 1000) before we add our custom classification layer.
# Therefore add a Flatten layer before adding the GlobalAveragePooling2D layer to convert the 2D feature maps from the VGG16 model into a 1D tensor.

# Freeze the layers
for layer in base_model.layers:
  layer.trainable = False

x = base_model.output

x = GlobalAveragePooling2D()(x)  # Global average pooling layer
predictions = Dense(num_classes, activation='softmax')(x)  # Dense classification layer with softmax activation

# Create the fine tuned model
model = Model(inputs=base_model.input, outputs=predictions)




Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_1_0_224_tf_no_top.h5


In [10]:
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 32, 32, 3)]       0         
                                                                 
 conv1 (Conv2D)              (None, 16, 16, 32)        864       
                                                                 
 conv1_bn (BatchNormalizati  (None, 16, 16, 32)        128       
 on)                                                             
                                                                 
 conv1_relu (ReLU)           (None, 16, 16, 32)        0         
                                                                 
 conv_dw_1 (DepthwiseConv2D  (None, 16, 16, 32)        288       
 )                                                               
                                                                 
 conv_dw_1_bn (BatchNormali  (None, 16, 16, 32)        128   

In [11]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics = ['accuracy'])

In [12]:
# Train the model
model.fit(
    train_generator,
    validation_data = validation_generator,
    epochs=10

)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7cc2e3f1a260>

In [54]:
model.save('cifar_MobileNet.h5', save_format='h5')

  saving_api.save_model(


## **Implementing Similarity Index**

To find the similarity between the two images we are going to use '**Euclidean Distance**' with the following approach :

1. Read the image files as an array.
2. Since the image files are colored there are 3 channels for RGB values.  We are going to flatten them such that each image is a single 1-D array.
3. Once we have our image files as an array we are going to generate a histogram for each image where for each index 0 – 255 we are going the count the occurrence of that pixel value in the image.
4. Once we have our histograms we are going to use the L2-Norm or Euclidean Distance to find the difference the two histograms.
5. Based on the distance between the histogram of our test image and the reference images we can find the image our test image is most similar to.





In [13]:
# importing neceassary libraries
from PIL import Image   # Image to read the image in terms of numerical values
from collections import Counter     # Counter to count the number of times each pixel value (0-255) occurs in the images.
import numpy as np      # Numpy for storing the image as Numpy array

In [14]:
ref_img_1 = Image.open('/content/cifar10-pngs-in-folders/cifar10/cifar10/images_database/deer/0090.png')
ref_img_2 = Image.open('/content/cifar10-pngs-in-folders/cifar10/cifar10/images_database/horse/0034.png')

array_1 = np.asarray(ref_img_1)
array_2 = np.asarray(ref_img_2)

print(np.shape(ref_img_1))
print(np.shape(ref_img_2))

(32, 32, 3)
(32, 32, 3)


In [15]:
# Flatten this 3D array into a 1D array
flat_array_1 = array_1.flatten()
flat_array_2 = array_2.flatten()

In [16]:
# Generating the count-histogram Vector
RH1 = Counter(flat_array_1)
RH2 = Counter(flat_array_2)
# returns a dictionary where the key corresponds to the pixel value and the value of the key is the number of times that pixel is present in the image.

One limitation of Euclidean distance is that it requires all the vectors to be normalized i.e both the vectors need to be of the same dimensions. To ensure that our histogram vector is normalized we are going to use a for loop from 0-255 and generate our histogram with the value of the key if the key is present in the image else we append a 0.

In [17]:
# defining Histogram
H1 = []
for i in range(256):
  if i in RH1.keys():
    H1.append(RH1[i])
  else:
    H1.append(0)

# generates a vector of size (256, ) where each index corresponds to the pixel value and the value corresponds to the count of the pixel in that image.

In [18]:
# defining Histogram
H2 = []
for i in range(256):
  if i in RH2.keys():
    H2.append(RH2[i])
  else:
    H2.append(0)

# generates a vector of size (256, ) where each index corresponds to the pixel value and the value corresponds to the count of the pixel in that image.

### **Euclidean Distance Function**

In [19]:
# Function takes in two histograms and returns the euclidean distance between them.

def L2Norm(H1, H2):
  distance = 0
  for i in range(len(H1)):
    distance += np.square(H1[i]-H2[i])
  return np.sqrt(distance)

### **Defining our Test Image**
Our test image is going to be that of a cat from the query_images. We will find the distance between reference image 1 and test image and the same for reference image 2. The smallest distance means the greater the similarity between the two pictures

In [20]:
test_image = Image.open('/content/cifar10-pngs-in-folders/cifar10/cifar10/query_images/horse/0048.png')
test_array = np.asarray(test_image)
flatten_array_test = test_array.flatten()

# Generating count histogram counter
TH1 = Counter(flatten_array_test)

# Defining histogram
HT = []
for i in range(256):
  if i in RH1.keys():
    HT.append(TH1[i])
  else:
    HT.append(0)

In [21]:
dist_test_ref_1 = L2Norm(H1, HT)
print("The distance between Reference Image 1 (Deer) and Test Image (Horse) is {}".format(dist_test_ref_1))

The distance between Reference Image 1 (Deer) and Test Image (Horse) is 193.9690696992693


In [22]:
dist_test_ref_2 = L2Norm(H2, HT)
print("The distance between Reference Image 2 (Horse) and Test Image (Horse) is {}".format(dist_test_ref_2))

The distance between Reference Image 2 (Horse) and Test Image (Horse) is 178.48249213858483


## **Implementing the Image Retrieval System**

https://towardsdatascience.com/build-an-image-search-engine-using-python-ad181e76441b

**Finding Top 'N' similar images**

In [24]:
from PIL import Image
from collections import Counter, defaultdict
import numpy as np
import os

def calculate_euclidean_distance(query_array, database_folder):
    query_histogram = Counter(query_array)
    similar_images = []

    for category in os.listdir(database_folder):
        category_folder = os.path.join(database_folder, category)
        for image_name in os.listdir(category_folder):
            image_path = os.path.join(category_folder, image_name)
            database_image = Image.open(image_path)
            database_array = np.asarray(database_image).flatten()
            database_histogram = Counter(database_array)

            # Initialize default dictionaries for query and database histograms
            query_histogram = defaultdict(int)
            database_histogram = defaultdict(int)

            # Generate histograms for query and database images
            for pixel_value in query_array:
                query_histogram[pixel_value] += 1

            for pixel_value in database_array:
                database_histogram[pixel_value] += 1

            # Make sure histograms have the same set of keys (pixel values)
            all_keys = set(query_histogram.keys()).union(set(database_histogram.keys()))

            # Fill in missing keys with 0 counts
            for key in all_keys:
                query_histogram[key] = query_histogram.get(key, 0)
                database_histogram[key] = database_histogram.get(key, 0)

            # Calculate Euclidean distance
            distance = np.linalg.norm(np.array(list(query_histogram.values())) - np.array(list(database_histogram.values())))
            similar_images.append((image_path, distance))

    return similar_images

# Read and prepare the query image
query_image_path = '/content/cifar10-pngs-in-folders/cifar10/cifar10/query_images/cat/0075.png'
query_image = Image.open(query_image_path)
query_array = np.asarray(query_image).flatten()

# Prepare the database images
database_folder = '/content/cifar10-pngs-in-folders/cifar10/cifar10/images_database'

# Find the top N similar images
top_n = 4
similar_images = calculate_euclidean_distance(query_array, database_folder)
similar_images.sort(key=lambda x: x[1])
top_similar_images = similar_images[:top_n]

# Print the paths of top similar images and their distances
for image_path, distance in top_similar_images:
    print(f"Image Path: {image_path}, Distance: {distance}")



Image Path: /content/cifar10-pngs-in-folders/cifar10/cifar10/images_database/horse/1783.png, Distance: 142.59032225224823
Image Path: /content/cifar10-pngs-in-folders/cifar10/cifar10/images_database/horse/0027.png, Distance: 143.52003344481216
Image Path: /content/cifar10-pngs-in-folders/cifar10/cifar10/images_database/cat/0725.png, Distance: 144.3606594609487
Image Path: /content/cifar10-pngs-in-folders/cifar10/cifar10/images_database/frog/1714.png, Distance: 146.0
