<a href="https://colab.research.google.com/github/vivek09thakur/ML_Image_Classification/blob/main/Colab%20Notebook/Image_Classification_Machine_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Image Classification Using Machine Learing Model**

- **Ignore Warnigs**

In [1]:
import warnings
warnings.filterwarnings('ignore')

*  **Mounting Drive**

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


### **Importing Requriements**

In [3]:
import os
import cv2
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import VGG16, preprocess_input
from sklearn.metrics.pairwise import cosine_similarity
from shutil import copyfile

- **Loading and Processing The Images**

In [4]:
def load_and_preprocess_image(img_path, target_size=(224, 224)):
    img = image.load_img(img_path, target_size=target_size)
    img_array = image.img_to_array(img)
    img_array = np.expand_dims(img_array, axis=0)
    img_array = preprocess_input(img_array)
    return img_array

- **Extracting Image Feature**

In [5]:
def extract_features(img_path, model):
    img_array = load_and_preprocess_image(img_path)
    features = model.predict(img_array)
    return features.flatten()

- **Calculating Cosine Similarity**

In [6]:
def get_cosine_similarity(feature1, feature2):
    similarity = cosine_similarity([feature1], [feature2])
    return similarity[0][0]

- **Finding All The Similar Images To The Given Sample**

In [7]:
def find_and_group_similar_images(sample_image_folder,
                                  dataset_folder,
                                  threshold=0.8,
                                  output_folder="similar_images_output"):

    if not os.path.exists(output_folder):
        os.makedirs(output_folder)

    # Load pre-trained VGG16 model
    base_model = VGG16(weights='imagenet', include_top=False)

    for sample_image_filename in os.listdir(sample_image_folder):
        if sample_image_filename.endswith(('.jpg', '.jpeg', '.png')):
            sample_image_path = os.path.join(sample_image_folder,
                                             sample_image_filename)

            # Create a folder for each sample image
            sample_output_folder = os.path.join(
                output_folder,
                os.path.splitext(sample_image_filename)[0])

            if not os.path.exists(sample_output_folder):
                os.makedirs(sample_output_folder)

            # Extract features for the sample image
            sample_features = extract_features(sample_image_path,
                                               base_model)

            for dataset_image_filename in os.listdir(dataset_folder):
                if dataset_image_filename.endswith(('.jpg',
                                                    '.jpeg',
                                                    '.png')):

                    dataset_image_path = os.path.join(dataset_folder,
                                            dataset_image_filename)

                    current_features = extract_features(
                        dataset_image_path,
                        base_model)

                    # Calculate cosine similarity
                    similarity = get_cosine_similarity(sample_features,
                                                       current_features)

                    if similarity > threshold:
                        output_path = os.path.join(sample_output_folder,
                                                   dataset_image_filename)
                        copyfile(dataset_image_path, output_path)
                        print(f'''
                        Similar Img {sample_image_filename}:
                        {dataset_image_filename}''')

                        print(f"Similarity: {similarity:.2f})")

- `main-loop`



In [8]:
sample_image_folder_path = "/content/drive/MyDrive/Colab Notebooks/Dataset/ImgDataset/SampleImg"
dataset_image_folder_path = "/content/drive/MyDrive/Colab Notebooks/Dataset/ImgDataset"
similarity_threshold = 0.5
output_folder_name = "similar_images_output"

In [9]:
find_and_group_similar_images(sample_image_folder_path, dataset_image_folder_path, similarity_threshold, output_folder_name)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

                        Similar Img sample_img1.jpg:
                        sample_img1.jpg
Similarity: 1.00)

                        Similar Img sample_img1.jpg:
                        sample_img3.jpg
Similarity: 0.56)

                        Similar Img sample_img1.jpg:
                        sample_img2.jpg
Similarity: 0.61)
