##### Here we not building the CNN model from the scratch,rather we are using pretrained model(already trained on huge dataset).
##### So I  have imported the model and configured it according to  requirements
##### Here I am using the Resnet50 CNN model(this model is trained on imagenet with more than millions of images)
- Resnet50 model can classify images in thousands different objects-i.e it extracts features from the  images
##### Here resnet50 model will extract 2400 objects from each image in the fashion project dataset and once we have extracted the featues we have to identify similar images based on the input image features
##### Now the features will be feeded to the nearestneighbors alogrithm
- here based upon the features of the 44 images distance will be calculated.To identify the similar product image will be considering the nearest image feature based upon the euclidian distance

## Import Libraries

In [None]:
import numpy as np
import pickle as pkl
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50,preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.layers import GlobalMaxPool2D

from sklearn.neighbors import NearestNeighbors
import os #to import the images names from image folder
from numpy.linalg import norm


## Extract filenames from Folder

In [None]:
path='E:/FTE/Projects/DL/Fashion/images'
filenames = []
for file in os.listdir(path):
    filenames.append(os.path.join(path,file))

In [None]:
len(filenames)

## Importing ResNet50 Model and Configuration
- resnet50 model is trained based on the imagenet dataset
- include_top=False   --here we are excluding one layer(top layer) from resnet50
- when we give the image into the model we treat it as 3D array of RGB
- since we are not including the top layer so we are replacing it with globalmaxpool2D layer

In [None]:
model= ResNet50(weights='imagenet',include_top=False, input_shape=(224,224,3))
model.trainable =False  #we are not training the model as we are using pretrained model

model=tf.keras.models.Sequential([model,GlobalMaxPool2D()])
model.summary()

### *Note
- Here we can see  that we used globalmaxpooling2d layer it converted the 4D array to 2D array

## Extract 2048 features from the images
- we will use preprocessing from tensorflow.keras to extract the image
- in target size the image will be converted in 224 224 pixels and then convert the image into array

In [None]:
img=image.load_img('54656.jpg',target_size=(224,224))
img_array= image.img_to_array(img)
img_array

In [None]:
img_array.shape

- Our resnet50 model got trained on batches of images so it is expecting batches of images to extract features so here we are giving only one input
- So we need to reshape our image array using a feature called dimension array present in numpy

In [None]:
img_expand_dim=np.expand_dims(img_array,axis=0)
img_expand_dim.shape

- Converted from 3D to 4D array which is ready to be fitted into the model
- But we cannot directly load this into the model,firstly we have to apply preprocess_input before feeding into the model(Resnet50)
- preprocess_input will convert the image data from RGB to BRG


In [None]:
img_preprocess= preprocess_input(img_expand_dim)
#now we have to fit it to the model and make prediction
result=model.predict(img_preprocess).flatten()   ##flatten()--returns array in 1D from 2D
result

- Here the model has extracted the 2048 features from the image
- We need to scale the array that we got.Use norm features from numpy to do this
- Norm--scale the values between 0 and 1

In [None]:
norm_result=result/norm(result)
norm_result

- We have extracted the features for only 1 image now we have to extract for entire image

In [None]:
def extract_features_from_images(image_path, model):   
    img=image.load_img(image_path,target_size=(224,224)) 
    img_array= image.img_to_array(img)
    img_array
    img_expand_dim=np.expand_dims(img_array,axis=0)
    img_preprocess= preprocess_input(img_expand_dim)
    result=model.predict(img_preprocess).flatten()
    norm_result=result/norm(result)
    return norm_result
    

- it will take image_path as input
- it will take image_path instead of passsing a singele image as static path
- Now we take the input from the original dataset

In [None]:
extract_features_from_images(filenames[0],model)

- we have extracted the features for a single image now extract the features for all images
- Store the features in image_features by appending and passing the image path(file) and model
- we can use tqdm module to track the progress of the loop

In [None]:
from tqdm import tqdm

In [None]:
image_features=[]
for file in tqdm(filenames[0:22000]):
    image_features.append(extract_features_from_images(file,model))
image_features

### Save the features
- using pickle we will save the features
- wb-- write binary, rb-- read binary

In [None]:
pkl.dump(image_features, open('Image_features.pkl','wb'))

In [None]:
pkl.dump(filenames, open('filenames.pkl','wb'))

- Now after dumping we need to load/open the saved pickle file

In [None]:
Image_features=pkl.load(open('Image_features.pkl','rb'))

In [None]:
filenames=pkl.load(open('filenames.pkl','rb'))

In [None]:
Image_features = np.array(Image_features, dtype=np.float16)
print(Image_features.shape)

- This image features will be containing the 2048 features for each 44441 images

### Find the similar images 
- For this we will be using nearestneighbors algorithm

In [None]:
neighbors= NearestNeighbors(n_neighbors=6, algorithm='brute', metric='euclidean')

In [None]:
neighbors.fit(Image_features)

- Here n=6 instead of 5 because the first image will be self image and then find the 5 similar image
- Now we need a input image to find the next 5 similar image

In [None]:
input_image=extract_features_from_images('54656.jpg', model)

- Now we need to find the similar image with nearest distance using the input features

In [None]:
input_image = np.array(input_image, dtype=np.float32)
neighbors._fit_X = neighbors._fit_X.astype(np.float32)  # Ensure the model data is float32

distance, indices = neighbors.kneighbors([input_image])

In [None]:
indices[0]

- These are the 5 images which are related to input image
- Print the input images

In [None]:
from IPython.display import Image

In [None]:
Image('54656.jpg')

- Now find the filenames for the above given indices and then pass it to the image

In [None]:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

In [None]:
def display_images(filenames, indices):
    plt.figure(figsize=(20, 5))  # Create a figure with a specific size

    # Loop through the first 5 recommended images
    for i in range(5):
        img_path = filenames[indices[0][i + 1]]  # Get the path of the recommended image
        img = mpimg.imread(img_path)  # Read the image
        
        # Add a subplot for each image
        plt.subplot(1, 5, i + 1)  # 1 row, 5 columns, position i+1
        plt.imshow(img)  # Display the image
        plt.axis('off')  # Turn off the axes
        plt.title(f"Recommendation {i + 1}")  # Add a title for each image

    plt.tight_layout()
    plt.show()  # Show all images

# Example Usage
display_images(filenames, indices)

- These are the recommended images for the input image