<a href="https://colab.research.google.com/github/nazeli-terpetrosyan/Image_Similarity/blob/main/Image_Similarity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Determining relevant supplier websites based on the product images

For this task, I took 2 approaches: measuring the image similarity with cosine distance and building a Siamese Neural Network(SNN).

#API Research

Before writing the code, I did some research regarding other projects done for image similarity. Below, I'll present some APIs that could be useful for solving the task.

**Deep AI Image Similarity API**

Deep AI's API had a significantly good performance, however, to use the API in production, there would have been a need to discuss partnership and maybe further contracts with Deep API, therefore, it would not have been the best solution.

The [link ](https://deepai.org/machine-learning-model/image-similarity)to the API webpage.

**Image similarity API published on Rapid API**

This API also had a good performance, however, it had limitations for free use. You could only make 20 calls per day. Their Pro plan allowed 6000 calls/month, costing 1.99$, however, it would not have been useful in the long run. 

The [link ](https://rapidapi.com/dyapi-dyapi-default/api/image-similarity1)to the API webpage.

**Therefore,** after finalizing the initial research and not finding any appropriate APIs, I started working on the code.

#Importing the Libraries

In [None]:
import numpy as np
import pandas as pd
import cv2
from google.colab.patches import cv2_imshow
from PIL import Image
import os

import tensorflow as tf
import tensorflow_hub as hub

#Approach 1: Cosine Distance

In [None]:
from scipy.spatial import distance
metric = 'cosine' #the method used for calculating the distance (other options are euclidean or dot for example)

In [None]:
model_url = "https://tfhub.dev/tensorflow/efficientnet/lite0/feature-vector/2"

IMAGE_SHAPE = (224, 224)

layer = hub.KerasLayer(model_url)
emb_model = tf.keras.Sequential([layer])

In [None]:
def extract(file):
  file = Image.open(file).convert('L').resize(IMAGE_SHAPE)

  file = np.stack((file,)*3, axis=-1)

  file = np.array(file)/255.0

  embedding = emb_model.predict(file[np.newaxis, ...])

  feature_np = np.array(embedding)
  flattended_feature = feature_np.flatten()

  return flattended_feature

Sample code for cosine distance



```
img1 = extract('{IMG1 PATH HERE}')
img2 = extract('{IMG2 PATH HERE}')
distance.cdist([img1], [img2], metric)[0]
```



The output is between 0 and 1. 0 means the 2 images are exactly the same. Therefore, the larger the output the more different the images are.

**The disadvantage** of cosine similarity is that for the algorithm almonds against a white background and hazelnuts against a white background are 2 very similiar images. In this case, it can be disadvantegous when checking for product suppliers.

*A solution might be trying to fine-tune the embedding model for better feature extraction. And another approach can be trying to play around with different metric systems.

#Approach 2: Siamese Neural Network

##Getting the Dataset

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import zipfile

PATH = '/content/drive/MyDrive/google_images.zip' #Path to the images zip

zip_ref = zipfile.ZipFile(PATH, 'r')
zip_ref.extractall('/tmp/train')
zip_ref.close()

In [None]:
datadir='/tmp/train'
Categories=os.listdir(datadir)

img_data=[]
label=[]
SIZE = 128 #The size of the images

for i in Categories:
  print(f'loading... category : {i}')
  path=os.path.join(datadir, i)
  for img in os.listdir(path):
    img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_COLOR)     
    img_array = cv2.resize(img_array, (SIZE, SIZE))  
    img_data.append(img_array)
    label.append(Categories.index(i))
  print(f'loaded category: {i} successfully')

loading... category : peanuts
loaded category: peanuts successfully
loading... category : coconut
loaded category: coconut successfully
loading... category : sunflower oil
loaded category: sunflower oil successfully
loading... category : oats
loaded category: oats successfully
loading... category : hazelnuts
loaded category: hazelnuts successfully
loading... category : random
loaded category: random successfully
loading... category : maple syrup
loaded category: maple syrup successfully
loading... category : cocoa
loaded category: cocoa successfully
loading... category : dates
loaded category: dates successfully
loading... category : almonds
loaded category: almonds successfully


##Generating Image Pairs

This function generates images pairs solely compared with the target(in our case, it will be almonds).

In [None]:
def make_pairs_target(images, labels, target):
  pairImages = []
  pairLabels = []


  numClasses = len(np.unique(labels))
  idx = [np.where(labels == Categories[i])[0] for i in range(0, numClasses)]

  target_imgs = images[np.where(labels == target)]

  for img in target_imgs:
		# randomly pick an image that belongs to the same class (creating a positive pair)
    idxB = np.random.choice(idx[Categories.index(target)])
    posImage = images[idxB]
    pairImages.append([img, posImage])
    pairLabels.append([1])

    # randomly pick an image that belongs to a different class (creating a negative pair)
    negIdx = np.where(labels != target)[0]
    negImage = images[np.random.choice(negIdx)]
    pairImages.append([img, negImage])
    pairLabels.append([0])

  return (np.array(pairImages), np.array(pairLabels))

This function generates image pairs with combinations of all categories.

In [None]:
def make_pairs(images, labels):
  pairImages = []
  pairLabels = []

  numClasses = len(np.unique(labels))
  idx = [np.where(labels == i)[0] for i in range(0, numClasses)]

  for idxA in range(len(images)):
    currentImage = images[idxA]
    currentLabel = labels[idxA]

		# randomly pick an image that belongs to the same class (creating a positive pair)
    idxB = np.random.choice(idx[currentLabel])
    posImage = images[idxB]
    pairImages.append([currentImage, posImage])
    pairLabels.append([1])

		# randomly pick an image that belongs to a different class (creating a negative pair)
    negIdx = np.where(labels != label)[0]
    negImage = images[np.random.choice(negIdx)]
    pairImages.append([currentImage, negImage])
    pairLabels.append([0])
	
  return (np.array(pairImages), np.array(pairLabels))

##Preprocessing

In [None]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(img_data, label, test_size = 0.2, random_state = 1)

x_train, x_test = np.array(x_train) / 255.0, np.array(x_test) / 255.0

x_train = np.expand_dims(x_train, axis=-1)
x_test = np.expand_dims(x_test, axis=-1)

(pairTrain, labelTrain) = make_pairs(np.array(x_train), np.array(y_train))
(pairTest, labelTest) = make_pairs(np.array(x_test), np.array(y_test))



At first, the data was generated with the *make_pairs_target()* function, however due to insufficient data, the model was later trained with the data generated from *make_pairs()* function.

##SNN Architecture

In [None]:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Lambda
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import Flatten

import tensorflow.keras.backend as K

###Architecture #1

In [None]:
IMG_SHAPE = (128, 128, 1)

BATCH_SIZE = 64
EPOCHS = 100

In [None]:
def build_siamese_model(inputShape, embeddingDim=48):
	inputs = Input(inputShape)

	x = Conv2D(64, (2, 2), padding="same", activation="relu")(inputs)
	x = MaxPooling2D(pool_size=(2, 2))(x)
	x = Dropout(0.3)(x)
	x = Conv2D(64, (2, 2), padding="same", activation="relu")(x)
	x = MaxPooling2D(pool_size=2)(x)
	x = Dropout(0.3)(x)
 
	pooledOutput = GlobalAveragePooling2D()(x)
	outputs = Dense(embeddingDim)(pooledOutput)

	model = Model(inputs, outputs)

	return model

In [None]:
def euclidean_distance(vectors):
	(featsA, featsB) = vectors

	sumSquared = K.sum(K.square(featsA - featsB), axis=1,
		keepdims=True)

	return K.sqrt(K.maximum(sumSquared, K.epsilon()))

In [None]:
imgA = Input(shape=IMG_SHAPE)
imgB = Input(shape=IMG_SHAPE)
featureExtractor = build_siamese_model(IMG_SHAPE)
featsA = featureExtractor(imgA)
featsB = featureExtractor(imgB)

In [None]:
dist = Lambda(euclidean_distance)([featsA, featsB])
outputs = Dense(1, activation="sigmoid")(dist)
model = Model(inputs=[imgA, imgB], outputs=outputs)

In [None]:
model.compile(loss="categorical_crossentropy", optimizer="adam",
	metrics=["accuracy"])

history = model.fit(
	[pairTrain[:, 0][:, :, :, 0], pairTrain[:, 1][:, :, :, 0]], labelTrain[:],
	validation_data=([pairTest[:, 0][:, :, :, 0], pairTest[:, 1][:, :, :, 0]], labelTest[:]),
	batch_size=BATCH_SIZE, 
  epochs=EPOCHS)

###Architecture #2

In [None]:
input = Input((128, 128, 1))

x = BatchNormalization()(input)
x = Conv2D(16, (2, 2), activation="tanh")(x)
x = AveragePooling2D(pool_size=(2, 2))(x)
x = Conv2D(32, (2, 2), activation="tanh")(x)
x = AveragePooling2D(pool_size=(2, 2))(x)
x = Conv2D(64, (2, 2), activation="tanh")(x)
x = AveragePooling2D(pool_size=(2, 2))(x)
x = Conv2D(128, (2, 2), activation="tanh")(x)
x = AveragePooling2D(pool_size=(2, 2))(x)
x = Conv2D(256, (2, 2), activation="tanh")(x)
x = AveragePooling2D(pool_size=(2, 2))(x)
x = Flatten()(x)

x = BatchNormalization()(x)
x = Dense(10, activation="tanh", kernel_regularizer="l2")(x)

embedding_network = Model(input, x)

input_1 = Input((128, 128, 1))
input_2 = Input((128, 128, 1))

tower_1 = embedding_network(input_1)
tower_2 = embedding_network(input_2)

merge_layer = Lambda(euclidean_distance)([tower_1, tower_2])
normal_layer = BatchNormalization()(merge_layer)
output_layer = Dense(1, activation="sigmoid")(normal_layer)

siamese = Model(inputs=[input_1, input_2], outputs=output_layer)

In [None]:
siamese.compile(loss="binary_crossentropy", optimizer="adam",
	metrics=["accuracy"])

history = siamese.fit(
	[pairTrain[:, 0][:, :, :, 0], pairTrain[:, 1][:, :, :, 0]], labelTrain[:],
	validation_data=([pairTest[:, 0][:, :, :, 0], pairTest[:, 1][:, :, :, 0]], labelTest[:]),
	batch_size=BATCH_SIZE, 
  epochs=EPOCHS)

Two results showed after training the both models with different losses and activation functions, Ðµither the model was not learning, or it was overfitting. These 2 results were a consequence of insufficient data.

This showed that for measuring similarity for a specific product (almonds in this case), SNNs are not the best solution. However, if there is no restriction to a specific product and more data available, I believe it is highly possible SNN's will perform better than other methods (in this case, cosine distance).

#TensorFlow Image Similarity

Tensorflow recently introduced the TensorFlow Similarity package to make training image similarity models easier. Below, you can see the demo version on the MNIST dataset.

The model below returns the 5 closest images to the given image. This is not exactly what this task requires, but I believe the model can be changed to solve this task.

*Due to limited time and resources regarding the package, I was not able to review it thoroughly. However, I believe TensorFlow Similarity can be quite useful for solving the task in the future.

In [None]:
!pip install --upgrade-strategy=only-if-needed tensorflow_similarity[tensorflow] 

In [None]:
from tensorflow.keras import layers

# Embedding output layer with L2 norm
from tensorflow_similarity.layers import MetricEmbedding 
# Specialized metric loss
from tensorflow_similarity.losses import MultiSimilarityLoss 
# Sub classed keras Model with support for indexing
from tensorflow_similarity.models import SimilarityModel
# Data sampler that pulls datasets directly from tf dataset catalog
from tensorflow_similarity.samplers import TFDatasetMultiShotMemorySampler
# Nearest neighbor visualizer
from tensorflow_similarity.visualization import viz_neigbors_imgs


# Data sampler that generates balanced batches from MNIST dataset
sampler = TFDatasetMultiShotMemorySampler(dataset_name='mnist', classes_per_batch=10)

# Build a Similarity model using standard Keras layers
inputs = layers.Input(shape=(28, 28, 1))
x = layers.Rescaling(1/255)(inputs)
x = layers.Conv2D(64, 3, activation='relu')(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation='relu')(x)
outputs = MetricEmbedding(64)(x)

# Build a specialized Similarity model
model = SimilarityModel(inputs, outputs)

# Train Similarity model using contrastive loss
model.compile('adam', loss=MultiSimilarityLoss())
model.summary()
model.fit(sampler, epochs=5)

# Index 100 embedded MNIST examples to make them searchable
sx, sy = sampler.get_slice(0,100)
model.index(x=sx, y=sy, data=sx)

# Find the top 5 most similar indexed MNIST examples for a given example
qx, qy = sampler.get_slice(3713, 1)
nns = model.single_lookup(qx[0])

# Visualize the query example and its top 5 neighbors
#viz_neigbors_imgs(qx[0], qy[0], nns)

#Getting website relevance

**A description of the approach:** The code goes over the images scrapped from the website and compares them with the images of the target(in this case, almonds). Every image is compared n times (VAL_NUM) with images randomly chosen from the target images folder, and then the mean is taken. Based on the mean, it is determined if the image is similar or not (according to SIM_THRESHOLD). Then the percentage of similar images on the website is determined, which then determines the website's relevance (according to RELEVANCE_THRESHOLD).

In [None]:
datadir='/content/drive/MyDrive/Data/Irrelevant/Example 1' #the path to the scrapped website images

TARGET = 'almonds' #the target can be easily modified to check for other products
validation_datadir='/tmp/train/' + TARGET #the path to the folder where target's images are located(for comparison)
VAL_NUM = 15 #the number of images that the website image will be compared with

#The threshold that will be considered for similarity
#In this case, it's 0.65 since the data for validation might contain not relevant images and therefore throw off the mean
#when you're sure that the validation data is clean and reliable, you can lower the threshold
SIM_THRESHOLD = 0.65 

#The percentage of images on the website that were similar to the target
#Currently, it's at 0.25 since I believe that if the 1/4 of websites images are relevant, 
#then the website has a large chance of being a potential supplier
RELEVANT_THRESHOLD = 0.25

imgs = os.listdir(validation_datadir)
label = np.array(label)
target_idx = np.where(label == Categories.index(TARGET))[0]

sim_arr = []

for img in os.listdir(datadir):
  img1 = extract(os.path.join(datadir,img))
  img_mean = 0 
  for i in range(VAL_NUM): 
    #The 2nd image is choosen randomly from the validation folder, however specific images can be used too
    img2 = extract(os.path.join(validation_datadir, np.random.choice(imgs)))
    #Here cosine distance is used, however the comparison method can easily be changed
    img_mean += distance.cdist([img1], [img2], metric)[0] 
  img_mean /= VAL_NUM
  sim_arr.append(1 if img_mean <= SIM_THRESHOLD else 0)

sim_arr = np.array(sim_arr)
prct = len(sim_arr[sim_arr == 1])/len(sim_arr)
print("Relevant" if prct >= RELEVANT_THRESHOLD else "Irrelevant")
print("Percentage: " + str(prct))

  "Palette images with Transparency expressed in bytes should be "


Irrelevant
Percentage: 0.0


**Some Results**

*Relevant Example 1:* 0.47 (Relevant)

*Relevant Example 2:* 0.45454545454545453 (Relevant)

*Relevant Example 3:* 0.25 (Relevant)

\

*Irrelevant Example 1:* 0.0 (Irrelevant)

*Irrelevant Example 2:* 0.02857142857142857 (Irrelevant)

*Irrelevant Example 3:* 0.06172839506172839 (Irrelevant)

**Remarks:** Currently, the algorithm of the website relevance is quite slow as it needs to do a significant number of validations for better accuracy. However, as the image similarity model's accuracy increases, there will be less need for numerous validations, reducing the execution time.