<font color="orange"> Image Similarity Approaches </font>

Histogram based approach
-   Histograms capture the distribution of pixel values in an image.
-   By comparing the histograms of two images, you can measure their similarity.
- Below two metrics used for measuring similarity btw images
   - ['The Histogram Intersection'] 
   - Histogram Correlation metrics 


-  We use grayscale histograms for thresholding. 
-  We use histograms for white balancing. 
-  We use color histograms for object tracking in images, such as with the CamShift algorithm


<font color="orange">cv2.calcHist(images, channels, mask, histSize, ranges) <font>



-   channels: A list of indexes, where we specify the index of the channel we want to compute a          histogram for.

-    To compute a histogram of a grayscale image, the list would be [0]. 

-   To compute a histogram for all three red, green, and blue channels, the channels list would be [0, 1, 2].

-   If a mask is provided, a histogram will be computed for masked pixels only. If we do not have a mask or do not want to apply one, we can just provide a value of None.

-   histSize: This is the number of bins we want to use when computing a histogram. Again, this is a list, one for each channel we are computing a histogram for. 

-   The bin sizes do not all have to be the same. Here is an example of 32 bins for each channel:

-   ranges: The range of possible pixel values. Normally, this is [0, 256] (that is not a typo — the ending range of the cv2.calcHist function is non-inclusive so you’ll want to provide a value of 256 rather than 255) for each channel, but if you are using a color space other than RGB [such as HSV], the ranges might be different.


In [5]:
import cv2
# Load images
image1 = cv2.imread("test1.png")
image2 = cv2.imread("test2.png")

# image2 = cv2.imread("test3.png")
# new_width = 768
# new_height = 1366
# image2=cv2.resize(image2,(new_width,new_height))



hist_img1 = cv2.calcHist([image1], [0, 1, 2], None, [256, 256, 256], [0, 256, 0, 256, 0, 256])
hist_img1[255, 255, 255] = 0 #ignore all white pixels
cv2.normalize(hist_img1, hist_img1, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX)


hist_img2 = cv2.calcHist([image2], [0, 1, 2], None, [256, 256, 256], [0, 256, 0, 256, 0, 256])
hist_img2[255, 255, 255] = 0  #ignore all white pixels
cv2.normalize(hist_img2, hist_img2, alpha=0, beta=1, norm_type=cv2.NORM_MINMAX)

# Find the metric value
metric_val = cv2.compareHist(hist_img1, hist_img2, cv2.HISTCMP_CORREL)
print(f"Similarity Score: ", round(metric_val, 2))

Similarity Score:  1.0


<font color ="orange"> Structural Similarity Index (SSIM) </font>



-   SSIM is a widely used metric that assesses the structural similarity between two images.

-   It considers luminance, contrast, and structure, giving a score between -1 (dissimilar) and 1 (identical).
-    The scikit-image library in Python offers an SSIM implementation.



In [2]:
import cv2
from skimage import metrics

# Load images
image1 = cv2.imread("test1.png")
image2 = cv2.imread("test2.png")

image2 = cv2.resize(image2, (image1.shape[1], image1.shape[0]), interpolation = cv2.INTER_AREA)
print(image1.shape, image2.shape)

# Convert images to grayscale
image1_gray = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
image2_gray = cv2.cvtColor(image2, cv2.COLOR_BGR2GRAY)

# Calculate SSIM
ssim_score = metrics.structural_similarity(image1_gray, image2_gray, full=True)
print(f"SSIM Score: ", round(ssim_score[0], 2))

(768, 1366, 3) (768, 1366, 3)
SSIM Score:  1.0


<font color ="orange"> Drawbacks of  (SSIM) </font>


-   The main drawback of the SSIM approach compared to the Histogram approach is that the images have to be the same dimension.

-    Even the similarity score is very low. We can do background subtraction and transparency removal from the images to improve the similarity score.

<font color="orange"> feature based approach </font>


-   These methods extract salient features from images, such as edges, corners, or key points. 

-   Techniques like Scale-Invariant Feature Transform (SIFT) and Speeded-Up Robust Features (SURF) identify distinctive points in images, which can then be compared across images.

-   Scale-Invariant Feature Transform (SIFT)
-   https://medium.com/@deepanshut041/introduction-to-sift-scale-invariant-feature-transform-65d7f3a72d40

-   Speeded-Up Robust Features (SURF)
-   https://medium.com/@deepanshut041/introduction-to-surf-speeded-up-robust-features-c7396d6e7c4e

SURF IMplementation (Speeded-UP Robust Features)

SURF Composed of two steps
-   Feature Extraction
-   Feature Description

<font color="orange">Deep learning based approach </font>



Deep learning has revolutionized image similarity tasks. 

Using pre-trained convolutional neural networks (CNNs) like ResNet, VGG, and Inception, you can extract deep features from images.

CLIP (Contrastive Language-Image Pre-Training) from the openAI is an impressive multimodal zero-shot image classifier that achieves impressive results in a wide range of domains with no fine-tuning. It applies the recent advancements in large-scale transformers like GPT-3 to the vision arena.

We can fine-tune these models on our own image and text data with the regular SentenceTransformers training code. ScrapeHero helps in preparing your own image dataset to train these models. Its web crawling service can crawl complex websites and provides high-quality data.

https://medium.com/@jeremy-k/unlocking-openai-clip-part-2-image-similarity-bf0224ab5bb0

In [None]:
#!pip install git+https://github.com/openai/CLIP.git
#pip install open_clip_torch
#pip install sentence_transformers

In [8]:


import torch
import open_clip
import cv2
from sentence_transformers import util
from PIL import Image

# image processing model
device = "cuda" if torch.cuda.is_available() else "cpu"
model, _, preprocess = open_clip.create_model_and_transforms('ViT-B-16-plus-240', pretrained="laion400m_e32")
model.to(device)
def imageEncoder(img):
    img1 = Image.fromarray(img).convert('RGB')
    img1 = preprocess(img1).unsqueeze(0).to(device)
    img1 = model.encode_image(img1)
    return img1
def generateScore(image1, image2):
    image1 = cv2.imread("blank.jpg")
    image2 = cv2.imread("add_text.jpg")
    #test_img = cv2.imread(image1, cv2.IMREAD_UNCHANGED)
    #data_img = cv2.imread(image2, cv2.IMREAD_UNCHANGED)
    img1 = imageEncoder(image1)
    img2 = imageEncoder(image2)
    cos_scores = util.pytorch_cos_sim(img1, img2)
    score = round(float(cos_scores[0][0])*100, 2)
    return score
print(f"similarity Score: ", round(generateScore(image1, image2), 2))


[ WARN:0@473.268] global loadsave.cpp:248 findDecoder imread_('blank.jpg'): can't open/read file: check file path/integrity
[ WARN:0@473.268] global loadsave.cpp:248 findDecoder imread_('add_text.jpg'): can't open/read file: check file path/integrity


AttributeError: 'NoneType' object has no attribute '__array_interface__'

https://medium.com/@developerRegmi/image-similarity-comparison-using-vgg16-deep-learning-model-a663a411cd24#:~:text=Image%20Similarity%20Comparison%20using%20VGG16%20Deep%20Learning%20Model,-Roman&text=VGG16%20is%20a%20powerful%20pretrained,compare%20them%20to%20identify%20similarities.


The similarity between images can then be computed based on the cosine similarity or Euclidean distance of these feature vectors. To improve the accuracy, we can preprocess the images

Applications


The main applications of the image similarity technique include e-commerce product matching, image retrieval, object recognition, and face recognition.

Image similarity, for example, is used in image retrieval to find images similar to a query image.

Image similarity can be used in object recognition to match a given object with a known database. An image similarity algorithm is used to identify people by comparing their faces to a database.

Conclusion

The ability to measure image similarity is a vital component of numerous applications in today’s visually driven world. 
You can also explore the Siamese networks, a special class of neural networks designed for one-shot learning and image similarity tasks.