<a href="https://colab.research.google.com/github/ziababar/demos/blob/master/derivative_security/notebooks/image_similarity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Background

The objective of this notebook program is to find how similar two images are. This can be used to determine the derivative images from a source image.


# References

The code in this notebook has been adapted from the following,

1.   [Check if two images are equal with Opencv and Python](https://pysource.com/2018/07/19/check-if-two-images-are-equal-with-opencv-and-python/)
2.   [Find similarities between two images with Opencv and Python](https://pysource.com/2018/07/20/find-similarities-between-two-images-with-opencv-and-python/)
3.   [Detect how similar two images are with Opencv and Python](https://pysource.com/2018/07/20/detect-how-similar-two-images-are-with-opencv-and-python/)
4.   [Check if a set of images match the original one with Opencv and Python](https://pysource.com/2018/07/27/check-if-a-set-of-images-match-the-original-one-with-opencv-and-python/)

# Libraries

OpenCV
NumPy
PyPlot
UrlLib
CV2_IMSHOW



In [None]:
# We need to downgrade OpenCV as some none-free features are not available in the latest version
# First uninstall OpenCV and then install the older version

!pip uninstall opencv-python -y
!pip install opencv-contrib-python==3.4.2.17 --force-reinstall


In [None]:
import cv2
import numpy as np

from urllib.request import urlopen
from google.colab.patches import cv2_imshow


# Data Sources

Load several image files from the GitHub repository. These images are,

*   The original image
*   A copied image
*   A mixed color image
*   A sunburst image
*   A textured image

We now download all the images.

In [None]:
# download the image, convert it to a NumPy array, and then read
def url_to_image(url):
    resp = urlopen(url)
    image = np.asarray(bytearray(resp.read()), dtype="uint8")
    image = cv2.imdecode(image, cv2.IMREAD_COLOR)
    return image
    

In [None]:
# Download the original image
original = url_to_image("https://raw.githubusercontent.com/ziababar/demos/master/derivative_security/data/original_golden_bridge-300x169.jpg")
cv2_imshow(original)


In [None]:
# Download the duplicate image
duplicate = url_to_image("https://raw.githubusercontent.com/ziababar/demos/master/derivative_security/data/original_golden_bridge-300x169.jpg")
cv2_imshow(duplicate)


In [None]:
# Download the rotated image
rotated = url_to_image("https://raw.githubusercontent.com/ziababar/demos/master/derivative_security/data/original_golden_bridge-169x300.jpg")
cv2_imshow(rotated)

In [None]:
# Download the mixed color image
mixed_colors = url_to_image("https://raw.githubusercontent.com/ziababar/demos/master/derivative_security/data/mixed_colors-1024x575.jpg")
cv2_imshow(mixed_colors)


In [None]:
# Download the sunburst image
sunburst = url_to_image("https://raw.githubusercontent.com/ziababar/demos/master/derivative_security/data/sunburst-1024x575.jpg")
cv2_imshow(sunburst)


In [None]:
# Download the textured the image
textured = url_to_image("https://raw.githubusercontent.com/ziababar/demos/master/derivative_security/data/textured-1024x575.jpg")
cv2_imshow(textured)


# Image Processing
---

There are multiple cases to consider when processing and comparing images.


## Case 1 - Images are identical

Images are identical if they meet the following criteria,

1.   Image size is the same AND
2.   Image channel is the same AND
3.   The subtraction of both images results in a black image





In [None]:
# Check if both images are the same size
if original.shape == duplicate.shape:
    print("The images have same size and channels")

In [None]:
# The operation cv2.subtract(image1, image2) simply subtract from each pixel of the first image, the value of the corresponding pixel in the second image.
difference = cv2.subtract(original, duplicate)
b, g, r = cv2.split(difference)

In [None]:
# A colored image has 3 channels (blue, green and red)
# so the cv2.subtract() operation makes the subtraction for each single channel and we need to check if all the three channels are black.
# If they are, we can say that the images are equal.

if cv2.countNonZero(b) == 0 and cv2.countNonZero(g) == 0 and cv2.countNonZero(r) == 0:
    print("The images are completely Equal")

If the images are equal, the result will be a black image (which means each pixel will have value 0).


In [None]:
cv2_imshow(difference)

## Case 2 - Images are similar but not identical

In some cases, the two derived image may not be identifical to the source image.

Here we take multiple derived images, these have different filters appliede to it (sunburst, color changes, textured etc.)

Through feature detection and feature matching, we can find derived images which are similar to the source image.

Here an approach called Scale Invariant Feature Transform (SIFT) is used to extract keypoints and compute its descriptors.

 - Keypoints are locations in the image that are determined based on measures of their stability.
 - Descriptors are local image gradients at selected scale and rotation that describe each keypoint region.

SIFT is based on a paper by D.Lowe, University of British Columbia in 2004. A tutorial on SIFT is given at https://docs.opencv.org/master/da/df5/tutorial_py_sift_intro.html

In [None]:
# Construct a SIFT object
sift = cv2.xfeatures2d.SIFT_create()

sift.detect() function finds the keypoint in the images.
sift.compute() function computes the descriptors from the keypoints we have found.
OR
sift.detectAndCompute() function finds both keypoints and descriptors in a single step.

In [None]:
# Detect key points and descriptors both both images
kp_1, desc_1 = sift.detectAndCompute(original, None)
kp_2, desc_2 = sift.detectAndCompute(mixed_colors, None)
kp_3, desc_3 = sift.detectAndCompute(rotated, None)


OpenCV also provides cv.drawKeyPoints() function which draws the small circles on the locations of keypoints.

In [None]:
img=cv2.drawKeypoints(original, kp_1, img)
cv2_imshow(img)


In [None]:
img=cv2.drawKeypoints(rotated, kp_1, img)
cv2_imshow(img)

In [None]:
# Load FlannBasedMatcher which is the method used to find the matches between the descriptors of both the images.
index_params = dict(algorithm=0, trees=5)
search_params = dict()
flann = cv2.FlannBasedMatcher(index_params, search_params)

# Find the matches between the 2 images, which is stored in the array  ‘matches’.
# The array will contain all possible matches, so many false matches as well.
matches = flann.knnMatch(desc_1, desc_2, k=2)


Apply the ratio test to select only the good matches. The quality of a match is define by the distance. The distance is a number, and the lower this number is, the more similar the features are.

By applying the ratio test we can decide to take only the matches with lower distance, so higher quality.
 - Decreasing the ratio value will get high quality matches but fewer matches.
 - Increasing the ratio value will get more matches but many false positives.

In [None]:
good_points = []
ratio = 0.8

for m, n in matches:
    if m.distance < ratio*n.distance:
        good_points.append(m)

# Find the number of good matches found
print(len(good_points))


We can see the found matches of keypoints from both two images. Here the parameters are,
 - img1 – First source image.
 - keypoints1 – Keypoints from the first source image.
 - img2 – Second source image.
 - keypoints2 – Keypoints from the second source image.
 - matches1to2 – Matches from the first image to the second one, which means that keypoints1[i] has a corresponding point in keypoints2[matches[i]] .

In [None]:
result = cv2.drawMatches(original, kp_1, mixed_colors, kp_2, good_points, None)
cv2_imshow(result)