# Activity 3: Signal Identification
### Module 2 - Computer Vision - Intelligent Robotics Implementation

**Adrián Lozano González - A01661437** 

**Israel Macías Santana - A01027029**

Importing libraries. *Deque* will be necessary to apply a kind of filter, that is, there will be a frame storage buffer to obtain the best match. This will be explained below.

In [6]:
import cv2
import numpy as np
import os
from collections import deque

The following functions determine the obtaining of keypoints and descriptors, Hu moments, and finding the most common element in an array:

*obtain_sift* uses a SIFT object to get the keypoints and descriptors of the provided image. This needs to be performed on a grayscale image. Alternatively, an ORB object could describe the same characteristics, however, it loses precision.

*obtain_hu* uses the moments of the image to obtain the Hu moments. Previous to this operation, a grayscale conversion is needed and a global threshold needs to be applied. The most important quality of moments is that since intensity is relevant for M10, M00 and M01, the threshold needs to be perfectly calculated, thus keeping a threshold of 128. 

*most_common_element* creates a dictionary where each element is stored along with the number of times it has been repeated, from which it obtains the most frequent one. This will be necessary to find the correct sign.

In [7]:
def obtain_sift(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    sift = cv2.SIFT_create()
    keypoints, descriptors = sift.detectAndCompute(gray, None)
    return keypoints, descriptors

def obtain_hu(image):
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _, thresh = cv2.threshold(gray, 128, 255, cv2.THRESH_BINARY)
    moments = cv2.moments(thresh)
    hu_moments = cv2.HuMoments(moments).flatten()
    return hu_moments

def most_common_element(buffer):
    count_dict = {}
    for item in buffer:
        if item in count_dict:
            count_dict[item] += 1
        else:
            count_dict[item] = 1
    most_common_item = max(count_dict, key=count_dict.get)
    return most_common_item

In the following section, all the images to be compared are read. For each image, the descriptors, keypoints, and Hu moments are obtained using the previously described functions. Two lists are created. One will store each image descriptors so that we can compare later on. The second one stores each Hu moments, giving certan specified characteristics of each sign.

In [8]:
folder_path = 'signs'  
image_paths = [os.path.join(folder_path, f) for f in os.listdir(folder_path) if f.endswith(('.png', '.jpg', '.jpeg'))]


descriptors_list = []
hu_moments_list = []
for image_path in image_paths:
    img = cv2.imread(image_path)
    _, descriptors = obtain_sift(img)
    hu_moments = obtain_hu(img)
    descriptors_list.append(descriptors)
    hu_moments_list.append(hu_moments)

The SIFT object, Brute Force matcher, and video objects are initialized. Additionally, a buffer of 10 elements is created, which will be evaluated each time it is filled to obtain the most frequent occurrence and this will lead to smoothing selections and act as a "mode" filter, where the most repeated element in the buffer will be the definite value. This will be discussed on the main loop.

In [9]:
cap = cv2.VideoCapture(0)
sift = cv2.SIFT_create()
bf = cv2.BFMatcher()


best_match_buffer = deque(maxlen=10)

Main video loop. Descriptors and keypoints of the current video frame, as well as the Hu moments, are calculated. Each descriptor (of each of the signs) is compared using a brute force knn match (where k=2 to obtain two matches), and both obtained matches are compared. If the first match is less than 70% of the second, we keep that match. This threshold will prevent matches that do not necessarily provide relevant descriptors and the lower it is, the mor restrictive it is, thus, obtaining more precision.

Then, the magnitude of the vector that compares (subtracts) the Hu moments of the current frame with those of each sign is calculated. Here, we check if:

1) There are more than 20 matches obtained.
2) The distance between Hu moments is **minimal**.

If both conditions are met, it means that this is most likely the sign, and thus the *best_index* becomes our current index.

Finally, if the index is appropriate (not equal to -1, which means no suitable match or minimal Hu moments difference was found), it is added to the before mentioned buffer. Additionally, if the buffer is already full, the most repeated index is counted (as a filter), and so, the **correct sign** is obtained from the list of paths.

In [10]:
while True:
    ret, frame = cap.read()
    if not ret:
        break

    gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    keypoints_frame, descriptors_frame = sift.detectAndCompute(gray_frame, None)

    if descriptors_frame is not None:
        
        _, thresh_frame = cv2.threshold(gray_frame, 128, 255, cv2.THRESH_BINARY)
        moments_frame = cv2.moments(thresh_frame)
        hu_moments_frame = cv2.HuMoments(moments_frame).flatten()

        best_match_index = -1
        best_match_score = float('inf')
        best_match_good_matches = None

        for i, descriptors in enumerate(descriptors_list):
            matches = bf.knnMatch(descriptors, descriptors_frame, k=2)
            good_matches = []
            for m, n in matches:
                if m.distance < 0.7 * n.distance:  
                    good_matches.append(m)

        
            hu_distance = np.linalg.norm(hu_moments_list[i] - hu_moments_frame)

           
            if len(good_matches) > 20 and hu_distance < best_match_score:  
                best_match_index = i
                best_match_score = hu_distance
                best_match_good_matches = good_matches

        if best_match_index != -1:
            best_match_buffer.append(best_match_index)

            if len(best_match_buffer) == best_match_buffer.maxlen:
                most_common_match = most_common_element(best_match_buffer)
                print(f"Most common match: {image_paths[most_common_match]}")

    cv2.imshow('Frame', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()