Task 1 [50 pts]: Implement an algorithmically efficient version of object detection using the chamfer distance. In particular, implement a function 

(scores, result_image) = chamfer_search(edge_image, template, scale, number_of_results),

where:

edge_image is the image (2D matrix) you want to search.
template is the pattern (2D matrix) you are searching for.
scale A scale s means that the template size must be multiplied by s in order to match the occurrence of the object in the image.
number_of_results specifies the number of results that will be displayed on result_image (note: result_image is an output argument).
scores is a matrix of size equal to the size of the image, and scores(i,j) is the directed chamfer distance from the template to a window centered at (i, j).
result_image is a copy of edge_image, with white bounding rectangles drawn in white color for the best matches found during the search. The number of bounding rectangles to be drawn is specified by the input argument number_of_results. The centers of these bounding rectangles are simply the pixel locations in scores (which is the first output argument). Note: you may use the given draw_rectangle function to draw the bounding rectangles.
Tip: To avoid many bounding rectangles appearing on top of each other, do not allow more than 50% overlap between two rectangles, i.e., the center of one rectangle should not be allowed within the region defined by a previous rectangle. Note: this should not change the scores matrix that is returned by the function, so make sure to make a copy of it inside your function.
Your function will be graded based on the correctness and efficiency of the implementation. As usual, feel free to use any functions that OpenCV already defines (such as cv.distanceTransform()), or code posted on the course repository.

In [13]:
#VERSION1
import cv2
import numpy as np
from draw_rectangle import draw_rectangle
import matplotlib.pyplot as plt

edge_image = cv2.imread('data/clutter1_edges.bmp', cv2.IMREAD_GRAYSCALE)
template = cv2.imread('data/template.bmp', cv2.IMREAD_GRAYSCALE)
scale = 1
number_of_results = 5

# Apply distance transform to the edge image
edge_image_inv = cv2.bitwise_not(edge_image)
distance_transform = cv2.distanceTransform(edge_image_inv, cv2.DIST_L2, 5)

# Resize the template and convert to uint8
resized_template = cv2.resize(template, None, fx=scale, fy=scale).astype(np.uint8)

# Compute Chamfer distance scores
convolution_scores = cv2.filter2D(distance_transform, -1, resized_template)

# Normalize the result
scores = convolution_scores / convolution_scores.max()
scores_copy = scores.copy()


min_score = np.min(scores)
row, col = np.unravel_index(np.argmin(scores, axis=None), scores.shape)
print(min_score)
print(row)
print(col)

#print("Scores Copy:")
#for row in scores_copy:
#    for value in row:
#        print(f"{value:.4f}", end=' ')
#    print()

# Find top matches without overlap
top_matches = []
for _ in range(number_of_results):
    # Find the minimum location in the current result
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(scores_copy)
    top_matches.append(min_loc)

    # Set the pixels within the region of the current match to 1
    y1 = min_loc[1] - int(resized_template.shape[0] / 2)
    y2 = min_loc[1] + int(resized_template.shape[0] / 2)
    x1 = min_loc[0] - int(resized_template.shape[1] / 2)
    x2 = min_loc[0] + int(resized_template.shape[1] / 2)
    scores_copy[y1:y2, x1:x2] = 1

#print("Scores:")
#for row in scores:
#    for value in row:
#        print(f"{value:.4f}", end=' ')  # Print each value separated by a space
#    print()  # Move to the next line for the next row

# Create result image and draw bounding rectangles
result_image = cv2.imread('data/clutter1_edges.bmp')

for (x, y) in top_matches:
    top = y + int(resized_template.shape[0] / 2)
    bottom = y - int(resized_template.shape[0] / 2)
    left = x - int(resized_template.shape[1] / 2)
    right = x + int(resized_template.shape[1] / 2)
    result_image = draw_rectangle(result_image, top, bottom, left, right)

min_score = np.min(scores)
row, col = np.unravel_index(np.argmin(scores, axis=None), scores.shape)
print(min_score)
print(row)
print(col)

min_score = np.min(scores_copy)
row, col = np.unravel_index(np.argmin(scores_copy, axis=None), scores.shape)
print(min_score)
print(row)
print(col)
    
#plt.imshow(result_image)
#print(top_matches)

0.05707732
171
222
0.05707732
171
222
0.099219725
112
24


In [None]:
#VERSION2
import cv2
import numpy as np
from draw_rectangle import draw_rectangle
import matplotlib.pyplot as plt

edge_image = cv2.imread('data/clutter1_edges.bmp', cv2.IMREAD_GRAYSCALE)
template = cv2.imread('data/template.bmp', cv2.IMREAD_GRAYSCALE)
scale = 1
number_of_results = 5

# Apply distance transform to the edge image
edge_image_inv = cv2.bitwise_not(edge_image)
distance_transform = cv2.distanceTransform(edge_image_inv, cv2.DIST_L2, 5)

# Resize the template and convert to uint8
resized_template = cv2.resize(template, None, fx=scale, fy=scale).astype(np.uint8)

# Compute Chamfer distance scores
convolution_scores = cv2.filter2D(distance_transform, -1, resized_template)

# Normalize the result
scores = convolution_scores / convolution_scores.max()


min_score = np.min(scores)
row, col = np.unravel_index(np.argmin(scores, axis=None), scores.shape)
print(min_score)
print(row)
print(col)


# Find top matches without overlap
top_matches = []
for _ in range(number_of_results):
    # Find the minimum location in the current result
    min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(scores)
    top_matches.append(min_loc)

    # Set the pixels within the region of the current match to 1
    y1 = min_loc[1] - int(resized_template.shape[0] / 2)
    y2 = min_loc[1] + int(resized_template.shape[0] / 2)
    x1 = min_loc[0] - int(resized_template.shape[1] / 2)
    x2 = min_loc[0] + int(resized_template.shape[1] / 2)
    scores[y1:y2, x1:x2] = 1

# Create result image and draw bounding rectangles
result_image = cv2.imread('data/clutter1_edges.bmp')

for (x, y) in top_matches:
    top = y + int(resized_template.shape[0] / 2)
    bottom = y - int(resized_template.shape[0] / 2)
    left = x - int(resized_template.shape[1] / 2)
    right = x + int(resized_template.shape[1] / 2)
    result_image = draw_rectangle(result_image, top, bottom, left, right)

plt.imshow(result_image)
print(top_matches)

Task 2 [50 pts]: Implement a function called (scores, result_image) = skin_chamfer_search(color_image, edge_image, template, scale, number_of_results) that, in addition to your solution to Task 1, also uses histogram-based skin detection to improve results, for cases where we are interested in detecting hands.
Your goal for this task is to improve detection accuracy by combining skin detection and chamfer distance. When combining these two, you should expect the hand to be detected as the top 1 result.
The skin histograms are available in the data folder and can be loaded inside your function using np.load(). The function detect_skin() is also provided.
The detection speed can also be improved using this technique, however, the way to achieve that is more advanced, and it is not required for this assignment. You can try it optionally if you wish.
An example of a test image, corresponding edge image, and template, that can be passed to this function, is seen in Figure 1. 

In [24]:
import cv2
import numpy as np
import matplotlib.pyplot as plt
from draw_rectangle import draw_rectangle
from detect_skin import detect_skin
from chamfer_search import chamfer_search

color_image = cv2.imread('data/clutter1.bmp')
color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
edge_image = cv2.imread('data/clutter1_edges.bmp', cv2.IMREAD_GRAYSCALE)
template = cv2.imread('data/template.bmp', cv2.IMREAD_GRAYSCALE)
scale = 1
number_of_results = 1
threshold = .05
pos_hist = np.load('data/positive_histogram.npy')
neg_hist = np.load('data/negative_histogram.npy')

skin_mask = detect_skin(color_image, pos_hist, neg_hist)  # Load histograms using np.load()

# Step 2: Apply skin mask to the edge image
edge_image_skin = edge_image.copy()
edge_image_skin[skin_mask < threshold] = 0


# Step 3: Use chamfer search in the skin-filtered edge image
scores, result_image = chamfer_search(edge_image_skin, template, scale, number_of_results)

min_score = np.min(scores)
row, col = np.unravel_index(np.argmin(scores, axis=None), scores.shape)
print(min_score)
print(row)
print(col)

# Step 4: Return the top match's score and resulting image
top_score = scores[0]
top_result_image = result_image

0.11750685
112
91
