## In order to evaluate the performance of the model, there needs to be a definitive way to measure accuracy.

One possible evaluation metric involves using an "intersection over union" measurement. It takes the overlapping areas of bounding boxes and divides the total area of both bounding boxes. This produces an accuracy score that can be used to measure how close of a match the bounding boxes area. A score of 1.0 reflects a perfect match, where scores closer to 0 are likely incorrect matches.

However, it is also necessary to consider the case that the number of bounding boxes guessed is inaccurate. There are two ways this can happen. In the situation that the number of guessed bounding boxes is lower, the guesses should be matched to the closest real box to determine an error while the missing pairs are automatically considered errors (false negative). Should the number of guesses be higher, each existing box should determine its closest match that is not more proximal to any other box. The left over guesses are considered in the error count as false positives.



In [2]:
# Consider box1, box2 as [x, y, width, height]
def iou(box1, box2):
    xa = max(box1[0], box2[0])
    ya = max(box1[1], box2[1])
    xb = min(box1[0] + box1[2], box2[0] + box2[2])
    yb = min(box1[1] + box1[3], box2[1] + box2[3])
    
    i_area = (xb - xa) * (yb - ya)
    
    a_area = box1[2] * box1[3]
    b_area = box2[2] * box2[3]
    
    return i_area / float(a_area + b_area - i_area)

def box_distance(box1, box2):
    return sqrt( (box1[0] - box2[0])^2 + (box1[1] - box2[1])^2 )

In [3]:

def evaluate_performance(ytrue, yhat):
    for real_boxes, guessed_boxes in ytrue, yhat:
        print(real_boxes, guessed_boxes)