<a href="https://colab.research.google.com/github/rhanschristiansen/association_algo_performance/blob/master/src/association_test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>




## The Association Process Testing

This file documents the process of performing association between lidar reading and detected bounding boxes of objects in the feild of view of both a video camera and a LeddarTech M16 Lidar Detector.

#### Background

__LIDAR Detector__

The LeddarTech M16 provides a non-scanning lidar measurement which returns high sampling speed distance measurements from 16 lidar zones that are arrayed from left to right in a 45 degree field of view horizontally and 7.5 degrees vertically. The individual segments cover equal angles of about 2.8 degrees horizontally and 7.5 degrees vertically. Each LeddarTech reading consists of a segment number (0-15) and a distance reading. There may be zero, one or more distance measurements per segment depending on objects that are in the field of view. The range of the M16 is about 140 feet.

__Video Object Detector__

The setup uses the Tensorflow Deep Learning software package https://www.tensorflow.org/ and the YOLOv3 model https://github.com/maiminh1996/YOLOv3-tensorflow to detect objects in the field of view of the camera.

__Association__

having both distance and object detection information available is extremely valuable but it is highly important that we know which lidar distance reading is associated with which object returned from the video object detector. This is __The Association Problem__ that is developed and tested in this notebook.


#### Description of the Problem

The drawing below shows a 3d Projection of the LeddarTech M16 field of view onto the field of view of the video camera.

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/Fields_of_View.jpg?raw=1 width=560></html>

When the setup is deployed in the field, the information that is retrieved for a single frame looks like the image below. Here you can see the lidar zones of the M16 shown with thin black lines across the lower middle of the image. 

__Displaying Lidar Readings__

If a zone of the lidar detector returns a distance reading the zone is highlighted in yellow and the distance reading (in feet) is displayed above the lidar zone. If more than one value is returned for a given zone, multiple red distance readings are stacked vertically over the lidar zone. 

__Displaying Video Detections__

The object detections that are returned from the TensorFlow Yolo detector are shown as green bounding boxes. The class of the object are also shown in green text in the lower left corner of the bounding box.

In the frame below, there were 5 lidar distance readings returned in zones 4, 5, 6, 7 & 8. And there were 3 objects detected. In this scenario, a fairly simple algorithm could be developed to map lidar values to objects.

These are the video detection objects:

|Object # | Bounding Box (x1, y1, x2, y2) | Object Class | Confidence |
| :-----: | :---------------------------: | :----------: | :--------: |
|    1    |   (495, 354, 561, 409)        |    Car       |  0.9839655 |
|    2    |   (663, 366, 697, 392)        |    Car       |  0.980555  |
|    3    |   (598, 368, 667, 420)        |    Car       |  0.9420464 |

And these are the lidar detections:

|Detection # | Segment |   Distance   |
| :--------: | :-----: | :----------: | 
|    1       |    4    | 105.799030   |
|    2       |    5    | 105.018769   |
|    3       |    6    | 104.506889   |
|    4       |    7    |  88.595796   |
|    5       |    8    |  88.714592   |

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/lidar_to_image_rendering_14.png?raw=1 width=1280></html>

However, in other situations, the situation is much more complex. Observe the complexity of the situation when more objects and lidar readings are detected. In this frame shown below, there are a total of 8 objects and 23 lidar detections. This presents a significant challenge to any association algorithm.

In this frame, these are the video detection objects:

|Object # | Bounding Box (x1, y1, x2, y2) | Object Class | Confidence  |
| :-----: | :---------------------------: | :----------: | :---------: |
|    1    |   ( -7, 317, 427, 621)        |    Car       |  0.9997973  |
|    2    |   (786, 354, 1027, 504)       |    Car       |  0.9997645  |
|    3    |   (380, 360, 512, 454)        |    Car       |  0.99277544 |
|    4    |   (1129, 418, 1270, 713)      |    Car       |  0.9813789  |
|    5    |   (589, 371, 669, 420)        |    Car       |  0.97814137 |
|    6    |   (741, 363, 825, 440)        |    Car       |  0.9496468  |
|    7    |   (707, 372, 738, 407)        |    Car       |  0.8582622  |
|    8    |   (517, 355, 578, 410)        |    Car       |  0.8155548  |

And these are the lidar detections:

|Detection # | Segment |   Distance   |
| :--------: | :-----: | :----------: | 
|    1       |    0    |   20.352862  |
|    2       |    3    |   51.171712  |
|    3       |    4    |   50.816675  |
|    4       |    5    |   50.729818  |
|    5       |    5    |   91.829979  |
|    6       |    6    |   91.648105  |
|    7       |    7    |   85.183094  |
|    8       |    7    |  112.802604  |
|    9       |    8    |   84.882674  |
|   10       |    8    |  111.584855  |
|   11       |    9    |   84.740849  |
|   12       |    9    |  112.994140  |
|   13       |   10    |   59.250310  |
|   14       |   10    |   84.906053  |
|   15       |   11    |   59.656511  |
|   16       |   12    |   34.480209  |
|   17       |   13    |   35.184426  |
|   18       |   13    |  108.944754  |
|   19       |   14    |   35.332408  |
|   20       |   14    |  108.647988  |
|   21       |   15    |   35.796530  |
|   22       |   14    |   37.425736  |
|   23       |   14    |  110.845094  |

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/lidar_to_image_rendering_19.png?raw=1 width=1280></html>


### Solving the Association Problem

The image below shows a close up of the simplest association problem so that we can outine an algorithm to address the solution.

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/closeup_association.png?raw=1 width=560></html>

These are the video detection objects (in green):

|Object # | Bounding Box (x1, y1, x2, y2) | Object Class | Confidence |
| :-----: | :---------------------------: | :----------: | :--------: |
|    1    |   (495, 354, 561, 409)        |    Car       |  0.9839655 |
|    2    |   (663, 366, 697, 392)        |    Car       |  0.980555  |
|    3    |   (598, 368, 667, 420)        |    Car       |  0.9420464 |

From left to right the three objects are 1, 3 & 2 

And these are the lidar detections (in red):

|Detection # | Segment |   Distance   |
| :--------: | :-----: | :----------: | 
|    1       |    4    | 105.799030   |
|    2       |    5    | 105.018769   |
|    3       |    6    | 104.506889   |
|    4       |    7    |  88.595796   |
|    5       |    8    |  88.714592   |

From left to right the lidar detections are in numerical order 1, 2, 3, 4 & 5 

From the data above, using our intuition we could make the following proposals for rules:

>1) A lidar reading can only be associated with the object if the object bounding box intersects the segment of the lidar region

If we represent an associations matrix with objects in rows and lidar detections in columns and with ones in position (i,j) representing an association between object i and lidar detection j. 

Using rule #1 we would have the following associations matrix:

| Obj# \ Det# |  1  |  2  |  3  |  4  |  5  |
| ----------: | :-: | :-: | :-: | :-: | :-: | 
|      1      |  0  |  1  |  1  |  0  |  0  |
|      2      |  0  |  0  |  0  |  0  |  1  |
|      3      |  0  |  0  |  0  |  1  |  1  |

Unfortunately, objects 1 and 3 both have two different lidar readings associated with them and lidar reading 5 is associated with two different objects.

For the second case, shown above, the association matrix would be significantly more complex:

| Obj# \ Det# |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  | 10  | 11  | 12  | 13  | 14  | 15  | 16  | 17  | 18  | 19  | 20  | 21  | 22  | 23  |
| ----------: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | 
|      1      |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      2      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |
|      3      |  0  |  1  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      4      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      5      |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      6      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      7      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      8      |  0  |  0  |  0  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |


In this case we have most objects with more than one lidar reading associated with them and several lidar readings that are associated with more than one object.

One approach that could be attempted is to develop a rules based approach for choosing the best lidar reading for each individual object. Once possible rule might be:

>2) When more than one lidar reading is associated with an object based on rule #1 above, choose the lidar reading with the closest distance.

Unfortunately, using this rule leads to an conflict with objects 3 and 8 that are both associated with lidar reading #4 using rule 2 which has the lowest distance. By observation, it is clear that lidar reading #4 should be associated with object #8 rather than object #2 because object #8 is the closer object that is closer than object #2.

By trying to create a rules based approach to solve the dilemma, we can run into an endless set of rules that are complex and self-contradictory.

### The Hungarian Algorithm

One of the methods to solve this assignment problem is the __Hungarian Algorithm__ http://hungarianalgorithm.com/index.php or one of its variants called the __Munkres Algorithm__ http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html. The Munk-Res algorithm is similar to the Hungarian algorithm but works with non-square matrices.


#### The cost function

The input to the Munkres Algorithm is a cost function matrix which defines a __cost__ value for each object/lidar detection assignment pair. The Munkres Algotithm will return an assignment matrix or assignment pairs where the overall sum of the costs for all the assignments is minimized. 

To make the algorithm work it is necessary to create a cost function that decreases as the quality of the assignment increases.

#### The Ideal Bounding Box

__Possible Cost Functions__

Several possible cost functions can be developed for the input to the Munkres Algorithm

>1) The L2 Norm - 

# Colab Environment Setup
Download the supporting source files and data to the Colab environment


In [3]:
def clone_source_code():
  """
  Clone the github repo and move to this working directory
  """
  print("Downloading source code...")
  !_=$(git clone --quiet https://github.com/rhanschristiansen/association_algo_performance.git)
  !mv association_algo_performance/* .
  !rm -rf association_algo_performance/

def download_extract_data():
  """
  Download data.zip from Google Drive and extract to this working directory
  """
  print("Downloading data...")
  !curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=1ornFmw59u_It0Cpi8yDHRdpW4G_6YLj5" > /dev/null
  !curl -s -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=1ornFmw59u_It0Cpi8yDHRdpW4G_6YLj5" -o "data.zip"
  print("Extracting data...")
  !unzip -q data.zip
def download_yolov3_weights():
  print("Downloading CNN weights...")
  !curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=1pdFmnxLEbezktEbA7tfnVoU50wowXI2r" > /dev/null
  !curl -s -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=1pdFmnxLEbezktEbA7tfnVoU50wowXI2r" -o "src/detection/yolov3/yolov3.weights"
def setup():
  import sys
  sys.path.append('./') #add the parent directory to the path
  !rm -rf *
  clone_source_code()
  download_extract_data()
  download_yolov3_weights()
  print("Setup Complete.")

setup()

Downloading source code...
Downloading data...
Extracting data...
Setup Complete.


## The Code

Build a class called Transform() that contains the function to move from the image plane to the XZ plane

In [50]:
# This contains the class Transform() from transform.py in the utils directory

import math
import numpy as np
import src.util.calibration_kitti as cal

class Transform():
    def __init__(self):
        self.y_horizon = cal.cal['Y_HORIZON']
        self.y_resolution = cal.cal['Y_RESOLUTION']
        self.x_center = cal.cal['X_CENTER']
        self.x_resolution = cal.cal['X_RESOLUTION']
        self.vfov = cal.cal['VFOV']
        self.hfov = cal.cal['HFOV']
        self.ht_camera = cal.cal['HT_CAMERA']
        self.width_car = cal.cal['WIDTH_CAR']
        self.length_car = cal.cal['LENGTH_CAR']
        self.wl_ratio = self.width_car / self.length_car
        self.edge_margin = 0

        self.seg_to_pixel_top = cal.SEG_TO_PIXEL_TOP
        self.seg_to_pixel_center = cal.SEG_TO_PIXEL_CENTER
        self.seg_to_pixel_left = cal.SEG_TO_PIXEL_LEFT
        self.seg_to_pixel_right = cal.SEG_TO_PIXEL_RIGHT
        self.seg_to_pixel_bottom = cal.SEG_TO_PIXEL_BOTTOM
        self.alpha, self.zprime = self._y2_to_alpha_and_zprime()


    '''
    Function name:  _y2_to_Zprime
    Inputs:         calibration variables
                    y_resolution, y_horizon, vfov, ht_camera
    Output:         a dictionary of Z' values for each value of y2 from 
                    y_horizon-1 to y_resolution
    '''
    def _y2_to_alpha_and_zprime(self):
        zprime = {}
        alpha = {}
        for y in range(self.y_resolution, self.y_horizon, -1):
            val = (y-self.y_horizon) * self.vfov / self.y_resolution
            alpha[y] = val
            zprime[y] = self.ht_camera / math.tan(val)
        return alpha, zprime

    '''
    Function name:  _point_to_distance
    Inputs:         y2 - the y2 pixel value of the bounding box 
                    xc - the x pixel value of the point of interest
                    yc - the y pixel value of the point of interest
                    
    Output:         distance to the point of interest 
    
    equation:       dist = zprime / cos(beta)
                    beta = sqrt((xc-x_center)^2 + (yc-ycenter)^2) / Y2    
    '''
    def _point_to_distance(self, y2, cent):
        if y2 > self.y_horizon and y2 <= self.y_resolution:
            dist_pixels = math.sqrt((cent[0] - self.x_center)**2 + (cent[1] - self.y_horizon)**2)
            beta = dist_pixels / (y2-self.y_horizon) * self.alpha[y2]
            dist = self.zprime[y2] / math.cos(beta)
            return dist
        else:
            return 100000 # if y2 is out of bounds return an abitrarily high number

    '''
    Function:   _x_to_seg_est
    Inputs:     x (in pixels)
    Output:     a floating point number whose integer value represents the number of the segment.
    '''
    def _x_to_seg_est(self,x):
        return (x - self.seg_to_pixel_left[0])/((self.seg_to_pixel_right[15]-self.seg_to_pixel_left[0])/16)

    '''
    Function name:  _find_seg_intersections
    Inputs:         bb [x1,y1,x2,y2]
    Output:         list of lists intersecting segment numbers
    '''
    def find_seg_intersections(self,bbs):
        segs_list = []
        for bb in bbs:
            toohigh = bb[3] < self.seg_to_pixel_top
            toolow = bb[1] > self.seg_to_pixel_bottom
            if not (toohigh or toolow):
                seg_left = int(self._x_to_seg_est(bb[0]))
                seg_right = self._x_to_seg_est(bb[2])
                segs = [x for i,x in enumerate(range(0,16)) if x >= seg_left and x < seg_right]
                segs_list.append(segs)
            else:
                segs_list.append([])
        return segs_list

    '''
    Function name:  _find_seg_centroids
    Inputs:         bb, seg
    Outputs:        a list of (xc, yc) tuples corresponding to the seg list
    Method:         the function finds the intersecting area of the bounding box and the 
                    bounding box and returns the centroid of the intersecting area
    '''
    def _find_seg_centroids(self,bb,segs):
        cents = []
        for seg in segs:
            xvals = [bb[0],bb[2],self.seg_to_pixel_left[int(seg)],self.seg_to_pixel_right[int(seg)]]
            xvals.sort()
            yvals = [bb[1],bb[3],self.seg_to_pixel_top,self.seg_to_pixel_bottom]
            yvals.sort()
            cents.append((int((xvals[1] + xvals[2])/2),int((yvals[1] + yvals[2])/2)))
        return cents

    '''
    Function name:  bb_to_dist_seg
    Inputs:         a list containing [x1, y1, x2, y2]
                    which is the bounding box coordinates in the pixel space
    Output:         a list containing a list of intersecting segments numbers 0-15 and a 
                    list of the corresponding distance estimates
    '''

    def bb_to_dist_seg_list(self,bb):
        dists_list = []
        segs_list = self.find_seg_intersections(bb)
        for i, segs in enumerate(segs_list):
            cents = self._find_seg_centroids(bb[i],segs_list[i])
            dists = []
            for cent in cents:
                dist = self._point_to_distance(bb[i][3],cent)
                dists.append(dist)
            dists_list.append(dists)
        return dists_list, segs_list

    '''
    Function name:  bb_dist_to_XZ
    Inputs:         a list of lists containing [x1, y1, x2, y2, dist]
                    which is the bounding box coordinates in the pixel space and
                    the dist in ft in the XZ space
    Output:         a list of lists of [X, Z] tuples for each list in the input
    '''

    def bb_dist_to_XZ(self,bb_dist_lists):
        XZ_lists = []
        for i, bb_dist_list in enumerate(bb_dist_lists):

            # x_c and y_c are the coordinates of the centroid of the bounding box in the pixel plane xy
            x_c = (bb_dist_list[0] + bb_dist_list[2]) / 2
            y_c = (bb_dist_list[1] + bb_dist_list[3]) / 2

            # hypotenus_xy is the distance in pixels from the center of optical flow
            # and the centroid of the bounding box in the pixel coordinate system
            hypotenus_xy = math.sqrt((x_c - self.x_center)**2 + (y_c - self.y_horizon)**2)

            # beta is the angle between the line extending from the camera to the center of optical flow
            # and the line extending from the camera to the center of the object in the world coordinate system
            beta = hypotenus_xy * self.hfov / self.x_resolution

            # Z is the Z coordinate of the object in the world coordinate system
            Z = bb_dist_list[4] * math.cos(beta)

            # hypotenus_XY is the distance in feet between the point (0, 0, Z) and (Xc, Yc, Z)
            # in the world coordinate system on the plane parallel to XY
            # that intersects the back of the detected object
            hypotenus_XY = bb_dist_list[4] * math.sin(beta)

            # X is the X coordinate of the object in the world coordinate system
            X = x_c / hypotenus_xy * hypotenus_XY
            XZ_lists.append([X,Z])

        return XZ_lists

    # this function calculates the ideal bounding box for the lidar detection
    def lidar_dist_seg_to_bb(self, dist, seg):

        y2 = int(cal.cal['FOCAL_LENGTH'] * cal.cal['HT_CAMERA'] / dist + cal.cal['Y_HORIZON'])

        x_width =  cal.cal['WIDTH_CAR'] / dist * cal.cal['FOCAL_LENGTH']

        beta = math.radians(cal.SEG_TO_ANGLE[seg])
        x_mid = cal.cal['X_CENTER'] + beta / cal.cal['HFOV'] * cal.cal['X_RESOLUTION']

        x1 = int(x_mid - x_width / 2)
        x2 = int(x_mid + x_width / 2)
        y1 = int(y2 - x_width)

        return [x1, y1, x2, y2]


Next, we bring in a class to contain instances of the Lidar Detections 

In [51]:

# This contains the class LIDAR_detection() from lidar_detection.py in robert@verix-6440:~$ cd ~/PycharmProjects/
from src.util.transform import Transform

class LIDAR_detection():
    def __init__(self, frame, seg, dist, ampl):
        tr = Transform()
        self.frame = frame
        self.dist = dist
        self.seg = seg
        self.bb = tr.lidar_dist_seg_to_bb(dist, seg)
        self.ampl = ampl


The next class is the car detector that uses the yolo v3 Tensorflow based Convolutional Nueral Network to detect cars, trucks and vans.

In [53]:
# This contains the class CarDetectorTFV2() from car_detector_tf_v2.py in the detection directory

"""
Car detector using tensorflow models
TODO: move into a class
"""
import os
import tarfile
import numpy as np
import cv2
import tensorflow as tf
from src.detection.yolov3 import yolov3

class CarDetectorTFV2(object):
    def __init__(self):
        tf.reset_default_graph()
        self.session = tf.Session()
        self.batch_size = 1
        self.max_output_size = 10
        self.iou_threshold = 0.5
        self.confidence_threshold = 0.5
        self.model = yolov3.Yolo_v3(max_output_size=self.max_output_size,
                                    iou_threshold=self.iou_threshold,
                                    confidence_threshold=self.confidence_threshold)
        self.class_names = self.model.class_names
        self.model_size = self.model.model_size
        self.inputs = tf.placeholder(tf.float32, [self.batch_size, self.model_size[0], self.model_size[1], 3])
        self.run_inference = self.model(self.inputs, training=False)
        self.model_vars = tf.global_variables(scope='yolo_v3_model')
        self.assign_ops = yolov3.load_weights(self.model_vars, self.model.weights_path)
        self.session.run(self.assign_ops)

    def detect(self, img, return_class_scores=False):
        """
        Given input image, return detections
        """
        detection_boxes, detection_classes, detection_scores = [], [], []
        img_net = cv2.resize(img, (self.model_size[0], self.model_size[1]))
        batch = np.array([img_net])
        detection_result = self.session.run(self.run_inference, feed_dict={self.inputs: batch})
        det = detection_result[0]
        resize_factor = (img.shape[1] / self.model_size[0], img.shape[0] / self.model_size[1])
        for cls in range(len(self.class_names)):
            boxes = det[cls]
            if np.size(boxes) != 0:
                for box in boxes:
                    xy, confidence = box[:4], box[4]
                    xy = [int(xy[i] * resize_factor[i % 2]) for i in range(4)]
                    x1, y1, x2, y2 = xy[0], xy[1], xy[2], xy[3]
                    bbox = [x1, y1, x2, y2]
                    detection_boxes.append(bbox)
                    detection_classes.append(self.class_names[cls])
                    detection_scores.append(confidence)
        if return_class_scores:
            return detection_boxes, detection_classes, detection_scores
        else:
            return detection_boxes



The next class is the Costs() class that defines the various cost functions that are used in the association process and finally passed to the Munkres algorithm for assignment of Lidar Detections to Video Detections  

In [None]:
# This contains the class Costs() from costs.py in the associations directory

import math
import src.util.calibration_kitti as cal
import numpy as np
from src.util.transform import Transform

# the class Costs contains methods that individually calculate the cost_arrays for use by the Munkres algorithm
# Each of the methods receives the lidar_detections and video_detections lists as values in the **kwargs dictionary
# Each of the methods returns a cost array with values designed to be between 0 and 1
# the cost array should have the number of rows equal to the number of video detection objects in the video_detections list
# and the number of columns equal to the number of lidar detection objects in the lidar_detections list.

class Costs():
    def __init__(self):
        self.tr = Transform()

    # this method calculates the euclidian distance between the centroid of the video_detection bounding box and the
    # centroid of the lidar detection ideal bounding box. The distance in pixels is divided by the diagonal of the
    # video image in pixels to give a value between 0 and 1
    def dist_between_centroids(self, **kwargs):
#        max_dist = math.sqrt(cal.cal['X_RESOLUTION']**2 + cal.cal['Y_RESOLUTION']**2) # max dist between centroids used
        LIDAR_X_RES = cal.SEG_TO_PIXEL_RIGHT[8] - cal.SEG_TO_PIXEL_LEFT[0]
        LIDAR_Y_RES = cal.SEG_TO_PIXEL_BOTTOM - cal.SEG_TO_PIXEL_TOP

        max_dist = math.sqrt(LIDAR_X_RES**2 + LIDAR_Y_RES**2) # max dist between centroids used to normalize between 0 and 1
        video_detections = kwargs['video_detections']
        lidar_detections = kwargs['lidar_detections']
        dist_array = np.ones((len(video_detections), len(lidar_detections)), np.float64) * 1

        for i, video_detection in enumerate(video_detections):
            cx_v = (video_detection.bbox[0] + video_detection.bbox[2]) / 2
            cy_v = (video_detection.bbox[1] + video_detection.bbox[3]) / 2
            for j, lidar_detection in enumerate(lidar_detections):
                cx_l = (lidar_detection.bb[0] + lidar_detection.bb[2]) / 2
                cy_l = (lidar_detection.bb[1] + lidar_detection.bb[3]) / 2
                dist_array[i,j] = math.sqrt( (cx_v-cx_l)**2 + (cy_v-cy_l)**2 ) / max_dist

        return dist_array

    # this method calculates the difference in feet between the lidar distance and the
    # estimated distance derived from the y2 value of the video bounding box. this value is
    # divided by the max_lidar distance of 140 feet to give a value between 0 and 1
    # if there is no overlap between the video_detection bounding box and the lidar segment
    # the cost is penalized with a value of 1e6
    def dist_lidar_to_y2estimate(self, **kwargs):
        max_dist = 140/2 # maximum detection distance for lidar - used to normalize the output between 0 and 1
        m_to_ft = cal.cal['M_TO_FT']
        video_detections = kwargs['video_detections']
        lidar_detections = kwargs['lidar_detections']
        dist_array = np.ones((len(video_detections), len(lidar_detections)), np.float64) * 1
        bbs = []
        for i in range(len(video_detections)):
            bbs.append(list(video_detections[i].bbox))

        dists_list, segs_list = self.tr.bb_to_dist_seg_list(bbs)

        for i in range(len(dists_list)):
            dist_est_array = np.zeros((len(lidar_detections), 1), np.float64)
            for j in range(len(lidar_detections)):
                for segs, dists in zip(segs_list,dists_list):
                    for k, seg in enumerate(segs):
                        if seg == lidar_detections[j].seg: # only use values from bounding boxes that overlap the lidar segment
                            dist_est = dists[k]
                            # add dist_est to fvec in video_detection for later use
                            dist_est_array[j,0] = dist_est
                            dist_array[i,j] = abs(dist_est - lidar_detections[j].dist) / max_dist
            video_detections[i].dist_est_y2 = dist_est_array

        return dist_array

    # this is a helper function to calculate the intersection / union ratio of the bounding box rectangles.
    # the value 0 means there is no intersection
    # the value 1 means they bounding boxes are in exactly the same place 100% overlap
    def _iou(self, boxA, boxB):

        # determine the (x, y)-coordinates of the intersection rectangle
        xA = max(boxA[0], boxB[0])
        yA = max(boxA[1], boxB[1])
        xB = min(boxA[2], boxB[2])
        yB = min(boxA[3], boxB[3])

        if xA < xB and yA < yB: # calculate area only for overlapping bounding boxes
            # compute the area of intersection rectangle
            interArea = (xB - xA) * (yB - yA)
        else:
            interArea = 0

        # compute the area of both the prediction and ground-truth
        # rectangles
        boxAArea = (boxA[2] - boxA[0]) * (boxA[3] - boxA[1])
        boxBArea = (boxB[2] - boxB[0]) * (boxB[3] - boxB[1])

        union = float(boxAArea + boxBArea - interArea)

        # compute the intersection over union by taking the intersection
        # area and dividing it by the sum of prediction + ground-truth
        # areas - the interesection area
        if union > 0:
            iou = interArea / float(boxAArea + boxBArea - interArea)
        else:
            iou = 0

        return iou

    # this method calculates the overlap of the video_detection bounding box and the
    # lidar detection ideal bounding box
    # The returned values in the array are 1 minus the intersection / union ratio
    # a 100% overlap would have a cost of 0 and 1% overlap would have a cost of 0.99
    # no overlap is penalized with a value of 1e6
    def inverse_intersection_over_union(self, **kwargs):

        video_detections = kwargs['video_detections']
        lidar_detections = kwargs['lidar_detections']

        cost_array = np.ones((len(video_detections), len(lidar_detections)), np.float64) * 1

        for i in range(len(video_detections)):
            for j in range(len(lidar_detections)):
                cost = 1 - self._iou(video_detections[i].bbox, lidar_detections[j].bb) # one minus the iou ratio
                if cost < 1: # only use values that are less that 1 (have some overlap between bounding boxes)
                    cost_array[i,j] = cost

        return cost_array



Here the Munkres class is defined

In [None]:
# This contains the class Munkres() from munkres.py in the associations directory

#!/usr/bin/env python
# -*- coding: iso-8859-1 -*-

# Documentation is intended to be processed by Epydoc.

"""
Introduction
============

The Munkres module provides an implementation of the Munkres algorithm
(also called the Hungarian algorithm or the Kuhn-Munkres algorithm),
useful for solving the Assignment Problem.

Assignment Problem
==================

Let *C* be an *n*\ x\ *n* matrix representing the costs of each of *n* workers
to perform any of *n* jobs. The assignment problem is to assign jobs to
workers in a way that minimizes the total cost. Since each worker can perform
only one job and each job can be assigned to only one worker the assignments
represent an independent set of the matrix *C*.

One way to generate the optimal set is to create all permutations of
the indexes necessary to traverse the matrix so that no row and column
are used more than once. For instance, given this matrix (expressed in
Python)::

    matrix = [[5, 9, 1],
              [10, 3, 2],
              [8, 7, 4]]

You could use this code to generate the traversal indexes::

    def permute(a, results):
        if len(a) == 1:
            results.insert(len(results), a)

        else:
            for i in range(0, len(a)):
                element = a[i]
                a_copy = [a[j] for j in range(0, len(a)) if j != i]
                subresults = []
                permute(a_copy, subresults)
                for subresult in subresults:
                    result = [element] + subresult
                    results.insert(len(results), result)

    results = []
    permute(range(len(matrix)), results) # [0, 1, 2] for a 3x3 matrix

After the call to permute(), the results matrix would look like this::

    [[0, 1, 2],
     [0, 2, 1],
     [1, 0, 2],
     [1, 2, 0],
     [2, 0, 1],
     [2, 1, 0]]

You could then use that index matrix to loop over the original cost matrix
and calculate the smallest cost of the combinations::

    n = len(matrix)
    minval = sys.maxint
    for row in range(n):
        cost = 0
        for col in range(n):
            cost += matrix[row][col]
        minval = min(cost, minval)

    print minval

While this approach works fine for small matrices, it does not scale. It
executes in O(*n*!) time: Calculating the permutations for an *n*\ x\ *n*
matrix requires *n*! operations. For a 12x12 matrix, that's 479,001,600
traversals. Even if you could manage to perform each traversal in just one
millisecond, it would still take more than 133 hours to perform the entire
traversal. A 20x20 matrix would take 2,432,902,008,176,640,000 operations. At
an optimistic millisecond per operation, that's more than 77 million years.

The Munkres algorithm runs in O(*n*\ ^3) time, rather than O(*n*!). This
package provides an implementation of that algorithm.

This version is based on
http://www.public.iastate.edu/~ddoty/HungarianAlgorithm.html.

This version was written for Python by Brian Clapper from the (Ada) algorithm
at the above web site. (The ``Algorithm::Munkres`` Perl version, in CPAN, was
clearly adapted from the same web site.)

Usage
=====

Construct a Munkres object::

    from munkres import Munkres

    m = Munkres()

Then use it to compute the lowest cost assignment from a cost matrix. Here's
a sample program::

    from munkres import Munkres, print_matrix

    matrix = [[5, 9, 1],
              [10, 3, 2],
              [8, 7, 4]]
    m = Munkres()
    indexes = m.compute(matrix)
    print_matrix(matrix, msg='Lowest cost through this matrix:')
    total = 0
    for row, column in indexes:
        value = matrix[row][column]
        total += value
        print '(%d, %d) -> %d' % (row, column, value)
    print 'total cost: %d' % total

Running that program produces::

    Lowest cost through this matrix:
    [5, 9, 1]
    [10, 3, 2]
    [8, 7, 4]
    (0, 0) -> 5
    (1, 1) -> 3
    (2, 2) -> 4
    total cost=12

The instantiated Munkres object can be used multiple times on different
matrices.

Non-square Cost Matrices
========================

The Munkres algorithm assumes that the cost matrix is square. However, it's
possible to use a rectangular matrix if you first pad it with 0 values to make
it square. This module automatically pads rectangular cost matrices to make
them square.

Notes:

- The module operates on a *copy* of the caller's matrix, so any padding will
  not be seen by the caller.
- The cost matrix must be rectangular or square. An irregular matrix will
  *not* work.

Calculating Profit, Rather than Cost
====================================

The cost matrix is just that: A cost matrix. The Munkres algorithm finds
the combination of elements (one from each row and column) that results in
the smallest cost. It's also possible to use the algorithm to maximize
profit. To do that, however, you have to convert your profit matrix to a
cost matrix. The simplest way to do that is to subtract all elements from a
large value. For example::

    from munkres import Munkres, print_matrix

    matrix = [[5, 9, 1],
              [10, 3, 2],
              [8, 7, 4]]
    cost_matrix = []
    for row in matrix:
        cost_row = []
        for col in row:
            cost_row += [sys.maxint - col]
        cost_matrix += [cost_row]

    m = Munkres()
    indexes = m.compute(cost_matrix)
    print_matrix(matrix, msg='Highest profit through this matrix:')
    total = 0
    for row, column in indexes:
        value = matrix[row][column]
        total += value
        print '(%d, %d) -> %d' % (row, column, value)

    print 'total profit=%d' % total

Running that program produces::

    Highest profit through this matrix:
    [5, 9, 1]
    [10, 3, 2]
    [8, 7, 4]
    (0, 1) -> 9
    (1, 0) -> 10
    (2, 2) -> 4
    total profit=23

The ``munkres`` module provides a convenience method for creating a cost
matrix from a profit matrix. Since it doesn't know whether the matrix contains
floating point numbers, decimals, or integers, you have to provide the
conversion function; but the convenience method takes care of the actual
creation of the cost matrix::

    import munkres

    cost_matrix = munkres.make_cost_matrix(matrix,
                                           lambda cost: sys.maxint - cost)

So, the above profit-calculation program can be recast as::

    from munkres import Munkres, print_matrix, make_cost_matrix

    matrix = [[5, 9, 1],
              [10, 3, 2],
              [8, 7, 4]]
    cost_matrix = make_cost_matrix(matrix, lambda cost: sys.maxint - cost)
    m = Munkres()
    indexes = m.compute(cost_matrix)
    print_matrix(matrix, msg='Lowest cost through this matrix:')
    total = 0
    for row, column in indexes:
        value = matrix[row][column]
        total += value
        print '(%d, %d) -> %d' % (row, column, value)
    print 'total profit=%d' % total

References
==========

1. http://www.public.iastate.edu/~ddoty/HungarianAlgorithm.html

2. Harold W. Kuhn. The Hungarian Method for the assignment problem.
   *Naval Research Logistics Quarterly*, 2:83-97, 1955.

3. Harold W. Kuhn. Variants of the Hungarian method for assignment
   problems. *Naval Research Logistics Quarterly*, 3: 253-258, 1956.

4. Munkres, J. Algorithms for the Assignment and Transportation Problems.
   *Journal of the Society of Industrial and Applied Mathematics*,
   5(1):32-38, March, 1957.

5. http://en.wikipedia.org/wiki/Hungarian_algorithm

Copyright and License
=====================

This software is released under a BSD license, adapted from
<http://opensource.org/licenses/bsd-license.php>

Copyright (c) 2008 Brian M. Clapper
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice,
  this list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
  this list of conditions and the following disclaimer in the documentation
  and/or other materials provided with the distribution.

* Neither the name "clapper.org" nor the names of its contributors may be
  used to endorse or promote products derived from this software without
  specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
"""

__docformat__ = 'restructuredtext'

# ---------------------------------------------------------------------------
# Imports
# ---------------------------------------------------------------------------

import sys
import copy

# ---------------------------------------------------------------------------
# Exports
# ---------------------------------------------------------------------------

__all__     = ['Munkres', 'make_cost_matrix']

# ---------------------------------------------------------------------------
# Globals
# ---------------------------------------------------------------------------

# Info about the module
__version__   = "1.0.5.5"
__author__    = "Brian Clapper, bmc@clapper.org"
__url__       = "http://github.com/datapublica/munkres"
__copyright__ = "(c) 2008 Brian M. Clapper"
__license__   = "BSD-style license"

# ---------------------------------------------------------------------------
# Classes
# ---------------------------------------------------------------------------

class Munkres:
    """
    Calculate the Munkres solution to the classical assignment problem.
    See the module documentation for usage.
    """

    def __init__(self):
        """Create a new instance"""
        self.C = None
        self.row_covered = []
        self.col_covered = []
        self.n = 0
        self.Z0_r = 0
        self.Z0_c = 0
        self.marked = None
        self.path = None

    def make_cost_matrix(profit_matrix, inversion_function):
        """
        **DEPRECATED**

        Please use the module function ``make_cost_matrix()``.
        """
        import munkres
        return munkres.make_cost_matrix(profit_matrix, inversion_function)

    make_cost_matrix = staticmethod(make_cost_matrix)

    def pad_matrix(self, matrix, pad_value=0):
        """
        Pad a possibly non-square matrix to make it square.

        :Parameters:
            matrix : list of lists
                matrix to pad

            pad_value : int
                value to use to pad the matrix

        :rtype: list of lists
        :return: a new, possibly padded, matrix
        """
        max_columns = 0
        total_rows = len(matrix)

        for row in matrix:
            max_columns = max(max_columns, len(row))

        total_rows = max(max_columns, total_rows)

        new_matrix = []
        for row in matrix:
            row_len = len(row)
            new_row = row[:]
            if total_rows > row_len:
                # Row too short. Pad it.
                new_row += [pad_value] * (total_rows - row_len)
            new_matrix += [new_row]

        while len(new_matrix) < total_rows:
            new_matrix += [[pad_value] * total_rows]

        return new_matrix

    def compute(self, cost_matrix):
        """
        Compute the indexes for the lowest-cost pairings between rows and
        columns in the database. Returns a list of (row, column) tuples
        that can be used to traverse the matrix.

        :Parameters:
            cost_matrix : list of lists
                The cost matrix. If this cost matrix is not square, it
                will be padded with zeros, via a call to ``pad_matrix()``.
                (This method does *not* modify the caller's matrix. It
                operates on a copy of the matrix.)

                **WARNING**: This code handles square and rectangular
                matrices. It does *not* handle irregular matrices.

        :rtype: list
        :return: A list of ``(row, column)`` tuples that describe the lowest
                 cost path through the matrix

        """
        self.C = self.pad_matrix(cost_matrix)
        self.n = len(self.C)
        self.original_length = len(cost_matrix)
        self.original_width = len(cost_matrix[0])
        self.row_covered = [False for i in range(self.n)]
        self.col_covered = [False for i in range(self.n)]
        self.Z0_r = 0
        self.Z0_c = 0
        self.path = self.__make_matrix(self.n * 2, 0)
        self.marked = self.__make_matrix(self.n, 0)

        done = False
        step = 1

        steps = { 1 : self.__step1,
                  2 : self.__step2,
                  3 : self.__step3,
                  4 : self.__step4,
                  5 : self.__step5,
                  6 : self.__step6 }

        while not done:
            try:
                func = steps[step]
                step = func()
            except KeyError:
                done = True

        # Look for the starred columns
        results = []
        for i in range(self.original_length):
            for j in range(self.original_width):
                if self.marked[i][j] == 1:
                    results += [(i, j)]

        return results

    def __copy_matrix(self, matrix):
        """Return an exact copy of the supplied matrix"""
        return copy.deepcopy(matrix)

    def __make_matrix(self, n, val):
        """Create an *n*x*n* matrix, populating it with the specific value."""
        matrix = []
        for i in range(n):
            matrix += [[val for j in range(n)]]
        return matrix

    def __step1(self):
        """
        For each row of the matrix, find the smallest element and
        subtract it from every element in its row. Go to Step 2.
        """
        n = self.n
        for i in range(n):
            minval = min(self.C[i])
            # Find the minimum value for this row and subtract that minimum
            # from every element in the row.
            for j in range(n):
                self.C[i][j] -= minval

        return 2

    def __step2(self):
        """
        Find a zero (Z) in the resulting matrix. If there is no starred
        zero in its row or column, star Z. Repeat for each element in the
        matrix. Go to Step 3.
        """
        n = self.n
        for i in range(n):
            for j in range(n):
                if (self.C[i][j] == 0) and \
                   (not self.col_covered[j]) and \
                   (not self.row_covered[i]):
                    self.marked[i][j] = 1
                    self.col_covered[j] = True
                    self.row_covered[i] = True

        self.__clear_covers()
        return 3

    def __step3(self):
        """
        Cover each column containing a starred zero. If K columns are
        covered, the starred zeros describe a complete set of unique
        assignments. In this case, Go to DONE, otherwise, Go to Step 4.
        """
        n = self.n
        count = 0
        for i in range(n):
            for j in range(n):
                if self.marked[i][j] == 1:
                    self.col_covered[j] = True
                    count += 1

        if count >= n:
            step = 7 # done
        else:
            step = 4

        return step

    def __step4(self):
        """
        Find a noncovered zero and prime it. If there is no starred zero
        in the row containing this primed zero, Go to Step 5. Otherwise,
        cover this row and uncover the column containing the starred
        zero. Continue in this manner until there are no uncovered zeros
        left. Save the smallest uncovered value and Go to Step 6.
        """
        step = 0
        done = False
        row = -1
        col = -1
        star_col = -1
        while not done:
            (row, col) = self.__find_a_zero()
            if row < 0:
                done = True
                step = 6
            else:
                self.marked[row][col] = 2
                star_col = self.__find_star_in_row(row)
                if star_col >= 0:
                    col = star_col
                    self.row_covered[row] = True
                    self.col_covered[col] = False
                else:
                    done = True
                    self.Z0_r = row
                    self.Z0_c = col
                    step = 5

        return step

    def __step5(self):
        """
        Construct a series of alternating primed and starred zeros as
        follows. Let Z0 represent the uncovered primed zero found in Step 4.
        Let Z1 denote the starred zero in the column of Z0 (if any).
        Let Z2 denote the primed zero in the row of Z1 (there will always
        be one). Continue until the series terminates at a primed zero
        that has no starred zero in its column. Unstar each starred zero
        of the series, star each primed zero of the series, erase all
        primes and uncover every line in the matrix. Return to Step 3
        """
        count = 0
        path = self.path
        path[count][0] = self.Z0_r
        path[count][1] = self.Z0_c
        done = False
        while not done:
            row = self.__find_star_in_col(path[count][1])
            if row >= 0:
                count += 1
                path[count][0] = row
                path[count][1] = path[count-1][1]
            else:
                done = True

            if not done:
                col = self.__find_prime_in_row(path[count][0])
                count += 1
                path[count][0] = path[count-1][0]
                path[count][1] = col

        self.__convert_path(path, count)
        self.__clear_covers()
        self.__erase_primes()
        return 3

    def __step6(self):
        """
        Add the value found in Step 4 to every element of each covered
        row, and subtract it from every element of each uncovered column.
        Return to Step 4 without altering any stars, primes, or covered
        lines.
        """
        minval = self.__find_smallest()
        for i in range(self.n):
            for j in range(self.n):
                if self.row_covered[i]:
                    self.C[i][j] += minval
                if not self.col_covered[j]:
                    self.C[i][j] -= minval
        return 4

    def __find_smallest(self):
        """Find the smallest uncovered value in the matrix."""
        minval = sys.maxsize
        for i in range(self.n):
            for j in range(self.n):
                if (not self.row_covered[i]) and (not self.col_covered[j]):
                    if minval > self.C[i][j]:
                        minval = self.C[i][j]
        return minval

    def __find_a_zero(self):
        """Find the first uncovered element with value 0"""
        row = -1
        col = -1
        i = 0
        n = self.n
        done = False

        while not done:
            j = 0
            while True:
                if (self.C[i][j] == 0) and \
                   (not self.row_covered[i]) and \
                   (not self.col_covered[j]):
                    row = i
                    col = j
                    done = True
                j += 1
                if j >= n:
                    break
            i += 1
            if i >= n:
                done = True

        return (row, col)

    def __find_star_in_row(self, row):
        """
        Find the first starred element in the specified row. Returns
        the column index, or -1 if no starred element was found.
        """
        col = -1
        for j in range(self.n):
            if self.marked[row][j] == 1:
                col = j
                break

        return col

    def __find_star_in_col(self, col):
        """
        Find the first starred element in the specified row. Returns
        the row index, or -1 if no starred element was found.
        """
        row = -1
        for i in range(self.n):
            if self.marked[i][col] == 1:
                row = i
                break

        return row

    def __find_prime_in_row(self, row):
        """
        Find the first prime element in the specified row. Returns
        the column index, or -1 if no starred element was found.
        """
        col = -1
        for j in range(self.n):
            if self.marked[row][j] == 2:
                col = j
                break

        return col

    def __convert_path(self, path, count):
        for i in range(count+1):
            if self.marked[path[i][0]][path[i][1]] == 1:
                self.marked[path[i][0]][path[i][1]] = 0
            else:
                self.marked[path[i][0]][path[i][1]] = 1

    def __clear_covers(self):
        """Clear all covered matrix cells"""
        for i in range(self.n):
            self.row_covered[i] = False
            self.col_covered[i] = False

    def __erase_primes(self):
        """Erase all prime markings"""
        for i in range(self.n):
            for j in range(self.n):
                if self.marked[i][j] == 2:
                    self.marked[i][j] = 0

# ---------------------------------------------------------------------------
# Functions
# ---------------------------------------------------------------------------

def make_cost_matrix(profit_matrix, inversion_function):
    """
    Create a cost matrix from a profit matrix by calling
    'inversion_function' to invert each value. The inversion
    function must take one numeric argument (of any type) and return
    another numeric argument which is presumed to be the cost inverse
    of the original profit.

    This is a static method. Call it like this:

    .. python::

        cost_matrix = Munkres.make_cost_matrix(matrix, inversion_func)

    For example:

    .. python::

        cost_matrix = Munkres.make_cost_matrix(matrix, lambda x : sys.maxint - x)

    :Parameters:
        profit_matrix : list of lists
            The matrix to convert from a profit to a cost matrix

        inversion_function : function
            The function to use to invert each entry in the profit matrix

    :rtype: list of lists
    :return: The converted matrix
    """
    cost_matrix = []
    for row in profit_matrix:
        cost_matrix.append([inversion_function(value) for value in row])
    return cost_matrix

def print_matrix(matrix, msg=None):
    """
    Convenience function: Displays the contents of a matrix of integers.

    :Parameters:
        matrix : list of lists
            Matrix to print

        msg : str
            Optional message to print before displaying the matrix
    """
    import math

    if msg is not None:
        print(msg)

    # Calculate the appropriate format width.
    width = 0
    for row in matrix:
        for val in row:
            val_width = int(math.log10(abs(val))) + 1 if val != 0 else 0
            width = max(width, val_width)

    # Make the format string
    format = '%%%dd' % width

    # Print the matrix
    for row in matrix:
        sep = '['
        for val in row:
            sys.stdout.write(sep + format % val)
            sep = ', '
        sys.stdout.write(']\n')

# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------

if __name__ == '__main__':
    pass

    '''
    matrices = [
                # Square
                ([[400, 150, 400],
                  [400, 450, 600],
                  [300, 225, 300]],
                 850 # expected cost
                ),

                # Rectangular variant
                ([[400, 150, 400, 1],
                  [400, 450, 600, 2],
                  [300, 225, 300, 3]],
                 452 # expected cost
                ),

                # Square
                ([[10, 10,  8],
                  [ 9,  8,  1],
                  [ 9,  7,  4]],
                 18
                ),

                # Rectangular variant
                ([[10, 10,  8, 11],
                  [ 9,  8,  1, 1],
                  [ 9,  7,  4, 10]],
                 15
                ),
               ]

    m = Munkres()
    for cost_matrix, expected_total in matrices:
        print_matrix(cost_matrix, msg='cost matrix')
        indexes = m.compute(cost_matrix)
        total_cost = 0
        for r, c in indexes:
            x = cost_matrix[r][c]
            total_cost += x
            print('(%d, %d) -> %d' % (r, c, x))
        print('lowest cost=%d' % total_cost)
        assert expected_total == total_cost

    '''

This next cell contains the class Association() which controls the association process using the calculation of cost functions and the assignment of Lidar Detections to Video Detections using the Munkres method.

In [None]:
# # This contains the class Association() from association.py in the associations directory

# import numpy as np
# import munkres

# class Association:
#     def __init__(self):
#         pass

#     # the evaluate cost function receives to arguments:
#     # 1 - a dictionary called cost functions with the function method name as the key and the weight as the value
#     #
#     #     cost_functions = { costs.dist_between_centroids : 0.334,
#     #                        costs.dist_lidar_to_y2estimate : 0.333,
#     #                        costs.inverse_intersection_over_union : 0.333 }
#     #
#     # 2 - a dictionary contained in **kwargs with the keys containing the names of the two lists of objects
#     #     to be associated (video_detections and lidar_detections) and the values are the lists of objects to be
#     #     associated.
#     #
#     #    kwargs = {'video_detections' : video_detections, 'lidar_detections' : lidar_detections}
#     #
#     #
#     # The evaluate cost function evaluates the costs using methods that are contained in the Costs Class in the
#     # costs.py file.
#     # The function is called like this:
#     #
#     #     a = Association()
#     #     costs = a.evaluate_cost(cost_functions, **kwargs)
#     #
#     # each of the elements of the costs array represent the cost value between the
#     # i-th video_detection and the j-th lidar_detection where
#     # cost (i,j) = weight[0] * cost_function[0](i, j) + weight[1] * cost_function[1](i,j) + ... + weight[n] * cost_function[n](i,j)
#     # for n cost functions and weights in the cost_functions dictionary

#     def evaluate_cost(self, cost_functions, **kwargs):

#         array_size = []
#         for k,v in kwargs.items():
#             array_size.append(len(v))

#         cost = np.zeros((array_size[0], array_size[1]), np.float64 )
#         cost_components = []

#         for function_name, weight in cost_functions.items():
#             cost_component = function_name(**kwargs)
#             cost += weight * cost_component
#             cost_components.append(cost_component)

#         return cost, cost_components

#     def compute_munkres(self, cost):
#         m = munkres.Munkres()
#         assignments = m.compute(cost)
#         return assignments


# # this is a test of the Association class using the cost methods contained in the Costs class
# if __name__ == '__main__':
#     pass
#     '''
#     import src.detection.detection as video_det
#     import src.lidar.lidar_detection as lidar_det
#     import src.association.costs as costs

#     costs = costs.Costs()

#     vdet0 = video_det.Detection()
#     vdet0.bbox = [412, 375, 486, 421]
#     vdet1 = video_det.Detection()
#     vdet1.bbox = [762, 374, 799, 408]
#     vdet2 = video_det.Detection()
#     vdet2.bbox = [913, 338, 1020, 375]
#     vdet3 = video_det.Detection()
#     vdet3.bbox = [708, 374, 739, 400]
#     vdet4 = video_det.Detection()
#     vdet4.bbox = [613, 361, 650, 384]
#     vdet5 = video_det.Detection()
#     vdet5.bbox = [562, 369, 600, 396]
#     vdet6 = video_det.Detection()
#     vdet6.bbox = [774, 378, 990, 502]
#     vdet7 = video_det.Detection()
#     vdet7.bbox = [893, 350, 954, 377]
#     vdet8 = video_det.Detection()
#     vdet8.bbox = [171, 360, 301, 416]

#     video_detections = [vdet0, vdet1, vdet2, vdet3, vdet4, vdet5, vdet6, vdet7, vdet8]

#     ldet0 = lidar_det.LIDAR_detection(840, 2, 38.3435516357, 0)
#     ldet1 = lidar_det.LIDAR_detection(840, 11, 12.4829711914, 0)
#     ldet2 = lidar_det.LIDAR_detection(840, 12, 12.714263915999998, 0)
#     ldet3 = lidar_det.LIDAR_detection(840, 12, 36.3725891113, 0)
#     ldet4 = lidar_det.LIDAR_detection(840, 13, 12.671356201199998, 0)
#     ldet5 = lidar_det.LIDAR_detection(840, 15, 12.3006744385, 0)
#     lidar_detections = [ldet0, ldet1, ldet2, ldet3, ldet4, ldet5]

#     # enter the cost method names as keys in the dictionary and weights as their values
#     # the returned costs array will have a number of rows equal to the number of video_detection objects
#     # and a number of columns equal to the nubmer of lidar_detection objects.
#     #
#     # each of the elements of the array represent the cost value between the i-th video_detection and the j-th lidar_detection
#     # cost (i,j) = weight[0] * cost_function[0](i, j) + weight[1] * cost_function[1](i,j) + ... + weight[n] * cost_function[n](i,j)
#     # for n cost functions and weights in the cost_functions dictionary

#     cost_functions = { costs.dist_between_centroids : 0.334,
#                        costs.dist_lidar_to_y2estimate : 0.333,
#                        costs.inverse_intersection_over_union : 0.333 }

#     a = Association()

#     # enter the video_detections and lidar_detections lists into the kwargs dictionary
#     kwargs = {'video_detections' : video_detections, 'lidar_detections' : lidar_detections}

#     # evaluate the costs array by passing the cost_functions dictionary and the kwargs dictionary to the evaluate_costs method
#     costs = a.evaluate_cost(cost_functions, **kwargs)

#     b = 1
#     '''

In [None]:
# This contains code to display the current mouse location on the video screen

global x_pixel, y_pixel, no_mouse_click_count
x_pixel = -1
y_pixel = -1
max_no_mouse_click_count = 100
no_mouse_click_count = max_no_mouse_click_count

def mouse_click(event, x, y, flags, param):
    global x_pixel, y_pixel, no_mouse_click_count
    no_mouse_click_count = max_no_mouse_click_count

    if event == cv2.EVENT_MOUSEMOVE:
        x_pixel = x
        y_pixel = y


In [None]:
# This contains code to draw bounding boxes on an imagerobert@verix-6440:~$ cd ~/PycharmProjects/

def draw_bboxes(bboxes, img):
    """
    Draw bounding boxes to frame
    :param bboxes: list of bboxes in [x1,y1,x2,y2] format
    :param img: np.arr
ay
    :return: image with bboxes drawn
    """
    img = img.copy()
    if bboxes is not None and len(bboxes) > 0:
        for i, bb in enumerate(bboxes):
            cv2.rectangle(img, (bb[0], bb[1]), (bb[2], bb[3]), (0, 255, 0), 2)

    return img


In [None]:
# This is the main loop for processing and testing associations

def run_association_test(manual_test, read_file = False):

    # This contains all of the imports needed for running the association tests

    import os
    import cv2
    import numpy as np
    import pandas as pd
    from src.detection.car_detector_tf_v2 import CarDetectorTFV2
    from src.detection.detection import Detection
    from src.lidar.lidar_detection import LIDAR_detection
    from src.association.association import Association
    from src.association.costs import Costs
    import src.util.calibration_kitti as cal
    import math
    import pykitti
    import datetime
    #from tqdm import tqdm
    from tqdm import tqdm_notebook
    
    
    # This contains all of the defines and hyperparameters neededfor the associations tests
    WRITE_VIDEO_FILE = False
    WRITE_DATA_FILE = False

    # set USE_DETECTOR to True to use the Yolo CNN or False to use the ground truth for the Video Detections
    USE_DETECTOR = True
    # the minimum detection confidence level for the Yolo CNN
    CONFIDENCE_THRESHOLD = 0.8


    # the run control variables
    PAUSE = False
    DISP_LIDAR = False
    DISP_DET = False
    DISP_ASSOC = True
    DISP_ZONES = True
    DISP_TRUTH = True
    DISP_RESULTS = True
    SLOW = False
  
    global x_pixel, y_pixel, no_mouse_click_count
    x_pixel = -1
    y_pixel = -1
    max_no_mouse_click_count = 100
    no_mouse_click_count = max_no_mouse_click_count

    def mouse_click(event, x, y, flags, param):
        global x_pixel, y_pixel, no_mouse_click_count
        no_mouse_click_count = max_no_mouse_click_count

        if event == cv2.EVENT_MOUSEMOVE:
            x_pixel = x
            y_pixel = y
    
    # the locations and dates of the video and lidat data files
    PWD = './'
    DATA_DATE = '2011_09_26'
    RUN_NUMBER = '0015'
    DATA_DIR = '../data'

    video_frame_lag = 0
    dist_thresh = 0.2 # set higher to allow less accurate lidar readings to be labeled as correct

    # This contains the setup for the lidar detections

    lidar_left = cal.SEG_TO_PIXEL_LEFT[0]
    lidar_right = cal.SEG_TO_PIXEL_RIGHT[15]
    lidar_top = cal.SEG_TO_PIXEL_TOP
    lidar_bottom = cal.SEG_TO_PIXEL_BOTTOM

    # read the lidar data
    m16 = pd.read_csv('{}/{}/{}_filtered.csv'.format(DATA_DIR, DATA_DATE, RUN_NUMBER ), skiprows=2)

    # read the ground truth from the tracklets data
    gt_df = pd.read_csv('{}/{}'.format(DATA_DIR, '2011_09_26_drive_0015_sync_converted-tracklets.csv'))
    gt_df['dist'] = gt_df['dist'] * cal.cal['M_TO_FT']
    # remove all objects that are not vehicles
    gt_df = gt_df[(gt_df['label']=='Car') | (gt_df['label']=='Truck') | (gt_df['label']=='Van')]
    # remove all objects outside the range of the lidar detector
    gt_df = gt_df[(gt_df['dist']>=30) & (gt_df['dist'] <= 140)]
    #remove all objects outside of the lidar fov
    gt_df = gt_df[gt_df['x1'] <= lidar_right]
    gt_df = gt_df[gt_df['x2'] >= lidar_left]
    gt_df = gt_df[gt_df['y1'] <= lidar_bottom]
    gt_df = gt_df[gt_df['y2'] >= lidar_top]

    
    column_names_2 = ['run_num','use_detector', 'max_cost', 'w0', 'w1', 'w2', 'total_associations', 'accuracy', 'precision', 'recall', 'total_possible_associations', 'true_pos', 'false_pos', 'false_neg']
    #test_results = pd.read_csv('test_runs.csv')

    #manual_test = [0, False, 1, 0.95, 0.05, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    test_results = pd.DataFrame([manual_test], columns=column_names_2)


    # create a class to access the kitti dataset
    kitti_dataset = pykitti.raw(base_path='../data/', date=DATA_DATE, drive=RUN_NUMBER)

    if USE_DETECTOR:
        detector = CarDetectorTFV2()

    first_frame = True

    run_num = 0

    for run_num in tqdm_notebook(range(len(test_results))):

        total_possible_associations = 0
        true_pos = 0
        false_pos = 0
        false_neg = 0

        USE_DETECTOR = test_results.loc[(test_results['run_num'] == run_num)].use_detector.bool()



        max_cost = np.float(test_results.loc[test_results['run_num'] == run_num].max_cost)
        w0 = np.float(test_results.loc[test_results['run_num'] == run_num].w0)
        w1 = np.float(test_results.loc[test_results['run_num'] == run_num].w1)
        w2 = np.float(test_results.loc[test_results['run_num'] == run_num].w2)

        weights = [w0, w1, w2]

        column_names = ['frame', 'video_det_index', 'lidar_det_index', 'gt_index', 'lidar_dist', 'gt_dist', 'cost',
                        'correct', 'max_cost', 'dist_thresh', 'w0', 'c0', 'w1', 'c1', 'w2', 'c2', ]
        associations_record = pd.DataFrame([], columns=column_names)

        for frame_num, frame_filename in enumerate(kitti_dataset.cam2_files):
            new_frame = True
            frame = cv2.imread(frame_filename)
            success = frame.any()
            frame_draw = frame.copy()
            if not success:
                print('no frame')
                break

            if first_frame:
                cv2.namedWindow('draw_frame')
                cv2.setMouseCallback('draw_frame', mouse_click)
                first_frame = False


            # get the ground truth values
            gt_current_frame = gt_df.loc[gt_df['frame_number'] == frame_num]


            # fill in the list of detections
            bboxes = []
            c = Costs()

            if frame_num == 7:
                a = 1


            if USE_DETECTOR:
                # get the video detections
                
                bbs, class_names, confidences = detector.detect(img=frame, return_class_scores=True)
                for i, bb in enumerate(bbs):
                    # ensure bb is inside the window
                    bbs[i][0] = max(bb[0],0)
                    bbs[i][1] = max(bb[1],0)
                    bbs[i][2] = min(bb[2],cal.cal['X_RESOLUTION'])
                    bbs[i][3] = min(bb[3],cal.cal['Y_RESOLUTION'])
                n = len(class_names)
                eliminate_flag = np.zeros(n,np.int)
                for i, (bbox, class_name, confidence) in enumerate(zip(bbs, class_names, confidences)):
                    # filter out bounding boxes that do not intersect with the lidar zone and are not vehicles
                    if class_name not in ['car', 'truck', 'bus'] or confidence <= CONFIDENCE_THRESHOLD:
                        eliminate_flag[i] = 1
                    if (bbox[1] > lidar_bottom or bbox[3] < lidar_top or bbox[2] < lidar_left or bbox[0] > lidar_right):
                        eliminate_flag[i] = 1
                #remove the items from bbs, class_names and confidences
                new_bbs = []; new_class_names = []; new_confidences = [];
                for ii in range(n):
                    if eliminate_flag[ii] == 0:
                        new_bbs.append(bbs[ii])
                        new_class_names.append(class_names[ii])
                        new_confidences.append(confidences[ii])
                bbs = new_bbs
                class_names = new_class_names
                confidences = new_confidences

                n = len(class_names)
                eliminate_flag = np.zeros(n,np.int)

                #eliminate redundant detections - iou greater that 0.7 and different class_name
                for ii in range(n):
                    for jj in range(n):
                        if ii < jj and c._iou(bbs[ii],bbs[jj]) >= 0.7 and class_names[ii] != class_names[jj]:
                            eliminate_flag[jj] = 1

                #remove the redundant items from bbs, class_names and confidences
                new_bbs = []; new_class_names = []; new_confidences = [];
                for ii in range(n):
                    if eliminate_flag[ii] == 0:
                        new_bbs.append(bbs[ii])
                        new_class_names.append(class_names[ii])
                        new_confidences.append(confidences[ii])
                bbs = new_bbs
                class_names = new_class_names
                confidences = new_confidences
                # only append what's left
                for i, (bbox, class_name, confidence) in enumerate(zip(bbs, class_names, confidences)):
                    bboxes.append(bbox)

            else:
                # get the bounding boxes from the ground truth data
                gt_dist = []
                for i, gt in enumerate(gt_current_frame.values):
                    bboxes.append([gt[1], gt[2], gt[3], gt[4]])
                    gt_dist.append(gt[6])
                    total_possible_associations += 1

            video_detections = []
            if bboxes is not None and len(bboxes) > 0:
                for i, bb in enumerate(bboxes):
                    det = Detection()
                    det.bbox = np.array([bb[0], bb[1], bb[2], bb[3]])
                    det.frame_id = frame_num
                    video_detections.append(det)


            # get the lidar values
            lidar_vals = m16.loc[m16['frame'] == frame_num-video_frame_lag]

            lidar_detections = []
            for ii in range(len(lidar_vals)):
                if lidar_vals.iloc[ii,5] >= 30 and lidar_vals.iloc[ii,5] <= 140:
                    lidar_detection = LIDAR_detection(frame_num,int(lidar_vals.iloc[ii,4]),lidar_vals.iloc[ii,5],lidar_vals.iloc[ii,6])

                    lidar_detections.append(lidar_detection)



            # perform the associations task
            if len(lidar_detections) > 0 and len(video_detections) > 0:
                associations = []
                costs = Costs()

                # total_cost(i,j) = w_0 * cost_function_0(i,j) + w_1 * cost_function_1(i,j) + .. + w_n * cost_function_n(i,j)
                cost_functions = {costs.dist_between_centroids: weights[0],
                                  costs.dist_lidar_to_y2estimate: weights[1],
                                  costs.inverse_intersection_over_union: weights[2]}

                a = Association()

                # enter the video_detections and lidar_detections lists into the kwargs dictionary
                kwargs = {'video_detections': video_detections, 'lidar_detections': lidar_detections}

                # evaluate the costs array by passing the cost_functions dictionary and the kwargs dictionary to the evaluate_costs method
                costs, cost_components = a.evaluate_cost(cost_functions, **kwargs)

                original_costs = costs.copy()

                c_shape = np.shape(costs)
                rows = c_shape[0]
                cols = c_shape[1]
                if rows <= cols:
                    assignments = a.compute_munkres(costs)
#                     costs_T = np.transpose(costs)
#                     import pdb; pdb.set_trace()
#                     assignments_T = a.compute_munkres(costs_T)
#                     assignments = []

#                     for i, assignment in enumerate(assignments_T):
#                         assignments.append((assignment[1],assignment[0]))

                if len(assignments) != min(len(video_detections), len(lidar_detections)):
                    a = 1
                #determine if the associations are correct


                if USE_DETECTOR:
                    for i, gt in enumerate(gt_current_frame.values):
                        total_possible_associations += 1
                        bb_gt = [gt[1], gt[2], gt[3], gt[4]]
                        for j, assignment in enumerate(assignments):
                            if original_costs[assignment[0], assignment[1]] < max_cost:
                                bb_v = video_detections[assignment[0]].bbox
                                iou = c._iou(bb_gt, bb_v)
                                if iou > 0:
                                    dist_diff = abs(lidar_detections[assignment[1]].dist - gt[6])
                                    if dist_diff < dist_thresh * gt[6]:
                                        new_row = [frame_num, assignment[0], assignment[1], i,
                                                   lidar_detections[assignment[1]].dist, gt[6],
                                                   original_costs[assignment[0], assignment[1]], 'True', max_cost,
                                                   dist_thresh, weights[0],
                                                   cost_components[0][assignment[0], assignment[1]], weights[1],
                                                   cost_components[1][assignment[0], assignment[1]], weights[2],
                                                   cost_components[2][assignment[0], assignment[1]]]
                                        true_pos += 1
                                    else:
                                        new_row = [frame_num, assignment[0], assignment[1], i,
                                                   lidar_detections[assignment[1]].dist, gt[6],
                                                   original_costs[assignment[0], assignment[1]], 'False', max_cost,
                                                   dist_thresh, weights[0],
                                                   cost_components[0][assignment[0], assignment[1]], weights[1],
                                                   cost_components[1][assignment[0], assignment[1]], weights[2],
                                                   cost_components[2][assignment[0], assignment[1]]]
                                        false_pos += 1
                                    row_num = len(associations_record)
                                else:
                                    new_row = [frame_num, assignment[0], assignment[1], i,
                                               lidar_detections[assignment[1]].dist, gt[6],
                                               original_costs[assignment[0], assignment[1]], 'false_neg_iouzero', max_cost,
                                               dist_thresh, weights[0],
                                               cost_components[0][assignment[0], assignment[1]], weights[1],
                                               cost_components[1][assignment[0], assignment[1]], weights[2],
                                               cost_components[2][assignment[0], assignment[1]]]
                                    false_neg += 1
                            else:
                                new_row = [frame_num, assignment[0], assignment[1], i,
                                           lidar_detections[assignment[1]].dist, gt[6],
                                           original_costs[assignment[0], assignment[1]], 'false_neg_max_cost', max_cost,
                                           dist_thresh, weights[0],
                                           cost_components[0][assignment[0], assignment[1]], weights[1],
                                           cost_components[1][assignment[0], assignment[1]], weights[2],
                                           cost_components[2][assignment[0], assignment[1]]]
                                false_neg += 1
                            associations_record.loc[row_num] = new_row

                else:
                    cols = ['frame', 'video_det_index', 'lidar_det_index', 'gt_index', 'lidar_dist', 'gt_dist', 'cost', 'correct']
                    for assignment in assignments:
                        if original_costs[assignment[0],assignment[1]] < max_cost:
                            dist_diff = abs(lidar_detections[assignment[1]].dist - gt_dist[assignment[0]])
                            if dist_diff < dist_thresh * gt_dist[assignment[0]]:
                                new_row = [frame_num, assignment[0], assignment[1], i, lidar_detections[assignment[1]].dist,
                                           gt_dist[assignment[0]], original_costs[assignment[0], assignment[1]], 'True', max_cost, dist_thresh,
                                           weights[0], cost_components[0][assignment[0], assignment[1]], weights[1],
                                           cost_components[1][assignment[0], assignment[1]], weights[2],
                                           cost_components[2][assignment[0], assignment[1]]]
                                true_pos += 1
                            else:
                                new_row = [frame_num, assignment[0], assignment[1], i, lidar_detections[assignment[1]].dist,
                                           gt_dist[assignment[0]], original_costs[assignment[0], assignment[1]], 'False', max_cost, dist_thresh,
                                           weights[0], cost_components[0][assignment[0], assignment[1]], weights[1],
                                           cost_components[1][assignment[0], assignment[1]], weights[2],
                                           cost_components[2][assignment[0], assignment[1]]]
                                false_pos += 1
                            row_num = len(associations_record)       
                        else:
                            new_row = [frame_num, assignment[0], assignment[1], i, lidar_detections[assignment[1]].dist,
                                       gt_dist[assignment[0]], original_costs[assignment[0], assignment[1]], 'false_neg_maxcost',
                                       max_cost, dist_thresh,
                                       weights[0], cost_components[0][assignment[0], assignment[1]], weights[1],
                                       cost_components[1][assignment[0], assignment[1]], weights[2],
                                       cost_components[2][assignment[0], assignment[1]]]
                            false_neg += 1
                        associations_record.loc[row_num] = new_row

            #display the frame once if not PAUSE; continuously if PAUSE
            while new_frame or PAUSE:
                new_frame = False # only go through once unless PAUSE

                # draw vertical line in center of image
                cv2.line(frame_draw, (int(cal.cal['X_CENTER']), 0), (int(cal.cal['X_CENTER']), int(frame_draw.shape[0])),
                         (255, 0, 255), 1)
                cv2.line(frame_draw, (0, int(cal.cal['Y_HORIZON'])), (int(frame_draw.shape[1]), int(cal.cal['Y_HORIZON'])),
                         (255, 0, 255), 1)
                cv2.putText(frame_draw, 'frame: {0:0.0f}'.format(frame_num), (0, 25), 1, 2, (0, 0, 255), 2)

                if DISP_DET: # show video detections in green
                    for video_detection in video_detections:
                        cv2.rectangle(frame_draw, (int(bb[0]), int(bb[1])), (int(bb[2]), int(bb[3])), (0, 255, 0), 2)

                if DISP_LIDAR: # show lidar ideal bounding boxes in yellow
                    for lidar_detection in lidar_detections:
                        lidar_dist = lidar_detection.dist
                        bb = lidar_detection.bb
                        cv2.rectangle(img=frame_draw, pt1=(int(bb[0]), int(bb[1])), pt2=(int(bb[2]), int(bb[3])),
                                      color=(0, 255, 255), thickness=2)
                        cv2.putText(frame_draw, '{0:0.2f}'.format(lidar_dist), (int(bb[0]), int(bb[1])), 1, 1, (0, 0, 255), 2)

                if DISP_ASSOC: # show associations in blue and red connected by a yellow line
                    if len(lidar_detections) > 0 and len(video_detections) > 0:
                        for assignment in assignments:
                            if original_costs[assignment[0],assignment[1]] <= max_cost:
                                bb_v = video_detections[assignment[0]].bbox
                                dist_est = float(video_detections[assignment[0]].dist_est_y2[assignment[1]])
                                lidar_dist = lidar_detections[assignment[1]].dist
                                cv2.rectangle(img=frame_draw, pt1=(int(bb_v[0]),int(bb_v[1])), pt2=(int(bb_v[2]),int(bb_v[3])), color=(255,0,0), thickness=2)
        #                        cv2.putText(frame_draw, '{0:0.0f}'.format(assignment[0]), (int(bb_v[0]),int(bb_v[1])), 1, 1, (255, 0, 0), 2)
        #                        cv2.putText(frame_draw, '{0:0.2f}'.format(dist_est), (int(bb_v[0]-30), int(bb_v[3]+25)), 1, 1, (255, 0, 0), 2)
                                bb_l = lidar_detections[assignment[1]].bb
                                cv2.rectangle(img=frame_draw, pt1=(int(bb_l[0]),int(bb_l[1])), pt2=(int(bb_l[2]),int(bb_l[3])), color=(0,0,255), thickness=2)
        #                        cv2.putText(frame_draw, '{0:0.0f}'.format(assignment[1]), (int(bb_l[0]),int(bb_l[1])), 1, 1, (0, 0, 255), 2)
                                cv2.putText(frame_draw, '{0:0.2f}'.format(lidar_dist), (int(bb_l[0]),int(bb_l[3])+25), 1, 1, (0, 0, 255), 2)
                                cv2.line(img=frame_draw, pt1=(int(bb_v[0]),int(bb_v[1])), pt2=(int(bb_l[0]),int(bb_l[1])), color=(0,255,255), thickness=2)

                if DISP_ZONES: # show the lidar zone boundaries in black
                    y1 = int(cal.SEG_TO_PIXEL_TOP)
                    y2 = int(cal.SEG_TO_PIXEL_BOTTOM)
                    for i in range(16):
                        x = int(cal.SEG_TO_PIXEL_LEFT[i])
                        cv2.line(frame_draw, (x, y1), (x, y2), (0, 0, 0), thickness=1)
                        cv2.line(frame_draw, (x - 5, y1), (x + 5, y1), (0, 0, 0), thickness=1)
                        cv2.line(frame_draw, (x - 5, y2), (x + 5, y2), (0, 0, 0), thickness=1)

                    x = int(cal.SEG_TO_PIXEL_RIGHT[i])
                    cv2.line(frame_draw, (x, y1), (x, y2), (0, 0, 0), thickness=1)
                    cv2.line(frame_draw, (x - 5, y1), (x + 5, y1), (0, 0, 0), thickness=1)
                    cv2.line(frame_draw, (x - 5, y2), (x + 5, y2), (0, 0, 0), thickness=1)

                if DISP_TRUTH: # show the ground truth bboxes and dist
                    for i in range(len(gt_current_frame)):
                        x1, y1, x2, y2, label, dist = gt_current_frame.iloc[i,1:]
                        if not (x2 < lidar_left or x1 > lidar_right or y1 > lidar_bottom or y2 < lidar_top):
                            cv2.rectangle(img=frame_draw, pt1=(int(x1), int(y1)), pt2=(int(x2), int(y2)),color=(0, 255, 0), thickness=2)
                            cv2.putText(frame_draw, '{0:0.1f}'.format(dist), (int(x1), int(y1)), 1, 1,(0, 255, 0), 2)

                # show the nouse coordinates
                if x_pixel >= 0 and no_mouse_click_count > 0:
                    cv2.putText(frame_draw, '({0:0.0f}, {1:0.0f})'.format(x_pixel, y_pixel),
                                (cal.cal['X_RESOLUTION'] - 120, 15), 1, 1, (0, 0, 255), 2)
                    no_mouse_click_count -= 1

                if DISP_RESULTS:
                    pass

                cv2.putText(frame_draw, 'frame: {0:0.0f}'.format(frame_num), (0, 25), 1, 2, (0, 0, 255), 2)
                cv2.imshow('draw_frame', frame_draw)

                if SLOW:
                    key = cv2.waitKey(1000) & 0xFF
                else:
                    key = cv2.waitKey(30) & 0xFF

                if frame_num == 212:
                    a = 1

                if key == ord('q') or key == 27:
                    exit(0)
                if key == ord('p') or key == ord('P'):
                    PAUSE = not PAUSE
                if key == ord('l') or key == ord('L'):
                    DISP_LIDAR = not DISP_LIDAR
                if key == ord('d') or key == ord('D'):
                    DISP_DET = not DISP_DET
                if key == ord('a') or key == ord('A'):
                    DISP_ASSOC = not DISP_ASSOC
                if key == ord('s') or key == ord('S'):
                    SLOW = not SLOW
                if key == ord('z') or key == ord('Z'):
                    DISP_ZONES = not DISP_ZONES
                if key == ord('t') or key == ord('T'):
                    DISP_TRUTH = not DISP_TRUTH

        false_neg = total_possible_associations - (true_pos + false_pos)
        accuracy = true_pos / total_possible_associations
        if true_pos + false_pos > 0:
            precision = true_pos / (true_pos + false_pos)
        else:
            precision = np.nan

        if true_pos + false_neg > 0:
            recall = true_pos / (true_pos + false_neg)
        else:
            recall = np.nan

        now = datetime.datetime.now()
        filename = 'results_{0:04d}.csv'.format(run_num)

        associations_record.to_csv(filename, index=False)

        print('run: {0:0.0f}, accy: {1:0.3f}, prec: {2:0.3f}, recall: {3:0.3f}, total_assoc: {4:0.0f}, total_poss_assoc: {5:0.0f}, true_pos: {6:0.0f}, '
              'false_pos: {7:0.0f}, false_neg: {8:0.0f}, use_detector:{9:}, max_cost: {10:0.3f}, w0: {11:0.3f}, w1: {12:0.3f}, '
              'w2: {13:0.3f}'.format(run_num, accuracy, precision, recall, len(associations_record), total_possible_associations, true_pos, false_pos, false_neg, str(USE_DETECTOR), max_cost, weights[0], weights[1], weights[2]))

        #column_names_2 = ['run_num', 'max_cost', 'w0', 'w1', 'w2', 'total_associations', 'accuracy',
        #                  'total_possible_associations', 'true_pos', 'false_pos', 'false_neg']

        test_results.iloc[run_num,6] = len(associations_record)
        test_results.iloc[run_num,7] = accuracy
        test_results.iloc[run_num,8] = precision
        test_results.iloc[run_num,9] = recall
        test_results.iloc[run_num,10] = total_possible_associations
        test_results.iloc[run_num,11] = true_pos
        test_results.iloc[run_num,12] = false_pos
        test_results.iloc[run_num,13] = false_neg

    filename = 'test_results_' + str(now) + '.csv'
    test_results.to_csv(filename)
    cv2.waitKey(1)
    cv2.destroyAllWindows()
    cv2.waitKey(1)

    print('Run Complete!')
    
    return test_results




### Testing Definitions

The following definitions are used to evaluate the accuracy of the association process:


__True Positive:__

If: an association has a video detection bounding box that intersects with a Ground Truth bounding box __AND__ has a cost value less than the hyper-parameter max_cost __AND__ has a lidar distance value __within__ the hyper-parameter __dist_thresh__ (percentage) of the Ground Truth distance 

Then: it is labeled a __True Positive__ association (true_pos)

>These __True Positive__ associations are the correct associations with a video detection and lidar reading that is related with high confidence to a labeled Ground Truth object

__False Positive - Video Detector:__

If: an association is made with a video detection bounding box that does not intersect a ground truth bounding box 

Then: the association is labeled a __False Positive - Video Detector__
>These __False Positive - Video Detector__ associations are errors caused by the video detector rather than the association process so they do not effect the association accuracy

__False Positive:__

If: an association is made that has a cost value greater than the hyper-parameter max_cost __OR__ has a lidar distance value outside of the hyper-parameter __dist_thresh__ (percentage) of the Ground Truth distance 

Then: the association is labeled a __False Positive:__ (false_pos)

>These __False Positive:__ associations are the incorrect associations with a video detection and lidar reading that intersects a Ground Truth object but fails either the max_cost of dist_thresh tests. Note that the Munkres Algorithm always returns the association pairs that have a global minimum cost. Some of these associations may distances that are too far away from the ground truth or may just barely intersect the bounding box and need to be rejected using hyper-parameters.

__False Negative:__

If: all the associations have been processed and a ground truth object has not been labeled as either a True Positive or a False Positive using the rules above

Then: the ground truth object is labeled as a __False Negative__ (false_neg)

>These __False Negative__ associations are missing associations that failed to be made either due to the lack of a video detection that intersects with a Ground Truth object or the lack of a lidar detection that intersects with the video detection. The most common of these two faults is the missing video detection. To evaluate the magnitude of the missing video detections, the number of false negatives can be compared using a USE_DETECTION = False hyper-parameter on the association algorithm.


__True Negative:__

True Negatives are not evaluated in this algorithm because they are not applicable.

### Calculating Accuracy, Precision and Recall

In addition, the following equations are used to calculate the accuracy, precision and recall of a run.

\begin{equation*}
Accuracy = \frac{True Positives}{(True Positives + False Positives + False Negatives)}
\end{equation*}

\begin{equation*}
Precision = \frac{True Positives}{(True Positives + False Positives)}
\end{equation*}

\begin{equation*}
Recall = \frac{True Positives}{(True Positives + False Negatives)}
\end{equation*}


In [None]:
#column_names_2 = ['run_num','use_detector', 'max_cost', 'w0', 'w1', 'w2', 'total_associations', 'accuracy', 'precision', 'recall', 'total_possible_associations', 'true_pos', 'false_pos', 'false_neg']
manual_test1 = [0, False, 1, 0.95, 0.05, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# import pdb; pdb.set_trace()
test_results1 = run_association_test(manual_test1)






Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Use keras.layers.BatchNormalization instead.  In particular, `tf.control_dependencies(tf.GraphKeys.UPDATE_OPS)` should not be used (consult the `tf.keras.layers.batch_normalization` documentation).
Instructions for updating:
Use `tf.cast` instead.

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where



Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`


HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))

run: 0, accy: 0.998, prec: 0.998, recall: 1.000, total_assoc: 452, total_poss_assoc: 452, true_pos: 451, false_pos: 1, false_neg: 0, use_detector:False, max_cost: 1.000, w0: 0.950, w1: 0.050, w2: 0.000

Run Complete!


In [None]:
test_results1

Unnamed: 0,run_num,use_detector,max_cost,w0,w1,w2,total_associations,accuracy,precision,recall,total_possible_associations,true_pos,false_pos,false_neg
0,0,False,1,0.95,0.05,0,452,0.997788,0.997788,1.0,452,451,1,0


In [None]:
manual_test2 = [0, True, 1, 0.95, 0.05, 0, 0, 0, 0, 0, 0, 0, 0, 0]

test_results2 = run_association_test(manual_test2)


Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`


HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))

run: 0, accy: 0.834, prec: 0.973, recall: 0.854, total_assoc: 368, total_poss_assoc: 429, true_pos: 358, false_pos: 10, false_neg: 61, use_detector:True, max_cost: 1.000, w0: 0.950, w1: 0.050, w2: 0.000

Run Complete!


In [None]:
test_results2

Unnamed: 0,run_num,use_detector,max_cost,w0,w1,w2,total_associations,accuracy,precision,recall,total_possible_associations,true_pos,false_pos,false_neg
0,0,True,1,0.95,0.05,0,368,0.834499,0.972826,0.854415,429,358,10,61


dragon15
ddd   +##### 

In [None]:
import tensorflow as tf
tf.__version__

'1.14.0'