<a href="https://colab.research.google.com/github/kylehounslow/association_algo_performance/blob/master/src/association_test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>




## The Association Process Testing

This file documents the process of performing association between lidar reading and detected bounding boxes of objects in the feild of view of both a video camera and a LeddarTech M16 Lidar Detector.

#### Background

__LIDAR Detector__

The LeddarTech M16 provides a non-scanning lidar measurement which returns high sampling speed distance measurements from 16 lidar zones that are arrayed from left to right in a 45 degree field of view horizontally and 7.5 degrees vertically. The individual segments cover equal angles of about 2.8 degrees horizontally and 7.5 degrees vertically. Each LeddarTech reading consists of a segment number (0-15) and a distance reading. There may be zero, one or more distance measurements per segment depending on objects that are in the field of view. The range of the M16 is about 140 feet.

__Video Object Detector__

The setup uses the Tensorflow Deep Learning software package https://www.tensorflow.org/ and the YOLOv3 model https://github.com/maiminh1996/YOLOv3-tensorflow to detect objects in the field of view of the camera.

__Association__

having both distance and object detection information available is extremely valuable but it is highly important that we know which lidar distance reading is associated with which object returned from the video object detector. This is __The Association Problem__ that is developed and tested in this notebook.


#### Description of the Problem

The drawing below shows a 3d Projection of the LeddarTech M16 field of view onto the field of view of the video camera.

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/Fields_of_View.jpg?raw=1 width=560></html>

When the setup is deployed in the field, the information that is retrieved for a single frame looks like the image below. Here you can see the lidar zones of the M16 shown with thin black lines across the lower middle of the image. 

__Displaying Lidar Readings__

If a zone of the lidar detector returns a distance reading the zone is highlighted in yellow and the distance reading (in feet) is displayed above the lidar zone. If more than one value is returned for a given zone, multiple red distance readings are stacked vertically over the lidar zone. 

__Displaying Video Detections__

The object detections that are returned from the TensorFlow Yolo detector are shown as green bounding boxes. The class of the object are also shown in green text in the lower left corner of the bounding box.

In the frame below, there were 5 lidar distance readings returned in zones 4, 5, 6, 7 & 8. And there were 3 objects detected. In this scenario, a fairly simple algorithm could be developed to map lidar values to objects.

These are the video detection objects:

|Object # | Bounding Box (x1, y1, x2, y2) | Object Class | Confidence |
| :-----: | :---------------------------: | :----------: | :--------: |
|    1    |   (495, 354, 561, 409)        |    Car       |  0.9839655 |
|    2    |   (663, 366, 697, 392)        |    Car       |  0.980555  |
|    3    |   (598, 368, 667, 420)        |    Car       |  0.9420464 |

And these are the lidar detections:

|Detection # | Segment |   Distance   |
| :--------: | :-----: | :----------: | 
|    1       |    4    | 105.799030   |
|    2       |    5    | 105.018769   |
|    3       |    6    | 104.506889   |
|    4       |    7    |  88.595796   |
|    5       |    8    |  88.714592   |

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/lidar_to_image_rendering_14.png?raw=1 width=1280></html>

However, in other situations, the situation is much more complex. Observe the complexity of the situation when more objects and lidar readings are detected. In this frame shown below, there are a total of 8 objects and 23 lidar detections. This presents a significant challenge to any association algorithm.

In this frame, these are the video detection objects:

|Object # | Bounding Box (x1, y1, x2, y2) | Object Class | Confidence  |
| :-----: | :---------------------------: | :----------: | :---------: |
|    1    |   ( -7, 317, 427, 621)        |    Car       |  0.9997973  |
|    2    |   (786, 354, 1027, 504)       |    Car       |  0.9997645  |
|    3    |   (380, 360, 512, 454)        |    Car       |  0.99277544 |
|    4    |   (1129, 418, 1270, 713)      |    Car       |  0.9813789  |
|    5    |   (589, 371, 669, 420)        |    Car       |  0.97814137 |
|    6    |   (741, 363, 825, 440)        |    Car       |  0.9496468  |
|    7    |   (707, 372, 738, 407)        |    Car       |  0.8582622  |
|    8    |   (517, 355, 578, 410)        |    Car       |  0.8155548  |

And these are the lidar detections:

|Detection # | Segment |   Distance   |
| :--------: | :-----: | :----------: | 
|    1       |    0    |   20.352862  |
|    2       |    3    |   51.171712  |
|    3       |    4    |   50.816675  |
|    4       |    5    |   50.729818  |
|    5       |    5    |   91.829979  |
|    6       |    6    |   91.648105  |
|    7       |    7    |   85.183094  |
|    8       |    7    |  112.802604  |
|    9       |    8    |   84.882674  |
|   10       |    8    |  111.584855  |
|   11       |    9    |   84.740849  |
|   12       |    9    |  112.994140  |
|   13       |   10    |   59.250310  |
|   14       |   10    |   84.906053  |
|   15       |   11    |   59.656511  |
|   16       |   12    |   34.480209  |
|   17       |   13    |   35.184426  |
|   18       |   13    |  108.944754  |
|   19       |   14    |   35.332408  |
|   20       |   14    |  108.647988  |
|   21       |   15    |   35.796530  |
|   22       |   14    |   37.425736  |
|   23       |   14    |  110.845094  |

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/lidar_to_image_rendering_19.png?raw=1 width=1280></html>


### Solving the Association Problem

The image below shows a close up of the simplest association problem so that we can outine an algorithm to address the solution.

<html><img src=https://github.com/rhanschristiansen/association_algo_performance/blob/master/src/images/closeup_association.png?raw=1 width=560></html>

These are the video detection objects (in green):

|Object # | Bounding Box (x1, y1, x2, y2) | Object Class | Confidence |
| :-----: | :---------------------------: | :----------: | :--------: |
|    1    |   (495, 354, 561, 409)        |    Car       |  0.9839655 |
|    2    |   (663, 366, 697, 392)        |    Car       |  0.980555  |
|    3    |   (598, 368, 667, 420)        |    Car       |  0.9420464 |

From left to right the three objects are 1, 3 & 2 

And these are the lidar detections (in red):

|Detection # | Segment |   Distance   |
| :--------: | :-----: | :----------: | 
|    1       |    4    | 105.799030   |
|    2       |    5    | 105.018769   |
|    3       |    6    | 104.506889   |
|    4       |    7    |  88.595796   |
|    5       |    8    |  88.714592   |

From left to right the lidar detections are in numerical order 1, 2, 3, 4 & 5 

From the data above, using our intuition we could make the following proposals for rules:

>1) A lidar reading can only be associated with the object if the object bounding box intersects the segment of the lidar region

If we represent an associations matrix with objects in rows and lidar detections in columns and with ones in position (i,j) representing an association between object i and lidar detection j. 

Using rule #1 we would have the following associations matrix:

| Obj# \ Det# |  1  |  2  |  3  |  4  |  5  |
| ----------: | :-: | :-: | :-: | :-: | :-: | 
|      1      |  0  |  1  |  1  |  0  |  0  |
|      2      |  0  |  0  |  0  |  0  |  1  |
|      3      |  0  |  0  |  0  |  1  |  1  |

Unfortunately, objects 1 and 3 both have two different lidar readings associated with them and lidar reading 5 is associated with two different objects.

For the second case, shown above, the association matrix would be significantly more complex:

| Obj# \ Det# |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  | 10  | 11  | 12  | 13  | 14  | 15  | 16  | 17  | 18  | 19  | 20  | 21  | 22  | 23  |
| ----------: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | 
|      1      |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      2      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |  1  |
|      3      |  0  |  1  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      4      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      5      |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      6      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      7      |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |
|      8      |  0  |  0  |  0  |  1  |  1  |  1  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |  0  |


In this case we have most objects with more than one lidar reading associated with them and several lidar readings that are associated with more than one object.

One approach that could be attempted is to develop a rules based approach for choosing the best lidar reading for each individual object. Once possible rule might be:

>2) When more than one lidar reading is associated with an object based on rule #1 above, choose the lidar reading with the closest distance.

Unfortunately, using this rule leads to an conflict with objects 3 and 8 that are both associated with lidar reading #4 using rule 2 which has the lowest distance. By observation, it is clear that lidar reading #4 should be associated with object #8 rather than object #2 because object #8 is the closer object that is closer than object #2.

By trying to create a rules based approach to solve the dilemma, we can run into an endless set of rules that are complex and self-contradictory.

### The Hungarian Algorithm

One of the methods to solve this assignment problem is the __Hungarian Algorithm__ http://hungarianalgorithm.com/index.php or one of its variants called the __Munkres Algorithm__ http://csclab.murraystate.edu/~bob.pilgrim/445/munkres.html. The Munk-Res algorithm is similar to the Hungarian algorithm but works with non-square matrices.


#### The cost function

The input to the Munkres Algorithm is a cost function matrix which defines a __cost__ value for each object/lidar detection assignment pair. The Munkres Algotithm will return an assignment matrix or assignment pairs where the overall sum of the costs for all the assignments is minimized. 

To make the algorithm work it is necessary to create a cost function that decreases as the quality of the assignment increases.

#### The Ideal Bounding Box

__Possible Cost Functions__

Several possible cost functions can be developed for the input to the Munkres Algorithm

>1) The L2 Norm - 

# Colab Environment Setup
Download the supporting source files and data to the Colab environment


In [None]:
def clone_source_code():
  """
  Clone the github repo and move to this working directory
  """
  print("Downloading source code...")
#   !_=$(git clone --quiet https://github.com/rhanschristiansen/association_algo_performance.git)
  !_=$(git clone --quiet https://github.com/kylehounslow/association_algo_performance.git)
  !mv association_algo_performance/* .
  !rm -rf association_algo_performance/

def download_extract_data():
  """
  Download data.zip from Google Drive and extract to this working directory
  """
  print("Downloading data...")
  !curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=1ornFmw59u_It0Cpi8yDHRdpW4G_6YLj5" > /dev/null
  !curl -s -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=1ornFmw59u_It0Cpi8yDHRdpW4G_6YLj5" -o "data.zip"
  print("Extracting data...")
  !unzip -q data.zip
def download_yolov3_weights():
  print("Downloading CNN weights...")
  !curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=1pdFmnxLEbezktEbA7tfnVoU50wowXI2r" > /dev/null
  !curl -s -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=1pdFmnxLEbezktEbA7tfnVoU50wowXI2r" -o "src/detection/yolov3/yolov3.weights"

def install_requirements():
  print("Installing pip packages...")
  !pip install --quiet -r src/requirements.txt

def setup():
  import sys
  sys.path.append('./') #add the parent directory to the path
  !rm -rf *
  clone_source_code()
  install_requirements()
  download_extract_data()
  download_yolov3_weights()
  print("Setup Complete.")

setup()

# The Code

## Helper Functions

In [None]:
def draw_bboxes(bboxes, img):
    """
    Draw bounding boxes to frame
    :param bboxes: list of bboxes in [x1,y1,x2,y2] format
    :param img: np.array
    :return: image with bboxes drawn
    """
    img = img.copy()
    if bboxes is not None and len(bboxes) > 0:
        for i, bb in enumerate(bboxes):
            cv2.rectangle(img, (bb[0], bb[1]), (bb[2], bb[3]), (0, 255, 0), 2)

    return img

def play_video_html(video_filepath: str, width: int = 720):
    from IPython.display import HTML
    from base64 import b64encode
    mp4 = open(video_filepath,'rb').read()
    data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
    return HTML("""
    <video width={width} controls autoplay>
        <source src="{data_url}" type="video/mp4">
    </video>
    """ .format(width=width, data_url=data_url))

def download_file(filepath: str):
    from google.colab import files
    files.download(filepath)

## Association Test Runner

In [None]:
# This is the main loop for processing and testing associations

def run_association_test(manual_test, output_video_file:str = "", read_file:bool = False):

    # This contains all of the imports needed for running the association tests

    import os
    import cv2
    import numpy as np
    import pandas as pd
    from src.detection.car_detector_tf_v2 import CarDetectorTFV2
    from src.detection.detection import Detection
    from src.lidar.lidar_detection import LIDAR_detection
    from src.association.association import Association
    from src.association.costs import Costs
    import src.util.calibration_kitti as cal
    import math
    import pykitti
    import datetime
    from tqdm.notebook import tqdm
    
    
    # This contains all of the defines and hyperparameters neededfor the associations tests
    video_writer = None
    WRITE_DATA_FILE = False

    # set USE_DETECTOR to True to use the Yolo CNN or False to use the ground truth for the Video Detections
    USE_DETECTOR = True
    # the minimum detection confidence level for the Yolo CNN
    CONFIDENCE_THRESHOLD = 0.8


    # the run control variables
    PAUSE = False
    DISP_LIDAR = False
    DISP_DET = False
    DISP_ASSOC = True
    DISP_ZONES = True
    DISP_TRUTH = True
    DISP_RESULTS = True
    SLOW = False
  
    global x_pixel, y_pixel, no_mouse_click_count
    x_pixel = -1
    y_pixel = -1
    max_no_mouse_click_count = 100
    no_mouse_click_count = max_no_mouse_click_count

    def mouse_click(event, x, y, flags, param):
        global x_pixel, y_pixel, no_mouse_click_count
        no_mouse_click_count = max_no_mouse_click_count

        if event == cv2.EVENT_MOUSEMOVE:
            x_pixel = x
            y_pixel = y
    
    # the locations and dates of the video and lidat data files
    PWD = './'
    DATA_DATE = '2011_09_26'
    RUN_NUMBER = '0015'
    DATA_DIR = './data'

    video_frame_lag = 0
    dist_thresh = 0.2 # set higher to allow less accurate lidar readings to be labeled as correct

    # This contains the setup for the lidar detections

    lidar_left = cal.SEG_TO_PIXEL_LEFT[0]
    lidar_right = cal.SEG_TO_PIXEL_RIGHT[15]
    lidar_top = cal.SEG_TO_PIXEL_TOP
    lidar_bottom = cal.SEG_TO_PIXEL_BOTTOM

    # read the lidar data
    m16 = pd.read_csv('{}/{}/{}_filtered.csv'.format(DATA_DIR, DATA_DATE, RUN_NUMBER ), skiprows=2)

    # read the ground truth from the tracklets data
    gt_df = pd.read_csv('{}/{}'.format(DATA_DIR, '2011_09_26_drive_0015_sync_converted-tracklets.csv'))
    gt_df['dist'] = gt_df['dist'] * cal.cal['M_TO_FT']
    # remove all objects that are not vehicles
    gt_df = gt_df[(gt_df['label']=='Car') | (gt_df['label']=='Truck') | (gt_df['label']=='Van')]
    # remove all objects outside the range of the lidar detector
    gt_df = gt_df[(gt_df['dist']>=30) & (gt_df['dist'] <= 140)]
    #remove all objects outside of the lidar fov
    gt_df = gt_df[gt_df['x1'] <= lidar_right]
    gt_df = gt_df[gt_df['x2'] >= lidar_left]
    gt_df = gt_df[gt_df['y1'] <= lidar_bottom]
    gt_df = gt_df[gt_df['y2'] >= lidar_top]

    
    column_names_2 = ['run_num','use_detector', 'max_cost', 'w0', 'w1', 'w2', 'total_associations', 'accuracy', 'precision', 'recall', 'total_possible_associations', 'true_pos', 'false_pos', 'false_neg']
    #test_results = pd.read_csv('test_runs.csv')

    #manual_test = [0, False, 1, 0.95, 0.05, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    test_results = pd.DataFrame([manual_test], columns=column_names_2)


    # create a class to access the kitti dataset
    kitti_dataset = pykitti.raw(base_path='./data/', date=DATA_DATE, drive=RUN_NUMBER)

    if USE_DETECTOR:
        detector = CarDetectorTFV2()

    first_frame = True

    run_num = 0

    for run_num in tqdm(range(len(test_results))):

        total_possible_associations = 0
        true_pos = 0
        false_pos = 0
        false_neg = 0

        USE_DETECTOR = test_results.loc[(test_results['run_num'] == run_num)].use_detector.bool()



        max_cost = np.float(test_results.loc[test_results['run_num'] == run_num].max_cost)
        w0 = np.float(test_results.loc[test_results['run_num'] == run_num].w0)
        w1 = np.float(test_results.loc[test_results['run_num'] == run_num].w1)
        w2 = np.float(test_results.loc[test_results['run_num'] == run_num].w2)

        weights = [w0, w1, w2]

        column_names = ['frame', 'video_det_index', 'lidar_det_index', 'gt_index', 'lidar_dist', 'gt_dist', 'cost',
                        'correct', 'max_cost', 'dist_thresh', 'w0', 'c0', 'w1', 'c1', 'w2', 'c2', ]
        associations_record = pd.DataFrame([], columns=column_names)

        for frame_num, frame_filename in tqdm(enumerate(kitti_dataset.cam2_files)):
            new_frame = True
            frame = cv2.imread(frame_filename)
            success = frame.any()
            frame_draw = frame.copy()
            if not success:
                print('no frame')
                break

            if first_frame:
                # cv2.namedWindow('draw_frame')
                # cv2.setMouseCallback('draw_frame', mouse_click)
                if output_video_file:
                    height, width, channels = frame.shape
                    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
                    video_writer = cv2.VideoWriter(output_video_file, 
                                                   fourcc, 
                                                   20, 
                                                   (width, height))
                first_frame = False


            # get the ground truth values
            gt_current_frame = gt_df.loc[gt_df['frame_number'] == frame_num]


            # fill in the list of detections
            bboxes = []
            c = Costs()

            if frame_num == 7:
                a = 1


            if USE_DETECTOR:
                # get the video detections
                
                bbs, class_names, confidences = detector.detect(img=frame, return_class_scores=True)
                for i, bb in enumerate(bbs):
                    # ensure bb is inside the window
                    bbs[i][0] = max(bb[0],0)
                    bbs[i][1] = max(bb[1],0)
                    bbs[i][2] = min(bb[2],cal.cal['X_RESOLUTION'])
                    bbs[i][3] = min(bb[3],cal.cal['Y_RESOLUTION'])
                n = len(class_names)
                eliminate_flag = np.zeros(n,np.int)
                for i, (bbox, class_name, confidence) in enumerate(zip(bbs, class_names, confidences)):
                    # filter out bounding boxes that do not intersect with the lidar zone and are not vehicles
                    if class_name not in ['car', 'truck', 'bus'] or confidence <= CONFIDENCE_THRESHOLD:
                        eliminate_flag[i] = 1
                    if (bbox[1] > lidar_bottom or bbox[3] < lidar_top or bbox[2] < lidar_left or bbox[0] > lidar_right):
                        eliminate_flag[i] = 1
                #remove the items from bbs, class_names and confidences
                new_bbs = []; new_class_names = []; new_confidences = [];
                for ii in range(n):
                    if eliminate_flag[ii] == 0:
                        new_bbs.append(bbs[ii])
                        new_class_names.append(class_names[ii])
                        new_confidences.append(confidences[ii])
                bbs = new_bbs
                class_names = new_class_names
                confidences = new_confidences

                n = len(class_names)
                eliminate_flag = np.zeros(n,np.int)

                #eliminate redundant detections - iou greater that 0.7 and different class_name
                for ii in range(n):
                    for jj in range(n):
                        if ii < jj and c._iou(bbs[ii],bbs[jj]) >= 0.7 and class_names[ii] != class_names[jj]:
                            eliminate_flag[jj] = 1

                #remove the redundant items from bbs, class_names and confidences
                new_bbs = []; new_class_names = []; new_confidences = [];
                for ii in range(n):
                    if eliminate_flag[ii] == 0:
                        new_bbs.append(bbs[ii])
                        new_class_names.append(class_names[ii])
                        new_confidences.append(confidences[ii])
                bbs = new_bbs
                class_names = new_class_names
                confidences = new_confidences
                # only append what's left
                for i, (bbox, class_name, confidence) in enumerate(zip(bbs, class_names, confidences)):
                    bboxes.append(bbox)

            else:
                # get the bounding boxes from the ground truth data
                gt_dist = []
                for i, gt in enumerate(gt_current_frame.values):
                    bboxes.append([gt[1], gt[2], gt[3], gt[4]])
                    gt_dist.append(gt[6])
                    total_possible_associations += 1

            video_detections = []
            if bboxes is not None and len(bboxes) > 0:
                for i, bb in enumerate(bboxes):
                    det = Detection()
                    det.bbox = np.array([bb[0], bb[1], bb[2], bb[3]])
                    det.frame_id = frame_num
                    video_detections.append(det)


            # get the lidar values
            lidar_vals = m16.loc[m16['frame'] == frame_num-video_frame_lag]

            lidar_detections = []
            for ii in range(len(lidar_vals)):
                if lidar_vals.iloc[ii,5] >= 30 and lidar_vals.iloc[ii,5] <= 140:
                    lidar_detection = LIDAR_detection(frame_num,int(lidar_vals.iloc[ii,4]),lidar_vals.iloc[ii,5],lidar_vals.iloc[ii,6])

                    lidar_detections.append(lidar_detection)



            # perform the associations task
            if len(lidar_detections) > 0 and len(video_detections) > 0:
                associations = []
                costs = Costs()

                # total_cost(i,j) = w_0 * cost_function_0(i,j) + w_1 * cost_function_1(i,j) + .. + w_n * cost_function_n(i,j)
                cost_functions = {costs.dist_between_centroids: weights[0],
                                  costs.dist_lidar_to_y2estimate: weights[1],
                                  costs.inverse_intersection_over_union: weights[2]}

                a = Association()

                # enter the video_detections and lidar_detections lists into the kwargs dictionary
                kwargs = {'video_detections': video_detections, 'lidar_detections': lidar_detections}

                # evaluate the costs array by passing the cost_functions dictionary and the kwargs dictionary to the evaluate_costs method
                costs, cost_components = a.evaluate_cost(cost_functions, **kwargs)

                original_costs = costs.copy()

                c_shape = np.shape(costs)
                rows = c_shape[0]
                cols = c_shape[1]
                if rows <= cols:
                    assignments = a.compute_munkres(costs)
#                     costs_T = np.transpose(costs)
#                     import pdb; pdb.set_trace()
#                     assignments_T = a.compute_munkres(costs_T)
#                     assignments = []

#                     for i, assignment in enumerate(assignments_T):
#                         assignments.append((assignment[1],assignment[0]))

                if len(assignments) != min(len(video_detections), len(lidar_detections)):
                    a = 1
                #determine if the associations are correct


                if USE_DETECTOR:
                    for i, gt in enumerate(gt_current_frame.values):
                        total_possible_associations += 1
                        bb_gt = [gt[1], gt[2], gt[3], gt[4]]
                        for j, assignment in enumerate(assignments):
                            if original_costs[assignment[0], assignment[1]] < max_cost:
                                bb_v = video_detections[assignment[0]].bbox
                                iou = c._iou(bb_gt, bb_v)
                                if iou > 0:
                                    dist_diff = abs(lidar_detections[assignment[1]].dist - gt[6])
                                    if dist_diff < dist_thresh * gt[6]:
                                        new_row = [frame_num, assignment[0], assignment[1], i,
                                                   lidar_detections[assignment[1]].dist, gt[6],
                                                   original_costs[assignment[0], assignment[1]], 'True', max_cost,
                                                   dist_thresh, weights[0],
                                                   cost_components[0][assignment[0], assignment[1]], weights[1],
                                                   cost_components[1][assignment[0], assignment[1]], weights[2],
                                                   cost_components[2][assignment[0], assignment[1]]]
                                        true_pos += 1
                                    else:
                                        new_row = [frame_num, assignment[0], assignment[1], i,
                                                   lidar_detections[assignment[1]].dist, gt[6],
                                                   original_costs[assignment[0], assignment[1]], 'False', max_cost,
                                                   dist_thresh, weights[0],
                                                   cost_components[0][assignment[0], assignment[1]], weights[1],
                                                   cost_components[1][assignment[0], assignment[1]], weights[2],
                                                   cost_components[2][assignment[0], assignment[1]]]
                                        false_pos += 1
                                    row_num = len(associations_record)
                                else:
                                    new_row = [frame_num, assignment[0], assignment[1], i,
                                               lidar_detections[assignment[1]].dist, gt[6],
                                               original_costs[assignment[0], assignment[1]], 'false_neg_iouzero', max_cost,
                                               dist_thresh, weights[0],
                                               cost_components[0][assignment[0], assignment[1]], weights[1],
                                               cost_components[1][assignment[0], assignment[1]], weights[2],
                                               cost_components[2][assignment[0], assignment[1]]]
                                    false_neg += 1
                            else:
                                new_row = [frame_num, assignment[0], assignment[1], i,
                                           lidar_detections[assignment[1]].dist, gt[6],
                                           original_costs[assignment[0], assignment[1]], 'false_neg_max_cost', max_cost,
                                           dist_thresh, weights[0],
                                           cost_components[0][assignment[0], assignment[1]], weights[1],
                                           cost_components[1][assignment[0], assignment[1]], weights[2],
                                           cost_components[2][assignment[0], assignment[1]]]
                                false_neg += 1
                            associations_record.loc[row_num] = new_row

                else:
                    cols = ['frame', 'video_det_index', 'lidar_det_index', 'gt_index', 'lidar_dist', 'gt_dist', 'cost', 'correct']
                    for assignment in assignments:
                        if original_costs[assignment[0],assignment[1]] < max_cost:
                            dist_diff = abs(lidar_detections[assignment[1]].dist - gt_dist[assignment[0]])
                            if dist_diff < dist_thresh * gt_dist[assignment[0]]:
                                new_row = [frame_num, assignment[0], assignment[1], i, lidar_detections[assignment[1]].dist,
                                           gt_dist[assignment[0]], original_costs[assignment[0], assignment[1]], 'True', max_cost, dist_thresh,
                                           weights[0], cost_components[0][assignment[0], assignment[1]], weights[1],
                                           cost_components[1][assignment[0], assignment[1]], weights[2],
                                           cost_components[2][assignment[0], assignment[1]]]
                                true_pos += 1
                            else:
                                new_row = [frame_num, assignment[0], assignment[1], i, lidar_detections[assignment[1]].dist,
                                           gt_dist[assignment[0]], original_costs[assignment[0], assignment[1]], 'False', max_cost, dist_thresh,
                                           weights[0], cost_components[0][assignment[0], assignment[1]], weights[1],
                                           cost_components[1][assignment[0], assignment[1]], weights[2],
                                           cost_components[2][assignment[0], assignment[1]]]
                                false_pos += 1
                            row_num = len(associations_record)       
                        else:
                            new_row = [frame_num, assignment[0], assignment[1], i, lidar_detections[assignment[1]].dist,
                                       gt_dist[assignment[0]], original_costs[assignment[0], assignment[1]], 'false_neg_maxcost',
                                       max_cost, dist_thresh,
                                       weights[0], cost_components[0][assignment[0], assignment[1]], weights[1],
                                       cost_components[1][assignment[0], assignment[1]], weights[2],
                                       cost_components[2][assignment[0], assignment[1]]]
                            false_neg += 1
                        associations_record.loc[row_num] = new_row

            #display the frame once if not PAUSE; continuously if PAUSE
            while new_frame or PAUSE:
                new_frame = False # only go through once unless PAUSE

                # draw vertical line in center of image
                cv2.line(frame_draw, (int(cal.cal['X_CENTER']), 0), (int(cal.cal['X_CENTER']), int(frame_draw.shape[0])),
                         (255, 0, 255), 1)
                cv2.line(frame_draw, (0, int(cal.cal['Y_HORIZON'])), (int(frame_draw.shape[1]), int(cal.cal['Y_HORIZON'])),
                         (255, 0, 255), 1)
                cv2.putText(frame_draw, 'frame: {0:0.0f}'.format(frame_num), (0, 25), 1, 2, (0, 0, 255), 2)

                if DISP_DET: # show video detections in green
                    for video_detection in video_detections:
                        cv2.rectangle(frame_draw, (int(bb[0]), int(bb[1])), (int(bb[2]), int(bb[3])), (0, 255, 0), 2)

                if DISP_LIDAR: # show lidar ideal bounding boxes in yellow
                    for lidar_detection in lidar_detections:
                        lidar_dist = lidar_detection.dist
                        bb = lidar_detection.bb
                        cv2.rectangle(img=frame_draw, pt1=(int(bb[0]), int(bb[1])), pt2=(int(bb[2]), int(bb[3])),
                                      color=(0, 255, 255), thickness=2)
                        cv2.putText(frame_draw, '{0:0.2f}'.format(lidar_dist), (int(bb[0]), int(bb[1])), 1, 1, (0, 0, 255), 2)

                if DISP_ASSOC: # show associations in blue and red connected by a yellow line
                    if len(lidar_detections) > 0 and len(video_detections) > 0:
                        for assignment in assignments:
                            if original_costs[assignment[0],assignment[1]] <= max_cost:
                                bb_v = video_detections[assignment[0]].bbox
                                dist_est = float(video_detections[assignment[0]].dist_est_y2[assignment[1]])
                                lidar_dist = lidar_detections[assignment[1]].dist
                                cv2.rectangle(img=frame_draw, pt1=(int(bb_v[0]),int(bb_v[1])), pt2=(int(bb_v[2]),int(bb_v[3])), color=(255,0,0), thickness=2)
        #                        cv2.putText(frame_draw, '{0:0.0f}'.format(assignment[0]), (int(bb_v[0]),int(bb_v[1])), 1, 1, (255, 0, 0), 2)
        #                        cv2.putText(frame_draw, '{0:0.2f}'.format(dist_est), (int(bb_v[0]-30), int(bb_v[3]+25)), 1, 1, (255, 0, 0), 2)
                                bb_l = lidar_detections[assignment[1]].bb
                                cv2.rectangle(img=frame_draw, pt1=(int(bb_l[0]),int(bb_l[1])), pt2=(int(bb_l[2]),int(bb_l[3])), color=(0,0,255), thickness=2)
        #                        cv2.putText(frame_draw, '{0:0.0f}'.format(assignment[1]), (int(bb_l[0]),int(bb_l[1])), 1, 1, (0, 0, 255), 2)
                                cv2.putText(frame_draw, '{0:0.2f}'.format(lidar_dist), (int(bb_l[0]),int(bb_l[3])+25), 1, 1, (0, 0, 255), 2)
                                cv2.line(img=frame_draw, pt1=(int(bb_v[0]),int(bb_v[1])), pt2=(int(bb_l[0]),int(bb_l[1])), color=(0,255,255), thickness=2)

                if DISP_ZONES: # show the lidar zone boundaries in black
                    y1 = int(cal.SEG_TO_PIXEL_TOP)
                    y2 = int(cal.SEG_TO_PIXEL_BOTTOM)
                    for i in range(16):
                        x = int(cal.SEG_TO_PIXEL_LEFT[i])
                        cv2.line(frame_draw, (x, y1), (x, y2), (0, 0, 0), thickness=1)
                        cv2.line(frame_draw, (x - 5, y1), (x + 5, y1), (0, 0, 0), thickness=1)
                        cv2.line(frame_draw, (x - 5, y2), (x + 5, y2), (0, 0, 0), thickness=1)

                    x = int(cal.SEG_TO_PIXEL_RIGHT[i])
                    cv2.line(frame_draw, (x, y1), (x, y2), (0, 0, 0), thickness=1)
                    cv2.line(frame_draw, (x - 5, y1), (x + 5, y1), (0, 0, 0), thickness=1)
                    cv2.line(frame_draw, (x - 5, y2), (x + 5, y2), (0, 0, 0), thickness=1)

                if DISP_TRUTH: # show the ground truth bboxes and dist
                    for i in range(len(gt_current_frame)):
                        x1, y1, x2, y2, label, dist = gt_current_frame.iloc[i,1:]
                        if not (x2 < lidar_left or x1 > lidar_right or y1 > lidar_bottom or y2 < lidar_top):
                            cv2.rectangle(img=frame_draw, pt1=(int(x1), int(y1)), pt2=(int(x2), int(y2)),color=(0, 255, 0), thickness=2)
                            cv2.putText(frame_draw, '{0:0.1f}'.format(dist), (int(x1), int(y1)), 1, 1,(0, 255, 0), 2)

                # show the nouse coordinates
                if x_pixel >= 0 and no_mouse_click_count > 0:
                    cv2.putText(frame_draw, '({0:0.0f}, {1:0.0f})'.format(x_pixel, y_pixel),
                                (cal.cal['X_RESOLUTION'] - 120, 15), 1, 1, (0, 0, 255), 2)
                    no_mouse_click_count -= 1

                if DISP_RESULTS:
                    pass

                cv2.putText(frame_draw, 'frame: {0:0.0f}'.format(frame_num), (0, 25), 1, 2, (0, 0, 255), 2)
                # cv2.imshow('draw_frame', frame_draw)
                if video_writer:
                    video_writer.write(frame_draw)

                if SLOW:
                    key = cv2.waitKey(1000) & 0xFF
                else:
                    key = cv2.waitKey(30) & 0xFF

                if frame_num == 212:
                    a = 1

                if key == ord('q') or key == 27:
                    exit(0)
                if key == ord('p') or key == ord('P'):
                    PAUSE = not PAUSE
                if key == ord('l') or key == ord('L'):
                    DISP_LIDAR = not DISP_LIDAR
                if key == ord('d') or key == ord('D'):
                    DISP_DET = not DISP_DET
                if key == ord('a') or key == ord('A'):
                    DISP_ASSOC = not DISP_ASSOC
                if key == ord('s') or key == ord('S'):
                    SLOW = not SLOW
                if key == ord('z') or key == ord('Z'):
                    DISP_ZONES = not DISP_ZONES
                if key == ord('t') or key == ord('T'):
                    DISP_TRUTH = not DISP_TRUTH

        false_neg = total_possible_associations - (true_pos + false_pos)
        accuracy = true_pos / total_possible_associations
        if true_pos + false_pos > 0:
            precision = true_pos / (true_pos + false_pos)
        else:
            precision = np.nan

        if true_pos + false_neg > 0:
            recall = true_pos / (true_pos + false_neg)
        else:
            recall = np.nan

        now = datetime.datetime.now()
        filename = 'results_{0:04d}.csv'.format(run_num)

        associations_record.to_csv(filename, index=False)

        print('run: {0:0.0f}, accy: {1:0.3f}, prec: {2:0.3f}, recall: {3:0.3f}, total_assoc: {4:0.0f}, total_poss_assoc: {5:0.0f}, true_pos: {6:0.0f}, '
              'false_pos: {7:0.0f}, false_neg: {8:0.0f}, use_detector:{9:}, max_cost: {10:0.3f}, w0: {11:0.3f}, w1: {12:0.3f}, '
              'w2: {13:0.3f}'.format(run_num, accuracy, precision, recall, len(associations_record), total_possible_associations, true_pos, false_pos, false_neg, str(USE_DETECTOR), max_cost, weights[0], weights[1], weights[2]))

        #column_names_2 = ['run_num', 'max_cost', 'w0', 'w1', 'w2', 'total_associations', 'accuracy',
        #                  'total_possible_associations', 'true_pos', 'false_pos', 'false_neg']

        test_results.iloc[run_num,6] = len(associations_record)
        test_results.iloc[run_num,7] = accuracy
        test_results.iloc[run_num,8] = precision
        test_results.iloc[run_num,9] = recall
        test_results.iloc[run_num,10] = total_possible_associations
        test_results.iloc[run_num,11] = true_pos
        test_results.iloc[run_num,12] = false_pos
        test_results.iloc[run_num,13] = false_neg

    filename = 'test_results_' + str(now) + '.csv'
    test_results.to_csv(filename)
    if video_writer:
        video_writer.release()
    # cv2.waitKey(1)
    # cv2.destroyAllWindows()
    # cv2.waitKey(1)

    print('Run Complete!')
    
    return test_results




## Run Tests

### Testing Definitions

The following definitions are used to evaluate the accuracy of the association process:


__True Positive:__

If: an association has a video detection bounding box that intersects with a Ground Truth bounding box __AND__ has a cost value less than the hyper-parameter max_cost __AND__ has a lidar distance value __within__ the hyper-parameter __dist_thresh__ (percentage) of the Ground Truth distance 

Then: it is labeled a __True Positive__ association (true_pos)

>These __True Positive__ associations are the correct associations with a video detection and lidar reading that is related with high confidence to a labeled Ground Truth object

__False Positive - Video Detector:__

If: an association is made with a video detection bounding box that does not intersect a ground truth bounding box 

Then: the association is labeled a __False Positive - Video Detector__
>These __False Positive - Video Detector__ associations are errors caused by the video detector rather than the association process so they do not effect the association accuracy

__False Positive:__

If: an association is made that has a cost value greater than the hyper-parameter max_cost __OR__ has a lidar distance value outside of the hyper-parameter __dist_thresh__ (percentage) of the Ground Truth distance 

Then: the association is labeled a __False Positive:__ (false_pos)

>These __False Positive:__ associations are the incorrect associations with a video detection and lidar reading that intersects a Ground Truth object but fails either the max_cost of dist_thresh tests. Note that the Munkres Algorithm always returns the association pairs that have a global minimum cost. Some of these associations may distances that are too far away from the ground truth or may just barely intersect the bounding box and need to be rejected using hyper-parameters.

__False Negative:__

If: all the associations have been processed and a ground truth object has not been labeled as either a True Positive or a False Positive using the rules above

Then: the ground truth object is labeled as a __False Negative__ (false_neg)

>These __False Negative__ associations are missing associations that failed to be made either due to the lack of a video detection that intersects with a Ground Truth object or the lack of a lidar detection that intersects with the video detection. The most common of these two faults is the missing video detection. To evaluate the magnitude of the missing video detections, the number of false negatives can be compared using a USE_DETECTION = False hyper-parameter on the association algorithm.


__True Negative:__

True Negatives are not evaluated in this algorithm because they are not applicable.

### Calculating Accuracy, Precision and Recall

In addition, the following equations are used to calculate the accuracy, precision and recall of a run.

\begin{equation*}
Accuracy = \frac{True Positives}{(True Positives + False Positives + False Negatives)}
\end{equation*}

\begin{equation*}
Precision = \frac{True Positives}{(True Positives + False Positives)}
\end{equation*}

\begin{equation*}
Recall = \frac{True Positives}{(True Positives + False Negatives)}
\end{equation*}


### Run Test #1

In [None]:
#column_names_2 = ['run_num','use_detector', 'max_cost', 'w0', 'w1', 'w2', 'total_associations', 'accuracy', 'precision', 'recall', 'total_possible_associations', 'true_pos', 'false_pos', 'false_neg']
manual_test1 = [0, False, 1, 0.95, 0.05, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# import pdb; pdb.set_trace()
output_video_file="test_results1.mp4"
test_results1 = run_association_test(manual_test1, output_video_file=output_video_file)
test_results1

### Download video of test results
Execute the following cell to download the video from the test run 

In [None]:
download_file(output_video_file)

### Run Test #2

In [None]:
manual_test2 = [0, True, 1, 0.95, 0.05, 0, 0, 0, 0, 0, 0, 0, 0, 0]
output_video_file2="test_results2.mp4"
test_results2 = run_association_test(manual_test2, output_video_file=output_video_file)
test_results2

In [None]:
download_file(output_video_file2)