# Welcome to the Final Project!

Course: https://thinkautonomous.ai/obstacle-tracking

In this project, you will learn to associate bounding boxes on multiple frames using the Hungarian Algorithm!


You will work on 3 aspects of the **multi-object tracker**:

*   Use YOLO and launch an object detection algorithm
*   Use The Hungarian Algorithm and associate the boxes
*   Improve the algorithm to avoid false positives and false negatives


This is a part included to link your Google Colab file (.ipynb) to your Google Drive folder.

If you don't work on Colab, you won't need these.

In [1]:
import os
from google.colab import drive
drive.mount('/content/drive', force_remount=False)

os.chdir("/content/drive/My Drive/Think Autonomous/SDC Course/Tracking")
!ls

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive
association_hungarian.ipynb	     Images		     yolo.ipynb
association_hungarian_Starter.ipynb  Output		     yolo_nms.py
final_project.ipynb		     __pycache__	     yolo_Starter.ipynb
final_project_Kalman.ipynb	     Tracking.gslides	     Yolov3
final_project_Kalman_Starter.ipynb   yolo_for_tracking_2.py
final_project_Starter.ipynb	     yolo_for_tracking.py


# 1 - Detection

In order to make multi-object tracking work, we will need to do a detection step. The tracking will heavily rely on the detector, it better be good.

We will choose the [YOLO algorithm](https://pjreddie.com/darknet/yolo/) that is both accurate and fast.

<img src="https://miro.medium.com/max/1446/1*YpNE9OQeshABhBgjyEXlLA.png" width="500">


Eventually, we want bounding box detection

<img src="https://pjreddie.com/media/image/Screen_Shot_2018-03-24_at_10.48.42_PM.png" width="500">


## Import Libraries and Test Images

Let's import the libraries and test images.<p>
I took a video and wrote a short script to take a picture every 7 frame of the image. Instead of working at **60 FPS** (recording frame rate), consider you have an algorithm working at 60/7 or about **9 frame per second**.<p>_

**Why the cut ?**<p>
YOLO is very fast, it can work at 60 FPS.
For tracking to be a bit challenging, let's not have 99% IOU every time.

In [2]:
### Imports
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import copy
import pickle
import cv2

In [3]:
### Load the Images
dataset_images = pickle.load(open('Images/images_tracking.p', "rb"))

In [4]:
def visualize_images(input_images):
    fig=plt.figure(figsize=(100,100))

    for i in range(len(input_images)):
        fig.add_subplot(1, len(input_images), i+1)
        plt.imshow(input_images[i])
    plt.show()

In [5]:
visualize_images(dataset_images)

Output hidden; open in https://colab.research.google.com to view.

## Run Initial Obstacle Detection - Modify the YOLO file

We will need to modify the original yolo.py file. Go to this file, **duplicate it**, and **modify the duplicate online** using Google Drive's text editor. <p>_

**What modifications should you do?**<p>
The current *inference()* function outputs an image.
To work with the hungarian algorithm, we will need the bounding box.<p>
* **Modify the postprocess()** function as well as the **inference() function** to **return the bounding box**.
* Then, we will only work with the original image and the bounding boxes
* Call the new file **yolo_for_tracking.py**
 and import it

In [6]:
### Run obstacle detection for the images
from yolo_for_tracking import *

result_images = [] # Empty list for output images
result_boxes = [] # Empty list for output boxes

# Initiliaze an object detector
detector = YOLO()
images = copy.deepcopy(dataset_images)

# For every image, run a detector using the inference() function
for img in images:
    result, boxes = detector.inference(img)
    result_images.append(result)
    result_boxes.append(boxes)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [7]:
# Print the results and the detected boxes
visualize_images(result_images)

Output hidden; open in https://colab.research.google.com to view.

## One Obstacle - One Color
**Last Step!** <p>
We now one one color per bounding box! All cars in blue is useless! <p>
We will create an Obstacle class that we will modify.
Each detected obstacle should have:
* an id
* a current bounding box
* a previous bounding box<p>_

**In the end, we will draw a bounding box based on the id.** <p>
If the id changes, the color will change also.

In [8]:
class Obstacle():
    def __init__(self, idx, box):
        """
        Init function. The obstacle must have an id and a box.
        """
        self.idx = idx
        self.box = box

In [9]:
def id_to_color(idx):
    """
    Random function to convert an id to a color
    Do what you want here but keep numbers below 255
    """
    blue = idx*5 % 256
    green = idx*36 %256
    red = idx*23 %256
    return (red, green, blue)

In [10]:
def main():
    """
    Main function.
    You already ran the detector on all 9 images. The variable is result_boxes.
    Use this to assign an id and draw a rectangle based on the id.
    """
    idx = 0
    obstacles = []
    result_images_2 = copy.deepcopy(dataset_images) # Copy the image without modifying the dataset
    for j, boxes in enumerate(result_boxes): # loop through all images
        for i, box in enumerate(boxes): # loop through all boxes
            obs = Obstacle(idx, box)
            left, top, width, height = box
            right = left+width
            bottom = top+height
            cv2.rectangle(result_images_2[j], (left, top), (right, bottom), id_to_color(idx+i), thickness=10)
            idx +=1
            obstacles.append(obs)
    return result_images_2

result_images_2 = main()

In [11]:
## Print the results
visualize_images(result_images_2)

Output hidden; open in https://colab.research.google.com to view.

We did it!<p>...<p>

But as you can see, the colors are not kept along the images. **We don't have active tracking yet**!

# 2 - Association

We now have a detection algorithm working! Congratulations!
The next step is to **match the detections** from one frame to another and **keep the color along the 9 images**.<p>
It should be dynamic and **work no matter the number of images**. In the end, we'll apply **this algorithm on a video**.

Eventually, we'll want a good association system

![Texte alternatif…](https://miro.medium.com/proxy/0*yN9MllhmuglJORss.png)

## Metrics

The first thing we'll need is to define a metric!

* If you **don't want a big challenge**, the **IOU cost** will do just fine!
* If you want a **medium challenge**, you can try to **implement [this paper](https://arxiv.org/pdf/1709.03572.pdf)**. Read carefully page 19-20 and try to implement these costs with IOU. It will filter out incoherent boxes.
* If you want the **biggest challenge**, try to **code [Deep SORT](https://arxiv.org/pdf/1703.07402.pdf)** and associate Deep Convolutional features to it. <p>

In the end, you should have a **single number in the cost matrix**. And it should be representative of the cost, like IOU is.

### IOU COST

In [12]:
def convert_data(box):
    """
    Convert data from (x1,y1, w, h) to (x1,y1,x2,y2)
    """
    x1 = box[0]
    x2 = box[0] + box[2]
    y1 = box[1]
    y2 = box[1]+box[3]
    return x1,y1,x2,y2

def box_iou(box1, box2):
    """
    Computer Intersection Over Union cost
    """
    box1 = convert_data(box1)
    box2 = convert_data(box2)
    xA = max(box1[0], box2[0])
    yA = max(box1[1], box2[1])
    xB = min(box1[2], box2[2])
    yB = min(box1[3], box2[3])

    inter_area = max(0, xB - xA + 1) * max(0, yB - yA + 1) #abs((xi2 - xi1)*(yi2 - yi1))
    # Calculate the Union area by using Formula: Union(A,B) = A + B - Inter(A,B)
    box1_area = (box1[2] - box1[0] + 1) * (box1[3] - box1[1] + 1) #abs((box1[3] - box1[1])*(box1[2]- box1[0]))
    box2_area = (box2[2] - box2[0] + 1) * (box2[3] - box2[1] + 1) #abs((box2[3] - box2[1])*(box2[2]- box2[0]))
    union_area = (box1_area + box2_area) - inter_area
    # compute the IoU
    iou = inter_area/float(union_area)
    return iou

### Exponential, Linear, And IOU Costs

Use this paper to code the solution: https://arxiv.org/pdf/1709.03572.pdf

In [13]:
 from math import sqrt, exp

def check_division_by_0(value, epsilon=0.01):
    if value < epsilon:
        value = epsilon
    return value

def hungarian_cost(old_box, new_box, iou_thresh = 0.3, linear_thresh = 10000, exp_thresh = 0.5):
        w1 = 0.5
        w2 = 1.5
        (_, h, w, _) = np.array(dataset_images).shape
        # IOU COST
        iou_cost = box_iou(old_box, new_box)
        
        ### Sanchez-Matilla et al COST
        Q_dist = sqrt(pow(w,2)+pow(h,2)) # First real-life Pythagore use in your life
        Q_shape = w*h
        distance_term = Q_dist/check_division_by_0(sqrt(pow(old_box[0] - new_box[0], 2)+pow(old_box[1] -new_box[1],2)))
        shape_term = Q_shape/check_division_by_0(sqrt(pow(old_box[2] - new_box[2], 2)+pow(old_box[3] - new_box[3],2)))
        linear_cost = distance_term*shape_term

        ## YUL et al COST
        a= (old_box[0] - new_box[0])/check_division_by_0(old_box[2])
        a_2 = pow(a,2)
        b = (old_box[1] - new_box[1])/check_division_by_0(old_box[3])
        b_2 = pow(b,2)
        ab = (a_2+b_2)*w1*(-1)
        c = abs(old_box[3] - new_box[3])/(old_box[3]+new_box[3])
        d = abs(old_box[2]-new_box[2])/(old_box[2]+new_box[2])
        cd = (c+d)*w2*(-1)
        exponential_cost = exp(ab)*exp(cd)

        if (iou_cost >= iou_thresh and linear_cost>=linear_thresh and exponential_cost>=exp_thresh):
            return iou_cost
        else :
            return 0

## The Hungarian Algorithm
We can now use the previous code from the workshop to track bounding boxes!

* Create an **associate()** function that takes **two lists of boxes** (time t-1 and time t) and that outputs **the matches, the new detections, and the unmatched tracks**.

In [14]:
from scipy.optimize import linear_sum_assignment

def associate(old_boxes, new_boxes):
    """
    old_boxes will represent the former bounding boxes (at time 0)
    new_boxes will represent the new bounding boxes (at time 1)
    Function goal: Define a Hungarian Matrix with IOU as a metric and return, for each box, an id
    """
    # Define a new IOU Matrix nxm with old and new boxes
    iou_matrix = np.zeros((len(old_boxes),len(new_boxes)),dtype=np.float32)

    # Go through boxes and store the IOU value for each box 
    # You can also use the more challenging cost but still use IOU as a reference for convenience (use as a filter only)
    for i,old_box in enumerate(old_boxes):
        for j,new_box in enumerate(new_boxes):
            iou_matrix[i][j] = box_iou(old_box, new_box)
            #iou_matrix[i][j] = hungarian_cost(old_box, new_box)

    # Call for the Hungarian Algorithm
    hungarian_row, hungarian_col = linear_sum_assignment(-iou_matrix)
    hungarian_matrix = np.array(list(zip(hungarian_row, hungarian_col)))

    # Create new unmatched lists for old and new boxes
    matches, unmatched_detections, unmatched_trackers = [], [], []

    # Go through the Hungarian Matrix, if matched element has IOU < threshold (0.3), add it to the unmatched 
    # Else: add the match    
    for h in hungarian_matrix:
        if(iou_matrix[h[0],h[1]]<0.3):
            unmatched_trackers.append(old_boxes[h[0]])
            unmatched_detections.append(new_boxes[h[1]])
        else:
            matches.append(h.reshape(1,2))
    
    if(len(matches)==0):
        matches = np.empty((0,2),dtype=int)
    else:
        matches = np.concatenate(matches,axis=0)
    
    # Go through old boxes, if no matched detection, add it to the unmatched_old_boxes
    for t,trk in enumerate(old_boxes):
	    if(t not in hungarian_matrix[:,0]):
		    unmatched_trackers.append(trk)
    
    # Go through new boxes, if no matched tracking, add it to the unmatched_new_boxes
    for d, det in enumerate(new_boxes):
        if(d not in hungarian_matrix[:,1]):
                unmatched_detections.append(det)
    
    return matches, unmatched_detections,unmatched_trackers

## Main Loop

In [15]:
def main(input_image):
    """
    Receives an images
    Outputs the result image, and a list of obstacle objects 
    """
    global stored_obstacles # Will be used to keep track of obstacles information
    global idx # Will be used to keep track of id information
    # Run obstacle detection
    image = copy.deepcopy(input_image)
    _, out_boxes = yolo.inference(input_image)
    # First iteration: Simply create obstacles and draw them
    if (idx == 0):
        stored_obstacles = []
        for i, box in enumerate(out_boxes): # For all detected boxes
            obs = Obstacle(idx, box) # Create an obstacle
            left, top, right, bottom = convert_data(box) # Move to x1,y1,x2,y2
            cv2.rectangle(image, (left, top), (right, bottom), id_to_color(obs.idx), thickness=10) # Draw the box on the image with ids
            image = cv2.putText(image, str(obs.idx),(left - 10,top -10),cv2.FONT_HERSHEY_SIMPLEX, 1,id_to_color(obs.idx),thickness=4)
            stored_obstacles.append(obs) # Put every created obstacle in the final list                
            idx +=1 # Increase the id for every box
        return image, stored_obstacles

    elif (idx != 0): # In case we already have obstacles from previous frame, work on association
        ## Before calling associate, we must create a list of old obstacles
        old_obstacles = [obs.box for obs in stored_obstacles] # Simply get the boxes
        matches, unmatched_detections, unmatched_tracks = associate(old_obstacles, out_boxes) # Associate the obstacles
        new_obstacles = []

        # For every match, change the obstacle value
        # Assign the id to the matched id
        # Assign the box to the new box
        for match in matches:
            obs = Obstacle(stored_obstacles[match[0]].idx, out_boxes[match[1]])
            new_obstacles.append(obs)
        
        # Loop through all unmatched detections and add these as obstacles
        for new_obs in unmatched_detections:
            idx+=1
            obs = Obstacle(idx, new_obs)
            new_obstacles.append(obs)
        
        # For every obstacle, draw on the image and return it
        for i, obs in enumerate(new_obstacles):
            left, top, right, bottom = convert_data(obs.box)
            cv2.rectangle(image, (left, top), (right, bottom), id_to_color(obs.idx), thickness=10)
            image = cv2.putText(image, str(obs.idx),(left - 10,top - 10),cv2.FONT_HERSHEY_SIMPLEX, 1,id_to_color(obs.idx),thickness=4)                
        stored_obstacles = copy.deepcopy(new_obstacles)
        return image, stored_obstacles

In [16]:
### Call the main loop

yolo = YOLO()
idx = 0

fig=plt.figure(figsize=(100,100))

result_images_3 = copy.deepcopy(dataset_images)

out_imgs = []

for i in range(len(result_images_3)):
    out_img, stored_obstacles = main(result_images_3[i])
    out_imgs.append(out_img)
    fig.add_subplot(1, len(result_images_3), i+1)
    plt.imshow(out_imgs[i])

plt.show()

Output hidden; open in https://colab.research.google.com to view.

# 3 - Improvements


### Changing the Non Maxima Suppression formula

In OpenCV DNN, NMS (Non Maxima Suppression) is computer per class, insteaf of on the whole list. For that reason, we may arrive at unwanted results:
![](https://user-images.githubusercontent.com/25801568/79720833-01a88180-82ea-11ea-993b-8accd6b7fcc1.png)

I have found a new Non-Maxima Suppression formula we can use [on PyImageSearch's page](https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/).

We have to adapt it:

*   We have (x1,y1,w,h) and we the function takes (x1,y1, x2, y2)
*   That's it!




In [17]:
from yolo_nms import *

yolo = YOLO()
idx = 0

fig=plt.figure(figsize=(100,100))

result_images_3 = copy.deepcopy(dataset_images)

out_imgs = []

for i in range(len(result_images_3)):
    out_img, stored_obstacles = main(result_images_3[i])
    out_imgs.append(out_img)
    fig.add_subplot(1, len(result_images_3), i+1)
    plt.imshow(out_imgs[i])


Output hidden; open in https://colab.research.google.com to view.

### Using Age


We now have a pretty good tracker! <p>
One thing that is not very good is that it relies solely on the detector.
If we miss the detection, we miss everything. <p>

In this part, we'll introduce two ideas:
* False Positive
* False Negative<p>

A **false positive** means that you detected an obstacle that shouldn't detect.<p>
We'll solve it by introducing a **MIN_HIT_STREAK** variable. If the detector detects something once, it is not displayed. If it **detects it twice in a row**, or 3 times in a row (thanks to matching), it is displayed.

A **false negative** means that you didn't detect an obstacle that should have been detected.<p>
We'll solve it by introducing a **MAX_AGE** variable. If an obstacle is suddently unmatched, we **keep displaying** it. If it is unmatched again, or more times, we remove it.

In [18]:
MIN_HIT_STREAK = 2
MAX_UNMATCHED_AGE = 3

**Obstacle Class** <p>
Let's redefine the Obstacle class to include these values
Every obstacle should have:
* an id
* a box
* an age (number of times matched)
* an unmatched frame number (number of times unmatched)

In [19]:
class Obstacle():
    def __init__(self, idx, box, age=1, unmatched_age=0):
        self.idx = idx
        self.box = box
        self.age = age
        self.unmatched_age = unmatched_age

**Main Loop**<p>
Now we can redefine the main loop
We simply add conditions to display or not an obstacle

In [20]:
def main(input_image):
    """
    Receives an images
    Outputs the result image, and a list of obstacle objects 
    """
    global stored_obstacles # Will be used to keep track of obstacles information
    global idx # Will be used to keep track of id information
    # Run obstacle detection
    image = copy.deepcopy(input_image)
    _, out_boxes = yolo.inference(input_image)
    
    # What we will do will be very similar but we have a second list of obstacles that answer to the conditions
    # On first iteration, we only create obstacles with age=1
    if (idx == 0):
        stored_obstacles = []
        for i, box in enumerate(out_boxes):
            obs = Obstacle(idx, box) # Create an obstacle with age=1
            stored_obstacles.append(obs)                
            idx +=1
        return image
    
    # On this case, if the obstacle has already been matched, we display it depending on the MIN_HIT_STREAK variable
    elif (idx != 0): # In case we already have obstacles from previous frame, work on association
        ## Before calling associate, we must create a list of old obstacles
        old_obstacles = [obs.box for obs in stored_obstacles] # Simply get the boxes
        matches, unmatched_detections, unmatched_tracks = associate(old_obstacles, out_boxes) # Associate the obstacles
        
        selected_obstacles = []
        # Loop through all matches and add these as obstacles
        new_obstacles = []
        for match in matches:
            obs = Obstacle(stored_obstacles[match[0]].idx, out_boxes[match[1]], stored_obstacles[match[0]].age +1) # Increase the age by 1
            new_obstacles.append(obs)
            if obs.age >= MIN_HIT_STREAK:
                selected_obstacles.append(obs)
        
        # Loop through all unmatched detections and add these as obstacles
        for new_obs in unmatched_detections:
            idx +=1
            obs = Obstacle(idx, new_obs)
            new_obstacles.append(obs)
            if obs.age >= MIN_HIT_STREAK:
                selected_obstacles.append(obs)

        for i, old_obs in enumerate(unmatched_tracks):
            if (stored_obstacles[i].box == old_obs):
                obs = stored_obstacles[i] 
                obs.unmatched_age +=1
                if obs.unmatched_age <= MAX_UNMATCHED_AGE:
                    selected_obstacles.append(obs)

        # Draw on selected obstacles only
        for i, obs in enumerate(selected_obstacles):
            left, top, right, bottom = convert_data(obs.box)
            cv2.rectangle(image, (left, top), (right, bottom), id_to_color(obs.idx), thickness=10)
            image = cv2.putText(image, str(obs.idx),(left,top),cv2.FONT_HERSHEY_SIMPLEX, 1,id_to_color(obs.idx),thickness=4)                

        stored_obstacles = copy.deepcopy(new_obstacles)
        return image

In [21]:
yolo = YOLO()
idx = 0

fig=plt.figure(figsize=(100,100))

result_images_3 = copy.deepcopy(dataset_images)

out_imgs = []

for i in range(len(result_images_3)):
    out_img = main(result_images_3[i])
    out_imgs.append(out_img)
    fig.add_subplot(1, len(result_images_3), i+1)
    plt.imshow(out_imgs[i])

plt.show()

Output hidden; open in https://colab.research.google.com to view.

# 4 - Kalman Filters

It is now time to introduce Kalman Filters to this project.
Kalman Filters will help predict the future position of a bounding box, so that the association will always match in the future.

We'll use a Kalman Filters that has a state of 4 variables: x, y, w, h. <p>
These are the values returned by the YOLO algorithm.


The process will be the following
*   Detect bounding boxes
*   Associate it with the previous predictions
*   Predict the next position of each box
<p>Repeat <p>
The association is made with the prediction from t-1.

Alright!

In [33]:
!pip install filterpy



In [23]:
from filterpy.kalman import KalmanFilter
from scipy.linalg import block_diag
from filterpy.common import Q_discrete_white_noise
import time

def FourDimensionsKF(R_std=10, Q_std=0.01):
    """ Create first order Kalman filter. 
    Specify R and Q as floats."""
    kf = KalmanFilter(dim_x=8, dim_z=4)
    kf.F = np.array([[1, 1, 0,  0, 0, 0, 0, 0],
                    [0,  1, 0,  0, 0, 0, 0, 0],
                    [0,  0, 1, 1, 0, 0, 0, 0],
                    [0,  0, 0,  1, 0, 0, 0, 0],
                     [0, 0, 0, 0, 1, 1, 0, 0],
                     [0, 0, 0, 0, 0, 1, 0, 0],
                      [0, 0, 0, 0, 0, 0, 1, 1],
                     [0, 0, 0, 0, 0, 0, 0, 1]])

    kf.P *= 1000
    kf.R[2:, 2:] *= R_std
    kf.Q[-1, -1] *= Q_std
    kf.Q[4:, 4:] *= Q_std
    return kf

In [24]:
# The Obstacle Class will now have Kalman Filter values

class Obstacle():
    def __init__(self, idx, box, time, age=1, unmatched_age=0):
        self.idx = idx
        self.box = box
        self.age = age
        self.unmatched_age = unmatched_age
        self.time = time
        self.kf = FourDimensionsKF()
        self.kf.x = np.array([box[0], 0, box[1], 0, box[2], 0, box[3], 0])
        self.kf.H = np.array([[1, 0, 0, 0, 0, 0, 0, 0],
                             [0, 0, 1, 0, 0, 0, 0, 0],
                             [0, 0, 0, 0, 1, 0, 0, 0],
                             [0, 0, 0, 0, 0, 0, 1, 0]])

In [25]:
def get_obs_from_mean(mean):
    return [mean[0], mean[2], mean[4], mean[6]]

In [30]:
def return_F_with_dt(dt):
    #dt = 7./60. #IMAGE MODE WITH 1/7 IMAGE
    dt = 1./25. #VIDEO MODE
    #dt = 1
    return np.array([
                [1, dt, 0,  0, 0, 0, 0, 0],
                [0,  1, 0,  0, 0, 0, 0, 0],
                [0,  0, 1, dt, 0, 0, 0, 0],
                [0,  0, 0,  1, 0, 0, 0, 0],
                [0, 0, 0, 0, 1, dt, 0, 0],
                [0, 0, 0, 0, 0, 1, 0, 0],
                [0, 0, 0, 0, 0, 0, 1, dt],
                [0, 0, 0, 0, 0, 0, 0, 1]])

In [31]:
def main(input_image):
    """
    Receives an images
    Outputs the result image, and a list of obstacle objects 
    """
    global stored_obstacles
    global idx
    global yolo
    
    #print("Starting new loop ...")
    image = copy.deepcopy(input_image)
    _, out_boxes = yolo.inference(input_image)
    current_time = time.time()
    #print("Detected Boxes")
    #print(out_boxes)
    
    # First Detection, Initialize a Kalman Filter per Bounding Box
    if (idx == 0):
        stored_obstacles = []
        for i, box in enumerate(out_boxes):
            obs = Obstacle(idx, box, current_time)
            stored_obstacles.append(obs)
            idx +=1
            #print("Initialized Kalman Filter")
            #print(obs.kf.x)
        return input_image
    
    # Not First Detection, Match and KF
    elif (idx != 0):                
        # Match between old obstacles and new using Hungarian Algorithm
        old_boxes = [obs.box for obs in stored_obstacles]
        matches, unmatched_detections, unmatched_tracks = associate(old_boxes, out_boxes)

        selected_obstacles = []
        new_obstacles = []

        # For Matched Obstacles, Update & Predict the next position; store in Box for future match
        for match in matches:
            obs = stored_obstacles[match[0]] # Take the former obstacle and its ID
            obs.age +=1 # Increment the age by 1
            
            # Update
            measurement = out_boxes[match[1]]
            #print("Value before update")
            #print(obs.kf.x)
            
            obs.kf.update(np.array(measurement))
            #print("Updated Value: ")
            #print(get_obs_from_mean(obs.kf.x_post))

            # Prediction
            F = return_F_with_dt(current_time - obs.time)
            obs.kf.F = F
            obs.kf.predict()
            obs.time = current_time
            #print("New Prediction: ")
            #print(get_obs_from_mean(obs.kf.x_prior))
            # Update for future Match
            obs.box = get_obs_from_mean(obs.kf.x)
            
            #print("Matched Prediction: ", obs.box, " and: ",out_boxes[match[1]])

            new_obstacles.append(obs)
            if obs.age >= MIN_HIT_STREAK:
                selected_obstacles.append(obs)

        # For Unmatched Detections, Do the same as for idx = 0
        for new_obs in unmatched_detections:
            idx +=1
            obs = Obstacle(idx, new_obs, current_time)
            new_obstacles.append(obs)
            if obs.age >= MIN_HIT_STREAK:
                selected_obstacles.append(obs)

        # For Unmatched Tracks, Just Predict using dt
        for i, old_obs in enumerate(unmatched_tracks):
            if stored_obstacles[i].box == old_obs:
                #obs = Obstacle(idx, old_obs, current_time)
                obs = stored_obstacles[i]
                F = return_F_with_dt(current_time - obs.time)
                obs.time = current_time
                obs.kf.F = F
                obs.kf.predict()
                obs.box = get_obs_from_mean(obs.kf.x)
                #print("Unmatched Track; Prediction:")
                #print(obs.box)
                obs.unmatched_age +=1
                if obs.unmatched_age <= MAX_UNMATCHED_AGE:
                    selected_obstacles.append(obs)

        # Draw on selected obstacles only
        for i, obs in enumerate(selected_obstacles):
            left, top, right, bottom = convert_data(obs.box)
            cv2.rectangle(image, (int(left), int(top)), (int(right), int(bottom)), id_to_color(obs.idx), thickness=10)
            image = cv2.putText(image, str(obs.idx),(int(left),int(top)),cv2.FONT_HERSHEY_SIMPLEX, 1,id_to_color(obs.idx),thickness=4)                
        stored_obstacles = copy.deepcopy(new_obstacles)
        return image


In [28]:
yolo = YOLO()

idx = 0

fig=plt.figure(figsize=(100,100))

result_images_3 = copy.deepcopy(dataset_images)

out_imgs = []

for i in range(len(result_images_3)):
    out_img = main(result_images_3[i])
    out_imgs.append(out_img)
    fig.add_subplot(1, len(result_images_3), i+1)
    plt.imshow(out_imgs[i])

plt.show()

Output hidden; open in https://colab.research.google.com to view.

# Video


Now is the time to run on a video. Import the video_0 file and run it in Paris!
If you have GPU, it's even better.
Otherwise, use a subclip function to run it only on the first seconds

In [32]:
from moviepy.editor import VideoFileClip
idx = 0
detector = YOLO()
#video_file = "/content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Images/video_0.MOV"
#video_file = "/content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Images/MOT16-13-raw.mp4" #25 FPS
video_file = "/content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Images/MOT16-14-raw.mp4" #25 FPS
clip = VideoFileClip(video_file).subclip(0,5)
white_clip = clip.fl_image(main)
%time white_clip.write_videofile("/content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Output/movie_track_kf_out.mp4",audio=False)

[MoviePy] >>>> Building video /content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Output/movie_track_kf_out.mp4
[MoviePy] Writing video /content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Output/movie_track_kf_out.mp4



  0%|          | 0/126 [00:00<?, ?it/s][A
  1%|          | 1/126 [00:01<02:39,  1.28s/it][A
  2%|▏         | 2/126 [00:02<02:40,  1.30s/it][A
  2%|▏         | 3/126 [00:03<02:40,  1.31s/it][A
  3%|▎         | 4/126 [00:05<02:39,  1.31s/it][A
  4%|▍         | 5/126 [00:06<02:39,  1.32s/it][A
  5%|▍         | 6/126 [00:07<02:38,  1.32s/it][A
  6%|▌         | 7/126 [00:09<02:38,  1.33s/it][A
  6%|▋         | 8/126 [00:10<02:36,  1.32s/it][A
  7%|▋         | 9/126 [00:11<02:34,  1.32s/it][A
  8%|▊         | 10/126 [00:13<02:33,  1.32s/it][A
  9%|▊         | 11/126 [00:14<02:31,  1.32s/it][A
 10%|▉         | 12/126 [00:15<02:31,  1.33s/it][A
 10%|█         | 13/126 [00:17<02:30,  1.33s/it][A
 11%|█         | 14/126 [00:18<02:28,  1.33s/it][A
 12%|█▏        | 15/126 [00:19<02:27,  1.33s/it][A
 13%|█▎        | 16/126 [00:21<02:25,  1.33s/it][A
 13%|█▎        | 17/126 [00:22<02:23,  1.32s/it][A
 14%|█▍        | 18/126 [00:23<02:22,  1.32s/it][A
 15%|█▌        | 19/126 [00:2

[MoviePy] Done.
[MoviePy] >>>> Video ready: /content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Output/movie_track_kf_out.mp4 

CPU times: user 5min 5s, sys: 6.93 s, total: 5min 12s
Wall time: 2min 50s


In [None]:
import io
import base64
from IPython.display import HTML

video = io.open('/content/drive/My Drive/Think Autonomous/SDC Course/Tracking/Output/movie_track_kf.mp4', 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls width="320" height="240">
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii'))) 