# Introduction to Object Tracking

**Instructional Video on Tracking**

https://youtu.be/Ky8J56lbm3c 

**Simple Approach to Object Tracking**

Each time we detect objects in a frame of video, we start fresh, forgetting about all the objects in the previous frame. But the new objects are typically the same objects in the previous frame, possibly with a small change in position due to movement. What tracking does is give each object an identification number (objID) and maintains that id as the object moves in successive frames.

Here is the terminology I like to use. We match an **old object** in the **old frame** (previous frame) to the closest **new object** in the **new frame** (current frame). You will also notice there are print statements in the code that are commented out. I left them in there so you can see how I debug my code, by printing out signficant variables while the programming is running.

**Simple Object Tracker Implimentation**

We will implement a very simple tracking algorithm, called the centroid tracker, because it tracks the center of each object. The assumption is a new object in the new frame hasn't moved much since the old frame. Therefore, a new object is the same object in the old frame with the closest proximity. To do this, for every new object in the new frame, we calculate the center of the object (xc,yc), and find the object in the old frame that is closest to xc,yc. This works as long as the objects aren't too close to each other (avoiding merging and splitting), don't move too fast, and don't go in and out of frame. Once we do all the matching, any left over old objects are called **orphans**. They either left the display or we experienced detector dropout. Any left over new objects are called **newborns**, they either just came into the frame are are produced by detector **dropins**. In the next section we will go into more detail on tracking failure modes.

**Tracking Failure Modes**

The simple tracker works well when there are just a few objects and they don't cross or touch each other. However, there are several conditions that will cause tracking errors. We will consider three cases; when objects move in and out of frame, when objects in frame overlap each other, and when the detector fails. I'll now describe some cases where the simple tracker will fail. In the first case, if an object moves out of frame, it will disappear from the new frame creating **orphans** (old object that doesn't have a corresponding new object). Conversly, an object can move into the new frame, it will not be in the previous frame, and will appear in the new frame (**newborns**). 

In the second case, when two objects in a previous frame, overlap in the current frame, they will be considered one object, a condition we call **merging**. Conversly, when two objects that were merged in a previous frame, separate into two distinct objects in the current frame, we call this **splitting**. 

In the third case, the detection is faulty. For the simple detection method we are using, this happens when there is variation in the object brightness from frame-to-frame and the brightness is near the quantization threshold. When the brightness dips below the threshould, the object is not detected, and disappears. We call this **dropout**. When the brightness increases above the threshold, it reappears (let's call it **drop-in**), often giving the appearance of **blinking**, alternating between appearing and disappearing.

In conclusion, if the object count in the current frame is less than the object count in the previous frame, an object left the frame, merged with another object, or detector dropout has occured. If the object count in the current frame is greater than the object count in the previous frame, an object entered the frame, splitting, or detector drop-in has occured. 

**Going Further**

If you would like to learn about eight object tracking implementations provided in the OpenCV library, check out this link https://pyimagesearch.com/2018/07/30/opencv-object-tracking/

**Loading the Video**

Mount Google drive through using drive.mount('/content/drive') Ceate a shortcut to the class Google Drive home directory to the location where the data is: Open the class Google Doc Folder https://drive.google.com/drive/folders/17dFdrIbTp8RjivAuOiLNyb8pTqZ8QqgL ("SCIP_IMAGE_PYTHON_2022"), Right click on "SCIP_DATA",click "Add shortcut to drive".

Run the code below to mount the class code data. You should be able to find image and video files in the folder (on the left) under drive/MyDrive/SCIP_DATA.

In [None]:
#import glob
from google.colab import drive
import matplotlib.pyplot as plt
from IPython import display
from time import sleep
import cv2
drive.mount('/content/drive')

# Review: Detect Objects in Video

Run the detection code from last week that does the following image processing steps: reads a video frame-by-frame, resizes it and converts it to grayscale, binary quantizes each grayscale frame using a fixed threshold, uses "cv2.findCountours" to detect the objects, selects the objects (plankton) we want by size, and draws a yellow rectangle the selected objects. Notice I added a few lines of code to calculate the center of the object. This will be useful when we try and track the objects.

In [8]:
from google.colab.patches import cv2_imshow

vid='/content/drive/MyDrive/SCIP_DATA/Video/threeStentor.mp4'
framesToDisplay=300
thresh=40 # used to determine if a pixel is assigned 0 or 255
frameNumber=0
minArea=300; maxArea=2000; # in pixels
CROP_SIZE=4 # number of pixels to remove on each side of image
thick=3   # thickness of rectangle lines around detected objects
cap = cv2.VideoCapture(vid)
xRez=640; yRez=480;

while(cap.isOpened() and frameNumber<framesToDisplay):
    # get image
    ret, frameIM = cap.read()
    if not ret: # check to make sure there was a frame to read
      print('Done with video')
      break
    frameIM = cv2.resize(frameIM, (xRez, yRez))
    grayIM = cv2.cvtColor(frameIM, cv2.COLOR_BGR2GRAY)    # convert color to grayscale image
    grayIM=grayIM[CROP_SIZE:yRez-CROP_SIZE,CROP_SIZE:xRez-CROP_SIZE]
    ret,binaryIM = cv2.threshold(grayIM,thresh,255,cv2.THRESH_BINARY) # threshold image to make pixels 0 or 255
    
    # detect objects in binaryIM
    contourList, hierarchy = cv2.findContours(binaryIM, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # all countour points, uses more memory
    objectCount=0 # counts number of objects detected
    for objContour in contourList:
      area = cv2.contourArea(objContour)
      #print(len(contourList),area)
      if area>minArea and area<maxArea:
        #print('frame',frameCount,'object',objectCount,'area',area)
        PO = cv2.boundingRect(objContour)
        x0=PO[0]; y0=PO[1]; w=PO[2]; h=PO[3]
        
        ############## NEW CODE: calculate center of object ###############
        xc=int(x0+w/2+0.5)
        yc=int(y0+h/2+0.5)
        ###################################################################
        
        cv2.rectangle(grayIM, (x0,y0), (x0+w,y0+h),255, thick) # place rectangle around each object, BGR
        objectCount+=1
    cv2_imshow(grayIM)
    sleep(0.1)
    display.display(plt.gcf())
    display.clear_output(wait=True)
    frameNumber+=1
cap.release()

# Saving Object Locations using Pickle

Now that we have the centers (xc,yc) of all the objects we want to track, we will save them in a list that we will later use to perform the tracking, without having to re-run the detect program. We could do it all in one program, but it would be rather large. To speed up processing, we are not going to display the detection results, since we know it works.

In [2]:
import pickle
from google.colab import files

vid='/content/drive/MyDrive/SCIP_DATA/Video/threeStentor.mp4'
framesToDisplay=300
thresh=40 # used to determine if a pixel is assigned 0 or 255
frameNumber=0
minArea=300; maxArea=2000; # in pixels
CROP_SIZE=4 # number of pixels to remove on each side of image
thick=3   # thickness of rectangle lines around detected objects
cap = cv2.VideoCapture(vid)
xRez=640; yRez=480;

############## NEW CODE: Defining a list to store object location ###############
objList=[] # we will need this list to store new object location
###################################################################

print('Creating objList') # I like to give a status of the program when it's running so the user know what it's doing
while(cap.isOpened() and frameNumber<framesToDisplay):
    # get image
    ret, frameIM = cap.read()
    if not ret: # check to make sure there was a frame to read
      print('Done reading entire video')
      break
    frameIM = cv2.resize(frameIM, (xRez, yRez))
    grayIM = cv2.cvtColor(frameIM, cv2.COLOR_BGR2GRAY)    # convert color to grayscale image
    grayIM=grayIM[CROP_SIZE:yRez-CROP_SIZE,CROP_SIZE:xRez-CROP_SIZE]
    ret,binaryIM = cv2.threshold(grayIM,thresh,255,cv2.THRESH_BINARY) # threshold image to make pixels 0 or 255
    
    # detect objects in binaryIM
    contourList, hierarchy = cv2.findContours(binaryIM, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) # all countour points, uses more memory
    objectCount=0 # counts number of objects detected
    for objContour in contourList:
      area = cv2.contourArea(objContour)
      #print(len(contourList),area)
      if area>minArea and area<maxArea:
        #print('frame',frameCount,'object',objectCount,'area',area)   # used for debugging
        PO = cv2.boundingRect(objContour)
        x0=PO[0]; y0=PO[1]; w=PO[2]; h=PO[3]
        
        ############## NEW CODE: calculate center of object ###############
        xc=int(x0+w/2+0.5)
        yc=int(y0+h/2+0.5)
        ASSIGNED_FLAG=0
        objID=-1          # -1 will indicate that no id has been assigned to the object (valid id's are positive numbers, starting with 0)
        v=[frameNumber,xc,yc,objID,ASSIGNED_FLAG] # create a list of object location and id. You'll see!
        objList.append(v)  # append the object location and id to the newObjList (we just made a list of lists!)   
        
        ###################################################################
        
        cv2.rectangle(grayIM, (x0,y0), (x0+w,y0+h),255, thick) # place rectangle around each object, BGR
        objectCount+=1

    # no need to display objects, just takes up time and we know it works by now
    #plt.imshow(grayIM)
    #plt.title('frame ' + str(frameNumber))
    #sleep(0.3)
    #display.display(plt.gcf())
    #display.clear_output(wait=True)
    frameNumber+=1
cap.release()

# save the objList using pickle 
pick_insert = open('drive/My Drive/objList.pickle','wb')
pickle.dump(objList, pick_insert)
pick_insert.close()
print('Saved objList.pickle')


NameError: ignored

Now let's play back the objList to make sure it's good.

In [1]:
import numpy as np

lastFrame=objList[-1][0]
print('lastFrame',lastFrame)
radius=10; thick=0;
color=(255,255,255)
im=np.zeros((yRez,xRez,3),dtype='uint8') # create a black image
lastFrame=0
for i in range (len(objList)):
    frame,xc,yc,objID,ASSIGNED_FLAG=objList[i] 
    cv2.circle(im, (xc,yc), radius,color[0], thick) # place rectangle around each object, BGR
    if frame!=lastFrame:
        lastFrame=frame
        cv2_imshow(im)
        #sleep(0.3) # no need to make sure every frame is visible since we keep writing circles on the original image (like a multiple exposure in photography), leaving a trail of object locations
        display.display(plt.gcf())
        display.clear_output(wait=True)

NameError: ignored

# Tracking Code

Now we are ready to do tracking using the objFile.
Here is an outline of what the code will be doing...
1. Find out where each frame starts in the objList. 
2. Initialize the objects in the first frame by assigning them ID's, starting with 0.
3. Make a nested loop (a loop in a loop). For each of the old objects (outer loop) find the closest new object (inner loop), and assign the old object ID to the new object. Set the ASSIGNED_FLAG for the new object so it isn't matched multiple times to old objects.
4. Give any newborns (new objects that were not assigned to old objects) a new unique ID, and keep track of what new id to give with the variable newID.
5. Show the tracking results by displaying each frame with bounding boxes around each object, color coded by the object ID. If there is perfect tracking, the bounding box of an object will stay constant as the object moves. If the color changes, try and predict what failure mode is occuring.


In [1]:
import math # we are going to need square root for calculating distance

radius=10; thick=0;
# To see if the the tracker is working, let's color code the bounding box of each object by it's id.
# If tracking is working perfectly, an object will retain it's color. 
# Seven colors are defined in a list. Can you decode what color they would produce? Hint: BGR
color=[(0,0,255),(0,255,0),(0,255,255),(255,0,0),(255,0,255,),(255,255,0),(255,255,0)] 

# The objList is a list of lists. Each list contains these variables in this order; frameNumber,xc,yc,objID,ASSIGNED_FLAG
# Instead of retreiving their value (indexing) with numbers 0 to 4 (remembers computers count starting with 0), I like to use labels, to give the code better readablity
frameIndex=0
idIndex=3
assignIndex=4

# create a black image for us to draw on
im=np.zeros((yRez,xRez,3),dtype='uint8')

# make a list of where the frame boundaries are in objList
startFrame=[]
lastFrame=0
for i in range (len(objList)):
    frame,xc,yc,objID,ASSIGNED_FLAG=objList[i] 
    cv2.circle(im, (xc,yc), radius,color[0], thick) # place rectangle around each object, BGR
    if frame!=lastFrame:
        startFrame.append(i)
        lastFrame=frame

# give objects in the first frame initial id's so they can be propagated to successive frames
startingID=0
for i in range(startFrame[0],startFrame[1]):
    objList[i][idIndex]=startingID
    startingID+=1
newID=startingID          # if a new id is needed for a new object, start with this since it's the next one that hasn't been used (every object must have a unique id!)

# now lets match each object in the old (last) frame with objects in the current (new) frame
orphanCount=0                         # counts how many old objects didn't find matches to new objects (ran out of new objects)
for i in range(1,len(startFrame)-1):  # we start with frame 1 (new frame, new objects) since we will be looking back at the previous old frame (frame 0) to match old objects with new objects
    oldStart=startFrame[i-1]          # beginning of old frame
    oldStop=startFrame[i]             # end of old frame
    newStart=startFrame[i]            # beginning of new frame
    newStop=startFrame[i+1]           # end of new frame
    #print(oldStart,oldStop,newStart,newStop)
    
    BIG_NUMBER=999999               # you'll see how we use this very soon to find the minimum distance
    for iOld in range(oldStart,oldStop):  # process every old object trying to match it to the closest new object
        frame,xcOld,ycOld,idOld,ASSIGNED_FLAG=objList[iOld] # let's get the location and id of the old object. 
        if ASSIGNED_FLAG==0:        # only process old objects that have not been assigned (matched) to the closest new object
            bestDistance=BIG_NUMBER # make this a really big number, so when we do our initial compare, it's guarantee to be smaller than this number
            bestMatchIndex=0        # we'll update this when we find a close object in the previous frame
            for iNew in range(newStart,newStop):      # check distance of every new object to the old object we are processing
                frame,xcNew,ycNew,_,_=objList[iNew]   # the dashes ('_') are a way to receive variables that we don't care about. It stores them in '_' which we will ignore. Sometimes I use "dummy", but most people seem to use '_'
                dx=xcNew-xcOld
                dy=ycNew-ycOld
                distance=math.sqrt(dx*dx+dy*dy)
                if distance<bestDistance:             # if old object is closer than the closest old object we've seen so far, save it's information
                  bestDistance=distance               # keep the value and see if any other object is closer
                  bestMatchIndex=iNew                 # keep track of the best old object in case it turns out to be the closest to the new object
                  #print(iNew,distance)
            if bestDistance!=BIG_NUMBER:              # if bestDistance has been updated, that means a match has been found 
              objList[bestMatchIndex][idIndex]=idOld  # the new object get's the id of the old object. The third element of the list (2 because we count from zero!) is the ID
              objList[idOld][assignIndex]=1           # ASSIGNED_FLAG=1 indicating old object has been assigned a new oject, so it can't be matched to any other new object (one per customer!)
            else:   
              orphanCount+=1                          # after testing all the new objects, bestDistance was never updated (still has the initial value of BIG_NUMBER), so we know there was no new object to match to this old object :( 

    # now that all old objects have been matched to new objects (except for old orphans), let's see if there are any new objects that haven't been assigned id's (new born objects)
    # we can tell because their id is their value when we created the objFile, which is -1
    # if they are "new born", give them an id using the newID variable
    for iNew in range(newStart,newStop):
        if objList[iNew][idIndex]==-1:      # id==-1 means the new object was not matched to an old object, so give it a newID
            objList[iNew][idIndex]=newID
            newID+=1                        # make sure we advance newID so the next "new born" gets a unique id
            print('new born id', objList[iNew][idIndex])

# now show that it works, by color coding the id of the objects
im=np.zeros((yRez,xRez,3),dtype='uint8')        # notice we make a color image by having three dimensions; x,y and 3 color channels (BGR)
lastFrame=0     # last frame is the starting frame, to start the loop ("bootstrapping")
for i in range (len(objList)):
    frame,xc,yc,objID,ASSIGNED_FLAG=objList[i] 
    #print('frame',frame,'objID',objID)
    cv2.circle(im, (xc,yc), radius,color[objID%len(color)], thick) # place rectangle around each object, BGR
    if frame!=lastFrame:
        lastFrame=frame
        cv2_imshow(im)
        sleep(0.3)
        display.display(plt.gcf())
        display.clear_output(wait=True)


NameError: ignored