# 7 Object Tracking

**MeanShift** Algorithm: clustering algorithm similar to k-means, that finds local maxima of any number of clusters present in the feature space. Main steps of the algorithm:
- for each data point/sample, we open a window and compute a weighted mean of all the samples within it
- we move the point to the computed mean, and we repeat
- eventually, all points will converge to cluster centroids!

While with k-means thenumber of clusters is selected, with mean-shift the number of clusters is computed automatically -- that might result in a non-intuitive number clusters, too...

How to apply it to tracking? We select a target area and compute its hue color histogram; then, we keep sliding the tracking window to the closest match (the cluster center): the movement is basically the shift of the window of the previous iteration on the new/current frame in the direction where the centroid of the intensities of the window is.

**CamShift** = Continuously Adaptive (window) Mean Shift. It is a mean-shift where the window changes its size according to the object size, which can actually vary if we move away or towards the camera. The size is updated after converging: we basically update the window size and its ellipsoid orientation.

## 7.2 Mean-Shift Tracking

In [2]:
import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline

In [1]:
import numpy as np
import cv2 

cap = cv2.VideoCapture(0) # cam input
#cap = cv2.VideoCapture('../../data/hand_move.mp4') # recorded video

# Take first frame of the video
ret,frame = cap.read()

# Set Up the Initial Tracking Window: the first face detected on first cam frame
# We will first detect the face and set that as our starting box.
face_cascade = cv2.CascadeClassifier('../../data/haarcascades/haarcascade_frontalface_default.xml')
face_rects = face_cascade.detectMultiScale(frame) 
# Convert this list of a single array to a tuple of (x,y,w,h)
(face_x,face_y,w,h) = tuple(face_rects[0]) # we take the first face, which is presumably the best guess 
track_window = (face_x,face_y,w,h)
# Set up the ROI for tracking: rows & cols = y & x
roi = frame[face_y:face_y+h, face_x:face_x+w]

# Use the HSV Color Mapping
hsv_roi =  cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)

# Find histogram to backproject the target on each frame for calculation of meanshit
roi_hist = cv2.calcHist([hsv_roi],[0],None,[180],[0,180])

# Normalize the histogram array values given a min of 0 and max of 255
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

# Setup the termination criteria, either 10 iteration or move by at least 1 pt
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)

ret = True
while ret:
    ret ,frame = cap.read()
    if ret == True:
        
        # Grab the Frame in HSV
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        
        # Calculate the Back Projection based off the roi_hist created earlier
        # Backprojection is a kind of matching approach:
        # - The Hue-Saturation 2D histogram of a template image is computed, eg., a hand
        # - We take anothe (hand) image, and define a BackProjection image, still empty
        # - The pixel values are the probability that the pixel in test image belongs to a skin area
        # - These are computed as follows:
        #     - Pixel Hue-Saturation is read and its bin in the templatete histogram visited
        #     - BackProjection pixel value = normalized histogram value
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
        
        # Apply meanshift to get the new coordinates of the rectangle
        # We pass the backprojection probability image to find the window to track with mean-shift
        ret, track_window = cv2.meanShift(dst, track_window, term_crit)
        
        # Draw the new rectangle on the image
        x,y,w,h = track_window
        img2 = cv2.rectangle(frame, (x,y), (x+w,y+h), (0,0,255),5)
        
        # Display image
        cv2.imshow('img2',img2)
        
        # Wait for ESC
        k = cv2.waitKey(1) & 0xff
        if k == 27:
            break        
    else:
        break
# Clear
cv2.destroyAllWindows()
cap.release()

**Observations**:
- The size of the tracked window/square doesn't resize as we come forward or go away
- We can trick the tracking by covering our face; when we show it again within the window, it continues to track

## 7.3 CamShift Tracking

Basically, we use the same code, only few lines changed:
- `cv2.CamShift()` is used instead of `cv2.meanShift()`
- drawing functions are changed, since now we get an OBB that varies in size and orientation

In [1]:
import numpy as np
import cv2
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import numpy as np
import cv2 

cap = cv2.VideoCapture(0) # cam input
#cap = cv2.VideoCapture('../../data/hand_move.mp4') # recorded video

# Take first frame of the video
ret,frame = cap.read()

# Set Up the Initial Tracking Window: the first face detected on first cam frame
# We will first detect the face and set that as our starting box.
face_cascade = cv2.CascadeClassifier('../../data/haarcascades/haarcascade_frontalface_default.xml')
face_rects = face_cascade.detectMultiScale(frame) 
# Convert this list of a single array to a tuple of (x,y,w,h)
(face_x,face_y,w,h) = tuple(face_rects[0]) # we take the first face, which is presumably the best guess 
track_window = (face_x,face_y,w,h)
# Set up the ROI for tracking: rows & cols = y & x
roi = frame[face_y:face_y+h, face_x:face_x+w]

# Use the HSV Color Mapping
hsv_roi =  cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)

# Find histogram to backproject the target on each frame for calculation of meanshit
roi_hist = cv2.calcHist([hsv_roi],[0],None,[180],[0,180])

# Normalize the histogram array values given a min of 0 and max of 255
cv2.normalize(roi_hist,roi_hist,0,255,cv2.NORM_MINMAX)

# Setup the termination criteria, either 10 iteration or move by at least 1 pt
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)

ret = True
while ret:
    ret ,frame = cap.read()
    if ret == True:
        
        # Grab the Frame in HSV
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        
        # Calculate the Back Projection based off the roi_hist created earlier
        dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
        
        ##########
        # Apply camshift to get the new coordinates of the rectangle
        ret, track_window = cv2.CamShift(dst, track_window, term_crit)
        
        # Draw the OBB on the image
        # Watch out: it varies in orientation and size
        pts = cv2.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv2.polylines(frame, [pts], True, (0,0,255), 5)
        ##########
                
        # Display image
        cv2.imshow('img2',img2)
        
        # Wait for ESC
        k = cv2.waitKey(1) & 0xff
        if k == 27:
            break        
    else:
        break
# Clear
cv2.destroyAllWindows()
cap.release()