# Tracking
Tracking is a process of following objects or features across a sequence of images or frames in a video. It locates and identifies objects of interest over time.



![](https://nanonets.com/blog/content/images/2019/04/sparse-vs-dense.gif)

![](https://s3-us-west-2.amazonaws.com/static.pyimagesearch.com/opencv-multi-object-tracking/opencv_multi_object_tracking_01.gif) ![](https://s3-us-west-2.amazonaws.com/static.pyimagesearch.com/multi-object-tracking-dlib/multi_object_tracking_dlib_result.gif)


In [1]:
# We need to import a few packages.
import cv2
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches

# Optical Flow

Optical Flow is a technique used to estimate the motion of objects in a sequence of images or video frames. It analyzes the apparent motion of pixels between consecutive frames. It assumes that nearby pixels in an image generally move together.

Optical flow algorithms attempt to find the correspondence between pixels in one frame and pixels in the other frame. This correspondence is then represented as a vector field, where each vector indicates the direction and magnitude of motion for a particular pixel.

# Sparse Optical Flow

Sparse optical flow computes these motion vectors only for a subset of pixels in the image.

How do you think we'll select these subset of pixels?

![](https://docs.opencv.org/3.4/optical_flow_basic1.jpg)


Consider a pixel $I(x,y,t)$ in the first frame. It moves by distance ($dx,dy$) in next frame taken after $dt$ time. So since those pixels are the same and intensity does not change, we can say,

$I(x,y,t) = I(x+dx, y+dy, t+dt)$

Then take taylor series approximation of right-hand side, remove common terms and divide by $dt$ to get the following equation:

$f_x u + f_y v + f_t = 0 \;$

where:

$f_x = \frac{\partial f}{\partial x} \; ; \; f_y = \frac{\partial f}{\partial y}$

$u = \frac{dx}{dt} \; ; \; v = \frac{dy}{dt}$

Above equation is called Optical Flow equation. In it, we can find $f_x$ and $f_y$, they are image gradients. Similarly $f_t$ is the gradient along time. But (u,v) is unknown. We cannot solve this one equation with two unknown variables. So several methods are provided to solve this problem and one of them is Lucas-Kanade. Lucas-Kanade method takes a 3x3 patch around the point. So all the 9 points have the same motion. We can find ($f_x, f_y, f_t$) for these 9 points. So now our problem becomes solving 9 equations with two unknown variables which is over-determined. Below is the final solution which is two equation-two unknown problem and solve to get the solution.

$\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} \sum_{i}{f_{x_i}}^2 & \sum_{i}{f_{x_i} f_{y_i} } \\ \sum_{i}{f_{x_i} f_{y_i}} & \sum_{i}{f_{y_i}}^2 \end{bmatrix}^{-1} \begin{bmatrix} - \sum_{i}{f_{x_i} f_{t_i}} \\ - \sum_{i}{f_{y_i} f_{t_i}} \end{bmatrix}$


# 2. Kanade-Lucas-Tomasi (KLT) Tracking
We use opencv to track some **feature points** in a video. To decide the points, we use *cv2.goodFeaturesToTrack()*. We take the first frame, detect some Shi-Tomasi corner points in it, then we iteratively track those points using Lucas-Kanade optical flow. For the function *cv2.calcOpticalFlowPyrLK()* we pass the previous frame, previous points and next frame. It returns next points along with some status numbers which has a value of 1 if next point is found, else zero. We iteratively pass these next points as previous points in next step.

## 2.1 goodFeaturesToTrack Features Tracking

In [2]:
### read 1.mp4 using opencv
cap = cv2.VideoCapture('1.mp4')

frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
size = (frame_width, frame_height)
result = cv2.VideoWriter('tracking1.avi',
                         cv2.VideoWriter_fourcc(*'MJPG'),
                         30, size)


# params for ShiTomasi corner detection
feature_params = dict( maxCorners = 100,
                       qualityLevel = 0.3,
                       minDistance = 7,
                       blockSize = 7 )
# maxCorners	Maximum number of corners to return. If there are more corners than are found, the strongest of them is returned.
# qualityLevel	Parameter characterizing the minimal accepted quality of image corners. The parameter value is multiplied by the best corner quality measure.
#               The corners with the quality measure less than the product are rejected.
#               For example, if the best corner has the quality measure = 1500, and the qualityLevel=0.01 ,
#               then all the corners with the quality measure less than 15 are rejected.
# minDistance	Minimum possible Euclidean distance between the returned corners.
# blockSize	    Size of an average block for computing a derivative covariation matrix over each pixel neighborhood.


# Parameters for lucas kanade optical flow
lk_params = dict( winSize  = (15, 15),
                  maxLevel = 2,
                  criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))
# winSize	    Size of the search window at each pyramid level.
# maxLevel	    0-based maximal pyramid level number; if set to 0, pyramids are not used (single level), if set to 1, two levels are used
# criteria	    The iteration process of calculating optical flow in OpenCV will stop either after the specified number of iterations


# Create some random colors
color = np.random.randint(0, 255, (100, 3))

### read a frame from video using opencv
ret, old_frame = cap.read()

old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)
p0 = cv2.goodFeaturesToTrack(old_gray, mask = None, **feature_params)
# Create a mask image for drawing purposes
mask = np.zeros_like(old_frame)
while(1):
    ret, frame = cap.read()
    if not ret:
        print('No frames grabbed!')
        break
    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # calculate optical flow
    p1, st, err = cv2.calcOpticalFlowPyrLK(prevImg=old_gray, nextImg=frame_gray, prevPts=p0, nextPts=None, **lk_params)
    # Select good points
    if p1 is not None:
        good_new = p1[st==1]
        good_old = p0[st==1]
    # draw the tracks
    for i, (new, old) in enumerate(zip(good_new, good_old)):
        a, b = new.ravel()
        c, d = old.ravel()
        mask = cv2.line(mask, (int(a), int(b)), (int(c), int(d)), color[i].tolist(), 2)
        frame = cv2.circle(frame, (int(a), int(b)), 5, color[i].tolist(), -1)
    img = cv2.add(frame, mask)

    result.write(img)

    # Now update the previous frame and previous points
    old_gray = frame_gray.copy()
    p0 = good_new.reshape(-1, 1, 2)

result.release()

No frames grabbed!


## 2.2 SIFT Features Tracking

In [3]:
cap = cv2.VideoCapture('1.mp4')

frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
size = (frame_width, frame_height)
result = cv2.VideoWriter('tracking2.avi',
                         cv2.VideoWriter_fourcc(*'MJPG'),
                         30, size)


# Parameters for lucas kanade optical flow
lk_params = dict( winSize  = (15, 15),
                  maxLevel = 2,
                  criteria = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))
# winSize	    Size of the search window at each pyramid level.
# maxLevel	    0-based maximal pyramid level number; if set to 0, pyramids are not used (single level), if set to 1, two levels are used
# criteria	    The iteration process of calculating optical flow in OpenCV will stop either after the specified number of iterations


# Create some random colors
color = np.random.randint(0, 255, (100, 3))
# Take first frame and find corners in it
ret, old_frame = cap.read()
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)

### Create SIFT feature detector using opencv
feature_detector = cv2.SIFT_create()

### Detect SIFT features in old_frame
keypoints, descriptors = feature_detector.detectAndCompute(old_gray, None)

p0 = np.zeros((len(keypoints), 1, 2), dtype=np.float32)
for i in range(len(keypoints)):
    p0[i, 0, 0] = keypoints[i].pt[0]
    p0[i, 0, 1] = keypoints[i].pt[1]

random_index = np.random.randint(0, len(keypoints), 100)
p0=p0[random_index]
# Create a mask image for drawing purposes
mask = np.zeros_like(old_frame)
while(1):
    ret, frame = cap.read()
    if not ret:
        print('No frames grabbed!')
        break
    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    # calculate optical flow
    p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)
    # Select good points
    if p1 is not None:
        good_new = p1[st==1]
        good_old = p0[st==1]
    # draw the tracks
    for i, (new, old) in enumerate(zip(good_new, good_old)):
        a, b = new.ravel()
        c, d = old.ravel()
        mask = cv2.line(mask, (int(a), int(b)), (int(c), int(d)), color[i].tolist(), 2)
        frame = cv2.circle(frame, (int(a), int(b)), 5, color[i].tolist(), -1)
    img = cv2.add(frame, mask)

    result.write(img)

    # Now update the previous frame and previous points
    old_gray = frame_gray.copy()
    p0 = good_new.reshape(-1, 1, 2)

result.release()

No frames grabbed!


# Dense Optical Flow
Lucas-Kanade method computes optical flow for a sparse feature set. OpenCV provides another algorithm to find the **dense optical flow**. It computes the optical flow for all the points in the frame. It is based on Gunner Farneback’s algorithm which is explained in “Two-Frame Motion Estimation Based on Polynomial Expansion” by Gunner Farneback in 2003.

Below sample shows how to find the dense optical flow using above algorithm. We get a 2-channel array with optical flow vectors, (u,v). We find their magnitude and direction. We color code the result for better visualization. Direction corresponds to Hue value of the image. Magnitude corresponds to Value plane.

![](https://upload.wikimedia.org/wikipedia/commons/4/4e/HSV_color_solid_cylinder.png)

In [4]:
cap = cv2.VideoCapture(cv2.samples.findFile("1.mp4"))

frame_width = int(cap.get(3))
frame_height = int(cap.get(4))
size = (frame_width, frame_height)
result = cv2.VideoWriter('tracking3.avi',
                         cv2.VideoWriter_fourcc(*'MJPG'),
                         30, size)

ret, frame1 = cap.read()
prvs = cv2.cvtColor(frame1, cv2.COLOR_BGR2GRAY)
hsv = np.zeros_like(frame1)
hsv[..., 1] = 255
while(1):
    ret, frame2 = cap.read()
    if not ret:
        print('No frames grabbed!')
        break
    next = cv2.cvtColor(frame2, cv2.COLOR_BGR2GRAY)
    flow = cv2.calcOpticalFlowFarneback(prvs, next, flow=None, pyr_scale=0.5, levels=3, winsize=15, iterations=3, poly_n=5, poly_sigma=1.2, flags=0)
    mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
    hsv[..., 0] = ang*180/np.pi/2
    hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
    bgr = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)

    result.write(bgr)

    prvs = next

result.release()

No frames grabbed!


# Homework 10

In this homework, you'll do object tracking on any custom video of your choice. The video should contain moving objects and should be atleast 10 seconds long.
You need to perform the following:-


1.   Sparse Optical Flow - You can use either KLT Tracking or SIFT Feature Tracking.
2.   Dense Optical Flow

Generate videos for both the optical flow methods similar to what we did above and upload it to the google form.

You might need to tune the parameters to achieve reasonable tracking.

Upload your videos to [HW10 Google Form](https://forms.gle/q7gsTYEDKBvvcE2t7)