# <center> 👉 class_10_5 » _Object Tracking - Meanshift and Camshift_ </center>

OpenCV is the huge open-source library for computer vision, machine learning, and image processing and now it plays a major role in real-time operation which is very important in today's systems.   
By using it, one can process images and videos to identify objects, faces, or even the handwriting of a human. 

        OpenCV는 컴퓨터 비전, 기계 학습 및 이미지 처리를 위한 거대한 오픈 소스 라이브러리이며 이제 오늘날 시스템에서 매우 중요한 실시간 작동에서 중요한 역할을 합니다.   
        이를 사용하면 이미지와 비디오를 처리하여 물체, 얼굴 또는 사람의 필체를 식별할 수 있습니다. 
        

* **Dense Optical Flow:** These algorithms help estimate the motion vector of every pixel in a video frame.
* **Sparse Optical Flow:** These algorithms, like the Kanade-Lucas-Tomashi (KLT) feature tracker, track the location of a few feature points in an image.
* **Kalman Filtering:** A very popular signal processing algorithm used to predict the location of a moving object based on prior motion information.   
>- One of the early applications of this algorithm was missile guidance!   
>- Also "the on-board computer that guided the descent of the Apollo 11 lunar module to the moon had a Kalman filter".  
* **Meanshift and Camshift:** These are algorithms for locating the maxima of a density function. They are also used for tracking.
* **Single Object Trackers:** In this class of trackers, the first frame is marked using a rectangle to indicate the location of the object we want to track.   
The object is then tracked in subsequent frames using the tracking algorithm.   
In most real-life applications, these trackers are used in conjunction with an object detector.
* **Multiple Object Track finding algorithms:** In cases when we have a fast object detector, it makes sense to detect multiple objects in each frame and then run a track finding algorithm that identifies which rectangle in one frame corresponds to a rectangle in the next frame.
>- Multiple Object Tracking has come a long way.   
>- It uses object detection and novel motion prediction algorithms to get accurate tracking information.   
>- For example, DeepSort uses the YOLO network to get blazing-fast inference speed. It is based on SORT.

## ▣ Meanshift and Camshift

## ▶ Meanshift 

The basic idea behind Meanshift is that every instance of a video is determined by the form of the pixel distribution of that frame.   
We define an initial window, generally a square or a circle for which the positions are specified by ourself which identifies the area of maximum pixel distribution   
and tries to keep track of that area in the video so that when the video is running our tracking window also moves towards the region of maximum pixel distribution.   
The direction of movement depends upon the difference between the center of our tracking window and the centroid of all the k-pixels inside that window.
Meanshift is __a very useful method to keep track of a particular object inside a video.__   
Meanshift can __separate the static background of a video and the moving foreground object.__

        Meanshift의 기본 아이디어는 비디오의 모든 인스턴스가 해당 프레임의 픽셀 분포 형태로 확인된다는 것입니다.
        우리는 초기 창(일반적으로 최대 픽셀 분포 영역을 식별하는 위치가 스스로 지정되는 정사각형 또는 원)을 정의합니다.
        그리고 비디오가 실행 중일 때 추적 창도 최대 픽셀 분포 영역을 향해 이동하도록 비디오에서 해당 영역을 추적하려고 합니다.
        이동 방향은 추적 창 중심과 해당 창 내부의 모든 k-픽셀 중심 간의 차이에 따라 달라집니다.
        Meanshift는 비디오 내의 특정 개체를 추적하는 데 매우 유용한 방법입니다.
        Meanshift는 비디오의 정적 배경과 움직이는 전경 개체를 분리할 수 있습니다.  

        
To use meanshift in OpenCV, first we need to setup the target, find its histogram so that we can backproject the target on each frame for calculation of meanshift.   
We also need to provide an initial location of window.   
For histogram, only Hue is considered here.   
Also, to avoid false values due to low light, low light values are discarded using cv.inRange() function.         

        OpenCV에서 평균 이동을 사용하려면 먼저 목표를 설정하고 평균 이동 계산을 위해 각 프레임의 목표를 역투영할 수 있도록 히스토그램을 찾아야 합니다.
        또한 창의 초기 위치도 제공해야 합니다.
        히스토그램의 경우 여기서는 Hue만 고려됩니다.
        또한 낮은 조명으로 인한 잘못된 값을 방지하기 위해 cv.inRange() 함수를 사용하여 낮은 조명 값을 삭제합니다.  
        
Examples: 

1. The tracking windows is tracking the football and the football player. 

<img src='./images/practice_img/obj0.png' align='left' width=400 height=400><img src='./images/practice_img/obj1.png'  width=400 height=400> 

### ■ Disadvantages of using meanshift  

There are 2 main disadvantages of using the Meanshift for object tracking.  

- The size of the tracking window remains the same irrespective of the distance of the object from the camera.  
- The Window will track the object only when it is in the region of that object.  
- So we must hardcode our position of the window carefully.

In [40]:
import cv2
import numpy as np

col, width, row, height = -1,-1,-1,-1        # set initial value
frame = None                                 # video output
frame2 = None                                # org img
inputmode = False                            # roi
rectangle = False
trackWindow = None
roi_hist = None

def onMouse(event, x, y, flags, param):
    global col, width, row, height, frame, frame2, inputmode
    global rectangle, roi_hist, trackWindow

    if inputmode:                          # press sapce bar   
        if event == cv2.EVENT_LBUTTONDOWN:
            rectangle = True 
            col, row = x,y
        elif event == cv2.EVENT_MOUSEMOVE:
            if rectangle:
                frame = frame2.copy() 
                cv2.rectangle(frame,(col, row), (x,y),(0,255,0),2)  
                cv2.imshow('frame',frame)
        elif event == cv2.EVENT_LBUTTONUP:  
            inputmode = False
            rectangle = False 
            cv2.rectangle(frame,(col,row),(x,y),(0,255,0),2) 
            height, width = abs(row-y),abs(col-x) 
            trackWindow = (col, row, width, height) 
            roi = frame[row:row+height, col:col+width] 
            roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV) 
            roi_hist = cv2.calcHist([roi],[0], None, [180],[0,180]) # 2D img histo, Hue value 0~179
            cv2.normalize(roi_hist, roi_hist, 0 , 255, cv2.NORM_MINMAX) # 0~179 => 0~255 
    return

def meanShift(file):
    global frame, frame2, inputmode, trackWindow, roi_hist
    try:
        cap = cv2.VideoCapture(file)
    except Exception as e:
        print(e)
        return
    ret, frame = cap.read() 
    cv2.namedWindow('frame')
    cv2.setMouseCallback('frame',onMouse,param=(frame,frame2)) 
    termination = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1) # meanShift
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        if trackWindow is not None: 
            hsv = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
            dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1) 
            cv2.imshow('dst',dst)
            ret, trackWindow = cv2.meanShift(dst, trackWindow, termination) 
            x,y,w,h = trackWindow 
            cv2.rectangle(frame,(x,y),(x+w,y+h), (0,255,0),2)

        cv2.imshow('frame',frame)
        k = cv2.waitKey(55)
        if k == 27:
            break

        if k == ord(' '): # space bar press
            print("select Area for meanShift and Enter a Key")
            inputmode = True
            frame2 = frame.copy() 

            while inputmode:
                cv2.imshow('frame',frame)
                cv2.waitKey(0)

    cap.release()
    cv2.destroyAllWindows()

meanShift('./Videos/boy-walking.mp4')

select Area for meanShift and Enter a Key


## ▶ Camshift  

Our window always has the same size whether the car is very far or very close to the camera.   
That is not good. We need to adapt the window size with size and rotation of the target.   
The solution came from "OpenCV Labs" and it is called CAMshift (Continuously Adaptive Meanshift) published by Gary Bradsky in his paper "Computer Vision Face Tracking for Use in a Perceptual User Interface" in 1998 [33] .
CAMshift tries to tackle the scale problem by using varying window size for applying meanshift.   
- Steps 1 and 2 are the same as that of MeanShift.   
- In the third step, we find the backproject image and then use CamShift() openCV function to track the position of the object in the new frame.   
- This function finds the an object center using meanshift and then adjust the window size.   
- This funciton returns the rotaed rectangle that includes the object position, size, and orientation.  

It applies meanshift first. Once meanshift converges, it updates the size of the window as, s=2×M00256−−−√. It also calculates the orientation of the best fitting ellipse to it.   
Again it applies the meanshift with new scaled search window and previous window location.   
The process continues until the required accuracy is met.

In [39]:
import cv2
import numpy as np

col, width, row, height = -1,-1,-1,-1
frame = None
frame2 = None
inputmode = False
rectangle = False
trackWindow = None
roi_hist = None

def onMouse(event, x, y, flags, param):
    global col, width, row, height, frame, frame2, inputmode
    global rectangle, roi_hist, trackWindow
    if inputmode:
        if event == cv2.EVENT_LBUTTONDOWN:
            rectangle = True
            col, row = x,y
        elif event == cv2.EVENT_MOUSEMOVE:
            if rectangle:
                frame = frame2.copy()
                cv2.rectangle(frame,(col, row), (x,y),(0,255,0),2)
                cv2.imshow('frame',frame)
        elif event == cv2.EVENT_LBUTTONUP:
            inputmode = False
            rectangle = False
            cv2.rectangle(frame,(col,row),(x,y),(0,255,0),2)
            height, width = abs(row-y),abs(col-x)
            trackWindow = (col, row, width, height)
            roi = frame[row:row+height, col:col+width]
            roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
            roi_hist = cv2.calcHist([roi],[0], None, [180],[0,180])
            cv2.normalize(roi_hist, roi_hist, 0 , 255, cv2.NORM_MINMAX) 
    return

def CamShift(file):
    global frame, frame2, inputmode, trackWindow, roi_hist
    try:
        cap = cv2.VideoCapture(file)
    except Exception as e:
        print(e)
        return
    ret, frame = cap.read()
    cv2.namedWindow('frame')
    cv2.setMouseCallback('frame',onMouse,param=(frame,frame2))
    termination = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1)
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        if trackWindow is not None:
            hsv = cv2.cvtColor(frame,cv2.COLOR_BGR2HSV)
            dst = cv2.calcBackProject([hsv],[0],roi_hist,[0,180],1)
            cv2.imshow('dst',dst)
            ret, trackWindow = cv2.CamShift(dst, trackWindow, termination) 
            pts = cv2.boxPoints(ret) 
            pts = np.int0(pts) 
            cv2.polylines(frame,[pts],True,(0,255,0),2) 

        cv2.imshow('frame',frame)
        k = cv2.waitKey(55)
        if k == 27:
            break

        if k == ord(' '):
            print("select Area for CamShift and Enter a Key")
            inputmode = True
            frame2 = frame.copy() 
            
            while inputmode:
                cv2.imshow('frame',frame)
                cv2.waitKey(0)
                
    cap.release()
    cv2.destroyAllWindows()
    
CamShift('./Videos/boy-walking.mp4')    

select Area for CamShift and Enter a Key


## ▶ MeanShift Face Track

In [37]:
#  MeanShift Face Track
# https://github.com/prabormukherjee/Object_tracking-opencv/blob/master/03_MeanShift_Tracking.ipynb
import cv2
import numpy as np

cap = cv2.VideoCapture('./Videos/face_track.mp4') # face_track.mp4   chaplin.mp4
ret, frame = cap.read()
face_casc = cv2.CascadeClassifier('./cv_data/haarcascade_frontalface_default.xml')
face_rects = face_casc.detectMultiScale(frame)
face_x, face_y, w,h = tuple(face_rects[0]) # Convert the list to a tuple
track_window = (face_x, face_y, w, h)
roi = frame[face_y: face_y+h, face_x: face_x+w]
hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
roi_hist = cv2.calcHist([hsv_roi], [0], None, [180], [0,180])
cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX);
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1) # Set the termination criteria 10 iterations or move 1 pt

while True:
    ret, frame = cap.read()
    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dest = cv2.calcBackProject([hsv], [0], roi_hist, [0, 180], 1)
        ret, track_window = cv2.meanShift(dest, track_window, term_crit)
        x, y, w, h = track_window
        img2 = cv2.rectangle(frame, (x, y),(x+w, y+h),(255, 255, 0))
        cv2.imshow('Face Tracker', img2)

        if cv2.waitKey(300) & 0xFF == 27:
            break
    else:
        break
    
cap.release()
cv2.destroyAllWindows()

## ▶ CamShift Face Track

In [38]:
# Import Libraries
import cv2
import numpy as np
cap = cv2.VideoCapture('Videos/face_track.mp4')
ret, frame = cap.read()
face_casc = cv2.CascadeClassifier('./cv_data/haarcascade_frontalface_default.xml')
face_rects = face_casc.detectMultiScale(frame)
face_x, face_y, w,h = tuple(face_rects[0])
track_window = (face_x, face_y, w, h)
roi = frame[face_y: face_y+h,face_x: face_x+w]
hsv_roi = cv2.cvtColor(roi, cv2.COLOR_BGR2HSV)
roi_hist = cv2.calcHist([hsv_roi], [0], None, [180], [0,180]) # Histogram to target on each frame for the meanshift calculation
cv2.normalize(roi_hist, roi_hist, 0, 255, cv2.NORM_MINMAX);  # Normalize the histogram
term_crit = (cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 1) # Set the termination criteria 10 iterations or move 1 pt

while True:
    ret, frame = cap.read()
    if ret == True:
        hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
        dest = cv2.calcBackProject([hsv], [0], roi_hist, [0, 180], 1)
        ret, track_window = cv2.CamShift(dest, track_window, term_crit) # Camshift to get the new coordinates of rectangle
        pts = cv2.boxPoints(ret)
        pts = np.int0(pts)
        img2 = cv2.polylines(frame, [pts], True, (0, 255, 0), 5)
        cv2.imshow('Cam Shift', img2)
        
        if cv2.waitKey(300) & 0xFF == 27:
            break
    else:
        break
    
cap.release()
cv2.destroyAllWindows()