## Tracking People

This is an initial experiment for learning purpose on using OpenCV for people tracking on videos.

The OpenCV API give us several ways to detect objects in images, but what if we want to count how many people are walking by a road monitored with a security camera?

Since we probably don't want to identify who is each person because of privacy regulations (and also because of that would be waste of resources), we need a way to get track of each person as an unkown, or a simple blob.

To achieve that, several tracking algorithms can be explored. The main components of the tracking algorithms are a model of its features, a model of its movement, and a search methodology. 

To give an idea, some naive implementation could purely run a search on the entire image for the near correlation peak of the extracted features, while some other can model the object's movement, like computing its speed and generating a probability map as a prediction for the next position in which it will search for the best correlation, also, a third one could update the extracted features at each iteration, or a fourth could use a deep-learning approach, that is each day reaching the feasibility for more common hardwares.

To be robust, the algorithms has to perform well under several conditions like object oclusion, image blur, changes in luminosity and noise.

To learn about the tracking algorithms my suggestion is to read [this](https://faculty.ucmerced.edu/mhyang/papers/spie11a.pdf) and [this](https://arxiv.org/pdf/1812.07368.pdf) articles.

Also a comparison between the tracking algorithms available in OpenCV is shown in [this](https://www.researchgate.net/publication/344247798_Single_Object_Trackers_in_OpenCV_A_Benchmark) article.


### Automating the Detection

With OpenCV it is fairly simple to load and use one of the several tracking algorithms available on the `opencv-contrib-python 3.4` package, like the very fast `Mosse`, but in the first place how to identify the person to be tracked?

Several tutorials about tracking focus only on tracking, leaving it to the user to use an embedded tool to draw a rectangle representing the Region Of Interest (ROI), used as input to the tracker algorithm.

But what if we want it to be automated? Well, I think that is where we get some fun here.

An easy strategy could be to narrow the detection by cropping the image to a small zone, like a strip of a crosswalk, where the flow will be passing through, and only there, to do the detection that will be the targets of the people tracker. Unlikely cars, the flow of the people can be very caotic compared to a car lane, where I think this strategy will be very succesefull.

That being said, if we will use the entire image for detection of people, we need to write an integration between the person that was already detected and is being tracked, with the person that is not being tracked and is now detected.

The most simplistic way I can think is to check if the centroid of the newly detected is outside the rectangle of any tracked person in that moment. 

### The first experiment

The approach here is very simple:

We are trying to combine the output from an object detection algorithm (that is expensive to run), with an object tracking algorithm (that is cheaper to run) with the objective to keep track of each person from the first to the last frame that person keeps in. A description of the step-by-step is shown below:

- Open the input video, get its shape and FPS.
- Setup the output video with the same shape and FPS from the input video.
- From the input video, get the image of the frame.
- Run a full or upper body pre-trained Haar Cascade Classifier on that image, that should return a list of detected full body in a form of a rectangle.
- From that list, try to identify which are new and which was already being tracked.
- Each person has an _ID_, a _failure counter_ and its own _tracker algorithm object_ that will be used in the next frames to maintain the right track.
- We set a threshold level for the person's failure counter so we can stop tracking that person.
- For each tracked person, draws its rectangle with a text containing its ID/failure counter on top of it, also draws its centroid.
Display the image.
Save the processed frames into the output video.


## References:  
- https://faculty.ucmerced.edu/mhyang/papers/spie11a.pdf
- https://arxiv.org/pdf/1812.07368.pdf
- https://learnopencv.com/object-tracking-using-opencv-cpp-python/  
- https://github.com/adipandas/multi-object-tracker  
- https://github.com/stephanj/basketballVideoAnalysis  
- https://gist.github.com/harshilpatel312/ff08b49fd71a3eeaeb209c91de3dfde1  
- https://www.robots.ox.ac.uk/~joao/circulant/  

In [7]:
import cv2

print('Available Trackers:')
for d in dir(cv2):
    if 'Tracker' in d:
        print('\t -',d)

Available Trackers:
	 - Tracker
	 - TrackerCSRT
	 - TrackerCSRT_Params
	 - TrackerCSRT_create
	 - TrackerGOTURN
	 - TrackerGOTURN_Params
	 - TrackerGOTURN_create
	 - TrackerKCF
	 - TrackerKCF_CN
	 - TrackerKCF_CUSTOM
	 - TrackerKCF_GRAY
	 - TrackerKCF_Params
	 - TrackerKCF_create
	 - TrackerMIL
	 - TrackerMIL_Params
	 - TrackerMIL_create
	 - legacy_MultiTracker
	 - legacy_Tracker
	 - legacy_TrackerBoosting
	 - legacy_TrackerCSRT
	 - legacy_TrackerKCF
	 - legacy_TrackerMIL
	 - legacy_TrackerMOSSE
	 - legacy_TrackerMedianFlow
	 - legacy_TrackerTLD
	 - rapid_OLSTracker
	 - rapid_Tracker


In [18]:
import numpy as np
import cv2  # or opencv-python
import time


class PersonTracker:
    def __init__(self, id, frame, bbox, tracking_algorithm, fails_limit, color=(0,255,0), debug=False):
        self.debug = debug
        self.fails_limit = fails_limit
        
        bbox = tuple(bbox.astype(int))

        # Select our tracking algorithm and create our multi tracker
        OPENCV_OBJECT_TRACKERS = {
#             "boosting": cv2.TrackerBoosting_create,  # opencv 3.4
            "mil": cv2.TrackerMIL_create,
            "kcf": cv2.TrackerKCF_create,
#             "tld": cv2.TrackerTLD_create,  # opencv 3.4
#             "medianflow": cv2.TrackerMedianFlow_create,  # opencv 3.4
            "goturn": cv2.TrackerGOTURN_create,
#             "mosse": cv2.TrackerMOSSE_create,  # opencv 3.4
            "csrt": cv2.TrackerCSRT_create,
        }
        self.tracker = OPENCV_OBJECT_TRACKERS[tracking_algorithm]()
        self.tracker.init(frame, bbox)
        self.active, bbox = self.tracker.update(frame)
        bbox = tuple(np.array(bbox, dtype=int))
               
        if self.active == True:
            print(f"New person tracker added with id {id}.")
            self.tracking_algorithm = tracking_algorithm
            self.id = int(id)
            self.centroid = self.get_centroid(bbox)
            self.fails = int(0)
            self.color = color
            self.bbox = bbox

    def __str__(self):
        return f"id: {self.id}, fails: {self.fails}, active: {self.active}, bbox: {self.bbox}"

    def update(self, frame):
        if self.active == False:
            return False

        retval, bbox = self.tracker.update(frame)
        bbox = tuple(np.array(bbox, dtype=int))
        
        stucked = (self.bbox[0] == bbox[0]) and (self.bbox[1] == bbox[1])
        if (retval == False) or (stucked == True):
            self.fails += int(1)
        else:
            self.fails = int(0)
            self.active = True
        self.bbox = bbox

        print(f"Updated id {self.id}, fails: {self.fails}, bbox: {self.bbox} -> {bbox}")
        if self.fails >= self.fails_limit:
            self.remove()

        return self.active

    def remove(self):
        print(f"Person tracker with id {self.id} removed.")
        self.active = False

    def get_centroid(self, bbox):
        (x, y, w, h) = bbox
        xc = int(x + (w * 0.5))
        yc = int(y + (h * 0.5))
        return xc, yc
        
    def draw(self, frame):
        (x, y, w, h) = np.array(self.bbox, dtype=int)
        cv2.rectangle(frame,
            pt1=(x, y),
            pt2=(x + w, y + h),
            color=self.color,
            thickness=1
        )
        centroid = self.get_centroid(self.bbox)
        cv2.circle(frame,
            center=centroid,
            radius=2,
            color=self.color,
            thickness=1
        )
        cv2.putText(frame,
            text=f"{(int(self.id))}/{int(self.fails)}",
            org=(x,y),
            fontFace=cv2.FONT_HERSHEY_SIMPLEX,
            fontScale=0.4,
            color=self.color,
            thickness=1
        );


class PeopleTracker:
    def __init__(self, debug=False):
        self.debug = debug
        self.trackers = list()

    def __str__(self):
        ret = ""
        for i in range(len(self.trackers)):
            ret += f"{self.trackers[i]}\n"
        return ret[:-1]

    def update(self, frame):
        if len(self.trackers) == 0:
            return False
        for i in range(len(self.trackers)):
            retval = self.trackers[i].update(frame)

    def isPointInsideRect(self, point, rect) -> bool:
        (x, y) = point
        (x1, y1, w, h) = rect
        (x2, y2) = (x1 + w, y1 + h)
        return (x1 < x < x2) and (y1 < y < y2)

    def add(self, frame, bbox, tracking_algorithm="kcf", fails_limit=25):
        id = len(self.trackers) + 1
        
        tracker = PersonTracker(
            id, frame, bbox, tracking_algorithm, fails_limit
        )
        
        # Here is the integration between Detection and Tracking:
        # We are only adding a new person if its centroid resides outside
        # any other active tracked person's bbox, otherwise we use that
        # detecion to update the already tracked person. Note that this 
        # is not the best idea to deal with oclusion.
        isNew = True
        if tracker.active == True:
            for i in range(len(self.trackers)):
                if self.isPointInsideRect(
                    tracker.centroid, self.trackers[i].bbox
                ):
                    # Reset their fails
                    self.trackers[i].fails = 0
                    # Reactivate with this new tracker if it is inactive
                    if self.trackers[i].active == False:
                        tracker.id = i
                        self.trackers[i].tracker = tracker
                    isNew = False

        if isNew == True:
            self.trackers.append(tracker)

        return retval, id

    def remove(self, id):
        self.trackers[id].remove()

    def draw(self, frame):
        if len(self.trackers) == 0:
            return False
        for i in range(len(self.trackers)):
            if self.trackers[i].active == True:
                self.trackers[i].draw(frame)



In [43]:
# Open the input video capture
# minSize, maxSize, input_filename = ((100,100), (120,120), './1080p_TownCentreXVID.mp4')
# minSize, maxSize, input_filename = (None, (120,120), './720p_TownCentreXVID.mp4')
# minSize, maxSize, input_filename = ((20,20), (80,80), './480p_TownCentreXVID.mp4')
minSize, maxSize, input_filename = ((5,10), (60,40), './360p_TownCentreXVID.mp4')
vcap = cv2.VideoCapture(input_filename)

# Get video properties
frame_width = int(vcap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(vcap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = vcap.get(cv2.CAP_PROP_FPS)
n_frames = int(vcap.get(cv2.CAP_PROP_FRAME_COUNT))

print("Frame width:", frame_width)
print("Frame width:", frame_height)
print("Video fps:", fps)

# Setup the output video file
output_filename = './output.mp4'
apiPreference = cv2.CAP_FFMPEG
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
vout = cv2.VideoWriter(
    filename=output_filename,
    apiPreference=apiPreference,
    fourcc=fourcc,
    fps=fps,
    frameSize=(frame_width, frame_height),
)

print(f"Processing \"{input_filename}\" ({int(n_frames)} frames)...")

# Create our body classifier
detector = cv2.CascadeClassifier(
#     cv2.data.haarcascades + 'haarcascade_fullbody.xml'
    cv2.data.haarcascades + 'haarcascade_upperbody.xml'
)
# Create our People Tracker
people = PeopleTracker()

# # Start app
window_name = "People Tracking"
cv2.startWindowThread()
cv2.namedWindow(window_name)

# Loop each frame
frame_count = 0
frames_to_process = 1000
processed_frames = np.zeros(frames_to_process, dtype=object)

green = (0, 255, 0)
red = (255, 0 ,0)

# start timer
start = time.time()
keyframe_interval = 10
fps_timer = [0, cv2.getTickCount()]
while vcap.isOpened():
    # Read a frame
    retval, frame = vcap.read()
    if not retval or frame_count == frames_to_process:
        break
    
    # Use the classifier to detect new people
    if frame_count % keyframe_interval == 0:
        bboxes = detector.detectMultiScale(frame,
            scaleFactor=1.05,
            minNeighbors=2,
            minSize=minSize,
            maxSize=maxSize
        )
        
        for bbox in bboxes:
            people.add(frame, bbox, fails_limit=50,
                tracking_algorithm="kcf",
#                 tracking_algorithm="csrt",
#                 tracking_algorithm="mil",
#                 tracking_algorithm="goturn",
                [None]
            )
            
    people.update(frame)
    
    people.draw(frame)
    
    # Compute and put FPS on frame
    fps = cv2.getTickFrequency() / (fps_timer[1] - fps_timer[0]);
    fps_timer[0] = fps_timer[1]
    fps_timer[1] = cv2.getTickCount()
    cv2.putText(frame,
        text=f"FPS: {int(fps)}",
        org=(frame_width -60, frame_height -5),
        fontFace=cv2.FONT_HERSHEY_SIMPLEX,
        fontScale=0.3,
        color=green,
        thickness=1
    );

    # Save frame
    processed_frames[frame_count] = frame
    frame_count += 1

    # Show in app
    cv2.imshow(window_name, frame)
    cv2.waitKey(1)

# end timer
end = time.time()
overall_elapsed_time = end - start
elapsed_time_per_frame = overall_elapsed_time / frame_count

print("Done!")
print(f"{frame_count} frames processed in {overall_elapsed_time} seconds.")
print(f"({elapsed_time_per_frame}) seconds per frame.")
print(f"({1/elapsed_time_per_frame}) frames per second.")

# Write processed frames to file
for frame in processed_frames:
    vout.write(frame)

print(f"Output saved to \"{output_filename}\".")

vcap.release()
vout.release()
cv2.destroyAllWindows()

SyntaxError: positional argument follows keyword argument (<ipython-input-43-9cb0c96936b1>, line 78)