# Pedestrian tracking with FairMOT object tracking model


## Overview
This notebook implements a SOTA pedestrian tracking model, FairMOT, which integrate object detection and re-identification into a single deep neural network. Tasks include:

1. Initiate a PyTorch version of DLA-34 baseline model, the backbone neural network of FairMOT. Load the pre-trained model weights that was trained on CrowdHuman and MIX datasets (see https://github.com/ifzhang/FairMOT for details).
2. Load input video. For each frame:
 
 2a. use the DLA-34 model to predict both object and object embeddings (i.e. features for identification).
 
 2b. asssociate detected objects with already-tracked or new ID, by examing embeddings and distance moved across frames.
 
 2c. generate output frame images with boxes and IDs

3. Combine processed frames to create output video.
4. (to be implemented) Export trajectories of the bottom center point of each bounding boxes, as the movement trajectories of people.

Notes:
- Tasks are carried out using open-source tool created by Yifu Zhang, which also contains scripts for model training and testing.
- In this notebook, pre-trained model is used for demonstrative purpose.
- In production, the FairMOT model will be trained on proper training data sets. Model checkpoints will be version-controlled, with approved ones saved to S3 for potential use. Model scripts will be containerized and stored in AWS ECR for use in AWS ECS or Fargate.

## 1) Install pacakges and setup environment


In [None]:
# install tools and packages
!git clone https://github.com/ifzhang/FairMOT     # the FairMot scripts
!git clone https://github.com/CharlesShang/DCNv2  # package for using DLA-34 model, the backbone neural network of FairMOT

In [None]:
# build DCNv2 
!python ./DCNv2/setup.py build develop

In [None]:
# istall requirements of FairMOT
!pip install -r ./external-repo/FairMOT/requirements.txt
!conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch -y
!conda install ffmpeg -y

In [None]:
# download pre-trained model weights of the DLA-34 backbone model.
#  - url: https://drive.google.com/open?id=1udpOPum8fJdoEQm6n0jsIgMMViOMFinu
#  - saved to S3 in advance
import boto3
import cv2
from PIL import Image

sess = sagemaker.Session()
s3 = boto3.client('s3')

# S3 bucket name and directories for the pre-trained DLA34 model
bucket = 'pedestrian-tracker'  
s3key_model = 'raw-pretrained-model'
fname = 'fairmot_dla34.pth'
s3.download_file(bucket, s3key + '/' + fname,  'external-repo/FairMOT/model/' + fname)

## 2) Run FairMOT model

In [1]:
# Execute FairMOT scripts with the following parameters
#  - load_model: baseline PyTorch model check point to load 
#  - input-video: input video to analyze
#  - conf_thres: Confidence threshold of tracking object identity; the highly, the more sceptical to assign an object to an ID
#  - det_thres: Confidence threshold of detecting an object.
#  - nmn_thres: Intersection over Union (IOU) threshold for non-max suppresion operation to remove bounding box proposals. Important if groups of people walk in close groups.
#  - track_buffer: maximum number of video frames for which an object is allowed to be missing before considered 'lost'

!python ./external-repo/FairMOT/src/demo.py mot --load_model ./external-repo/FairMOT/models/fairmot_dla34.pth \
        --conf_thres 0.3 --det_thres 0.3 --nms_thres 0.4 --track_buffer 30 \
        --input-video ./external-repo/FairMOT/videos/shopping-mall2.mp4 \
        --output-root ./external-repo/FairMOT/outputs
