# Deep Sort Implementation

This notebook is dedicated to implementation of deep sort algorithm from [the original repository](https://github.com/nwojke/deep_sort). This implementation will be subjected to several changes during this project:

- Adding 3 additional detectors
- Adding 3 additional REID algorithms
- Adding 1 additional segmentation algorithm

This algorithm will be checked on the following data from MOT challenge:

- TUD-Campus
- TUD-Stadtmitte
- KITTI-17
- PETS09-S2L1 from MOT15
- MOT16-09, MOT16-11 from MOT16


Since one of the requirements of the project is to implement it in the Google Colab Notebook, I decided to move the main function from the repository into Google Colab in order to make visible any changes to the original algorithm.

## Project Structure

This project has the following file structure:

- `application_util`, `deep_sort`, `tools` folders contain original and modified code for deep sort
- `deep_sort_app.py` file contains functions for running the whole model
- `data` folder contains data from MOT challenges
- `resources` folder with `detections` and `networks` subfolders which contain object detections, features for REID and models for extracting features.
- `Byzov A - Final Project.ipynb` notebook with the detailed report on creation and development of the project.

## Step 1. Preparing a Google Colab folder for deep sort algorithm

In this step I move to a folder in Google Colab with necessary files. For this commit files are already there, but for the final commit I will create a script to copy these files from GitHub.

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [9]:
DATA_DIR = '/content/drive/MyDrive/CV_DL_Project'
%cd $DATA_DIR

/content/drive/MyDrive/CV_DL_Project


## Step 2. Preparing an original deep sort implementation for Google Colab

The original deep sort required minimal changes to make it operable on GitHub. These changes are:

- Adding lines to `linear_assignment.py` and `generate_detection.py` to make them work with tensorflow2;
- Changing `image_viewer.py` to make it work on Google Colab, i.e., using `cv2_imshow()` instead of `cv2.imshow()`;
- Changing name of the `run` function in `deep_sort_app.py` to `deep_sort_run` for ease of understanding
- Adding a new argument `custom_detection` which allows to switch on or off our future custom detection and reidentification models. For now, it is used as `False`

Now, let's check if it works. 

In [2]:
from deep_sort_app import *

  'Cython evaluation (very fast so highly recommended) is '


In [None]:
deep_sort_run(
    sequence_dir="./data/MOT16/test/MOT16-06",
    detection_file="./resources/detections/MOT16_POI_test/MOT16-06.npy",
    output_file="./tmp/hypotheses.txt",
    min_confidence=0.8,
    nms_max_overlap=1.0,
    min_detection_height=0,
    max_cosine_distance=0.2,
    nn_budget=None,
    display=False,
    custom_detection=False
)

## Step 3. Adding new object detection and reidentification algorithms

Original deep sort does not run in real time, it actually precalculates values for detection boxes and features REID and stores in `.npy` file. Since we want to create more or less real-time deep_sort implementation, we need to change `deep_sort_app.py` in such a way that it uses our own detections.

Detections have `N x 138` format, where the first ten observations are:
- Frame
- Object
- Bounding boxes
- Confidence Intervals
- X, Y, Z coordinates (irrelevant for 2D, needs to be kept at -1)
- Features for reidentification

We are going to change algorithms that produce bounding boxes and features for reidentification. 

For finding bounding boxes for images we are going to use a family of models `EfficientDet Lite`. These models are quite fast and allow us to easily extract bounding boxes in x1, x2, y1, y2 format. This format is good for finding image patches for feature extraction with reidentification models. The current implementation allows to use all models from the `EfficientDet Lite` family: `lite0`, `lite1`, `lite2`, `lite3`, `lite4`. These models differ in a number of parameters and their confidence (bigger model is slower, but more precise). While this format is good for image detection, it needs to be changed into to bb_left, bb_top, width, height. Credits to [Google](https://tfhub.dev/tensorflow/efficientdet/lite1/detection/1)

For extracting feature for reidentification we are going to use a more precise framework from torchreid. This framework extracts different number of features from each image patch (for example, resnet18 extracts 512 features and mobilenetv2_x1_0 extracts 1280 features). This framework supports huge number of models [read more here](https://kaiyangzhou.github.io/deep-person-reid/pkg/models.html). I recommend to use the following set of models: `resnet18` or these and other mobile models (i.e., `nasnetamobile`, `mobilenetv2_x1_0`, `mobilenetv2_x1_4`, `shufflenet_v2_x0_5`). Credits to [Dr. Kaiyang Zhou](https://kaiyangzhou.github.io/).

Let's see how it works on one image before we add it into `deep_sort_app.py` 


In [2]:
import tensorflow_hub as hub
import tensorflow as tf
import cv2
import numpy as np
from google.colab.patches import cv2_imshow
from torchreid.utils.feature_extractor import *

  'Cython evaluation (very fast so highly recommended) is '


Before we could use the whole setup, we also need to use some parts of torchreid and put into a project folder. To do it, you can use the following script, but you probably need to change some parts of the absolute path

In [10]:
!git clone https://github.com/KaiyangZhou/deep-person-reid.git

Cloning into 'deep-person-reid'...
remote: Enumerating objects: 9854, done.[K
remote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects: 100% (4/4), done.[K
remote: Total 9854 (delta 0), reused 0 (delta 0), pack-reused 9850[K
Receiving objects: 100% (9854/9854), 9.57 MiB | 7.73 MiB/s, done.
Resolving deltas: 100% (7285/7285), done.
Checking out files: 100% (155/155), done.


In [11]:
!mv /content/drive/MyDrive/CV_DL_Project/deep-person-reid/torchreid /content/drive/MyDrive/CV_DL_Project/

In [15]:
!rm -r /content/drive/MyDrive/CV_DL_Project/deep-person-reid/

rm: cannot remove '/content/drive/MyDrive/CV_DL_Project/deep-person-reid': No such file or directory


In [16]:
def extract_patches(image, boxes_scores):
    """
    extract patches from an image with box scores
    """
    boxes_int = boxes_scores[:4]
    patches = np.asarray([image[x1:x2, y1:y2] for x1, y1, x2, y2, _ in np.int32(boxes_scores)])
    return patches

def create_object_detector(model_det):
    """
    downloads object detector and loads it into a program, creates a specific transformation for images for object detection
    """
    object_detector_model = f"https://tfhub.dev/tensorflow/efficientdet/{model_det}/detection/1"
    object_detector = hub.load(object_detector_model)

    def detection_img_transformer(image):
        return tf.image.convert_image_dtype(image, tf.uint8)[tf.newaxis, ...]

    return object_detector, detection_img_transformer

def create_reid_extractor(model_reid):
    """
    creates feature extractor from image with different models trained for REID task
    """
    reid_feature_extractor = FeatureExtractor(model_reid)
    
    return reid_feature_extractor

def create_custom_detections(image, frame_idx, object_detector, detection_img_transformer, reid_feature_extractor, conf_level=0.5):
    """
    creates a detection_mat np array in a similar to original format where first ten columns are from MOT challenge format and other are for features
    """
    det_image = detection_img_transformer(image)
    boxes, scores, _, _ = object_detector(det_image)

    boxes_scores = np.dstack((boxes, scores))
    boxes_scores = boxes_scores[0][boxes_scores[0][:, 4] >= conf_level]

    box_rows, _ = boxes_scores.shape

    patches = extract_patches(image, boxes_scores)

    boxes_scores[:, 2] = boxes_scores[:, 2] - boxes_scores[:, 0]
    boxes_scores[:, 3] = boxes_scores[:, 3] - boxes_scores[:, 1]

    features = np.array([reid_feature_extractor(img).cpu().numpy() for img in patches])
    features = features.reshape(box_rows, features.shape[2])

    detection_mat = np.concatenate(
        (
            np.repeat(frame_idx, box_rows).reshape(box_rows, 1),
            np.repeat(-1, box_rows).reshape(box_rows, 1),
            boxes_scores,
            np.repeat(np.array([-1, -1, -1]), box_rows).reshape(box_rows, 3),
            features
        ),
        axis=1
    )

    return detection_mat


In [None]:
image = cv2.imread('./img.jpeg', cv2.IMREAD_COLOR)
object_detector, detection_img_transformer = create_object_detector("lite1")
reid_feature_extractor = create_reid_extractor("resnet18")

In [35]:
detection_mat = create_custom_detections(image, 1, object_detector, detection_img_transformer, reid_feature_extractor)

  


## Step 4. Updating deep_sort_run function

Now, it is important to move our functionality into `deep_sort_app.py` in such a way that any user could easily fire up `deep_sort_run` with custom real time object detection and reidentification algorithms.

To do that I added several additional arguments into `deep_sort_run`:

- `custom_detection` - a flag that allows switch on switch off a custom detection
- `model_det` - a string parameter that allows us to specify a model from EfficientDet family. Uses _lite0_ by default
- `model_reid` - a string parameter that allows us to specify a model from a huge number of models used in torchreid. Uses _resnet18_ by default.

For this code to work, you need to use GPU runtime in Google Colab.

In [None]:

deep_sort_run(
    sequence_dir="./data/MOT16/test/MOT16-06",
    detection_file="./resources/detections/MOT16_POI_test/MOT16-06.npy",
    output_file="./tmp/hypotheses.txt",
    min_confidence=0.8,
    nms_max_overlap=1.0,
    min_detection_height=0,
    max_cosine_distance=0.2,
    nn_budget=None,
    display=False,
    custom_detection=True,
    model_det="lite0",
    model_reid="resnet18"
)