# IOU Tracker

IOUTracker implements a tracking algorithm or method to track objects based on their Intersection-Over-Union (IOU) information across the consecutive frames. The core concept of this algorithm refers to the article (http://elvera.nue.tu-berlin.de/files/1517Bochinski2017.pdf). The idea or the assumption is based on an existing and powerful detector and the high frame rate across the consecutive frames. Under this assumption, you can conduct the object tracking with only the localization and the IOU information. The algorithm conducts under a super-high frame rate and provides a foundation for more complicated calculations upon it. 

On the other hand, such an algorithm requires an evaluation. The evaluation of this implement also refers to two articles, MOT16 benchmark (https://arxiv.org/abs/1603.00831) and Multi-Target Tracker (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.309.8335&rep=rep1&type=pdf).

This implementation uses MOT17Det dataset (https://motchallenge.net/data/MOT17Det/) as an example.

* More information please refer to https://github.com/jiankaiwang/ioutracker.
* More example videos please refer to .

In [1]:
import os

try:
  # you must install ioutracker first
  from ioutracker import loadLabel, outputAsFramesToVideo, IOUTracker
  from ioutracker import EvaluateOnMOTDatasets, ExampleEvaluateMOTDatasets
except Exception as e:
  import sys
  modulePaths = [os.path.join(".", "ioutracker")]
  for path in modulePaths: sys.path.append(path)
  from ioutracker.dataloaders.MOTDataLoader import loadLabel
  from ioutracker.inference.MOTDet17Main import outputAsFramesToVideo
  from ioutracker.inference.MOTDet17Main import outputAsFramesToVideo
  from ioutracker.src.IOUTracker import IOUTracker
  from ioutracker.metrics.MOTMetrics import EvaluateOnMOTDatasets, ExampleEvaluateMOTDatasets

## Data Preprocessing

You can use the shell script under the path, (`./ioutracker/dataloaders/MOTDataDownloader.sh`), in the git repository to download the dataset.

In [2]:
SUB_DATASET = "train"
VERSION = "MOT17Det"
LOCAL_PATH = os.path.join("/", "tmp", "MOT")

# you can change the path pointing to the dataset
LABEL_PATH = os.path.join(LOCAL_PATH, "{}".format(VERSION + "Labels"), SUB_DATASET)
assert os.path.exists(LABEL_PATH), "{} is not found.".format(LABEL_PATH)

# you can change the path pointing to the dataset
FRAME_PATH = os.path.join(LOCAL_PATH, "{}".format(VERSION), SUB_DATASET)
assert os.path.exists(FRAME_PATH), "{} is not found.".format(FRAME_PATH)

## Sample Selection

In [3]:
totalSamples = next(os.walk(os.path.join(LABEL_PATH)))[1]
print(totalSamples)

['MOT17-13', 'MOT17-09', 'MOT17-11', 'MOT17-10', 'MOT17-04', 'MOT17-05', 'MOT17-02']


In [4]:
SAMPLE = "MOT17-02"
LABEL_FILE_PATH = os.path.join(LABEL_PATH, SAMPLE, "gt", "gt.txt")
assert os.path.exists(LABEL_FILE_PATH), "{} is not found.".format(LABEL_FILE_PATH)

FRAME_FILE_PATH = os.path.join(FRAME_PATH, SAMPLE, "img1")
assert os.path.exists(FRAME_FILE_PATH), "{} is not found.".format(FRAME_FILE_PATH)

## Output the tracking result on the video

In [5]:
FRAME_FPS = {"MOT17-13": 25, "MOT17-11": 30, "MOT17-10": 30, "MOT17-09": 30,
             "MOT17-05": 14, "MOT17-04": 30, "MOT17-02": 30}
assert SAMPLE in list(FRAME_FPS.keys()), "{} was not found.".format(SAMPLE)
fps = FRAME_FPS[SAMPLE]
print("Sample {} with FPS: {}".format(SAMPLE, fps))

Sample MOT17-02 with FPS: 30


Check whether or not the folder is existing, or create it if it isn't.

In [6]:
tracking_output = os.path.join(LOCAL_PATH, "tracking_output".format(VERSION))
if not os.path.exists(tracking_output):
  try:
    os.mkdir(tracking_output)
    print("Created the output path: {}".format(tracking_output))
  except Exception as e:
    raise Exception("Can't create the folder. ({})".format(e))

In this example, we introduce how to output the tracking result on the consecutive frame to a video file. Notice the flag `plotting` must be set to `True` if you want to output the video.

In [7]:
outputAsFramesToVideo(detection_conf=0.2,
                      iou_threshold=0.2,
                      min_t=fps,
                      track_min_conf=0.5,
                      labelFilePath=LABEL_FILE_PATH,
                      frameFilePath=FRAME_FILE_PATH,
                      trackingOutput=tracking_output,
                      fps=fps,
                      outputFileName=SAMPLE,
                      plotting=True)

100%|██████████| 524/524 [00:35<00:00, 14.95it/s]


Total time cost: 76.77822375297546


You can move the video to the desired path like below.

```sh
mv /tmp/MOT/tracking_output/tracking_MOT17-09.mp4 ~/Desktop/
```

## A simple example from scratch

Load the MOT data first via the API `loadLabel`. You can also use `help` to take a look into the parameters.

In [7]:
LABELS, DFPERSONS = loadLabel(src=LABEL_FILE_PATH, format_style="metrics_dict")

In [8]:
help(loadLabel)

Help on function loadLabel in module ioutracker.dataloaders.MOTDataLoader:

loadLabel(src, is_path=True, load_Pedestrian=True, load_Static_Person=True, visible_thresholde=0, format_style='onlybbox')
    LoadLabel: Load a label file in the csv format.
    
    Args:
      src: the MOT label file path (available when is_path is True)
      is_path: True or False for whether the src is the file path or not
      load_Pedestrian: whether to load the pedestrian data or not
      load_Static_Person: whether to load the statuc person data or not
      visible_thresholde: the threshold for filtering the invisible person data
      format_style: provides different styles in the lists,
                    "onlybbox" (func: formatBBoxAndVis), "onlybbox_dict" (func: formatBBoxAndVis),
                    "metrics" (func: formatForMetrics), "metrics_dict" (func: formatForMetrics)
    
    Returns:
      objects_in_frames: a list contains the person detection information per frames



LABELS is a dictionary whose key is the frame ID and whose value is each object detection result. The result keeps the localization and visibility in a list, more detail is `[x, y, w, h, visibility]`.

In [9]:
LABELS[1]

[[912.0, 484.0, 97.0, 109.0, 1.0, 1.0],
 [1338.0, 418.0, 167.0, 379.0, 1.0, 2.0],
 [586.0, 447.0, 85.0, 263.0, 1.0, 3.0],
 [1416.0, 431.0, 184.0, 336.0, 0.51351, 8.0],
 [1056.0, 484.0, 36.0, 110.0, 0.94595, 9.0],
 [1091.0, 484.0, 31.0, 115.0, 1.0, 10.0],
 [1255.0, 447.0, 33.0, 100.0, 1.0, 14.0],
 [1016.0, 430.0, 40.0, 116.0, 0.98687, 15.0],
 [1101.0, 441.0, 38.0, 108.0, 0.6584300000000001, 17.0],
 [935.0, 436.0, 42.0, 114.0, 0.41739, 18.0],
 [442.0, 446.0, 105.0, 283.0, 1.0, 19.0],
 [636.0, 458.0, 61.0, 187.0, 0.41935, 20.0],
 [1364.0, 434.0, 51.0, 124.0, 0.0, 21.0],
 [1478.0, 434.0, 63.0, 124.0, 0.0, 22.0],
 [473.0, 460.0, 89.0, 249.0, 0.16667, 23.0],
 [835.0, 473.0, 52.0, 75.0, 1.0, 24.0],
 [796.0, 476.0, 55.0, 60.0, 0.69643, 25.0],
 [548.0, 465.0, 35.0, 93.0, 0.52778, 26.0],
 [376.0, 446.0, 41.0, 104.0, 1.0, 30.0],
 [418.0, 459.0, 40.0, 84.0, 0.58537, 31.0],
 [582.0, 456.0, 35.0, 133.0, 0.11110999999999999, 36.0],
 [972.0, 456.0, 32.0, 77.0, 0.29370999999999997, 39.0],
 [693.0, 462.

DFPERSONS is a pandas dataframe object that is processed and filtered unnecessary columns.

In [10]:
DFPERSONS.head(5)

Unnamed: 0,fid,uid,bX,bY,bW,bH,conf,class,visible
0,1,1,912,484,97,109,0,7,1.0
1,2,1,912,484,97,109,0,7,1.0
2,3,1,912,484,97,109,0,7,1.0
3,4,1,912,484,97,109,0,7,1.0
4,5,1,912,484,97,109,0,7,1.0


Create a ioutracker that implements the IOU tracker algorithm.

In [11]:
iouTracks = IOUTracker(detection_conf=0.2,
                       iou_threshold=0.2,
                       min_t=fps,
                       track_min_conf=0.5)

In [12]:
help(IOUTracker)

Help on class IOUTracker in module ioutracker.src.IOUTracker:

class IOUTracker(builtins.object)
 |  IOUTracker(detection_conf=0.5, iou_threshold=0.5, min_t=1, track_min_conf=0.5)
 |  
 |  IOUTracker implements the IOU tracker algorithm details.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, detection_conf=0.5, iou_threshold=0.5, min_t=1, track_min_conf=0.5)
 |      Constructor.
 |      
 |      Args:
 |        detection_conf (sigma_l): the detection was removed when its confident score
 |                                  is lower than detection_conf
 |        iou_threshold (sigma_IOU): the min IOU threshold between a detection and
 |                                   active tracks
 |        min_t: the track is filtered out when its length is shorter than min_t
 |        track_min_conf (sigma_h): the track is filtered out when all of its detections'
 |                                  confident scores are less than the track_min_conf
 |  
 |  clear_finished_tracks(self)
 |     

The variable `tid_count` is used to assign the unique ID to each track. 

In this implementation, the IOUTracker is designed to take object detection results frame by frame, not a whole video. It keeps the information of each track. You can access the active tracks via the `get_active_tracks()` method, and watch the finished tracks via the `get_finished_tracks()` method.
 
On the other hand, you can access the attribute `tid` of each track to get the relative track ID.

In [13]:
tid_count = 1

for label in range(1, len(LABELS), 1):
  # iou tracker
  iouTracks.read_detections_per_frame(detections=LABELS[label])

  active_tacks = iouTracks.get_active_tracks()
  finished_tracks = iouTracks.get_finished_tracks()

  if label % 50 == 0:
    print("Frame {} tracker Info: active {}, finished {}".format(label, len(active_tacks), len(finished_tracks)))
  
  # simple way to assign the tracker ID
  for act_track in active_tacks:
    if not act_track.tid:
      # assign track id to use the color
      act_track.tid = tid_count
      tid_count += 1

Frame 50 tracker Info: active 39, finished 2
Frame 100 tracker Info: active 37, finished 7
Frame 150 tracker Info: active 40, finished 7
Frame 200 tracker Info: active 41, finished 7
Frame 250 tracker Info: active 43, finished 9
Frame 300 tracker Info: active 45, finished 11
Frame 350 tracker Info: active 47, finished 11
Frame 400 tracker Info: active 46, finished 13
Frame 450 tracker Info: active 45, finished 17
Frame 500 tracker Info: active 44, finished 17
Frame 550 tracker Info: active 44, finished 19
Frame 600 tracker Info: active 44, finished 21
Frame 650 tracker Info: active 43, finished 27
Frame 700 tracker Info: active 40, finished 32
Frame 750 tracker Info: active 36, finished 34
Frame 800 tracker Info: active 43, finished 34
Frame 850 tracker Info: active 44, finished 35
Frame 900 tracker Info: active 46, finished 35
Frame 950 tracker Info: active 49, finished 36
Frame 1000 tracker Info: active 48, finished 37


## Metrics

ExampleEvaluateMOTDatasets helps you evaluate on each dataset or each video. Here we use the same ground truth data as the predictions to test the functionality due to the lack of a detector.

Notice that this step takes a long time to process.

In [17]:
Predictions, _ = loadLabel(src=LABEL_FILE_PATH, is_path=True, load_Pedestrian=True, load_Static_Person=True,
    visible_thresholde=0.2, format_style="metrics_dict")

In [18]:
evalMOT = ExampleEvaluateMOTDatasets(LABEL_FILE_PATH, predictions=Predictions, printOnScreen=True)

100%|██████████| 599/599 [05:35<00:00,  1.79it/s]

TP: 11950
FP:     0
FN:     0
GT: 11950
Fragment Number:    159
SwitchID Number:    263
Recall: 100.000%
Precision: 100.000%
Accuracy: 100.000%
F1 Score: 1.000
MOTA: 0.977992
Total trajectories: 63
MT Number: 63, Ratio: 100.000%
PT Number: 0, Ratio: 0.000%
ML Number: 0, Ratio: 0.000%





EvaluateOnMOTDatasets class helps you evaluate the multiple datastes.

In [12]:
evalMOTData = EvaluateOnMOTDatasets()

# it is simple to pass the whole package of results into the multiple-dataset evaluator
evalMOTData(evalMOT)
evalMOTRes = evalMOTData.getResults()

Metric mota: Value -0.01489539748953983
Metric recall: Value 1.0
Metric precision: Value 0.5017845895444047
Metric accuracy: Value 1.0
Metric f1score: Value 0.6682510834614847
Metric rateMT: Value 1.0
Metric ratePT: Value 0.0
Metric rateML: Value 0.0
Metric TP: Value 23900
Metric FP: Value 23730
Metric FN: Value 0
Metric GT: Value 23900
Metric numFragments: Value 159
Metric numSwitchID: Value 263
Metric numMT: Value 63
Metric numPT: Value 0
Metric numML: Value 0
Metric numTraj: Value 63


You can also evaluate metrics on each MOT datasets and then summarize them.

In [None]:
evalMOTData = EvaluateOnMOTDatasets()
for key, _ in FRAME_FPS.items():
  LABEL_FILE_PATH = os.path.join(LABEL_PATH, key, "gt", "gt.txt")
  assert os.path.exists(LABEL_FILE_PATH), "{} is not found.".format(LABEL_FILE_PATH)
  print("Sample: {}".format(key))
  
  # here predictions flag is set to None, it makes to use the ground truth as the prediction
  evalMOT = ExampleEvaluateMOTDatasets(LABEL_FILE_PATH, predictions=None, printOnScreen=True)
  evalMOTData(evalMOT)
  print("", end="\n\n")
evalMOTRes = evalMOTData.getResults()