# IOU Tracker

IOUTracker implements a tracking algorithm or method to track objects based on their Intersection-Over-Union (IOU) information across the consecutive frames. The core concept of this algorithm refers to the article (http://elvera.nue.tu-berlin.de/files/1517Bochinski2017.pdf). The idea or the assumption is based on an existing and powerful detector and the high frame rate across the consecutive frames. Under this assumption, you can conduct the object tracking with only the localization and the IOU information. The algorithm conducts under a super-high frame rate and provides a foundation for more complicated calculations upon it. 

On the other hand, such an algorithm requires an evaluation. The evaluation of this implement also refers to two articles, MOT16 benchmark (https://arxiv.org/abs/1603.00831) and Multi-Target Tracker (http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.309.8335&rep=rep1&type=pdf).

This implementation uses MOT17Det dataset (https://motchallenge.net/data/MOT17Det/) as an example.

* More information please refer to https://github.com/jiankaiwang/ioutracker.
* More example videos please refer to .

In [None]:
import os
import tqdm
import time
import logging

try:
  # you must install ioutracker first
  from ioutracker import loadLabel, outputAsFramesToVideo, IOUTracker
  from ioutracker import EvaluateOnMOTDatasets, ExampleEvaluateMOTDatasets, EvaluateByFrame
  logging.warning("Load ioutracker from the installed package.")
except Exception as e:
  import sys
  modulePaths = [os.path.join("..")]
  for path in modulePaths: sys.path.append(path)
  from ioutracker.dataloaders.MOTDataLoader import loadLabel
  from ioutracker.inference.MOTDet17Main import outputAsFramesToVideo
  from ioutracker.inference.MOTDet17Main import outputAsFramesToVideo
  from ioutracker.src.IOUTracker import IOUTracker
  from ioutracker.metrics.MOTMetrics import EvaluateOnMOTDatasets, ExampleEvaluateMOTDatasets, EvaluateByFrame
  logging.warning("Load ioutracker from the relative path.")

## Data Preprocessing

You can use the shell script under the path, (`./ioutracker/dataloaders/MOTDataDownloader.sh`), in the git repository to download the dataset.

In [None]:
SUB_DATASET = "train"
VERSION = "MOT17Det"
LOCAL_PATH = os.path.join("/", "tmp", "MOT")

# you can change the path pointing to the dataset
LABEL_PATH = os.path.join(LOCAL_PATH, "{}".format(VERSION + "Labels"), SUB_DATASET)
assert os.path.exists(LABEL_PATH), "{} is not found.".format(LABEL_PATH)

# you can change the path pointing to the dataset
FRAME_PATH = os.path.join(LOCAL_PATH, "{}".format(VERSION), SUB_DATASET)
assert os.path.exists(FRAME_PATH), "{} is not found.".format(FRAME_PATH)

## Sample Selection

In [None]:
totalSamples = next(os.walk(os.path.join(LABEL_PATH)))[1]
print(totalSamples)

In [None]:
SAMPLE = "MOT17-04"
LABEL_FILE_PATH = os.path.join(LABEL_PATH, SAMPLE, "gt", "gt.txt")
assert os.path.exists(LABEL_FILE_PATH), "{} is not found.".format(LABEL_FILE_PATH)

FRAME_FILE_PATH = os.path.join(FRAME_PATH, SAMPLE, "img1")
assert os.path.exists(FRAME_FILE_PATH), "{} is not found.".format(FRAME_FILE_PATH)

## Output the tracking result on the video

In [None]:
FRAME_FPS = {"MOT17-13": 25, "MOT17-11": 30, "MOT17-10": 30, "MOT17-09": 30,
             "MOT17-05": 14, "MOT17-04": 30, "MOT17-02": 30}
assert SAMPLE in list(FRAME_FPS.keys()), "{} was not found.".format(SAMPLE)
fps = FRAME_FPS[SAMPLE]
print("Sample {} with FPS: {}".format(SAMPLE, fps))

Check whether or not the folder is existing, or create it if it isn't.

In [None]:
tracking_output = os.path.join(LOCAL_PATH, "tracking_output".format(VERSION))
if not os.path.exists(tracking_output):
  try:
    os.mkdir(tracking_output)
    print("Created the output path: {}".format(tracking_output))
  except Exception as e:
    raise Exception("Can't create the folder. ({})".format(e))

In this example, we introduce how to output the tracking result on the consecutive frame to a video file. Notice the flag `plotting` must be set to `True` if you want to output the video.

In [None]:
outputAsFramesToVideo(detection_conf=0.2,
                      iou_threshold=0.2,
                      min_t=fps,
                      track_min_conf=0.5,
                      labelFilePath=LABEL_FILE_PATH,
                      frameFilePath=FRAME_FILE_PATH,
                      trackingOutput=tracking_output,
                      fps=fps,
                      outputFileName=SAMPLE,
                      plotting=True)

You can move the video to the desired path like below.

```sh
mv /tmp/MOT/tracking_output/tracking_MOT17-04.mp4 ~/Desktop/
```

## A simple example from scratch

Load the MOT data first via the API `loadLabel`. You can also use `help` to take a look into the parameters.

In [None]:
LABELS, DFPERSONS = loadLabel(src=LABEL_FILE_PATH, format_style="metrics_dict")

In [None]:
help(loadLabel)

LABELS is a dictionary whose key is the frame ID and whose value is each object detection result. The result keeps the localization and visibility in a list, more detail is `[x, y, w, h, visibility]`.

In [None]:
len(LABELS[1]), LABELS[1]

DFPERSONS is a pandas dataframe object that is processed and filtered unnecessary columns.

In [None]:
DFPERSONS.head(5)

You can instantiate an IOUTracker to start the algorithm.

In [None]:
help(IOUTracker)

Create a ioutracker that implements the IOU tracker algorithm. In this example, we use the default ID increment mechanism to get the tracker ID for each box.

In [None]:
iouTracks = IOUTracker(detection_conf=0.2,
                       iou_threshold=0.2,
                       min_t=fps,
                       track_min_conf=0.5, 
                       assignedTID=True)

differs = 0
allFrameIds = list(LABELS.keys())
start, end = min(allFrameIds), max(allFrameIds) + 1

for frameIdx in range(start, end, 1):
  # apply the IOUTracker algorithm
  # you can set runPreviousVersion to choose the latest or previous version
  # the latest version saves lots of time to return the tracker ID
  detectionMapping, finished = iouTracks(LABELS[frameIdx], returnFinishedTrackers=True, runPreviousVersion=False)

  if frameIdx % (fps * 5) == 0:
    print("Frame: {}".format(frameIdx))
    for bboxIdx in range(len(LABELS[frameIdx])):
      print("BBox: {}, and its Track ID: {}".format(LABELS[frameIdx][bboxIdx], detectionMapping[bboxIdx]["tid"]))
    print("...")

The variable `tid_count` is used to assign the unique ID to each track. 

In this implementation, the IOUTracker is designed to take object detection results frame by frame, not a whole video. It keeps the information of each track. You can access the active tracks via the `get_active_tracks()` method, and watch the finished tracks via the `get_finished_tracks()` method.
 
On the other hand, you can access the attribute `tid` of each track to get the relative track ID.

In [None]:
iouTracks = IOUTracker(detection_conf=0.2,
                       iou_threshold=0.2,
                       min_t=fps,
                       track_min_conf=0.5, 
                       assignedTID=False)

tid_count = 1

for label in range(1, len(LABELS), 1):
  # iou tracker
  iouTracks.read_detections_per_frame(detections=LABELS[label])

  active_tacks = iouTracks.get_active_tracks()
  finished_tracks = iouTracks.get_finished_tracks()

  if label % 50 == 0:
    print("Frame {} tracker Info: active {}, finished {}".format(label, len(active_tacks), len(finished_tracks)))
  
  # simple way to assign the tracker ID
  for act_track in active_tacks:
    if not act_track.tid:
      # assign track id to use the color
      act_track.tid = tid_count
      tid_count += 1

## Metrics

ExampleEvaluateMOTDatasets helps you evaluate on each dataset or each video. Here we use the same ground truth data as the predictions to test the functionality due to the lack of a detector.

Notice that this step takes a long time to process.

In [None]:
Predictions, _ = loadLabel(src=LABEL_FILE_PATH, is_path=True, load_Pedestrian=True, load_Static_Person=True,
    visible_thresholde=0.2, format_style="metrics_dict")

In [None]:
evalMOT = ExampleEvaluateMOTDatasets(LABEL_FILE_PATH, predictions=Predictions, printOnScreen=True)

Otherwise, you can also evaluate the tracking result to the ground truth. The tracking result must contain the tracking ID.

In [None]:
def EvaluateFrameByFrame(groundTruth, predictions, printOnScreen=False):
  evalFrame = EvaluateByFrame(requiredTracking=False)  
  
  gtFKeys = list(groundTruth.keys())
  predFKeys = list(predictions.keys())
  numTNOnFrame = 0
  
  ttlFid = list(set(gtFKeys + predFKeys))
  ttlFid = sorted(ttlFid)
  start = ttlFid[0]
  end = ttlFid[-1] + 1
  logging.warning("start {} and end {}".format(start, end))
  
  for fid in tqdm.trange(start, end, 1):
    if fid not in gtFKeys and fid not in predFKeys:
      # skip the true negative
      numTNOnFrame += 1
      continue
    
    GTFrameInfo = groundTruth[fid] if fid in gtFKeys else [[]]
    prediction = predictions[fid] if fid in predFKeys else [[]]
    
    evalFrame.evaluateOnPredsWithTrackerID(GTFrameInfo, prediction)
    
  metaRes = evalFrame.getMetricsMeta(printOnScreen=printOnScreen)
  cmRes = evalFrame.getCM(printOnScreen=printOnScreen)
  motaRes = evalFrame.getMOTA(printOnScreen=printOnScreen)
  trajRes = evalFrame.getTrackQuality(printOnScreen=printOnScreen)
  
  if printOnScreen: print("The true negatives: {} frames".format(numTNOnFrame))
  
  return metaRes, cmRes, motaRes, trajRes

In [None]:
def transformUID(dicts, dataType="int"):
  dictCopy = dicts.copy()
  for key in list(dictCopy.keys()):
    if dataType in ["int", "integer"]:
      for objIdx in range(len(dictCopy[key])):
        dictCopy[key][objIdx][5] = int(dictCopy[key][objIdx][5])
    elif dataType in ["string", "str"]:
      for objIdx in range(len(dictCopy[key])):
        dictCopy[key][objIdx][5] = str(dictCopy[key][objIdx][5])
  return dictCopy

In [None]:
groundTruth, _ = loadLabel(src=LABEL_FILE_PATH, is_path=True, load_Pedestrian=True, load_Static_Person=True,
    visible_thresholde=0.2, format_style="metrics_dict")

predictions, _ = loadLabel(src=LABEL_FILE_PATH, is_path=True, load_Pedestrian=True, load_Static_Person=True,
    visible_thresholde=0.2, format_style="metrics_dict")

# transform the datatype of UID
groundTruth = transformUID(groundTruth, dataType="int")
predictions = transformUID(predictions, dataType="int")

evalMOT = EvaluateFrameByFrame(groundTruth=groundTruth, predictions=predictions, printOnScreen=True)

EvaluateOnMOTDatasets class helps you evaluate the multiple datastes.

In [None]:
evalMOTData = EvaluateOnMOTDatasets()

# it is simple to pass the whole package of results into the multiple-dataset evaluator
evalMOTData(evalMOT)
evalMOTRes = evalMOTData.getResults()

You can also evaluate metrics on each MOT datasets and then summarize them.

In [None]:
evalMOTData = EvaluateOnMOTDatasets()
for key, _ in FRAME_FPS.items():
  LABEL_FILE_PATH = os.path.join(LABEL_PATH, key, "gt", "gt.txt")
  assert os.path.exists(LABEL_FILE_PATH), "{} is not found.".format(LABEL_FILE_PATH)
  print("Sample: {}".format(key))
  
  # here predictions flag is set to None, it makes to use the ground truth as the prediction
  evalMOT = ExampleEvaluateMOTDatasets(LABEL_FILE_PATH, predictions=None, printOnScreen=True)
  evalMOTData(evalMOT)
  print("", end="\n\n")
evalMOTRes = evalMOTData.getResults()