# Multiple Object Tracking  Datasets

All dataset will have classes that read annotations. The dataset class needs to extend our custom `BaseDatasetTracking` class. 

The `BaseDatasetTracking` class has the following methods:

* `generate_dataset_statistics`: generates a summary of the dataset (e.g. number of videos, number of annotations, etc.)
* `save_dataset_statistics`: saves the summary to a `json` file




In [1]:
# expose parent directory to import modules
import os
import sys

ROOT_DIR = os.getcwd()
while os.path.basename(ROOT_DIR) != 'DatasetsStatistics':
    ROOT_DIR = os.path.abspath(os.path.join(ROOT_DIR,'..'))
sys.path.insert(0,ROOT_DIR)
os.chdir(ROOT_DIR)

In [2]:
TASK='MOT'

## **1. MOT 2017 dataset**

``` python

## 1. MOT 2017 dataset
from bases.mot_coco import MOT_IN_COCO
from pathlib import Path

dataset_year = 2017
dataset_stem = f"MOT{dataset_year}"
split = "train"
subset = f"{split}_cocoformat"
annotiation_file = Path(f"./data/{dataset_stem}/annotations/{subset}.json")
D = MOT_IN_COCO(annotation_file=str(annotiation_file))

# # generate and load stats
D.generate_dataset_statistics()

# save the stats
D.save_dataset_statistics(save_path = f"./summaries/{TASK}",
                            dataset_name = f"{dataset_stem}",
                            file_name = f"{subset}_stats.json"
                            )
print(f"[INFO] Saved")

```

## **2. MOT 2020 dataset**

``` python

## 2. MOT 2020 dataset
from bases.mot_coco import MOT_IN_COCO
from pathlib import Path

dataset_year = 2020
dataset_stem = f"MOT{dataset_year}"
split = "train"
subset = f"{split}_cocoformat"
annotiation_file = Path(f"./data/{dataset_stem}/annotations/{subset}.json")
D = MOT_IN_COCO(annotation_file=str(annotiation_file))

# # generate and load stats
D.generate_dataset_statistics()

# save the stats
D.save_dataset_statistics(save_path = f"./summaries/{TASK}",
                            dataset_name = f"{dataset_stem}",
                            file_name = f"{subset}_stats.json"
                            )
print(f"[INFO] Saved")

```

## **3.  Visdrone MOT dataset**

``` python

## 3. Visdrone MOT dataset
from bases.visdrone_mot import VisDroneMOT 
from pathlib import Path

tag = "mot"
dataset_stem = f"visdrone_{tag}"
split = "visdrone_mot"
subset = f"{split}_cocoformat" 
annotiation_file = Path(f"./data/{dataset_stem}/converted_annotations/{subset}.json")
D = VisDroneMOT(annotation_file=str(annotiation_file))

# # generate and load stats
D.generate_dataset_statistics()

# save the stats
D.save_dataset_statistics(save_path = f"./summaries/{TASK}",
                            dataset_name = f"{dataset_stem}",
                            file_name = f"{subset}_stats.json"
                            )
print(f"[INFO] Saved")

```

## **4.  SKYDATA MOT dataset**

``` python

## 4.  SKYDATA MOT dataset

# ##
TASK='MOT'

## 4. SKYDATA MOT dataset
from bases.skydata_vis_dataset import SkyDataVis
from pathlib import Path

dataset_year = ""
dataset_stem = f"skydata{dataset_year}"
split = "train"
# subset = "train_SKYVIS_ds5_fr3_alldata"
# "data/skydata/annotations/train_SKYVIS_3_alldata.json"
subset = f"{split}_SKYVIS_3_alldata"
annotiation_file = Path(f"./data/{dataset_stem}/annotations/{subset}.json")
D = SkyDataVis(annotation_file=str(annotiation_file))

# generate and load stats
D.generate_dataset_statistics()

# save the stats
D.save_dataset_statistics(save_path = f"./summaries/{TASK}",
                            dataset_name = f"{dataset_stem}",
                            file_name = f"{subset}_stats.json"
                            )
print(f"[INFO] Saved")

```

## **5. DanceTRack dataset**

``` python

## **5. KAIST dataset**

# ##
TASK='MOT'

##**5. KAIST dataset**
from bases.kaist_mot import KaistMOT
from pathlib import Path

dataset_year = ""
dataset_stem = f"kaist_pedestrian{dataset_year}"
split = "kaist"
subset = f"{split}_vid_cocoformat"
annotiation_file = Path(f"./data/{dataset_stem}/converted_annotations/{subset}.json")
D = KaistMOT(annotation_file=str(annotiation_file))

# generate and load stats
D.generate_dataset_statistics()

# save the stats
D.save_dataset_statistics(save_path = f"./summaries/{TASK}",
                            dataset_name = f"{dataset_stem}",
                            file_name = f"{subset}_stats.json"
                            )
print(f"[INFO] Saved")

```

[INFO] loading annotations into memory...
Done (t=0.55s)
creating index...
indexing videos...
updating annotations with video_id...
indexing annotations...


  0%|          | 0/108132 [00:00<?, ?it/s]


KeyError: 'instance_id'