[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/open-mmlab/mmtracking/blob/master/demo/MMTracking_Tutorial.ipynb)

# **Welcome to MMTracking**

In this tutorial, you will learn to:
+ Install MMTracking.
+ Perform inference with pretrained weights in MMTracking.
+ Train a new MOT model with a toy dataset.
Let's start!

## **Install MMTracking**

In [1]:
# Check nvcc version
!nvcc -V
# Check GCC version
!gcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Oct_12_20:09:46_PDT_2020
Cuda compilation tools, release 11.1, V11.1.105
Build cuda_11.1.TC455_06.29190527_0
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.



In [2]:
# install pytorch
!pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

# install MMEngine
!pip install mmengine

# install MMCV
!pip install 'mmcv>=2.0.0rc1' -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10.0/index.html

# install MMDetection
!pip install 'mmdet>=3.0.0rc0'

# clone the MMTracking repository
!git clone -b 1.x https://github.com/open-mmlab/mmtracking.git
%cd mmtracking

# install MMTracking and its dependencies
!pip install -r requirements/build.txt
!pip install -e .
# used to MOT evaluation
!pip install git+https://github.com/JonathonLuiten/TrackEval.git

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.pytorch.org/whl/torch_stable.html
Collecting torch==1.10.0+cu111
  Downloading https://download.pytorch.org/whl/cu111/torch-1.10.0%2Bcu111-cp37-cp37m-linux_x86_64.whl (2137.6 MB)
[K     |████████████▌                   | 834.1 MB 1.3 MB/s eta 0:16:48tcmalloc: large alloc 1147494400 bytes == 0x394ce000 @  0x7fd48a2f2615 0x592b76 0x4df71e 0x59afff 0x515655 0x549576 0x593fce 0x548ae9 0x51566f 0x549576 0x593fce 0x548ae9 0x5127f1 0x598e3b 0x511f68 0x598e3b 0x511f68 0x598e3b 0x511f68 0x4bc98a 0x532e76 0x594b72 0x515600 0x549576 0x593fce 0x548ae9 0x5127f1 0x549576 0x593fce 0x5118f8 0x593dd7
[K     |███████████████▉                | 1055.7 MB 1.2 MB/s eta 0:14:29tcmalloc: large alloc 1434370048 bytes == 0x7db24000 @  0x7fd48a2f2615 0x592b76 0x4df71e 0x59afff 0x515655 0x549576 0x593fce 0x548ae9 0x51566f 0x549576 0x593fce 0x548ae9 0x5127f1 0x598e3b 0x511f68 0x59

In [3]:
from mmengine.utils.dl_utils import collect_env
collect_env()

OrderedDict([('sys.platform', 'linux'),
             ('Python', '3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0]'),
             ('CUDA available', True),
             ('numpy_random_seed', 2147483648),
             ('GPU 0', 'Tesla T4'),
             ('CUDA_HOME', '/usr/local/cuda'),
             ('NVCC', 'Cuda compilation tools, release 11.1, V11.1.105'),
             ('GCC',
              'x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0'),
             ('PyTorch', '1.10.0+cu111'),
             ('PyTorch compiling details',
              'PyTorch built with:\n  - GCC 7.3\n  - C++ Version: 201402\n  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications\n  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)\n  - OpenMP 201511 (a.k.a. OpenMP 4.5)\n  - LAPACK is enabled (usually provided by MKL)\n  - NNPACK is enabled\n  - CPU capability usage: AVX2\n  - CUDA Runtime 11.1\n  - NVCC architect

In [4]:
# Check Pytorch installation
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())

# Check mmcv installation
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print(get_compiling_cuda_version())
print(get_compiler_version())

# Check MMDetection installation
import mmdet
print(mmdet.__version__)

# Check MMTracking installation
import mmtrack
print(mmtrack.__version__)

1.10.0+cu111 True
11.1
GCC 7.3
3.0.0rc0
1.0.0rc0


## **Perform inference**

In [5]:
# unset the proxy for downloading the pretrained models (optional)
!unset https_proxy
!unset http_proxy

# download checkpoints
!mkdir checkpoints
!wget -c https://download.openmmlab.com/mmtracking/vid/selsa/selsa_faster_rcnn_r50_dc5_1x_imagenetvid/selsa_faster_rcnn_r50_dc5_1x_imagenetvid_20201227_204835-2f5a4952.pth -P ./checkpoints
!wget -c https://download.openmmlab.com/mmtracking/sot/siamese_rpn/siamese_rpn_r50_1x_lasot/siamese_rpn_r50_1x_lasot_20211203_151612-da4b3c66.pth -P ./checkpoints
!wget -c https://download.openmmlab.com/mmtracking/vis/masktrack_rcnn/masktrack_rcnn_r50_fpn_12e_youtubevis2019/masktrack_rcnn_r50_fpn_12e_youtubevis2019_20211022_194830-6ca6b91e.pth -P ./checkpoints

--2022-09-06 08:05:44--  https://download.openmmlab.com/mmtracking/vid/selsa/selsa_faster_rcnn_r50_dc5_1x_imagenetvid/selsa_faster_rcnn_r50_dc5_1x_imagenetvid_20201227_204835-2f5a4952.pth
Resolving download.openmmlab.com (download.openmmlab.com)... 47.89.140.71
Connecting to download.openmmlab.com (download.openmmlab.com)|47.89.140.71|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 282801031 (270M) [application/octet-stream]
Saving to: ‘./checkpoints/selsa_faster_rcnn_r50_dc5_1x_imagenetvid_20201227_204835-2f5a4952.pth’


2022-09-06 08:06:16 (8.70 MB/s) - ‘./checkpoints/selsa_faster_rcnn_r50_dc5_1x_imagenetvid_20201227_204835-2f5a4952.pth’ saved [282801031/282801031]

--2022-09-06 08:06:16--  https://download.openmmlab.com/mmtracking/sot/siamese_rpn/siamese_rpn_r50_1x_lasot/siamese_rpn_r50_1x_lasot_20211203_151612-da4b3c66.pth
Resolving download.openmmlab.com (download.openmmlab.com)... 47.89.140.71
Connecting to download.openmmlab.com (download.openmmlab.com)

In [7]:
# run mot demo
import mmcv
import mmengine
import tempfile
from mmtrack.apis import inference_mot, init_model
from mmtrack.utils import register_all_modules
from mmtrack.registry import VISUALIZERS

register_all_modules(init_default_scope=True)
mot_config = './configs/mot/deepsort/deepsort_faster-rcnn_r50_fpn_8xb2-4e_mot17halftrain_test-mot17halfval.py'
input_video = './demo/demo.mp4'
imgs = mmcv.VideoReader(input_video)
# build the model from a config file
mot_model = init_model(mot_config, device='cuda:0')

# build the visualizer. Different name for creating different visualizer instance
mot_model.cfg.visualizer.name = 'mot_visualizer'
visualizer = VISUALIZERS.build(mot_model.cfg.visualizer)
visualizer.dataset_meta = mot_model.dataset_meta

prog_bar = mmengine.ProgressBar(len(imgs))
out_dir = tempfile.TemporaryDirectory()
out_path = out_dir.name

# test and show/save the images
for i, img in enumerate(imgs):
    result = inference_mot(mot_model, img, frame_id=i)
    visualizer.add_datasample(
            'mot',
            img[..., ::-1],
            data_sample=result,
            show=False,
            out_file=f'{out_path}/{i:06d}.jpg',
            wait_time=float(1 / int(imgs.fps)),
            step=i)
    prog_bar.update()

output = './demo/mot.mp4'
print(f'\n making the output video at {output} with a FPS of {imgs.fps}')
mmcv.frames2video(out_path, output, fps=imgs.fps, fourcc='mp4v')
out_dir.cleanup()

09/06 08:08:27 - mmengine - [4m[37mINFO[0m - load model from: https://download.openmmlab.com/mmtracking/mot/faster_rcnn/faster-rcnn_r50_fpn_4e_mot17-half-64ee2ed4.pth
09/06 08:08:27 - mmengine - [4m[37mINFO[0m - http loads checkpoint from path: https://download.openmmlab.com/mmtracking/mot/faster_rcnn/faster-rcnn_r50_fpn_4e_mot17-half-64ee2ed4.pth
09/06 08:08:27 - mmengine - [4m[37mINFO[0m - load model from: https://download.openmmlab.com/mmtracking/mot/reid/tracktor_reid_r50_iter25245-a452f51f.pth
09/06 08:08:27 - mmengine - [4m[37mINFO[0m - http loads checkpoint from path: https://download.openmmlab.com/mmtracking/mot/reid/tracktor_reid_r50_iter25245-a452f51f.pth

missing keys in source state_dict: head.bn.weight, head.bn.bias, head.bn.running_mean, head.bn.running_var, head.classifier.weight, head.classifier.bias

[                                                  ] 0/8, elapsed: 0s, ETA:



[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 1.1 task/s, elapsed: 7s, ETA:     0s
 making the output video at ./demo/mot.mp4 with a FPS of 3.0
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 10.7 task/s, elapsed: 1s, ETA:     0s


In [8]:
# run vis demo
from mmtrack.apis import inference_mot
vis_config = './configs/vis/masktrack_rcnn/masktrack-rcnn_mask-rcnn_r50_fpn_8xb1-12e_youtubevis2019.py'
vis_checkpoint = './checkpoints/masktrack_rcnn_r50_fpn_12e_youtubevis2019_20211022_194830-6ca6b91e.pth'
# build the model from a config file and a checkpoint file
vis_model = init_model(vis_config, vis_checkpoint, device='cuda:0')

# build the visualizer. Different name for creating different visualizer instance
vis_model.cfg.visualizer.name = 'vis_visualizer'
visualizer = VISUALIZERS.build(vis_model.cfg.visualizer)
visualizer.dataset_meta = vis_model.dataset_meta

imgs = mmcv.VideoReader(input_video)
prog_bar = mmengine.ProgressBar(len(imgs))
out_dir = tempfile.TemporaryDirectory()
out_path = out_dir.name
for i, img in enumerate(imgs):
    result = inference_mot(vis_model, img, frame_id=i)
    visualizer.add_datasample(
            'vis',
            img[..., ::-1],
            data_sample=result,
            show=False,
            out_file=f'{out_path}/{i:06d}.jpg',
            wait_time=float(1 / int(imgs.fps)),
            step=i)
    prog_bar.update()
output = './demo/vis.mp4'
print(f'\n making the output video at {output} with a FPS of {imgs.fps}')
mmcv.frames2video(out_path, output, fps=imgs.fps, fourcc='mp4v')
out_dir.cleanup()

09/06 08:09:05 - mmengine - [4m[37mINFO[0m - load model from: https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth
09/06 08:09:05 - mmengine - [4m[37mINFO[0m - http loads checkpoint from path: https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth


Downloading: "https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth" to /root/.cache/torch/hub/checkpoints/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth


  0%|          | 0.00/170M [00:00<?, ?B/s]


size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([41, 1024]).
size mismatch for roi_head.bbox_head.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([41]).
size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([160, 1024]).
size mismatch for roi_head.bbox_head.fc_reg.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([160]).
size mismatch for roi_head.mask_head.conv_logits.weight: copying a param with shape torch.Size([80, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([40, 256, 1, 1]).
size mismatch for roi_head.mask_head.conv_logits.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model 

In [9]:
# run vid demo
from mmtrack.apis import inference_vid
vid_config = './configs/vid/selsa/selsa_faster-rcnn_r50-dc5_8xb1-7e_imagenetvid.py'
vid_checkpoint = './checkpoints/selsa_faster_rcnn_r50_dc5_1x_imagenetvid_20201227_204835-2f5a4952.pth'
# build the model from a config file and a checkpoint file
vid_model = init_model(vid_config, vid_checkpoint, device='cuda:0')

# build the visualizer. Different name for creating different visualizer instance
vid_model.cfg.visualizer.name = 'vid_visualizer'
visualizer = VISUALIZERS.build(vid_model.cfg.visualizer)
visualizer.dataset_meta = vid_model.dataset_meta

imgs = mmcv.VideoReader(input_video)
prog_bar = mmengine.ProgressBar(len(imgs))
out_dir = tempfile.TemporaryDirectory()
out_path = out_dir.name
for i, img in enumerate(imgs):
    result = inference_vid(vid_model, img, frame_id=i)
    visualizer.add_datasample(
            'vid',
            img[..., ::-1],
            data_sample=result,
            show=False,
            out_file=f'{out_path}/{i:06d}.jpg',
            wait_time=float(1 / int(imgs.fps)),
            step=i)
    prog_bar.update()
output = './demo/vid.mp4'
print(f'\n making the output video at {output} with a FPS of {imgs.fps}')
mmcv.frames2video(out_path, output, fps=imgs.fps, fourcc='mp4v')
out_dir.cleanup()

09/06 08:09:34 - mmengine - [4m[37mINFO[0m - load model from: torchvision://resnet50
09/06 08:09:34 - mmengine - [4m[37mINFO[0m - torchvision loads checkpoint from path: torchvision://resnet50


Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth


  0%|          | 0.00/97.8M [00:00<?, ?B/s]


unexpected key in source state_dict: fc.weight, fc.bias

local loads checkpoint from path: ./checkpoints/selsa_faster_rcnn_r50_dc5_1x_imagenetvid_20201227_204835-2f5a4952.pth
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 1.1 task/s, elapsed: 7s, ETA:     0s
 making the output video at ./demo/vid.mp4 with a FPS of 3.0
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 11.1 task/s, elapsed: 1s, ETA:     0s


In [10]:
# run sot demo
from mmtrack.apis import inference_sot
sot_config = './configs/sot/siamese_rpn/siamese-rpn_r50_8xb28-20e_imagenetvid-imagenetdet-coco_test-lasot.py'
sot_checkpoint = './checkpoints/siamese_rpn_r50_1x_lasot_20211203_151612-da4b3c66.pth'
# build the model from a config file and a checkpoint file
sot_model = init_model(sot_config, sot_checkpoint, device='cuda:0')

# build the visualizer. Different name for creating different visualizer instance
sot_model.cfg.visualizer.name = 'sot_visualizer'
visualizer = VISUALIZERS.build(sot_model.cfg.visualizer)
visualizer.dataset_meta = sot_model.dataset_meta

init_bbox = [371, 411, 450, 646]
imgs = mmcv.VideoReader(input_video)
prog_bar = mmengine.ProgressBar(len(imgs))
out_dir = tempfile.TemporaryDirectory()
out_path = out_dir.name
for i, img in enumerate(imgs):
    result = inference_sot(sot_model, img, init_bbox, frame_id=i)
    visualizer.add_datasample(
            'vid',
            img[..., ::-1],
            data_sample=result,
            show=False,
            out_file=f'{out_path}/{i:06d}.jpg',
            wait_time=float(1 / int(imgs.fps)),
            step=i)
    prog_bar.update()
output = './demo/sot.mp4'
print(f'\n making the output video at {output} with a FPS of {imgs.fps}')
mmcv.frames2video(out_path, output, fps=imgs.fps, fourcc='mp4v')
out_dir.cleanup()

09/06 08:09:44 - mmengine - [4m[37mINFO[0m - load model from: https://download.openmmlab.com/mmtracking/pretrained_weights/sot_resnet50.model
09/06 08:09:44 - mmengine - [4m[37mINFO[0m - http loads checkpoint from path: https://download.openmmlab.com/mmtracking/pretrained_weights/sot_resnet50.model


Downloading: "https://download.openmmlab.com/mmtracking/pretrained_weights/sot_resnet50.model" to /root/.cache/torch/hub/checkpoints/sot_resnet50.model


  0%|          | 0.00/174M [00:00<?, ?B/s]

local loads checkpoint from path: ./checkpoints/siamese_rpn_r50_1x_lasot_20211203_151612-da4b3c66.pth
[                                                  ] 0/8, elapsed: 0s, ETA:



[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 1.6 task/s, elapsed: 5s, ETA:     0s
 making the output video at ./demo/sot.mp4 with a FPS of 3.0
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 8/8, 11.0 task/s, elapsed: 1s, ETA:     0s


## **Train a MOT model with a toy dataset**

### **Prepare dataset**

In [11]:
!mkdir data
!wget https://download.openmmlab.com/mmtracking/data/MOT17_tiny.zip -P ./data
!unzip -q ./data/MOT17_tiny.zip -d ./data

--2022-09-06 08:10:13--  https://download.openmmlab.com/mmtracking/data/MOT17_tiny.zip
Resolving download.openmmlab.com (download.openmmlab.com)... 47.89.140.71
Connecting to download.openmmlab.com (download.openmmlab.com)|47.89.140.71|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 344566302 (329M) [application/zip]
Saving to: ‘./data/MOT17_tiny.zip’


2022-09-06 08:10:52 (8.63 MB/s) - ‘./data/MOT17_tiny.zip’ saved [344566302/344566302]



In [14]:
# convert the dataset to coco format
!python ./tools/dataset_converters/mot/mot2coco.py -i ./data/MOT17_tiny/ -o ./data/MOT17_tiny/annotations --split-train --convert-det
# crop pedestrian patches from the original dataset for training reid model. It may take a few minutes.
!rm -rf ./data/MOT17_tiny/reid
!python ./tools/dataset_converters/mot/mot2reid.py -i ./data/MOT17_tiny/ -o ./data/MOT17_tiny/reid --val-split 0.9 --vis-threshold 0.8

Converting train set to COCO format
100% 2/2 [00:00<00:00,  2.09it/s]
train has 224 instances.
Done! Saved as ./data/MOT17_tiny/annotations/train_cocoformat.json and ./data/MOT17_tiny/annotations/train_detections.pkl
Converting test set to COCO format
0it [00:00, ?it/s]
test has 0 instances.
Done! Saved as ./data/MOT17_tiny/annotations/test_cocoformat.json and ./data/MOT17_tiny/annotations/test_detections.pkl
Converting half-train set to COCO format
100% 2/2 [00:01<00:00,  1.01it/s]
half-train has 182 instances.
Done! Saved as ./data/MOT17_tiny/annotations/half-train_cocoformat.json and ./data/MOT17_tiny/annotations/half-train_detections.pkl
Converting half-val set to COCO format
100% 2/2 [00:02<00:00,  1.01s/it]
half-val has 201 instances.
Done! Saved as ./data/MOT17_tiny/annotations/half-val_cocoformat.json and ./data/MOT17_tiny/annotations/half-val_detections.pkl
100% 2/2 [09:35<00:00, 287.68s/it]


### **Train a detector for MOT**

In [30]:
import mmengine
from mmengine.runner import set_random_seed
cfg = mmengine.Config.fromfile('./configs/det/faster-rcnn_r50_fpn_8xb2-4e_mot17halftrain_test-mot17halfval.py')
cfg.data_root = 'data/MOT17_tiny/'
cfg.train_dataloader.dataset.data_root = 'data/MOT17_tiny/'
cfg.test_dataloader = cfg.test_cfg = cfg.test_evaluator = None
cfg.val_dataloader = cfg.val_cfg = cfg.val_evaluator = None
# different name for creating different visualizer instance
cfg.visualizer.name = 'mot_visualizer'


cfg.work_dir = './tutorial_exps/detector'
cfg.randomness = dict(seed=0, deterministic=False)
cfg.gpu_ids = range(1)
print(f'Config:\n{cfg.pretty_text}')

Config:
dataset_type = 'mmdet.CocoDataset'
data_root = 'data/MOT17_tiny/'
train_pipeline = [
    dict(type='LoadImageFromFile', to_float32=True),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(
        type='RandomResize',
        scale=(1088, 1088),
        ratio_range=(0.8, 1.2),
        keep_ratio=True,
        clip_object_border=False),
    dict(type='PhotoMetricDistortion'),
    dict(type='RandomCrop', crop_size=(1088, 1088), bbox_clip_border=False),
    dict(type='RandomFlip', prob=0.5),
    dict(type='PackDetInputs')
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='Resize', scale=(1088, 1088), keep_ratio=True),
    dict(
        type='PackDetInputs',
        meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
                   'scale_factor'))
]
train_dataloader = dict(
    batch_size=2,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    batch_sampler=dict(type='mmdet.AspectRatioBat

In [31]:
import os.path as osp

from mmengine.utils import mkdir_or_exist
from mmengine.runner import Runner

mkdir_or_exist(osp.abspath(cfg.work_dir))
runner = Runner.from_cfg(cfg)
runner.train()

09/06 08:48:30 - mmengine - [4m[37mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0]
    CUDA available: True
    numpy_random_seed: 0
    GPU 0: Tesla T4
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.1, V11.1.105
    GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
    PyTorch: 1.10.0+cu111
    PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_5

Downloading: "http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_2x_coco/faster_rcnn_r50_fpn_2x_coco_bbox_mAP-0.384_20200504_210434-a5d8aa15.pth" to /root/.cache/torch/hub/checkpoints/faster_rcnn_r50_fpn_2x_coco_bbox_mAP-0.384_20200504_210434-a5d8aa15.pth


  0%|          | 0.00/160M [00:00<?, ?B/s]


size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([2, 1024]).
size mismatch for roi_head.bbox_head.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([2]).
size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.fc_reg.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([4]).
09/06 08:48:53 - mmengine - [4m[37mINFO[0m - Checkpoints will be saved to /content/mmtracking/tutorial_exps/detector by HardDiskBackend.
09/06 08:49:17 - mmengine - [4m[37mINFO[0m - Epoch(train) [1][50/414]  lr: 1.0000e-02  eta: 0:12:31  time: 0.4476  data_time: 0.0086  memory: 4097  loss_rpn_cls: 0.0432  loss_rpn_bbox: 0.1128  l

FasterRCNN(
  (data_preprocessor): DetDataPreprocessor()
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): ResLayer(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=Tru

### **Train a ReID model for MOT**

In [42]:
import mmengine

cfg = mmengine.Config.fromfile('./configs/reid/reid_r50_8xb32-6e_mot17train80_test-mot17val20.py')
cfg.train_dataloader.dataset.data_root = 'data/MOT17_tiny/'
cfg.train_dataloader.dataset.ann_file = 'reid/meta/train_9.txt'
cfg.test_dataloader = cfg.test_cfg = cfg.test_evaluator = None
cfg.val_dataloader = cfg.val_cfg = cfg.val_evaluator = None
cfg.visualizer.name = 'mot_reid_visualizer'

# learning policy
cfg.param_scheduler = [
    dict(
        type='LinearLR',
        start_factor=1.0 / 200,
        by_epoch=False,
        begin=0,
        end=200),
    dict(
        type='MultiStepLR',
        begin=0,
        end=2,
        by_epoch=True,
        milestones=[1],
        gamma=0.1)
]
cfg.train_cfg = dict(type='EpochBasedTrainLoop', max_epochs=2, val_begin=3)

cfg.work_dir = './tutorial_exps/reid'
cfg.randomness = dict(seed=0, deterministic=False)
cfg.gpu_ids = range(1)
print(f'Config:\n{cfg.pretty_text}')

Config:
dataset_type = 'ReIDDataset'
data_root = 'data/MOT17/'
train_pipeline = [
    dict(
        type='TransformBroadcaster',
        share_random_params=False,
        transforms=[
            dict(type='LoadImageFromFile', to_float32=True),
            dict(
                type='mmdet.Resize',
                scale=(128, 256),
                keep_ratio=False,
                clip_object_border=False),
            dict(type='RandomFlip', prob=0.5, direction='horizontal')
        ]),
    dict(type='PackReIDInputs')
]
test_pipeline = [
    dict(type='LoadImageFromFile', to_float32=True),
    dict(type='mmdet.Resize', scale=(128, 256), keep_ratio=False),
    dict(type='PackReIDInputs')
]
train_dataloader = dict(
    batch_size=1,
    num_workers=2,
    persistent_workers=True,
    sampler=dict(type='DefaultSampler', shuffle=True),
    dataset=dict(
        type='ReIDDataset',
        data_root='data/MOT17_tiny/',
        triplet_sampler=dict(num_ids=8, ins_per_id=4),
        data_pr

In [43]:
import os.path as osp

from mmengine.utils import mkdir_or_exist
from mmengine.runner import Runner

mkdir_or_exist(osp.abspath(cfg.work_dir))
runner = Runner.from_cfg(cfg)
runner.train()

09/06 09:16:26 - mmengine - [4m[37mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0]
    CUDA available: True
    numpy_random_seed: 0
    GPU 0: Tesla T4
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.1, V11.1.105
    GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
    PyTorch: 1.10.0+cu111
    PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_5

Downloading: "https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth" to /root/.cache/torch/hub/checkpoints/resnet50_batch256_imagenet_20200708-cfb998bf.pth


  0%|          | 0.00/97.7M [00:00<?, ?B/s]


unexpected key in source state_dict: head.fc.weight, head.fc.bias

missing keys in source state_dict: head.fcs.0.fc.weight, head.fcs.0.fc.bias, head.fcs.0.bn.weight, head.fcs.0.bn.bias, head.fcs.0.bn.running_mean, head.fcs.0.bn.running_var, head.fc_out.weight, head.fc_out.bias, head.bn.weight, head.bn.bias, head.bn.running_mean, head.bn.running_var, head.classifier.weight, head.classifier.bias

09/06 09:16:40 - mmengine - [4m[37mINFO[0m - Checkpoints will be saved to /content/mmtracking/tutorial_exps/reid by HardDiskBackend.
09/06 09:16:56 - mmengine - [4m[37mINFO[0m - Epoch(train) [1][50/1576]  lr: 2.5000e-02  eta: 0:16:34  time: 0.3139  data_time: 0.0053  memory: 4097  triplet_loss: 0.0000  ce_loss: 0.0004  accuracy_top-1: 100.0000  loss: 0.0006
09/06 09:17:12 - mmengine - [4m[37mINFO[0m - Epoch(train) [1][100/1576]  lr: 5.0000e-02  eta: 0:16:09  time: 0.3246  data_time: 0.0059  memory: 3519  triplet_loss: 0.0000  ce_loss: 0.0002  accuracy_top-1: 100.0000  loss: 0.0002
09/0

BaseReID(
  (data_preprocessor): ClsDataPreprocessor()
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): ResLayer(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)

### **Test the DeepSORT model**

In [46]:
import mmengine

cfg = mmengine.Config.fromfile('./configs/mot/deepsort/deepsort_faster-rcnn_r50_fpn_8xb2-4e_mot17halftrain_test-mot17halfval.py')
cfg.test_dataloader.dataset.data_root = 'data/MOT17_tiny/'
cfg.test_dataloader.dataset.test_mode = True
cfg.train_dataloader = cfg.train_cfg = None
cfg.val_dataloader = cfg.val_cfg = cfg.val_evaluator = None
cfg.visualizer.name = 'deepsort_visualizer'
cfg.model.detector.init_cfg.checkpoint = './tutorial_exps/detector/epoch_4.pth'
cfg.model.reid.init_cfg.checkpoint = './tutorial_exps/reid/epoch_2.pth'

cfg.work_dir = './tutorial_exps'
cfg.randomness = dict(seed=0, deterministic=False)
cfg.gpu_ids = range(1)
print(f'Config:\n{cfg.pretty_text}')

Config:
model = dict(
    data_preprocessor=dict(
        type='TrackDataPreprocessor',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        bgr_to_rgb=True,
        rgb_to_bgr=False,
        pad_size_divisor=32),
    detector=dict(
        type='FasterRCNN',
        _scope_='mmdet',
        backbone=dict(
            type='ResNet',
            depth=50,
            num_stages=4,
            out_indices=(0, 1, 2, 3),
            frozen_stages=1,
            norm_cfg=dict(type='BN', requires_grad=True),
            norm_eval=True,
            style='pytorch',
            init_cfg=dict(
                type='Pretrained', checkpoint='torchvision://resnet50')),
        neck=dict(
            type='FPN',
            in_channels=[256, 512, 1024, 2048],
            out_channels=256,
            num_outs=5),
        rpn_head=dict(
            type='RPNHead',
            in_channels=256,
            feat_channels=256,
            anchor_generator=dict(
          

In [47]:
from mmengine.model import is_model_wrapper
from mmengine.runner import Runner

runner = Runner.from_cfg(cfg)

if is_model_wrapper(runner.model):
    runner.model.module.init_weights()
else:
    runner.model.init_weights()

runner.test()

09/06 09:37:05 - mmengine - [4m[37mINFO[0m - 
------------------------------------------------------------
System environment:
    sys.platform: linux
    Python: 3.7.13 (default, Apr 24 2022, 01:04:09) [GCC 7.5.0]
    CUDA available: True
    numpy_random_seed: 0
    GPU 0: Tesla T4
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 11.1, V11.1.105
    GCC: x86_64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
    PyTorch: 1.10.0+cu111
    PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_5

Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  time_data = np.asarray(read_data[time_key], dtype=np.float)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  to_remove_tracker = np.array([], np.int)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  res[field] = np.zeros((len(self.array_labels)), dtype=np.float)
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  res['IDFN'] = fn_mat[match_rows, match_cols].sum().astype(np.int)


1 eval_sequence(MOT17-02-FRCNN, default-tracker)                         0.4670 sec
2 eval_sequence(MOT17-04-FRCNN, default-tracker)                         1.3160 sec

All sequences for default-tracker finished in 1.78 seconds

HOTA: default-tracker-pedestrian   HOTA      DetA      AssA      DetRe     DetPr     AssRe     AssPr     LocA      RHOTA     HOTA(0)   LocA(0)   HOTALocA(0)
MOT17-02-FRCNN                     26.07     39.469    17.888    46.443    63.395    20.298    59.499    79.439    28.497    34.31     68.423    23.476    
MOT17-04-FRCNN                     54.514    66.438    45.771    70.914    82.049    51.423    62.818    84.856    56.637    64.719    81.721    52.889    
COMBINED                           47.481    57.937    40.091    63.815    77.25     45.144    62.333    83.649    50.152    57.312    78.488    44.983    

CLEAR: default-tracker-pedestrian  MOTA      MOTP      MODA      CLR_Re    CLR_Pr    MTR       PTR       MLR       sMOTA     CLR_TP    CLR_FN    

{'motchallenge-metric/HOTA': 0.47480527965525604,
 'motchallenge-metric/AssA': 0.40090624838232497,
 'motchallenge-metric/DetA': 0.5793687250260392,
 'motchallenge-metric/MOTA': 0.6300135063714839,
 'motchallenge-metric/MOTP': 0.8167920002689131,
 'motchallenge-metric/IDSW': 1796.0,
 'motchallenge-metric/TP': 25694.0,
 'motchallenge-metric/FP': 2441.0,
 'motchallenge-metric/FN': 8364.0,
 'motchallenge-metric/Frag': 631.0,
 'motchallenge-metric/MT': 56.0,
 'motchallenge-metric/ML': 12.0,
 'motchallenge-metric/IDF1': 0.5111829305548856,
 'motchallenge-metric/IDTP': 15896.0,
 'motchallenge-metric/IDFN': 18162.0,
 'motchallenge-metric/IDFP': 12239.0,
 'motchallenge-metric/IDP': 0.5649902256975298,
 'motchallenge-metric/IDR': 0.46673321980151505}

<Figure size 432x288 with 0 Axes>