GitHub - amitgalor18/STC_Tracker: Multiple Object Tracking framework based on transformers for the MOT Challenge

Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations

[Paper]

Results on MOT17 and MOT20 compared to transformer-based trackers:

(as of October 2022)

Algorithm flowchart:

Bibtex

If you find this code useful, please star the project and consider citing:

@misc{https://doi.org/10.48550/arxiv.2210.13570,
  doi = {10.48550/ARXIV.2210.13570},
  
  url = {https://arxiv.org/abs/2210.13570},
  
  author = {Galor, Amit and Orfaig, Roy and Bobrovsky, Ben-Zion},
  
  keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  
  title = {Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations},
  
  publisher = {arXiv},
  
  year = {2022},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}

Results examples:

MOT17-03-SDP-trimmed.mp4

MOT20-06-trimmed.mp4

Environment Preparation

we use anaconda to simplify the package installations, you can download anaconda (4.10.3) here: https://www.anaconda.com/products/individual
you can create your conda env by doing

conda env create -n <env_name> -f eTransCenter.yml

Alternatively, you can use the added 'requirements.txt':

pip install -r requirements.txt

*Make sure to install the correct torch and torchvision versions matching your CUDA version from the pytorch website: [https://pytorch.org/get-started/previous-versions/]

STC uses Deformable transformer from Deformable DETR. Therefore, we need to install deformable attention modules:

cd ./to_install/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

4.for the up-scale and merge module in TransCenter, we use deformable convolution module, you can install it with:

cd ./to_install/DCNv2
./make.sh         # build
python testcpu.py    # run examples and gradient check on cpu
python testcuda.py   # run examples and gradient check on gpu

See also known issues from https://github.com/CharlesShang/DCNv2. If you have issues related to cuda of the third-party modules, please try to recompile them in the GPU that you use for training and testing. The dependencies are compatible with Pytorch 1.6, cuda 10.1.

If you install the DCNv2 and Deformable Transformer packages from other implementations, please replace the corresponding files with dcn_v2.py and ms_deform_attn.py in ./toinstall for allowing half-precision operations with the customized packages.

Data Preparation

ms coco: we use only the person category for pretraining TransCenter. The code for filtering is provided in ./data/coco_person.py.

@inproceedings{lin2014microsoft,
  title={Microsoft coco: Common objects in context},
  author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
  booktitle={European conference on computer vision},
  pages={740--755},
  year={2014},
  organization={Springer}
}

CrowdHuman: CrowdHuman labels are converted to coco format, the conversion can be done through ./data/convert_crowdhuman_to_coco.py.

@article{shao2018crowdhuman,
    title={CrowdHuman: A Benchmark for Detecting Human in a Crowd},
    author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian},
    journal={arXiv preprint arXiv:1805.00123},
    year={2018}
  }

MOT17: MOT17 labels are converted to coco format, the conversion can be done through ./data/convert_mot_to_coco.py.

@article{milan2016mot16,
  title={MOT16: A benchmark for multi-object tracking},
  author={Milan, Anton and Leal-Taix{\'e}, Laura and Reid, Ian and Roth, Stefan and Schindler, Konrad},
  journal={arXiv preprint arXiv:1603.00831},
  year={2016}
}

MOT20: MOT20 labels are converted to coco format, the conversion can be done through ./data/convert_mot20_to_coco.py.

@article{dendorfer2020mot20,
  title={Mot20: A benchmark for multi object tracking in crowded scenes},
  author={Dendorfer, Patrick and Rezatofighi, Hamid and Milan, Anton and Shi, Javen and Cremers, Daniel and Reid, Ian and Roth, Stefan and Schindler, Konrad and Leal-Taix{\'e}, Laura},
  journal={arXiv preprint arXiv:2003.09003},
  year={2020}
}

We also provide the filtered/converted labels:

MOT17 coco format labels: please put the annotations and annotations_onlySDP folders inside MOT17 to your MOT17 dataset root folder.

MOT20 coco format labels: please put the annotations folder inside MOT20 to your MOT20 dataset root folder.

Model Zoo

For main TransCenterV2 transformer model:

PVTv2 pretrained: pretrained model from deformable-DETR.

MOT17_trained_with_CH: model trained on CrowdHuman and MOT17 trainset.

MOT20_trained_with_CH: model trained on CrowdHuman and MOT20 trainset.

For embedding network fastReID model:

MOT17_SBS_S50: model trained on mot17 train set

MOT20_SBS_S50: model trained on mot20 train set

Please put all the pretrained models to ./model_zoo, and the PVTv2 model in .model_zoo/pvtv2_backbone

For Training, see the original TransCenterV2 for instructions: TransCenter

Tracking

###Using Private detections:

MOT17:

cd STC_Tracker
python ./tracking/mot17_private_test.py --data_dir=YourPathTo/MOT17/

MOT20:

cd STC_Tracker
python ./tracking/mot20_private_test.py --data_dir=YourPathTo/MOT20/

###Using Public detections:

MOT17:

cd STC_Tracker
python ./tracking/mot17_pub_test.py --data_dir=YourPathTo/MOT17/

MOT20:

cd STC_Tracker
python ./tracking/mot20_pub_test.py --data_dir=YourPathTo/MOT20/

You may also run the inference on a single file from the dataset, e.g.:

python ./tracking/mot17_private.py --data_dir YourPathTo/MOT17/ --output_dir mot17_results_dir_name --custom MOT17-02-SDP

MOTChallenge Results

MOT17 public detections:

Tracker	HOTA	MOTA	MOTP	IDF1	FP	FN	IDSW
TransCenterV2	56.7%	75.9%	81.2%	66.0%	30,220	100,995	4,622
STC_Tracker	59.5%	75.8%	81.3%	70.8%	33,833	99,074	3,787

MOT20 public detections:

Tracker	HOTA	MOTA	MOTP	IDF1	FP	FN	IDSW
TransCenterV2	50.1%	72.8%	81.0%	57.6%	28,012	110,274	2,620
STC_Tracker	56.1%	73.0%	80.9%	67.6%	30,880	106,876	2,172

MOT17 private detections:

Tracker	HOTA	MOTA	MOTP	IDF1	FP	FN	IDSW
TransCenterV2	56.7%	76.2%	81.1%	65.5%	40,107	88,827	5,397
STC_Tracker	59.8%	75.8%	81.1%	70.9%	44,952	87,039	4,533

MOT20 private detections:

Tracker	HOTA	MOTA	MOTP	IDF1	FP	FN	IDSW
TransCenterV2	50.2%	72.9%	81.0%	57.8%	28,588	108,950	2,620
STC_Tracker	56.3%	73.0%	81.0%	67.5%	30,215	107,701	2,011

Note:

Results from the original TransCenterV2 code were submitted independently for comparison on the same environment, using the same model version, pretrained on CrowdHuman
The results can be slightly different depending on the running environment.
We might keep updating the results in the near future.

Acknowledgement

The code for STC Tracker is modified and network pre-trained weights are obtained from the following repositories:

The main framework code is derived from TransCenter
The PVTv2 backbone pretrained models from PVTv2.
The data format conversion code is modified from CenterTrack.
The Kalman filter implementation is modified from StrongSORT
The embedding network code is modified from fastReID
The embedding network trained network and association implementation is from BoT-SORT

TransCenter, CenterTrack, Deformable-DETR, Tracktor, BoT-SORT, FastReID, StrongSORT.

@article{xu2021transcenter,
      title={TransCenter: Transformers with Dense Representations for Multiple-Object Tracking}, 
      author={Yihong Xu and Yutong Ban and Guillaume Delorme and Chuang Gan and Daniela Rus and Xavier Alameda-Pineda},
      year={2021},
      eprint={2103.15145},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{Aharon2022,
   author = {Nir Aharon and Roy Orfaig and Ben-Zion Bobrovsky},
   journal = {arXiv},
   month = {6},
   title = {BoT-SORT: Robust Associations Multi-Pedestrian Tracking},
   year = {2022},
}
@article{FastReID,
   author = {Lingxiao He and Xingyu Liao and Wu Liu and Xinchen Liu and Peng Cheng and Tao Mei},
   journal = {arXiv},
   month = {6},
   title = {FastReID: A Pytorch Toolbox for General Instance Re-identification},
   year = {2020},
}
@article{StrongSORT,
   author = {Yunhao Du and Yang Song and Bo Yang and Yanyun Zhao},
   journal = {arXiv},
   month = {2},
   title = {StrongSORT: Make DeepSORT Great Again},
   year = {2022},
}

@article{zhou2020tracking,
  title={Tracking Objects as Points},
  author={Zhou, Xingyi and Koltun, Vladlen and Kr{\"a}henb{\"u}hl, Philipp},
  journal={ECCV},
  year={2020}
}

@InProceedings{tracktor_2019_ICCV,
author = {Bergmann, Philipp and Meinhardt, Tim and Leal{-}Taix{\'{e}}, Laura},
title = {Tracking Without Bells and Whistles},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}}

@article{zhu2020deformable,
  title={Deformable DETR: Deformable Transformers for End-to-End Object Detection},
  author={Zhu, Xizhou and Su, Weijie and Lu, Lewei and Li, Bin and Wang, Xiaogang and Dai, Jifeng},
  journal={arXiv preprint arXiv:2010.04159},
  year={2020}
}

@article{zhang2021bytetrack,
  title={ByteTrack: Multi-Object Tracking by Associating Every Detection Box},
  author={Zhang, Yifu and Sun, Peize and Jiang, Yi and Yu, Dongdong and Yuan, Zehuan and Luo, Ping and Liu, Wenyu and Wang, Xinggang},
  journal={arXiv preprint arXiv:2110.06864},
  year={2021}
}

@article{wang2021pvtv2,
  title={Pvtv2: Improved baselines with pyramid vision transformer},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
  journal={Computational Visual Media},
  volume={8},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}

Several modules are from:

MOT Metrics in Python: py-motmetrics

Soft-NMS: Soft-NMS

DETR: DETR

DCNv2: DCNv2

PVTv2: PVTv2

ByteTrack: ByteTrack

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.vscode		.vscode
TransCenter_official		TransCenter_official
data		data
datasets		datasets
fast_reid		fast_reid
losses		losses
models		models
post_processing		post_processing
to_install		to_install
tracking		tracking
training		training
util		util
.gitignore		.gitignore
LICENSE		LICENSE
MOT17-03-SDP-trimmed.mp4		MOT17-03-SDP-trimmed.mp4
MOT20-06-trimmed.mp4		MOT20-06-trimmed.mp4
README.md		README.md
__init__.py		__init__.py
dubblebubble.png		dubblebubble.png
eTransCenter.yml		eTransCenter.yml
flowchart_v5.PNG		flowchart_v5.PNG
requirements.txt		requirements.txt

License

amitgalor18/STC_Tracker

Folders and files

Latest commit

History

Repository files navigation

Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations

Results on MOT17 and MOT20 compared to transformer-based trackers:

Algorithm flowchart:

Bibtex

Results examples:

Environment Preparation

Data Preparation

Model Zoo

Tracking

MOTChallenge Results

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Languages