[CVPR 2026]From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking

TL;DR. We reveal that DETR-based end-to-end MOT suffers from overly similar object embeddings. FDTA explicitly enhances discriminativeness in this paradigm.

📢 News

[2026] Our paper has been accepted by CVPR 2026! The code is now released. Please star this repo for updates!

🚀 Getting Started

1. Environment Setup

conda create -n FDTA python=3.12
conda activate FDTA
pip install -r requirements.txt
# Compile the Deformable Attention operator:
cd models/ops/
sh make.sh

2. Data Preparation

Please organize your datasets under ./datasets/ in the following structure. Note that depth maps are required alongside the RGB images:

datasets/
├── DanceTrack/
│   ├── train/
│   │   └── <sequence>/
│   │       ├── img1/
│   │       ├── gt/
│   │       └── depth/
│   ├── val/
│   └── test/
├── SportsMOT/
│   ├── train/
│   │   └── <sequence>/
│   │       ├── img1/
│   │       ├── gt/
│   │       └── depth/
│   ├── val/
│   └── test/
└── BFT/
    ├── train/
    │   └── <sequence>/
    │       ├── img1/
    │       ├── gt/
    │       └── depth/
    └── test/

Dataset sources:

DanceTrack: Download from the official repository.

SportsMOT: Download from the official repository or HuggingFace.

BFT: Download from Google Drive or Baidu Pan (NetTrack).

Depth maps: Generated using Video-Depth-Anything (see tools/gen_depthmaps.py). Other depth estimators with the same directory structure are also supported.

3. Pre-trained Weights

We use COCO pre-trained Deformable DETR weights for initialization, sourced from MOTIP. Download the weights and place them under ./pretrains/:

File	Description	Download
`r50_deformable_detr_coco.pth`	COCO pre-trained (base)	link
`r50_deformable_detr_coco_dancetrack.pth`	Fine-tuned on DanceTrack	link
`r50_deformable_detr_coco_sportsmot.pth`	Fine-tuned on SportsMOT	link
`r50_deformable_detr_coco_bft.pth`	Fine-tuned on BFT	link

Our trained FDTA model weights are available on HuggingFace. Download and place them under ./checkpoints/ for inference.

4. Training

All training configurations are stored in the ./configs/ folder. For example, to train on DanceTrack:

accelerate launch --num_processes=4 train.py \
  --data-root /path/to/your/datasets/ \
  --exp-name fdta_dancetrack \
  --config-path ./configs/dancetrack.yaml \
  --detr-pretrain ./pretrains/r50_deformable_detr_coco_dancetrack.pth

Replace dancetrack with sportsmot or bft to train on other datasets.

Note: If your GPU memory is less than 24GB, you can set --detr-num-checkpoint-frames 2 (< 16GB) or --detr-num-checkpoint-frames 1 (< 12GB) to reduce memory usage.

5. Inference

We support two inference modes:

submit: Generate tracker files for submission .
evaluate: Generate tracker files and compute evaluation metrics.

accelerate launch --num_processes=4 submit_and_evaluate.py \
  --data-root /path/to/your/datasets/ \
  --inference-mode evaluate \
  --config-path ./configs/dancetrack.yaml \
  --inference-model ./checkpoints/dancetrack.pth \
  --outputs-dir ./outputs/ \
  --inference-dataset DanceTrack \
  --inference-split val

accelerate launch --num_processes=4 submit_and_evaluate.py \
  --data-root /path/to/your/datasets/ \
  --inference-mode submit \
  --config-path ./configs/dancetrack.yaml \
  --inference-model ./checkpoints/dancetrack.pth \
  --outputs-dir ./outputs/ \
  --inference-dataset DanceTrack \
  --inference-split test

You can add --inference-dtype FP16 for faster inference with minimal performance loss.

Main Results

DanceTrack

Training Data	HOTA	IDF1	AssA	MOTA	DetA
train	71.7	77.2	63.5	91.3	81.0
train+val	74.4	80.0	67.0	92.2	82.7

SportsMOT

Training Data	HOTA	IDF1	AssA	MOTA	DetA
train	74.2	78.5	65.5	93.0	84.1

BFT

Training Data	HOTA	IDF1	AssA	MOTA	DetA
train	72.2	84.2	74.5	78.2	70.1

Acknowledgements

The code is built on top of these awesome repositories. We thank the authors for opensourcing their code.

Citation

If you find our work useful for your research, please consider citing:

@article{shao2025fdta,
  title={From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking},
  author={Shao, Yuqing and Yang, Yuchen and Yu, Rui and Li, Weilong and Guo, Xu and Yan, Huaicheng and Wang, Wei and Sun, Xiao},
  journal={arXiv preprint arXiv:2512.02392},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
TrackEval		TrackEval
assets		assets
configs		configs
data		data
log		log
models		models
structures		structures
tools		tools
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
runtime_option.py		runtime_option.py
submit_and_evaluate.py		submit_and_evaluate.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[CVPR 2026]From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking

📢 News

🚀 Getting Started

1. Environment Setup

2. Data Preparation

3. Pre-trained Weights

4. Training

5. Inference

Main Results

DanceTrack

SportsMOT

BFT

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

[CVPR 2026]From Detection to Association: Learning Discriminative Object Embeddings for Multi-Object Tracking

📢 News

🚀 Getting Started

1. Environment Setup

2. Data Preparation

3. Pre-trained Weights

4. Training

5. Inference

Main Results

DanceTrack

SportsMOT

BFT

Acknowledgements

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages