Detection Transformers with Assignment

By Jeffrey Ouyang-Zhang, Jang Hyun Cho, Xingyi Zhou, Philipp Krähenbühl

This repository is an official implementation of the paper NMS Strikes Back.

TL; DR. Detection Transformers with Assignment (DETA) re-introduce IoU assignment and NMS for transformer-based detectors. DETA trains and tests comparibly as fast as Deformable-DETR and converges much faster (50.2 mAP in 12 epochs on COCO).

DETR's one-to-one bipartite matching	Our many-to-one IoU-based assignment

Main Results

Method	Epochs	COCO val AP	Total Train time (8 GPU hours)	Batch Infer Speed (FPS)	URL
Two-stage Deformable DETR	50	46.9	42.5	-	see DeformDETR
Improved Deformable DETR	50	49.6	66.6	13.4	config log model
DETA	12	50.1	16.3	12.7	config log model
DETA	24	51.1	32.5	12.7	config log model
DETA (Swin-L)	24	62.9	100	4.2	config-O365 model-O365 config model

Note:

Unless otherwise specified, the model uses ResNet-50 backbone and training (ResNet-50) is done on 8 Nvidia Quadro RTX 6000 GPU.
Inference speed is measured on Nvidia Tesla V100 GPU.
"Batch Infer Speed" refer to inference with batch size = 4 to maximize GPU utilization.
Improved DeformableDETR implements two-stage Deformable DETR with improved hyperparameters (e.g. more queries, more feature levels, see full list here).
DETA with Swin-L backbone is pretrained on Object-365 and fine-tuned on COCO. This model attains 63.5AP on COCO test-dev. Times refer to fine-tuning (O365 pre-training takes 14000 GPU hours). We additionally provide the pre-trained Object365 config and model prior to fine-tuning.

Installation

Please follow instructions from Deformable-DETR for installation, data preparation, and additional usage examples. Tested on torch1.8.0+cuda10.1 and torch1.6.0+cuda9.2 and torch1.11.0+cuda11.3

Usage

Evaluation

You can evaluate our pretrained DETA models from the above table on COCO 2017 validation set:

./configs/deta.sh --eval --coco_path ./data/coco --resume <path_to_model>

You can also run distributed evaluation:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/deta.sh \
    --eval --coco_path ./data/coco --resume <path_to_model>

You can also run distributed evaluation on our Swin-L model:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/deta_swin_ft.sh \
    --eval --coco_path ./data/coco --resume <path_to_model>

Training

Training on single node

Training DETA on 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/deta.sh --coco_path ./data/coco

Training on slurm cluster

If you are using slurm cluster, you can simply run the following command to train on 1 node with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deta 8 configs/deta.sh \
    --coco_path ./data/coco

Fine-tune DETA with Swin-L on 2 nodes of each with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deta 16 configs/deta_swin_ft.sh \
    --coco_path ./data/coco --finetune <path_to_o365_model>

License

This project builds heavily off of Deformable-DETR and Detectron2. Please refer to their original licenses for more details. If you are using Swin-L backbone, please see Swin original license.

Citing DETA

If you find DETA useful in your research, please consider citing:

@article{ouyangzhang2022nms,
  title={NMS Strikes Back},
  author={Ouyang-Zhang, Jeffrey and Cho, Jang Hyun and Zhou, Xingyi and Kr{\"a}henb{\"u}hl, Philipp},
  journal={arXiv preprint arXiv:2212.06137},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
configs		configs
data		data
datasets		datasets
docs		docs
exps		exps
figs		figs
models		models
tools		tools
util		util
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
engine.py		engine.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detection Transformers with Assignment

Main Results

Installation

Usage

Evaluation

Training

Training on single node

Training on slurm cluster

License

Citing DETA

About

Releases

Packages

Contributors 2

Languages

License

jozhang97/DETA

Folders and files

Latest commit

History

Repository files navigation

Detection Transformers with Assignment

Main Results

Installation

Usage

Evaluation

Training

Training on single node

Training on slurm cluster

License

Citing DETA

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages