Skip to content

Latest commit

 

History

History
 
 

rtmdet

RTMDet: An Empirical Study of Designing Real-Time Object Detectors

PWC PWC PWC

Abstract

In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection. To obtain a more efficient model architecture, we explore an architecture that has compatible capacities in the backbone and neck, constructed by a basic building block that consists of large-kernel depth-wise convolutions. We further introduce soft labels when calculating matching costs in the dynamic label assignment to improve accuracy. Together with better training techniques, the resulting object detector, named RTMDet, achieves 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU, outperforming the current mainstream industrial detectors. RTMDet achieves the best parameter-accuracy trade-off with tiny/small/medium/large/extra-large model sizes for various application scenarios, and obtains new state-of-the-art performance on real-time instance segmentation and rotated object detection. We hope the experimental results can provide new insights into designing versatile real-time object detectors for many object recognition tasks.

RTMDet-l model structure

Results and Models

Object Detection

Model size box AP Params(M) FLOPS(G) TRT-FP16-Latency(ms) Config Download
RTMDet-tiny 640 41.0 4.8 8.1 0.98 config model | log
RTMDet-s 640 44.6 8.89 14.8 1.22 config model | log
RTMDet-m 640 49.3 24.71 39.27 1.62 config model | log
RTMDet-l 640 51.4 52.3 80.23 2.44 config model | log
RTMDet-x 640 52.8 94.86 141.67 3.10 config model | log

Note:

  1. The inference speed of RTMDet is measured on an NVIDIA 3090 GPU with TensorRT 8.4.3, cuDNN 8.2.0, FP16, batch size=1, and without NMS.
  2. For a fair comparison, the config of bbox postprocessing is changed to be consistent with YOLOv5/6/7 after PR#9494, bringing about 0.1~0.3% AP improvement.

Citation

@misc{lyu2022rtmdet,
      title={RTMDet: An Empirical Study of Designing Real-Time Object Detectors},
      author={Chengqi Lyu and Wenwei Zhang and Haian Huang and Yue Zhou and Yudong Wang and Yanyi Liu and Shilong Zhang and Kai Chen},
      year={2022},
      eprint={2212.07784},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}