Skip to content


Repository files navigation

Fast convergence of detr with spatially modulated co-attention


There are no extra compiled components in SMCA DETR and package dependencies are minimal, so the code is very simple to use. We provide instructions how to install dependencies via conda. First, clone the repository locally:

git clone

Then, install PyTorch 1.5+ and torchvision 0.6+:

conda install -c pytorch pytorch torchvision

Install pycocotools (for evaluation on COCO) and scipy (for training):

conda install cython scipy
pip install -U 'git+'

That's it, should be good to train and evaluate detection models.

(optional) to work with panoptic install panopticapi:

pip install git+

Data preparation

Download and extract COCO 2017 train and val images with annotations from We expect the directory structure to be the following:

  annotations/  # annotation json files
  train2017/    # train images
  val2017/      # val images


To train Single Scale SMCA on a single node with 8 gpus for 300 epochs run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env --coco_path /path/to/coco --batch_size 2 --lr_drop 40 --num_queries 300 --epochs 50 --dynamic_scale type3 --output_dir smca_single_scale

A single epoch takes 30 minutes, so 50 epoch training takes around 25 hours on a single machine with 8 V100 cards.

Object Detection

Model Zoo

name dataset backbone schedule box AP
0 SMCA(single scale) MSCOCO R50 50 41.0
1 SMCA-Container(single scale) MSCOCO Container-S-Light 50 44.2
2 SMCA-Container(single scale) MSCOCO Container-M 50 47.3
3 SMCA(single scale) MSCOCO R50 108 42.7
4 SMCA(single scale) MSCOCO R50 250 43.5
5 SMCA(multi scale) MSCOCO R50 50 43.7
6 SMCA(New multi scale) MSCOCO R50 50 44.4
7 SMCA Visual Genome R50 50 coming soon

Panoptic Segmentation

Model Zoo

name dataset backbone schedule PQ SQ RQ
1 MASK-Former(single scale) MSCOCO R50 500 46.5 80.4 56.8
2 SMCA-MASK-Former(single scale) MSCOCO R50 50 46.0 80.4 56.0
## Original SMCA code submission during ICCV review period.

Release Steps

  1. Single-scale SMCA
  2. Single-scale SMCA with Container-Small
  3. Single-scale SMCA with Container-Medium
  4. New Multi-scale SMCA (Newly added, 9th Sep)
  5. SMCA-DETR for Fast Convergence of Panoptic Segmentation


If you find this repository useful, please consider citing our work:

  title={Fast convergence of detr with spatially modulated co-attention},
  author={Gao, Peng and Zheng, Minghang and Wang, Xiaogang and Dai, Jifeng and Li, Hongsheng},
  journal={arXiv preprint arXiv:2101.07448},
  title={Container: Context Aggregation Network},
  author={Gao, Peng and Lu, Jiasen and Li, Hongsheng and Mottaghi, Roozbeh and Kembhavi, Aniruddha},
  journal={arXiv preprint arXiv:2106.01401},
  title={End-to-end object detection with adaptive clustering transformer},
  author={Zheng, Minghang and Gao, Peng and Wang, Xiaogang and Li, Hongsheng and Dong, Hao},
  journal={arXiv preprint arXiv:2011.09315},


Peng Gao, Qiu Han, Minghang Zeng


The project are borrowed heavily from DETR. Partially motivated by Sparse RCNN.


No description, website, or topics provided.






No releases published


No packages published