Skip to content

GAIA-vision/GAIA-det

Repository files navigation

GAIA-det

More models and demos coming soon! Stay tuned.

Introduction

GAIA-det is an open source object detection toolbox that helps you with your customized AI solutions. It is built on top of gaiavision and mmdet. This repo includes an official re-implementation of our CVPR2021 paper:

It provides functionalities that help the customization of AI solutions.

  • Design customized search space of any type with little efforts.
  • Manage models in search space according to your rules.
  • Integrate datasets of various sources.

Requirements

  • Python 3.6+
  • CUDA 10.0+
  • 1.2.7 <= mmcv < 1.3.0
  • 2.8.0 <= mmdet < 2.9.0
  • Others (See requirements.txt)

Installation

git clone https://github.com/GAIA-vision/GAIA-det.git && cd GAIA-det
pip install -r requirements.txt
pip install -e .

Prepare Supernet

Type Backbone Model Data Cloud Storage Password
WEIGHTS AR50to101 Faster COCO+Obj365 BaiduCloud tm5n
FLOPS_LUT AR50to101 Faster COCO+Obj365 BaiduCloud ttwq
WEIGHTS AR50to101 Faster COCO+Obj365+OID500 Coming soon  
FLOPS_LUT AR50to101 Faster COCO+Obj365+OID500 Coming soon  

Benchmark

Finetuning(Upstream-COCO)

Backbone Pretrain Model Depth Width Input Scale Lr schd FLOPS box AP (paper) box AP (repo)
ResNet50 ImageNet Faster 3, 4, 6, 3 64, 64, 128, 256, 512 800 1x 139G 37.1 37.6
ResNet50 ImageNet Faster 3, 4, 6, 3 64, 64, 128, 256, 512 800 4x 139G None 40.3
45-50GF GAIADET Faster 2, 4, 5, 3 64, 64, 96, 192, 384 480 1x 49G 40.4 40.7
70-75GF GAIADET Faster 4, 6, 27, 4 48, 64, 128, 192, 512 480 1x 71G 42.6 43.1
85-90GF GAIADET Faster 3, 4, 21, 4 48, 64, 160, 192, 640 560 1x 90G 43.6 44.4
110-115GF GAIADET Faster 2, 4, 25, 4 64, 64, 160, 192, 640 640 1x 115G 44.5 44.8
135-140GF GAIADET Faster 4, 4, 15, 4 48, 48, 128, 192, 512 800 1x 139G 45.3 45.6

We compare our results with ResNet50 of 4x on COCO for fairness, because COCO data has been used for 3x during upstream training.

Compatibility with other methods

Backbone Pretrain Model Depth Width Input Scale Lr schd Methods box AP (paper) box AP (repo)
ResNet50 ImageNet Faster 3, 4, 6, 3 64, 64, 128, 256, 512 800 1x N 37.1 37.6
ResNet50 ImageNet Faster 3, 4, 6, 3 64, 64, 128, 256, 512 800 1x Y 45.8 44.5
135-140GF GAIADET Faster 4, 4, 15, 4 48, 48, 128, 192, 512 800 1x N 45.3 45.6
135-140GF GAIADET Faster 4, 4, 15, 4 48, 48, 128, 192, 512 800 1x Y 49.1 48.5

Methods denote Deformable Convolution and Cascaded Head.

Finetuning(Downstream-BDD100k)

Backbone Model Depth Width Input Scale Lr schd FLOPS box AP (paper) box AP (repo)
ResNet50 Faster 3, 4, 6, 3 64, 64, 128, 256, 512 800 1x 139G None 30.1
45-50GF Faster 3, 4, 5, 2 48, 64, 96, 192, 384 480 1x 49G None 27.4
70-75GF Faster 4, 2, 15, 2 48, 48, 128, 192, 512 560 1x 71G None 29.5
85-90GF Faster 2, 2, 15, 3 64, 64, 128, 192, 384 640 1x 87G None 32.1
135-140GF Faster 4, 6, 23, 3 48, 80, 128, 192, 512 720 1x 139G None 32.9

Finetuning(Downstream-UODB)

Dataset KITTI VOC WiderFace LISA Kitchen DOTA DeepLesion Comic Clipart Watercolor Avg.
ResNet50(paper) 67.1 81.5 62.1 90.0 89.5 68.3 57.4 45.5 31.2 53.4 64.6
GAIA(paper) 75.6 87.4 62.7 92.1 90.1 70.8 62.1 61.1 72.2 69.7 74.4
ResNet50(repo)                      
GAIA(repo)                      

FLOPS of all models are around 139GFLOPS, and the metric used above is AP50.

Data Preparation

Please refer to DATA_PREPARATION.

Usage

Please refer to USAGE for generic use.

Citation

If you like our work and use the code or models for your research or project, please star our repo and cite our work as follows.

@InProceedings{Bu_2021_CVPR,
    author    = {Bu, Xingyuan* and Peng, Junran* and Yan, Junjie and Tan, Tieniu and Zhang, Zhaoxiang},
    title     = {GAIA: A Transfer Learning System of Object Detection That Fits Your Needs},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {274-283}
}