This repo contains the official PyTorch implementation of D&R
1. Check Requirements
- Linux with Python >= 3.6
- PyTorch >= 1.6 & torchvision that matches the PyTorch version.
- CUDA 10.1, 10.2
- GCC >= 4.9
2. Build
-
Create a virtual environment (optional)
conda create -n dandr python=3.7 conda activate dandrzq
-
Install PyTorch according to your CUDA version
-
Install Detectron2 (the version of Detectron2 must be 0.3)
python3 -m pip install detectron2==0.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.6/index.html
-
Install other requirements.
python3 -m pip install -r requirements.txt
3. Prepare Data and Weights
-
Data Preparation (from DeFRCN)
Dataset Size GoogleDrive BaiduYun Note VOC2007 0.8G download download - VOC2012 3.5G download download - vocsplit <1M download download refer from TFA COCO ~19G - - download from offical cocosplit 174M download download refer from TFA - Unzip the downloaded data-source to
datasets
and put it into your project directory:... datasets | -- coco (trainval2014/*.jpg, val2014/*.jpg, annotations/*.json) | -- cocosplit | -- VOC2007 | -- VOC2012 | -- vocsplit defrcn tools ...
- Unzip the downloaded data-source to
-
Weights Preparation
- DeFRCN use the imagenet pretrain weights to initialize the model. Download the same models from (given by DeFRCN): GoogleDrive BaiduYun
- Put the chekpoints into ImageNetPretrained/MSRA/R-101.pkl, ImageNetPretrained/torchvision, respectively
- We provide the BASE_WEIGHT (refer to run_*.sh) we used.
Dataset Split Size GoogleDrive VOC2007 1 203.8M download VOC2007 2 203.8M download VOC2007 3 203.8M download COCO - 206.2MB download
-
Text Embeddings Preparation
- Refer to the official implementation of CLIP for text embedding generation.
- Put the generated text embeddings into 'dataset/clip'
4. Training and Evaluation
- To reproduce the results on VOC,
sh run_voc.sh SPLIT_ID (1, 2 or 3)
- To reproduce the results on COCO
sh run_coco.sh
- Please read the details of few-shot object detection pipeline in
run_*.sh
.
This repo is developed based on DeFRCN and Detectron2. Please check them for more details and features.