Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 8db8560
Showing
45 changed files
with
3,769 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
## Introduction | ||
>[**Fine-Grained Prototypes Distillation for Few-Shot Object Detection (AAAI2024)**](https://arxiv.org/pdf/2401.07629.pdf) | ||
> | ||
![fpd_architecture](architecture.png) | ||
|
||
This repo is based on [MMFewShot](https://github.com/open-mmlab/mmfewshot). | ||
|
||
## Quick Start | ||
```bash | ||
# creat a conda environment | ||
conda create -n fpd python=3.8 | ||
conda activate fpd | ||
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio cudatoolkit=11.3 -c pytorch -c conda-forge | ||
|
||
# dependencies | ||
pip install openmim | ||
mim install mmcv-full==1.6.0 | ||
mim install mmcls==0.25.0 | ||
mim install mmdet==2.24.0 | ||
|
||
# install mmfewshot | ||
pip install git+https://github.com/open-mmlab/mmfewshot.git | ||
# or manually download the code, then | ||
# cd mmfewshot | ||
# pip install . | ||
|
||
# install FPD | ||
python setup.py develop | ||
``` | ||
|
||
## Prepare Datasets | ||
Please refer to [mmfewshot/data](https://github.com/open-mmlab/mmfewshot/blob/main/tools/data/README.md) | ||
for the data preparation steps. | ||
|
||
## Results on VOC Dataset | ||
* Base Training | ||
|
||
| Config | Split | Base AP50 | ckpt | | ||
|:---:|:---:|:---:|:---:| | ||
|[config](configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_base-training.py)|1|89.2|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split1_base-training_iter_20000.pth)| | ||
|[config](configs/fpd/voc/split2/fpd_r101_c4_2xb4_voc-split2_base-training.py)|2|88.8|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split2_base-training_iter_20000.pth)| | ||
|[config](configs/fpd/voc/split3/fpd_r101_c4_2xb4_voc-split3_base-training.py)|3|88.7|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split3_base-training_iter_20000.pth)| | ||
|
||
* Few Shot Fine-tuning | ||
|
||
| Config | Split | Shot | Novel AP50 | ckpt | log | | ||
|:---:|:---:|:---:|:---:|:---:|:---:| | ||
|[config](configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.py)|1|10|69.0|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning_iter_2400.pth)|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.log)| | ||
|[config](configs/fpd/voc/split2/fpd_r101_c4_2xb4_voc-split2_10shot-fine-tuning.py)|2|10|56.5|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split2_10shot-fine-tuning_iter_1800.pth)|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split2_10shot-fine-tuning.log)| | ||
|[config](configs/fpd/voc/split3/fpd_r101_c4_2xb4_voc-split3_10shot-fine-tuning.py)|3|10|64.1|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split3_10shot-fine-tuning_iter_1200.pth)|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split3_10shot-fine-tuning.log)| | ||
|
||
## Results on COCO Dataset | ||
* Base Training | ||
|
||
| Config | Base mAP | ckpt | | ||
|:---:|:---:|:---:| | ||
|[config](configs/fpd/coco/fpd_r101_c4_2xb4_coco_base-training.py)|36.0|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_coco_base-training_iter_110000.pth)| | ||
|
||
* Few Shot Fine-tuning | ||
|
||
| Config | Shot | Novel mAP (nAP) | ckpt | log | | ||
|:---:|:---:|:---:|:---:|:---:| | ||
|[config](configs/fpd/coco/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.py)|30|20.1|[ckpt]()|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.log)| | ||
|
||
## Evaluation | ||
|
||
```bash | ||
# single-gpu test | ||
python test.py ${CONFIG} ${CHECKPOINT} --eval mAP|bbox | ||
|
||
# multi-gpus test | ||
bash dist_test.sh ${CONFIG} ${CHECKPOINT} ${NUM_GPU} --eval mAP|bbox | ||
``` | ||
|
||
* For example, test pretrained weights on VOC Split1 10-shot with 2 gpus: | ||
|
||
```bash | ||
bash dist_test.sh \ | ||
configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.py \ | ||
./work_dirs/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning_iter_2400.pth 2 --eval mAP | ||
``` | ||
|
||
* Test pretrained weights on COCO 30-shot with 2 gpus: | ||
```bash | ||
bash dist_test.sh \ | ||
configs/fpd/coco/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.py \ | ||
./work_dirs/fpd_r101_c4_2xb4_coco_30shot-fine-tuning/fpd_r101_c4_2xb4_coco_30shot-fine-tuning_iter_18000.pth 2 --eval bbox | ||
``` | ||
|
||
## Training | ||
```bash | ||
# single-gpu training | ||
python train.py ${CONFIG} | ||
|
||
# multi-gpus training | ||
bash dist_train.sh ${CONFIG} ${NUM_GPU} | ||
``` | ||
* Training FPD on VOC dataset with 2 gpus: | ||
```bash | ||
# base training | ||
bash dist_train.sh \ | ||
configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_base-training.py 2 | ||
|
||
# few-shot fine-tuning | ||
bash dist_train.sh \ | ||
configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.py 2 | ||
``` | ||
* Training FPD on COCO dataset with 2 gpus: | ||
```bash | ||
# base training | ||
bash dist_train.sh \ | ||
configs/fpd/coco/fpd_r101_c4_2xb4_coco_base-training.py 2 | ||
|
||
# few-shot fine-tuning | ||
bash dist_train.sh \ | ||
configs/fpd/coco/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.py 2 | ||
``` | ||
|
||
## Citation | ||
If you would like to cite this paper, please use the following BibTeX entries: | ||
```BibTeX | ||
@InProceedings{wang2024fpd, | ||
title={Fine-Grained Prototypes Distillation for Few-Shot Object Detection}, | ||
author={Wang, Zichen and Yang, Bo and Yue, Haonan and Ma, Zhenghao}, | ||
booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI-24)}, | ||
year={2024} | ||
} | ||
``` |
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
# dataset settings | ||
img_norm_cfg = dict( | ||
mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False) | ||
train_multi_pipelines = dict( | ||
query=[ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
# dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), | ||
dict( | ||
type='Resize', | ||
img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736), | ||
(1333, 768), (1333, 800)], | ||
keep_ratio=True, | ||
multiscale_mode='value'), | ||
dict(type='RandomFlip', flip_ratio=0.5), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
], | ||
support=[ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
dict( | ||
type='CropResizeInstanceByRatio', | ||
num_context_pixels=16, | ||
context_ratio=0.07, | ||
target_size=(224, 224)), | ||
# dict( | ||
# type='CropResizeInstance', | ||
# num_context_pixels=16, | ||
# target_size=(224, 224)), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='GenerateMask', target_size=(224, 224)), | ||
dict(type='RandomFlip', flip_ratio=0.0), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
]) | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
# img_scale=(1000, 600), | ||
img_scale=(1333, 800), | ||
flip=False, | ||
transforms=[ | ||
dict(type='Resize', keep_ratio=True), | ||
dict(type='RandomFlip'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='Pad', size_divisor=32), | ||
dict(type='ImageToTensor', keys=['img']), | ||
dict(type='Collect', keys=['img']) | ||
]) | ||
] | ||
# classes splits are predefined in FewShotCocoDataset | ||
data_root = 'data/coco/' | ||
data = dict( | ||
samples_per_gpu=4, | ||
workers_per_gpu=2, | ||
train=dict( | ||
type='NWayKShotDataset', | ||
num_support_ways=60, | ||
num_support_shots=1, | ||
one_support_shot_per_image=True, | ||
num_used_support_shots=200, | ||
save_dataset=False, | ||
dataset=dict( | ||
type='FewShotCocoDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file='data/few_shot_ann/coco/annotations/train.json') | ||
], | ||
img_prefix=data_root, | ||
multi_pipelines=train_multi_pipelines, | ||
classes='BASE_CLASSES', | ||
instance_wise=False, | ||
dataset_name='query_support_dataset'), | ||
), | ||
val=dict( | ||
type='FewShotCocoDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file='data/few_shot_ann/coco/annotations/val.json') | ||
], | ||
img_prefix=data_root, | ||
pipeline=test_pipeline, | ||
classes='BASE_CLASSES'), | ||
test=dict( | ||
type='FewShotCocoDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file='data/few_shot_ann/coco/annotations/val.json') | ||
], | ||
img_prefix=data_root, | ||
pipeline=test_pipeline, | ||
test_mode=True, | ||
classes='BASE_CLASSES'), | ||
model_init=dict( | ||
copy_from_train_dataset=True, | ||
samples_per_gpu=16, | ||
workers_per_gpu=1, | ||
type='FewShotCocoDataset', | ||
ann_cfg=None, | ||
img_prefix=data_root, | ||
pipeline=train_multi_pipelines['support'], | ||
instance_wise=True, | ||
classes='BASE_CLASSES', | ||
dataset_name='model_init_dataset')) | ||
evaluation = dict(interval=20000, metric='bbox', classwise=True) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# dataset settings | ||
img_norm_cfg = dict( | ||
mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False) | ||
train_multi_pipelines = dict( | ||
query=[ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
# dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), | ||
dict( | ||
type='Resize', | ||
img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736), | ||
(1333, 768), (1333, 800)], | ||
keep_ratio=True, | ||
multiscale_mode='value'), | ||
dict(type='RandomFlip', flip_ratio=0.5), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
], | ||
support=[ | ||
dict(type='LoadImageFromFile'), | ||
dict(type='LoadAnnotations', with_bbox=True), | ||
dict( | ||
type='CropResizeInstance', | ||
num_context_pixels=16, | ||
target_size=(224, 224)), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='GenerateMask', target_size=(224, 224)), | ||
dict(type='RandomFlip', flip_ratio=0.0), | ||
dict(type='DefaultFormatBundle'), | ||
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']) | ||
]) | ||
test_pipeline = [ | ||
dict(type='LoadImageFromFile'), | ||
dict( | ||
type='MultiScaleFlipAug', | ||
# img_scale=(1000, 600), | ||
img_scale=(1333, 800), | ||
flip=False, | ||
transforms=[ | ||
dict(type='Resize', keep_ratio=True), | ||
dict(type='RandomFlip'), | ||
dict(type='Normalize', **img_norm_cfg), | ||
dict(type='ImageToTensor', keys=['img']), | ||
dict(type='Collect', keys=['img']) | ||
]) | ||
] | ||
# classes splits are predefined in FewShotVOCDataset | ||
data_root = 'data/VOCdevkit/' | ||
data = dict( | ||
samples_per_gpu=4, | ||
workers_per_gpu=2, | ||
train=dict( | ||
type='NWayKShotDataset', | ||
num_support_ways=15, | ||
num_support_shots=1, | ||
one_support_shot_per_image=True, | ||
num_used_support_shots=200, | ||
save_dataset=False, | ||
dataset=dict( | ||
type='FewShotVOCDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file=data_root + | ||
'VOC2007/ImageSets/Main/trainval.txt'), | ||
dict( | ||
type='ann_file', | ||
ann_file=data_root + 'VOC2012/ImageSets/Main/trainval.txt'), | ||
], | ||
img_prefix=data_root, | ||
multi_pipelines=train_multi_pipelines, | ||
classes=None, | ||
use_difficult=True, | ||
instance_wise=False, | ||
dataset_name='query_dataset'), | ||
support_dataset=dict( | ||
type='FewShotVOCDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file=data_root + | ||
'VOC2007/ImageSets/Main/trainval.txt'), | ||
dict( | ||
type='ann_file', | ||
ann_file=data_root + 'VOC2012/ImageSets/Main/trainval.txt'), | ||
], | ||
img_prefix=data_root, | ||
multi_pipelines=train_multi_pipelines, | ||
classes=None, | ||
use_difficult=False, | ||
instance_wise=False, | ||
dataset_name='support_dataset')), | ||
val=dict( | ||
type='FewShotVOCDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt'), | ||
], | ||
img_prefix=data_root, | ||
pipeline=test_pipeline, | ||
classes=None), | ||
test=dict( | ||
type='FewShotVOCDataset', | ||
ann_cfg=[ | ||
dict( | ||
type='ann_file', | ||
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt'), | ||
], | ||
img_prefix=data_root, | ||
pipeline=test_pipeline, | ||
test_mode=True, | ||
classes=None), | ||
model_init=dict( | ||
copy_from_train_dataset=True, | ||
samples_per_gpu=16, | ||
workers_per_gpu=1, | ||
type='FewShotVOCDataset', | ||
ann_cfg=None, | ||
img_prefix=data_root, | ||
pipeline=train_multi_pipelines['support'], | ||
use_difficult=False, | ||
instance_wise=True, | ||
classes=None, | ||
dataset_name='model_init_dataset')) | ||
evaluation = dict(interval=5000, metric='mAP') |
Oops, something went wrong.