Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
wangchen1801 committed Jan 25, 2024
0 parents commit 8db8560
Show file tree
Hide file tree
Showing 45 changed files with 3,769 additions and 0 deletions.
128 changes: 128 additions & 0 deletions README.md
@@ -0,0 +1,128 @@
## Introduction
>[**Fine-Grained Prototypes Distillation for Few-Shot Object Detection (AAAI2024)**](https://arxiv.org/pdf/2401.07629.pdf)
>
![fpd_architecture](architecture.png)

This repo is based on [MMFewShot](https://github.com/open-mmlab/mmfewshot).

## Quick Start
```bash
# creat a conda environment
conda create -n fpd python=3.8
conda activate fpd
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio cudatoolkit=11.3 -c pytorch -c conda-forge

# dependencies
pip install openmim
mim install mmcv-full==1.6.0
mim install mmcls==0.25.0
mim install mmdet==2.24.0

# install mmfewshot
pip install git+https://github.com/open-mmlab/mmfewshot.git
# or manually download the code, then
# cd mmfewshot
# pip install .

# install FPD
python setup.py develop
```

## Prepare Datasets
Please refer to [mmfewshot/data](https://github.com/open-mmlab/mmfewshot/blob/main/tools/data/README.md)
for the data preparation steps.

## Results on VOC Dataset
* Base Training

| Config | Split | Base AP50 | ckpt |
|:---:|:---:|:---:|:---:|
|[config](configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_base-training.py)|1|89.2|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split1_base-training_iter_20000.pth)|
|[config](configs/fpd/voc/split2/fpd_r101_c4_2xb4_voc-split2_base-training.py)|2|88.8|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split2_base-training_iter_20000.pth)|
|[config](configs/fpd/voc/split3/fpd_r101_c4_2xb4_voc-split3_base-training.py)|3|88.7|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split3_base-training_iter_20000.pth)|

* Few Shot Fine-tuning

| Config | Split | Shot | Novel AP50 | ckpt | log |
|:---:|:---:|:---:|:---:|:---:|:---:|
|[config](configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.py)|1|10|69.0|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning_iter_2400.pth)|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.log)|
|[config](configs/fpd/voc/split2/fpd_r101_c4_2xb4_voc-split2_10shot-fine-tuning.py)|2|10|56.5|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split2_10shot-fine-tuning_iter_1800.pth)|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split2_10shot-fine-tuning.log)|
|[config](configs/fpd/voc/split3/fpd_r101_c4_2xb4_voc-split3_10shot-fine-tuning.py)|3|10|64.1|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split3_10shot-fine-tuning_iter_1200.pth)|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_voc-split3_10shot-fine-tuning.log)|

## Results on COCO Dataset
* Base Training

| Config | Base mAP | ckpt |
|:---:|:---:|:---:|
|[config](configs/fpd/coco/fpd_r101_c4_2xb4_coco_base-training.py)|36.0|[ckpt](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_coco_base-training_iter_110000.pth)|

* Few Shot Fine-tuning

| Config | Shot | Novel mAP (nAP) | ckpt | log |
|:---:|:---:|:---:|:---:|:---:|
|[config](configs/fpd/coco/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.py)|30|20.1|[ckpt]()|[log](https://github.com/wangchen1801/FPD/releases/download/ckpts/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.log)|

## Evaluation

```bash
# single-gpu test
python test.py ${CONFIG} ${CHECKPOINT} --eval mAP|bbox

# multi-gpus test
bash dist_test.sh ${CONFIG} ${CHECKPOINT} ${NUM_GPU} --eval mAP|bbox
```

* For example, test pretrained weights on VOC Split1 10-shot with 2 gpus:

```bash
bash dist_test.sh \
configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.py \
./work_dirs/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning_iter_2400.pth 2 --eval mAP
```

* Test pretrained weights on COCO 30-shot with 2 gpus:
```bash
bash dist_test.sh \
configs/fpd/coco/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.py \
./work_dirs/fpd_r101_c4_2xb4_coco_30shot-fine-tuning/fpd_r101_c4_2xb4_coco_30shot-fine-tuning_iter_18000.pth 2 --eval bbox
```

## Training
```bash
# single-gpu training
python train.py ${CONFIG}

# multi-gpus training
bash dist_train.sh ${CONFIG} ${NUM_GPU}
```
* Training FPD on VOC dataset with 2 gpus:
```bash
# base training
bash dist_train.sh \
configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_base-training.py 2

# few-shot fine-tuning
bash dist_train.sh \
configs/fpd/voc/split1/fpd_r101_c4_2xb4_voc-split1_10shot-fine-tuning.py 2
```
* Training FPD on COCO dataset with 2 gpus:
```bash
# base training
bash dist_train.sh \
configs/fpd/coco/fpd_r101_c4_2xb4_coco_base-training.py 2

# few-shot fine-tuning
bash dist_train.sh \
configs/fpd/coco/fpd_r101_c4_2xb4_coco_30shot-fine-tuning.py 2
```

## Citation
If you would like to cite this paper, please use the following BibTeX entries:
```BibTeX
@InProceedings{wang2024fpd,
title={Fine-Grained Prototypes Distillation for Few-Shot Object Detection},
author={Wang, Zichen and Yang, Bo and Yue, Haonan and Ma, Zhenghao},
booktitle = {Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI-24)},
year={2024}
}
```
Binary file added architecture.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
112 changes: 112 additions & 0 deletions configs/_base_/datasets/nway_kshot/base_coco_ms.py
@@ -0,0 +1,112 @@
# dataset settings
img_norm_cfg = dict(
mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
train_multi_pipelines = dict(
query=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
# dict(type='Resize', img_scale=(1000, 600), keep_ratio=True),
dict(
type='Resize',
img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
(1333, 768), (1333, 800)],
keep_ratio=True,
multiscale_mode='value'),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
],
support=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='CropResizeInstanceByRatio',
num_context_pixels=16,
context_ratio=0.07,
target_size=(224, 224)),
# dict(
# type='CropResizeInstance',
# num_context_pixels=16,
# target_size=(224, 224)),
dict(type='Normalize', **img_norm_cfg),
dict(type='GenerateMask', target_size=(224, 224)),
dict(type='RandomFlip', flip_ratio=0.0),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
])
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
# img_scale=(1000, 600),
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
# classes splits are predefined in FewShotCocoDataset
data_root = 'data/coco/'
data = dict(
samples_per_gpu=4,
workers_per_gpu=2,
train=dict(
type='NWayKShotDataset',
num_support_ways=60,
num_support_shots=1,
one_support_shot_per_image=True,
num_used_support_shots=200,
save_dataset=False,
dataset=dict(
type='FewShotCocoDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file='data/few_shot_ann/coco/annotations/train.json')
],
img_prefix=data_root,
multi_pipelines=train_multi_pipelines,
classes='BASE_CLASSES',
instance_wise=False,
dataset_name='query_support_dataset'),
),
val=dict(
type='FewShotCocoDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file='data/few_shot_ann/coco/annotations/val.json')
],
img_prefix=data_root,
pipeline=test_pipeline,
classes='BASE_CLASSES'),
test=dict(
type='FewShotCocoDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file='data/few_shot_ann/coco/annotations/val.json')
],
img_prefix=data_root,
pipeline=test_pipeline,
test_mode=True,
classes='BASE_CLASSES'),
model_init=dict(
copy_from_train_dataset=True,
samples_per_gpu=16,
workers_per_gpu=1,
type='FewShotCocoDataset',
ann_cfg=None,
img_prefix=data_root,
pipeline=train_multi_pipelines['support'],
instance_wise=True,
classes='BASE_CLASSES',
dataset_name='model_init_dataset'))
evaluation = dict(interval=20000, metric='bbox', classwise=True)
127 changes: 127 additions & 0 deletions configs/_base_/datasets/nway_kshot/base_voc_ms.py
@@ -0,0 +1,127 @@
# dataset settings
img_norm_cfg = dict(
mean=[103.530, 116.280, 123.675], std=[1.0, 1.0, 1.0], to_rgb=False)
train_multi_pipelines = dict(
query=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
# dict(type='Resize', img_scale=(1000, 600), keep_ratio=True),
dict(
type='Resize',
img_scale=[(1333, 640), (1333, 672), (1333, 704), (1333, 736),
(1333, 768), (1333, 800)],
keep_ratio=True,
multiscale_mode='value'),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
],
support=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='CropResizeInstance',
num_context_pixels=16,
target_size=(224, 224)),
dict(type='Normalize', **img_norm_cfg),
dict(type='GenerateMask', target_size=(224, 224)),
dict(type='RandomFlip', flip_ratio=0.0),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
])
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
# img_scale=(1000, 600),
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
# classes splits are predefined in FewShotVOCDataset
data_root = 'data/VOCdevkit/'
data = dict(
samples_per_gpu=4,
workers_per_gpu=2,
train=dict(
type='NWayKShotDataset',
num_support_ways=15,
num_support_shots=1,
one_support_shot_per_image=True,
num_used_support_shots=200,
save_dataset=False,
dataset=dict(
type='FewShotVOCDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file=data_root +
'VOC2007/ImageSets/Main/trainval.txt'),
dict(
type='ann_file',
ann_file=data_root + 'VOC2012/ImageSets/Main/trainval.txt'),
],
img_prefix=data_root,
multi_pipelines=train_multi_pipelines,
classes=None,
use_difficult=True,
instance_wise=False,
dataset_name='query_dataset'),
support_dataset=dict(
type='FewShotVOCDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file=data_root +
'VOC2007/ImageSets/Main/trainval.txt'),
dict(
type='ann_file',
ann_file=data_root + 'VOC2012/ImageSets/Main/trainval.txt'),
],
img_prefix=data_root,
multi_pipelines=train_multi_pipelines,
classes=None,
use_difficult=False,
instance_wise=False,
dataset_name='support_dataset')),
val=dict(
type='FewShotVOCDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt'),
],
img_prefix=data_root,
pipeline=test_pipeline,
classes=None),
test=dict(
type='FewShotVOCDataset',
ann_cfg=[
dict(
type='ann_file',
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt'),
],
img_prefix=data_root,
pipeline=test_pipeline,
test_mode=True,
classes=None),
model_init=dict(
copy_from_train_dataset=True,
samples_per_gpu=16,
workers_per_gpu=1,
type='FewShotVOCDataset',
ann_cfg=None,
img_prefix=data_root,
pipeline=train_multi_pipelines['support'],
use_difficult=False,
instance_wise=True,
classes=None,
dataset_name='model_init_dataset'))
evaluation = dict(interval=5000, metric='mAP')

0 comments on commit 8db8560

Please sign in to comment.