IbM2

This repo is the official implementation of our CVPR2024 paper "Instance-based Max-margin for Practical Few-shot Recognition" [arXiv][paper][appendix][poster][video].

TL;DR

This paper proposes:

A practical FSL (pFSL) setting based on unsupervised pretrained models and recognizes many novel classes simultaneously.
IbM2, an instance-based max-margin methd based on the Gaussian Annulus Theorem. IbM2 converts random noise applied to the instances into a mechanism to achieve maximum margin.

Environment

python 3.8
pytorch >= 1.7
torchvision >= 0.8
timm 0.4.9

Data Preparation

1. Datasets

ImageNet

The ImageNet dataset is a large-scale image dataset widely used for various computer vision tasks.

Download

Register and download the dataset from the official ImageNet website.
Follow the instructions to download the ILSVRC2012 dataset, which includes training and validation images.

Structure

After downloading, organize the dataset into the following directory structure:

/path/to/imagenet/
train/
n01440764/
n01440764_18.JPEG
...
...
val/
n01440764/
ILSVRC2012_val_00000293.JPEG
...
...

For 1%-ImageNet variant, please refer to imagenet_subsets and build_imagenet_subsets for more details. Or for convinence, you can directly download the training set from this link.

CUB-200-2011

The CUB-200-2011 (Caltech-UCSD Birds-200-2011) dataset is a widely used dataset for fine-grained visual categorization tasks, specifically bird species classification.

Download

Download the dataset from the official CUB-200-2011 website.
Extract the downloaded tar file.

Structure

After extracting, organize the dataset into the following directory structure:

/path/to/cub_200_2011/
images/
001.Black_footed_Albatross/
Black_Footed_Albatross_0001_796111.jpg
...
...
train_test_split.txt

Use the script datasets/cub_preprocess.py to organize the images folder into train and test folders. After running the script, you will have two additional folders named train and test in your root directory.

2. Feature Extraction

IbM2 is a method for directly interacting with features extracted by backbones. To facilitate the evaluation process, you can preliminarily store the features using the provided scripts: extract_features.py and extract_features_imagenet_1pt.py.

python extract_features.py

Parameters:

dataset - choices: Imagenet or CUB.
shot - choices: 1, 2, 3, 4, 5, 8, 16.
save_test: whether to save the features for testing.
arch: model architecture - choice: deit_small_p16, deit_large_p7, deit_base_p4, resnet50
batch_size: batch size to extract the features
pretrain_method: unsupervised pretraining method - choice: DINO, MSN, MoCov3, SimCLR, BYOL.

or

python extract_features_imagenet_1pt.py

Note: Before that, you should download the

few-shot annotations (few_shot_split folder) (link)
backbone checkpoints (refer to table below)
correctly set the path in config.py.

The template of config.py looks like:

IMAGENET_PATH: root path of imagenet dataset
CUB_PATH: root path of CUB dataset
IMAGENET_1PT_PATH: root path of 1%-imagnet dataset
SPLIT_PATH: path of few-shot annotations files (/path/to/few_shot_split)

We experiment our IbM2 with various unsupervised pretraining method pretrained on ImageNet-1K. We provide the backbone checkpoints and the correponding extracted features to reproduce our results:

Method	Architecture	Checkpoint	ImageNet Features	CUB Features	1-pct Features
DINO	ViT-S/16	backbone	imagenet features	CUB features	1%-imagenet features
MoCov3	ViT-S/16	backbone	imagenet features	-	1%-imagenet features
MSN	ViT-S/16	backbone	imagenet features	CUB features	1%-imagenet features
MSN	ViT-B/4	backbone	imagenet features	-	1%-imagenet features
MSN	ViT-L/7	backbone	imagenet features	CUB features	1%-imagenet features
SimCLR	ResNet50	backbone	imagenet features	-	1%-imagenet features
BYOL	ResNet50	backbone	imagenet features	-	1%-imagenet features

Usage

Experiments on ImageNet-1K & CUB datasets

bash scripts/bsearch_finetune_search_continue_channel_wise.sh $cuda_id $shot $dataset $arch $pretrain_method

Parameters:

cuda_id - int: cuda index to run the code.
shot - int: training shot, choices - 1, 2, 3, 4, 5, 8, 16.
dataset- string: choices - Imagenet or CUB.
arch: model architecture - choice: deit_small_p16, deit_large_p7, deit_base_p4, resnet50
pretrain_method: unsupervised pretraining method - choice: DINO, MSN, MoCov3, SimCLR, BYOL.

An example to run the code is:

bash scripts/bsearch_finetune_search_continue_channel_wise.sh 0 1 Imagenet deit_small_p16 DINO

which means the configuration is to experiment with Imagenet 1shot using features from DINO(Vit-S/16).

Experiments on 1%-ImageNet semi-supervised leanring

The script to run the code is similar in this case:

bash scripts/bsearch_finetune_search_continue_channel_wise_imagenet_subsets.sh $cuda_id $arch $pretrain_method

Citation

If this project is helpful for you, you can cite our paper:

@inproceedings{fu2024ibm2,
      title={Instance-based Max-margin for Practical Few-shot Recognition},
      author={Fu, Minghao and Zhu, Ke},
      booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2024},
}

Acknowledgement

The code is built upon on timm.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
datasets		datasets
figs		figs
model		model
scripts		scripts
README.md		README.md
bsearch_decouple_search_continue_channel_wise.py		bsearch_decouple_search_continue_channel_wise.py
bsearch_decouple_search_continue_channel_wise_imagenet_subsets.py		bsearch_decouple_search_continue_channel_wise_imagenet_subsets.py
config.py		config.py
extract_features.py		extract_features.py
extract_features_imagenet_1pt.py		extract_features_imagenet_1pt.py
finetune_decouple_search_continue_channel_wise.py		finetune_decouple_search_continue_channel_wise.py
finetune_decouple_search_continue_channel_wise_imagenet_subsets.py		finetune_decouple_search_continue_channel_wise_imagenet_subsets.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IbM2

TL;DR

Environment

Data Preparation

1. Datasets

ImageNet

Download

Structure

CUB-200-2011

Download

Structure

2. Feature Extraction

Usage

Experiments on ImageNet-1K & CUB datasets

Experiments on 1%-ImageNet semi-supervised leanring

Citation

Acknowledgement

About

Releases

Packages

Languages

heekhero/IbM2

Folders and files

Latest commit

History

Repository files navigation

IbM2

TL;DR

Environment

Data Preparation

1. Datasets

ImageNet

Download

Structure

CUB-200-2011

Download

Structure

2. Feature Extraction

Usage

Experiments on ImageNet-1K & CUB datasets

Experiments on 1%-ImageNet semi-supervised leanring

Citation

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages