Skip to content

Official implementation for paper titled "Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning" accepted at ICLR 2023

License

Notifications You must be signed in to change notification settings

EdwinKim3069/WaRP-CIFSL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning

This repository contains the official implementation code for paper titled "Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning" accepted at ICLR 2023 as a Spotlight (notable-top-25%).

Abstract: Class-incremental few-shot learning, where new sets of classes are provided sequentially with only a few training samples, presents a great challenge due to catastrophic forgetting of old knowledge and overfitting caused by lack of data. During finetuning on new classes, the performance on previous classes deteriorates quickly even when only a small fraction of parameters are updated, since the previous knowledge is broadly associated with most of the model parameters in the original parameter space. In this paper, we introduce WaRP, the weight space rotation process, which transforms the original parameter space into a new space so that we can push most of the previous knowledge compactly into only a few important parameters. By properly identifying and freezing these key parameters in the new weight space, we can finetune the remaining parameters without affecting the knowledge of previous classes. As a result, WaRP provides an additional room for the model to effectively learn new classes in future incremental sessions. Experimental results confirm the effectiveness of our solution and show the improved performance over the state-of-the-art methods.

Step-by-Step Guide to Running Our Code

1. Clone this repository

git clone https://github.com/EdwinKim3069/WaRP-CIFSL.git
cd WaRP-CIFSL

2. Preparing datasets

CIFAR100 dataset will be automatically downloaded when running the code. Regarding MiniImagenet and CUB200 datasets, you can download from [MiniImagenet] and [CUB200].

For MiniImagenet:

cd data                                         # go to data directory
gdown 1eV1TGCC8h4Ga47PRRNiy4XdGbx3hfTyw
tar -xvf miniimagenet.tar
rm -f miniimagenet.tar
cd ..

For CUB200:

cd data                                       
gdown 110xuDVF-cSb8RlLjz0rV8EdlSpIALceN
tar -xvzf CUB_200_2011.tgz
rm -f CUB_200_2011.tgz
cd ..

3. Pretraining the model & Running incremental sessions with WaRP

Excuting a script file in the script folder pretrains the model, and then immediately runs incremental sessions with WaRP using pretrained model as follows.

CIFAR100:

bash script/run_cifar100.sh

MiniImagenet:

bash script/run_mini-imagenet.sh

CUB200:

bash script/run_cub200.sh 

If you already have pretrained the model, comment out the script code for pretraining and execute the script file as shown below.

# echo "Start Pretraining"
seed_num=0
gpu_num=0

# python train.py -project base \
#         -dataset cifar100 \
#         -base_mode ft_dot \
#         -new_mode avg_cos \
#         -gamma 0.1 \
#         -lr_base 0.1 \
#         -decay 0.0005 \
#         -epochs_base 210 \
#         -schedule Milestone \
#         -milestones 120 150 180 \
#         -gpu $gpu_num \
#         -temperature 16 \
#         -start_session 0 \
#         -batch_size_base 128 \
#         -seed $seed_num

echo "Run WaRP for Incremental Sessions"
model_directory="checkpoint/cifar100/base/ft_dot-avg_cos-data_init-start_0/Epo_210-Lr_0.1000-MS_120_150_180-Gam_0.10-Bs_128-Mom_0.90-Wd_0.00050-seed_$seed_num-T_16.00/session0_last_epoch.pth"

python train.py -project base \
        -dataset cifar100 \
        -new_mode ft_cos \
        -gamma 0.1 \
        -lr_base 0.1 \
        -decay 0.0005 \
        -epochs_base 0 \
        -temperature 16 \
        -start_session 0 \
        -model_dir $model_directory \
        -gpu $gpu_num \
        -lr_new 0.01 \
        -epochs_new 30 \
        -fraction_to_keep 0.1 \
        -seed $seed_num

Prototype-based method (simple yet strong baseline)

Prototype-based baseline is a method where the embeddings are averaged corresponding to each class (a.k.a. prototype) and the model makes prediction based on it. You can reproduce this simple yet strong baseline by inserting avg_cos in new_mode argument instead of ft_cos as follows.

echo "Run WaRP for Incremental Sessions"
model_directory="checkpoint/cifar100/base/ft_dot-avg_cos-data_init-start_0/Epo_210-Lr_0.1000-MS_120_150_180-Gam_0.10-Bs_128-Mom_0.90-Wd_0.00050-seed_$seed_num-T_16.00/session0_last_epoch.pth"

python train.py -project base \
        -dataset cifar100 \
        -new_mode avg_cos \
        -gamma 0.1 \
        -lr_base 0.1 \
        -decay 0.0005 \
        -epochs_base 0 \
        -temperature 16 \
        -start_session 0 \
        -model_dir $model_directory \
        -gpu $gpu_num \
        -lr_new 0.01 \
        -epochs_new 30 \
        -fraction_to_keep 0.1 \
        -seed $seed_num

Acknowledgment

Our source code is built upon the implementations from CEC-CVPR2021 and fscil repositories.

References

If you find our project useful for your future research, we would appreciate it if you could consider citing our paper using the following BibTex entry.

@inproceedings{
    kim2023warping,
    title={Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning},
    author={Do-Yeon Kim and Dong-Jun Han and Jun Seo and Jaekyun Moon},
    booktitle={The Eleventh International Conference on Learning Representations},
    year={2023},
}

About

Official implementation for paper titled "Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning" accepted at ICLR 2023

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published