AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

🔥Official PyTorch implementation and pre-trained models for paper AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning.
🤖AMU's version of MindSpore is available at MindSpore-AMU.
🌟The paper is accepted by CVPR 2024.

Introduction

This paper proposes a novel AMU-Tuning method to learn effective logit bias for CLIP-based few shot classification. Specifically, our AMU-Tuning predicts logit bias by exploiting the appropriate Auxiliary features, which are fed into an efficient feature-initialized linear classifier with Multi-branch training. Finally, an Uncertainty based fusion is developed to incorporate logit bias into CLIP for few-shot classification. The experiments are conducted on several widely used benchmarks, and the results show AMU-Tuning clearly outperforms its counterparts while achieving state-of-the-art performance of CLIP based few-shot learning without bells and whistles.

Requirements

Installation

Create a conda environment and install dependencies:

git clone https://github.com/TJU-sjyj/AMU-Tuning
cd AMU-Tuning

conda create -n AMU python=3.7
conda activate AMU

pip install -r requirements.txt

# Install the according versions of torch and torchvision
conda install pytorch torchvision cudatoolkit

Dataset

Our dataset setup is primarily based on Tip-Adapter. Please follow DATASET.md to download official ImageNet and other 10 datasets.

Foundation Models

The pre-tained weights of CLIP will be automatically downloaded by running.
The pre-tained weights of MoCo-v3 can be download at MoCo v3.

Get Started

One-Line Command by Using `run.sh`

We provide run.sh which you can run AMU-Tuning in an one-line command.

sh run.sh

Arguments

clip_backbone is the name of the backbone network of CLIP visual coders that will be used (e.g. RN50, RN101, ViT-B/16).
lr learning rate for adapter training.
shots number of samples per class used for training.
alpha is used to control the effect of logit bias.
lambda_merge is a hyper-parameter in Multi-branch Training.

More arguments can be referenced in parse_args.py.

Training Example

You can use this command to train a adapter with ViT-B-16 as CLIP's image encoder by 16-shot setting for 50 epochs.

CUDA_VISIBLE_DEVICES=0 python train.py\
    --rand_seed 2 \
    --torch_rand_seed 1\
    --exp_name test_16_shot  \
    --clip_backbone "ViT-B-16" \
    --augment_epoch 1 \
    --alpha 0.5\
    --lambda_merge 0.35\
    --train_epoch 50\
    --lr 1e-3\
    --batch_size 8\
    --shots 16\
    --root_path 'you root path' \
    --load_aux_weight

Result

The average results over the 11 datasets are displayed in the following figure:

The following table shows the results of AMU on the ImageNet1k dataset under the settings of 1~16 shots.

Method	Top1-Acc(%)	Checkpoint
AMU-Tuning-MoCov3-ResNet50-16shot-lmageNet1k	70.02	Download
AMU-Tuning-MoCov3-ResNet50-8shot-lmageNet1k	68.25	Download
AMU-Tuning-MoCov3-ResNet50-4shot-lmageNet1k	65.92	Download
AMU-Tuning-MoCov3-ResNet50-2shot-lmageNet1k	64.25	Download
AMU-Tuning-MoCov3-ResNet50-1shot-lmageNet1k	62.60	Download

Acknowledgement

This repo benefits from Tip and CaFo. Thanks for their works.

Contact

If you have any questions or suggestions, please feel free to contact us: tangyuwei@tju.edu.cn and linzhenyi@tju.edu.cn.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
caches		caches
clip		clip
datasets		datasets
result		result
README.md		README.md
avg_result.png		avg_result.png
default.yaml		default.yaml
framework.png		framework.png
parse_args.py		parse_args.py
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

Introduction

Requirements

Installation

Dataset

Foundation Models

Get Started

One-Line Command by Using `run.sh`

Arguments

Training Example

Result

Acknowledgement

Contact

About

Releases

Packages

Languages

TJU-sjyj/AMU-Tuning

Folders and files

Latest commit

History

Repository files navigation

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

Introduction

Requirements

Installation

Dataset

Foundation Models

Get Started

One-Line Command by Using run.sh

Arguments

Training Example

Result

Acknowledgement

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

One-Line Command by Using `run.sh`

Packages