Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach (RLRR)

This repository is the official implementation of RLRR. In this study, we approach the problem from the perspective of Singular Value Decomposition (SVD) of pre-trained parameter matrices, providing insights into the tuning dynamics of existing methods.

Usage

Environment

To install requirements:

conda env create -n RLRR -f environment.yaml

Before running the code, please activate this conda environment.

Data Preparation

FGVC & vtab-1k

You can follow VPT to download them.

Since the original vtab dataset is processed with tensorflow scripts and the processing of some datasets is tricky, we also upload the extracted vtab-1k dataset in onedrive for your convenience. You can download from here and then use them with our vtab.py directly. (Note that the license is in vtab dataset).

Pre-trained model preparation

For pre-trained ViT, Swin-B models on ImageNet-21K. You can also manually download them from ViT,Swin Transformer.

Train & Inference

Clone this repo:

git clone https://github.com/zstarN70/RLRR.git
cd RLRR

To fine-tune a pre-trained ViT model on VTAB, run:

CUDA_VISIBLE_DEVICES=0 python  train_vtab.py --dataset_name=kitti

To fine-tune a pre-trained ViT model on FGVC, run:

CUDA_VISIBLE_DEVICES=0 python  train_fgvc.py --dataset_name=kitti

Citation

If this project is helpful for you, you can cite our paper:

@inproceedings{dong2024low,
  title={Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach},
  author={Dong, Wei and Zhang, Xing and Chen, Bihui and Yan, Dawei and Lin, Zhijun and Yan, Qingsen and Wang, Peng and Yang, Yang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={16101--16110},
  year={2024}
}

Acknowledgement

The code is built upon timm. The processing of the vtab-1k dataset refers to vpt, vtab github repo, and NOAH.

Link

If you have any questions, please contact me：zstar@xauat.edu.cn

Name	Name	Last commit message	Last commit date
Latest commit History 8 Commits
RLRRDatasets	RLRRDatasets	RLRR	Mar 13, 2024
models	models	Create swin.py	Jul 12, 2024
tool	tool	RLRR	Mar 13, 2024
README.md	README.md	Update README.md	Jul 25, 2024
environment.yaml	environment.yaml	RLRR	Mar 13, 2024
framework.png	framework.png	RLRR	Mar 13, 2024
train_fgvc.py	train_fgvc.py	RLRR	Mar 13, 2024
train_fgvc_aug.py	train_fgvc_aug.py	RLRR	Mar 13, 2024
train_vtab.py	train_vtab.py	RLRR	Mar 13, 2024
utils.py	utils.py	RLRR	Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach (RLRR)

Usage

Environment

Data Preparation

Pre-trained model preparation

Train & Inference

Citation

Acknowledgement

Link

About

Releases

Packages

Languages

zstarN70/RLRR

Folders and files

Latest commit

History

Repository files navigation

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach (RLRR)

Usage

Environment

Data Preparation

Pre-trained model preparation

Train & Inference

Citation

Acknowledgement

Link

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages