Skip to content

zhangce01/DualAdapter

Repository files navigation

DualAdapter

arXiv License: MIT

👀Introduction

This repository contains the code for our paper Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models. [Paper]

⏳Setup

1. Environment

We test our codebase with PyTorch 1.12.1 with CUDA 11.6. Please install corresponding PyTorch and CUDA versions according to your computational resources. Then install the rest of required packages by running pip install -r requirements.txt.

2. Dataset

Please follow the following file provided by Tip-Adapter to download all the 11 datasets we used for experiments: https://github.com/gaopengcuhk/Tip-Adapter/blob/main/DATASET.md

3. Extracting Few-Shot Features

You can extract the features by running CUDA_VISIBLE_DEVICES=0 python extract_features.py.

After running, you can get all the image features from tran/val/test set, as well as the positive/negative textual features in caches/[dataset_name].

📦Usage

You can simply run CUDA_VISIBLE_DEVICES=0 python main.py --config configs/[dataset_name].yaml --shot [shot_number] to train and test the DualAdapter model.

Here, dataset_name should be one of [caltech101, dtd, eurosat, fgvc, food101, imagenet, oxford_flowers, oxford_pets, stanford_cars, sun397, ucf101], and shot_number is chosen from 1/2/4/8/16.

🙏Acknowledgements

Our codebase is adapted from Tip-Adapter, CLIP, APE, and CuPL. We thank the authors for releasing their code!

📧Contact

If you have any questions, please contact at cezhang@cs.cmu.edu.

📌 BibTeX & Citation

If you find this code useful, please consider citing our work:

@article{zhang2024negative,
  title={Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models},
  author={Zhang, Ce and Stepputtis, Simon and Sycara, Katia and Xie, Yaqi},
  journal={arXiv preprint arXiv:2403.12964},
  year={2024}
}

About

Code for Negative Yields Positive: Unified Dual-Path Adapter for Vision-Language Models

Resources

License

Stars

Watchers

Forks

Languages