[WACV 2024] Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders

This is the official codebase for our paper "Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders" presented at WACV 2024. The paper can be viewed at this link.

Installation

Create the conda environment and install the necessary packages:

conda env create -f environment.yml -n limiteddatavit

or alternatively

conda create -n limiteddatavit python=3.7 -y
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html

Data preparation

We provide code for training on ImageNet, CIFAR10, and CIFAR100. CIFAR10 and 100 will be automatically downloaded using torchvision, ImageNet must be downloaded separately.

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class2/
      img4.jpeg

Pretrained model weights

Model	Dataset	Evaluation Command
ViT-T + SSAT (weights)	ImageNet-1k	`python main_two_branch.py --data_path /path/to/imagenet/ --resume vittiny-ssat_imagenet1k_weights.pth --eval --model mae_vit_tiny`
ViT-S + SSAT (weights)	ImageNet-1k	`python main_two_branch.py --data_path /path/to/imagenet/ --resume vitsmall-ssat_imagenet1k_weights.pth --eval --model mae_vit_small`

Training models

To train ViT-Tiny with Self-Supervised Auxiliary Task on ImageNet-1k using 8 GPUs run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python -m torch.distributed.launch --nproc_per_node=8 main_two_branch.py --data_path /path/to/imagenet/ --output_dir ./output_dir --epochs 100 --model mae_vit_tiny

Available arguments for --data_path are /path/to/imagenet, c10, c100. Other datasets can be added in utils/datasets.py.

Available arguments for --model are mae_vit_tiny, mae_vit_small, mae_vit_base, mae_vit_large, mae_vit_huge.

Citation & Acknowledgement

@article{das-limiteddatavit-wacv2024,
    title={Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders},
    author={Srijan Das and Tanmay Jain and Dominick Reilly and Pranav Balaji and Soumyajit Karmakar and Shyam Marjit and Xiang Li and Abhijit Das and Michael Ryoo},
    journal={2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
    year={2024}
}

This repository is built on top of the code for the paper Masked Autoencoders Are Scalable Vision Learners from Meta Research.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
util		util
.gitignore		.gitignore
README.md		README.md
engine_two_branch.py		engine_two_branch.py
environment.yml		environment.yml
main_two_branch.py		main_two_branch.py
model_mae_image_loss.py		model_mae_image_loss.py
models_mae.py		models_mae.py
models_vit.py		models_vit.py
requirements.txt		requirements.txt
smalldatavit.png		smalldatavit.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

util

util

.gitignore

.gitignore

README.md

README.md

engine_two_branch.py

engine_two_branch.py

environment.yml

environment.yml

main_two_branch.py

main_two_branch.py

model_mae_image_loss.py

model_mae_image_loss.py

models_mae.py

models_mae.py

models_vit.py

models_vit.py

requirements.txt

requirements.txt

smalldatavit.png

smalldatavit.png

Repository files navigation

[WACV 2024] Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders

Installation

Data preparation

Pretrained model weights

Training models

Citation & Acknowledgement

About

Releases

Packages

Languages

dominickrei/Limited-data-vits

Folders and files

Latest commit

History

Repository files navigation

[WACV 2024] Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders

Installation

Data preparation

Pretrained model weights

Training models

Citation & Acknowledgement

About

Resources

Stars

Watchers

Forks

Languages