This repo is the official the implementation of the OODformer: Out-Of-Distribution Detection Transformer in PyTorch using CIFAR as an illustrative example:
##Getting started
At first please install all the dependencies using :
pip install -r requirement.txt
##Datasets
Please download all the in-distribution (CIFAR-10,CIFAR-100,ImageNet-30) and
out-of-distribution dataset(LSUN_resize,
ImageNet_resize,
Places-365,
DTD,
Stanford Dogs,
Food-101,
Caltech-256,
CUB-200) to data
folder under the root directory.
For training Vision Transformer and its Data efficient variant please download their corresponding pre-train weight from ViT and DeiT repository.
To fine-tune vision transformer network on any in-distribution dataset on multi GPU settings:
srun --gres=gpu:4 python vit/src/train.py --exp-name name_of_the_experimet --tensorboard --model-arch b16 --checkpoint-path path/to/checkpoint --image-size 224 --data-dir data/ImageNet30 --dataset ImageNet --num-classes 30 --train-steps 4590 --lr 0.01 --wd 1e-5 --n-gpu 4 --num-workers 16 --batch-size 512 --method SupCE
- model-arch : specify the model of vit and deit variants (see vit/src/config.py )
- method : currently we support only supervised cross-entropy
- train_steps : cyclic lr has been used for lr scheduler, number of training epoch can be calculated using
(#train steps* batch size)/#training samples
- checkpoint_path : for loading pre-trained weight of vision transformer based on their different model.
OODformer can also be trained with various supervised and self-supervised loss like :
- Supervised Cross Entropy
- Self-Supervised Contrastive Loss[TBD]
- Supervised Contrastive Loss[TBD]
- Self-Supervised Vision Transformer[TBD]
To train resnet variants(e.g., resent-50,wide-resent) as base model on in-distribution dataset :
srun --gres=gpu:4 python main_ce.py --batch_size 512 --epochs 500 --model resent34 --learning_rate 0.8 --cosine --warm --dataset cifar10
To evaluate the similarity distance from the mean embedding of an in-distribution (e.g., CIFAR-10) class a list of distance metrics (e.g., Mahalanobis, Cosine, Euclidean, and Softmax) can be used with OODformer as stated below :
srun --gres=gpu:1 python OOD_Distance.py --ckpt checkpoint_path --model vit --model_arch b16 --distance Mahalanobis --dataset id_dataset --out_dataset ood_dataset
Various embedding visualization can be viewed using generate_tsne.py
(1) UMAP of in-distribution embedding
(2) UMAP of combined in and out-of distribution embedding
@article{koner2021oodformer,
title={OODformer: Out-Of-Distribution Detection Transformer},
author={Koner, Rajat and Sinhamahapatra, Poulami and Roscher, Karsten and G{\"u}nnemann, Stephan and Tresp, Volker},
journal={arXiv preprint arXiv:2107.08976},
year={2021}
}
Part of this code is inspired by HobbitLong/SupContrast.