Out-of-distribution (OOD) detection aims to detect “unknown” data whose labels have not been seen during the in-distribution (ID) training process. Recent progress in representation learning gives rise to distance-based OOD detection that recognizes testing examples as ID/OOD according to their relatively distances to the training data of ID classes. Previous approaches calculate pairwise distances relying only on global image representations, which can be sub-optimal as the inevitable background clutter and intra-class variation may drive imagelevel representations from the same ID class far apart in a given representation space. In this work, we tackle this challenge by proposing Multi-scale OOD DEtection (MODE), a first framework leveraging both global visual information and local region details of images to maximally benefit OOD detection. Specifically, we first find that existing models pretrained by offthe-shelf cross-entropy or contrastive losses are incompetent to capture valuable local representations for MODE, due to the scale-discrepancy between the ID training and OOD detection processes. To mitigate this issue and encourage locally discriminative representations in ID training, we propose Attention-based Local PropAgation (ALPA), a trainable objective that exploits a cross-attention mechanism to align and highlight the local regions of the target objects for pairwise examples. In test-time OOD detection, a Cross-Scale Decision (CSD) function is further devised on the most discriminative multi-scale representations to separate ID-OOD data more faithfully. We demonstrate the effectiveness and flexibility of MODE on several benchmarks. Remarkably, MODE outperforms the previous state-of-the-art up to 19.24% in FPR, 2.77% in AUROC.
Motivation of exploring local, region-level representations to enhance the distance calculation between pairwise examples: the inevitable background clutter and intra-class variation may drive global, image-level representations from the same ID class far apart in a given representation space. For the first time, we take advantage of both global visual information and local region details of images to maximally benefit OOD detection.
During ID training, we develop ALPA, an end-to-end, plug-and-play, and cross-attention based learning objective tailored for encouraging locally discriminative representations for MODE.
During test-time OOD detection, we devise CSD, a simple, effective and multi-scale representations based ID-OOD decision function for MODE.
-
We propose MODE, a first framework that takes advantage of multi-scale (i.e., both global and local) representations for OOD detection.
-
In ID training, we develop ALPA, an end-to-end, plug-and-play, and cross-attention based learning objective tailored for encouraging locally discriminative representations for MODE.
-
In test-time OOD detection, we devise CSD, a simple, effective and multi-scale representations based ID-OOD decision function for MODE.
-
Extensive experiments on several benchmark datasets demonstrate the effectiveness and flexibility of MODE. Remarkably, MODE outperforms the previous state-ofthe-art up to 19.24% in FPR, 2.77% in AUROC.
- CIFAR-10 (ID) with ResNet-18
- CIFAR-100 (ID) with ResNet-34
- ImageNet (ID) with ResNet-50
- Visualization with tSNE
- Visualization analysis on k-nearest neighbors
The downloading process will start immediately upon running.
We provide links and instructions to download each dataset:
- SVHN: download it and place it in the folder of
datasets/ood_data/svhn
. Then runpython select_svhn_data.py
to generate test subset. - Textures: download it and place it in the folder of
datasets/ood_data/dtd
. - Places365: download it and place it in the folder of
datasets/ood_data/places365/test_subset
. We randomly sample 10,000 images from the original test dataset. - LSUN: download it and place it in the folder of
datasets/ood_data/LSUN
. - iSUN: download it and place it in the folder of
datasets/ood_data/iSUN
. - LSUN_fix: download it and place it in the folder of
datasets/ood_data/LSUN_fix
. - ImageNet_fix: download it and place it in the folder of
datasets/ood_data/ImageNet_fix
. - ImageNet_resize: download it and place it in the folder of
datasets/ood_data/Imagenet_resize
.
Please download ImageNet-1k and place the training data and validation data in
./datasets/imagenet/train
and ./datasets/imagenet/val
, respectively.
We have curated 4 OOD datasets from iNaturalist, SUN, Places, and Textures, and de-duplicated concepts overlapped with ImageNet-1k.
For iNaturalist, SUN, and Places, we have sampled 10,000 images from the selected concepts for each dataset, which can be download via the following links:
wget http://pages.cs.wisc.edu/~huangrui/imagenet_ood_dataset/iNaturalist.tar.gz
wget http://pages.cs.wisc.edu/~huangrui/imagenet_ood_dataset/SUN.tar.gz
wget http://pages.cs.wisc.edu/~huangrui/imagenet_ood_dataset/Places.tar.gz
For Textures, we use the entire dataset, which can be downloaded from their original website.
Please put all downloaded OOD datasets into ./datasets/ood_data
.
Baseline
python ./ALPA/main_supcon.py --batch_size 512 \
--learning_rate 0.5 \
--temp 0.1 \
--cosine
ID training w/ ALPA-train
python ./ALPA/main_supcon.py --batch_size 128 \
--learning_rate 0.5 \
--temp 0.1 \
--cosine \
--trial 0 \
--dataset cifar100 \
--model resnet34 \
--alpa_train
ID training w/ ALPA-finetune
python ./ALPA/main_supcon.py --batch_size 128 \
--learning_rate 0.1 \
--temp 0.1 \
--cosine \
--trial 0 \
--dataset cifar100 \
--model resnet34 \
--alpa_finetune
Specify the path to the pretrained model, and execute the following command.
Run ./demo_cifar.sh.
[1] Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D; Supervised Contrastive Learning (SupCon); NeurIPS 2020.
[2] Sun, Yiyou and Ming, Yifei and Zhu, Xiaojin and Li, Yixuan; Out-of-distribution Detection with Deep Nearest Neighbors (KNN); ICML 2022.
[3] Johnson J, Douze M, Jégou H; Billion-scale similarity search with gpus; IEEE Transactions on Big Data 2019.
Our codes (as well as this README.md) are based on the public repositories of SupCon and KNN.
We thank authors of for their source code.