This repository contains experiments for the paper
Why Normalizing Flows Fail to DetectOut-of-Distribution Data
by Polina Kirichenko, Pavel Izmailov and Andrew Gordon Wilson.
In the paper we show that the inductive biases of the flows — implicit assumptions in their architecture and training procedure — can hinder OOD detection.
- We show that flows learn latent representations for images largely based on local pixel correlations, rather than semantic content, making it difficult to detect data with anomalous semantics.
- We identify mechanisms through which normalizing flows can simultaneously increase likelihood for all structured images.
- We show that by changing the architectural details of the coupling layers, we can encourage flows to learn transformations specific to the target data, improving OOD detection.
- We show that OOD detection is improved when flows are trained on high-level features which contain semantic information extracted from image datasets.
In this repository we provide PyTorch code for reproducing results in the paper.
The following datasets need to be downloaded manually. You can then use the path to the data folder as DATA_PATH
in the scripts below.
- CelebA:
- NotMNIST: data available here
- ImageNet 64x64: data available here; we will add ImageNet scripts soon
The other datasets can be automatically downloaded to DATA_PATH
when you run the scripts below.
The scripts for training flow models are in the experiments/train_flows/
folder.
train_unsup.py
— the standard script for training flowstrain_unsup_ood.py
— same astrain_unsup.py
, but evaluates the likelihoods on OOD data during trainingtrain_unsup_ood_negative.py
— same astrain_unsup_ood.py
, but minimizes likelihood on OOD data (Appendix B)train_unsup_ood_uci.py
— same astrain_unsup_ood.py
, but for tabular data (Appendix K)
Comands used to train baseline models:
# RealNVP on FashionMNIST
python3 train_unsup.py --dataset=[FashionMNIST | MNIST] --data_path=DATA_PATH --save_freq=20 \
--flow=RealNVP --logdir=LOG_DIR --ckptdir=CKPTS_DIR --num_epochs=81 --lr=5e-5 \
--prior=Gaussian --num_blocks=6 --batch_size=32
# RealNVP on CelebA
python3 train_unsup.py --dataset=[CelebA | CIFAR10 | SVHN] --data_path=DATA_PATH --logdir=LOG_DIR \
--ckptdir=CKPTS_DIR --num_epochs=101 --lr=1e-4 --batch_size=32 --num_blocks=8 \
--weight_decay=5e-5 --num_scales=3
# Glow on FashionMNIST
python3 train_unsup.py --dataset=[FashionMNIST | MNIST] --data_path=DATA_PATH --flow=Glow \
--logdir=LOG_DIR --ckptdir=CKPTS_DIR --num_epochs=151 --lr=5e-5 --batch_size=32 \
--optim=RMSprop --num_scales=2 --num_coupling_layers_per_scale=16 \
--st_type=highway --num_blocks=3 --num_mid_channels=200
# Glow on CelebA
python3 train_unsup.py --dataset=[CelebA | CIFAR10 | SVHN] --data_path=DATA_PATH --flow=Glow \
--logdir=LOG_DIR --ckptdir=CKPTS_DIR --num_epochs=161 --lr=1e-5 --batch_size=32 \
--optim=RMSprop --num_scales=3 --num_coupling_layers_per_scale=8 \
--st_type=highway --num_blocks=3 --num_mid_channels=400
We provide example notebooks in experiments/notebooks/
:
GLOW_fashion.ipynb
— Glow for FashionMNISTrealnvp_celeba.ipynb
— RealNVP for CelebA
Below we show latent representations learned by RealNVP trained on FashionMNIST and CelebA for in-distribution and OOD inputs.
To reproduce the experiments in Appendix B, you can use the script train_unsup_ood_negative.py
, e.g.
to maximize likelihood on CIFAR-10 and minimize likelihood on CelebA:
train_unsup_ood_negative.py --ood_dataset=CelebA --ood_data_path=OOD_DATA_PATH --dataset=CIFAR10 \
--data_path=DATA_PATH --logdir=LOG_DIR ckptdir=CKPTS_DIR --num_epochs=101 --lr=5e-5 --batch_size=32 \
--num_blocks=8 --num_scales=3 --negative_val=-100000 --save_freq=10 --flow=RealNVP
For other dataset pairs, reuse the hyper-parameters of the baseline models and set --negative_val
equal
to -100000
for CIFAR-10, CelebA and SVHN and to -30000
for FashionMNIST, MNIST.
The implementation of RealNVP and Glow was adapted from the repo for the paper Semi-Supervised Learning with Normalizing Flows.