Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why Normalizing Flows Fail to Detect Out-of-Distribution Data #74

Open
howardyclo opened this issue Jan 16, 2021 · 0 comments
Open

Why Normalizing Flows Fail to Detect Out-of-Distribution Data #74

howardyclo opened this issue Jan 16, 2021 · 0 comments

Comments

@howardyclo
Copy link
Owner

howardyclo commented Jan 16, 2021

Metadata

Background: Training and sampling in Flows

  • Training and computing p(x): f: x -> z --> Feed x into function to get z, then maximize log_prob(x) = log_prob(z) + log_det_jacobian(f(x)), where log_prob(z) is usually log likelihood of Normal distribution N(z; mean=0, std=1).
  • Sampling: f-1: z -> x --> Sample z from the prior distribution (i.e., Normal distribution), and then feed into the inverse function to get x.

TL;DR

  • Normalizing flows can compute exact p(x) and we can train them by MLE (i.e., assign high density to training data).
  • However, most of the time we fail to use p(x) to distinguish out-of-distribution (OOD) data.
  • MLE training has a limited influence on OOD detection: Model are only trained to assign high probability on training data, instead of assigning low density on OOD data.
  • I.e., flows are learned to generate data, this objective does not necessarily need to learn semantics. Instead, learning pixel correlations (i.e., nearby pixels have similar colors) will generate high quality images.
  • Whether data is in or out-of-distribution is mainly distinguished by their semantics (i.e., label y), not by their pixel correlations.
  • The inductive bias of Normalizing flows (mainly study the coupling layer based NNs): They learn pixel correlations instead of semantics, so that's why flows fail to detect OOD data.
  • If given image embeddings that pretrained with images and labels, flows can detect OOD successfully from image embeddings.
  • They study the intermediate output of affine coupling layers of flows by injecting different masks (e.g., checkerboard mask, horizontal mask, and their proposed cycle mask), and find that even the first two masks applied to intermediate layers, flows can still learn to predict pixels by their neighbors. However, with their proposed cycle mask mechanism, flows cannot easily predict pixels by their neighbors, thus achieve successful OOD detection. However, since neighbors cannot easily obtain to predict pixels, the generation quality is not good. (Tradeoff between OOD and high-quality image generation?)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant