Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets #39

Open
howardyclo opened this issue Dec 7, 2018 · 0 comments

Comments

@howardyclo
Copy link
Owner

howardyclo commented Dec 7, 2018

Metadata

Abstract

InfoGAN, an information-theoretic extension to the GAN that is able to learn disentangled representations in a completely unsupervised manner. (Related to #33)

Vanilla GAN

  • Objective: min_{G} max_{D} V(D, G) = E_{x ~ P_data} [log D(x)] + E_{z ~ noise} [log(1-D(G(z)))]
  • Problem: Input noise vector z has no restrictions on the manner in which the generator may use this noise. As a result, it is possible that the noise will be used by the generator in a highly entangled way, causing the individual dimensions of z to not correspond to semantic features of the data.

InfoGAN

  • Decompose the input noise vector z into 2 parts:
    • Incompressible noise z (interpret as an uncertainty of dataset that cannot be encoded to meaningful factors of variation)
    • Disentangled latent code c = {c_1, c_2, ..., c_L} (Encode factors of variation of dataset)
    • Note both vectors are learned in an unsupervised manner.
    • Problem: The generator may ignore the latent code: P_G(x|c) = P_G(x).
    • Apply regularization by maximizing mutual information: I(c; G(z,c)).
  • Mutual information I(X;Y):
    • Measures the “amount of information” learned from knowledge of random variable Y about the other random variable X.
    • I(X;Y) = H(X) − H(X|Y) = H(Y) − H(Y|X), where H(.) is entropy.
    • I(X;Y) is the reduction of uncertainty in X when Y is observed. If X and Y are independent, then I(X;Y) = 0, because knowing one variable reveals nothing about the other.
    • Given any x ∼ P_G(x), we want P_G(c|x) to have a small entropy. In other words, the information in the latent code c should not be lost in the generation process (Address the above problem).
  • Objective: min_{G} max_{D} V_I(D, G) = V(D, G) - λ I(c; G(z,c))

Variational Mutual Information Maximization

  • Problem: I(c; G(z, c)) is hard to maximize directly as it requires access to the posterior P(c|x).
  • Obtain a lower bound if it by defining an auxiliary Q(c|x) to approximate P(c|x).
  • TODO: Upload lower bound derivation image in my macbook & change faster-rcnn folder name (lol)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant