## COMBINING GANs AND VAEs: ADVERSARIAL VARIATIONAL BAYES (AVB)

#### Giuseppe Onesto

### VAEs and GANs description and disadvantages

Variational Autoencoders (<b>VAEs</b>) and Generative Adversarial Networks(<b>GANs</b>) are two greatest methods to estimate generative models. 
More in depth, VAEs represent expressive latent variable models that can be used to learn complex probability distributions from training data; while GANs represent an adversarial process to solve the task, by simultaneously training two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. [1]
Even though they're used reaching great results in literature, they both have some disadvantages, how explained by the authors of Adversarial Variational Bayes(AVBs)[2] models: for what concerns VAEs, the main disadvantage regards the fact that the quality of the resulting model crucially relies on the expressiveness of the inference model. Abouth GANs, it's very hard for the model to learn to generate discrete data; and their training to be stable require finding a "Nash equilibrium of a game", that is not as immediate. <br>
<b>Adversarial Variational  Bayes  (AVB)</b>, is a  technique for training Variational Autoencoders with arbitrarily expressive  inference models. This is achieved by introducing an auxiliary discriminative network that allows to rephrase the maximum-likelihood problem as a two-player game, hence establishing a principled connection between VAEs and GANs.

#### The AVB Model

The model is an extension of VAEs model; in general VAEs are specified by a parametric generative model $p_\theta(x|z)$
of the visible variables given the latent variables, a prior p(z) over the latent variables and an approximate inference
model $ q_\phi(z|x) $ over the latent variables given the visible variables. It can be shown that: <br>
<p style="text-align: center;"> $ \log p_\theta(x) \geq −KL(q_\phi(z|x),p(z))+ E_{q\phi(z|x)} \log p_\theta(x|z) $</p>

When performing maximum-likelihood training, the goal is to optimize the marginal log-likelihood: <br>
<p style="text-align: center;">$ E_{p D(x)} \log p_\theta(x) $ , where p(D) is the  data  distribution.</p>

Apart from mathematical formulas, the key point is that VAEs models quality strictly depends on the expressiveness of the inference model $ q_\phi(z|x) $ .

In the AVB model, this is handled by adding noise as additional input to the inference model, instead of adding it at the very end (as happens in VAEs), allowing the inference network to learn complex probability distributions. <br>
We can think of AVB as a black-box inference model $ q_\phi(z|x) $ that uses adversarial training to obtain an approximate maximum likelihood assignment $ \theta^∗ $ to \theta and a close approximation $ q_{\phi^∗}(z|x) $ to the true posterior $ p_{\theta^∗}(z|x) $ 
This is shown in the below image, taken by the authors paper [2].

![alt text](AVB.png)
<p style="text-align: center;"><b>Figure 1:</b> Comparison of traditional VAEs models and AVB model</p>


To understand the link between AVB and GANs, that's the main idea of AVB starting from VAEs, we've to show the objective of the inference model presented above, that is:
 <p style="text-align: center;"> $ \max _\theta { \max _\phi  {E_ {p D(x)} E_ {q_\phi (z|x)} (\log p(z) − \log q_\phi(z|x) + \log p_\theta(x|z))} } $ </p> 
 
The idea is to implicitly represent the term $ \log p(z) − \log q_\phi(z|x) $ as the optimal value of an additional real-valued discriminative network T(x,z) that is introduced to the problem.<br>
For more on the structure of the network and mathematical proofs of the model, please see [2].


#### Experimental results

Here I would like to show a result of the application of AVB, with respect to a VAE model (generated by a diagonal Gaussian posterior distribution), to learn a generative model. 
The AVB neural networks was trained on a very simple synthetic dataset, containing 4 data points (space of 2*2 from the image posted below):
![alt text](synthetic.png)
<p style="text-align: center;"><b>Figure 2:</b> Images used for AVB training</p>

The encoder network of AVB takes as input a data point x and a vector of Gaussian random noise and produces a latent code z. The decoder network takes as input a latent code z and produces  the  parameters  for  four  independent  Bernoulli-distributions, one for each pixel of the output image. The encoder and the decoder of VAE are parametrized as the AVB ones, but of course the encoder didn't take any noise as input.
In the images below are shown the results obtained by the authors:

![alt text](AVBvsVAE1.jpg)
<p style="text-align: center;"><b>Figure 3:</b> Distribution of VAE and AVB latent code results</p>

![alt text](AVBvsVAE2.png)
<p style="text-align: center;"><b>Figure 4:</b> Comparison of VAE and AVB results</p>


#### The AVB github project

Finally, I'ld like to share the authors [GitHub project](https://github.com/LMescheder/AdversarialVariationalBayes), in which they've realized their AVB model in python, sharing the datasets used for experiments, and giving the possibility to use AVB for both variational inference and generative models. It's very interesting to play with.






#### REFERENCES

[1] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014. <br>
[2] Mescheder, Lars, Sebastian Nowozin, and Andreas Geiger. "Adversarial variational bayes: Unifying variational autoencoders and generative adversarial networks." arXiv preprint arXiv:1701.04722 (2017).