## Instructions:
- You can use NN libraries such as tensorflow, pytorch, etc. But implement GAN and VAE on your own.
- Zero tolerance for plagiarism. Do not copy from Practice and Share; the student who submitted there originally only can use; Github tracks who pushed what code.
- Total marks: 50
- Marks will be for answering the questions asked below, not for the codes. Use plots, tables, etc. to convey your answers.

In [None]:
import numpy as np
np.random.seed(5)
import matplotlib.pyplot as plt

# VAEs vs GANs
The objective of this exercise is to compare VAEs and GANs for a generative modeling task (generating samples from a mixture of Gaussians).

- Prepare the target probability distribution $p_x^*$ as a GMM with 5 components.
(Using snippets from CodingQuiz1.py is allowed)
- Construct and train a VAE to model $p^*_x$.
- Construct and train a GAN to model $p^*_x$.


## Questions
1. Plot the samples from original $p^*_x$, VAE and GANs to compare them visually. (You may plot them on same or different plots) [10 marks]
2. Quantify the performance of the two models and compare them. (Think what all metrics can you use - report as many as you can) [10 marks]
3. Expressivity vs Efficiency: Which of the two models is more expressive? Which is more efficient? (Number of parameters vs performance, computations vs performance, etc.) [10 marks]
4. Compare the pros and cons of the two models in a table. [10 marks]
5. Do you find a structure in the latent space in the two models? E.g., each Gaussian component of $p^*_x$ may correspond to a specific region in the latent space. [10 marks]

## Codes and Answers
It would be appreciated if you write modular codes and re-use them while answering different parts.

## Q1 and Q5
For the plots please check the GAN.ipynb and VAE.ipynb notebook.  

#### GAN
After the training, we plot the original data with each component of the GMM coloured with a different colour. We then sample noise from the standard normal distribution and transform it using our GAN.   
$x = G(z)$  
For each $x$ we see which component of the GMM it is most likely to be from. We colour the $G(z)$ and $z$ accordingly.   

There is also a gan.mp4 which shows us the evolution of the model as it generates data from the same noise vector after each epoch.  

#### VAE
After training our VAE, we plot the complete reconstructed data.    

In order to visual the spaces better, we sample from each component of the GMM and plot the reconstructed samples of that particular component using the same colour. This gives a better sense of the reconstruction. For each component we also plot the encoded data $z = E(x)$ using the same colour. This helps us visualize the structure in the latent space.    

We can see that each component has a separate cluster in the latent space. 

There is a vae.mp4 which shows us the evolution of the model as it learns to encode and decode the training data.

## Q2 
1. Log Likelihood 
2. Average Minimum distance from any training sample

## Q3 
### Expressivity
This is a more qualititative assessment. In this sense, GANs seem to be more expressive because it learns to transform a latent space into samples from $p^*(x)$.  From the video and the spread of $x = G(z)$, we can see that GANs are more expressive and can capture the shape of the training data better than VAEs.  

On the other hand, the VAE learns to reconstruct the samples from $p^*(x)$. While the sampling part in the VAE helps to bring in some more diversity from the latent space, the main purpose is to bring more structure to the latent space.  

Another way of thinking about expressivity is that a GAN is learning to be creative and original (with respect to some standard) and generates data which is as good as real. This is done from noise.  
On the other hand, VAE has some kind of supervision since it is learning to reconstruct the data. It is possible that interpolating in the latent space can give us some new kind of data but compared too GAN, VAE is still restricted in terms of training.

#### Another thought on VAE 
VAEs learn a latent space to represent the real data. By interpolating between different points in the latent space we could get data which is a never-seen-before mix of the original data. This might be one viewpoint on the expressivity on VAEs, however, for the experiments conducted, GANs seem to be more expressive and modelling the data better.

### Efficiency
The 2 models give different types of results. However, we can still have metric like "Number of Parameters vs performance" and "Training time vs performance".

##### Number of Parameters
##### GAN : 
4482 (G) + 17537 (D)
##### VAE :
18054

GANs have more parameters required for the overall training loop.  

##### Training Time 
GAN training : 732.8 s for 250 epochs => 2.9312 s / epoch (batch size = 256, training data = 100000)  
VAE training : 682.2 s for 250 epochs => 2.7288 s / epoch (batch size = 128, training data = 100000)   
GANs take significantly longer to train (consider batch size)

## Q4 
#### GAN
| Pros                          | Cons                                  |
| -----------                   | -----------                           |
| More expressive               | Unstable and Tricky Training          |
| Realistic data generation     | Long Training Time                    |
| -                             | Cannot model $p(z \| x) $             | 
| -                             | Requires a lot of data                | 

#### VAE
| Pros                                                              | Cons                                                                          |
| -----------                                                       | -----------                                                                   |
| Standard way to compare 2 VAEs using log likelihood               | Generated data is blurred because of the sampling and reconstrcution          |
| Can estimate latent variables, i.e $p(z \| x)$                    | Long Training Time                                                            |
