# Generative Models

In the previous Generative Modeling classwork, we talked about how the goal of generative modelling is to learn how to create new inputs (images, tabular data...etc) that have similar properties to the training data. Generally this can be accomplished in two ways:

- **Density Estimation**: where the model learns an explicit probability density function that it can sample from (e.g. Naive Bayes, Gaussian Mixture Models, and Variational Autoencoders)
- **Sample Generation**: where the model learns a function that can generate new stimuli without explicity density estimation (such as Generative Adversarial Networks)


In this classwork, we'll focus on **Sample Generation** models which do not explicitly learn a probability distribtion to sample from (unlike VAEs, GMM, and NB in the last section), rather we just ask them to create new inputs and reward it when the inputs look like our training data and penalize it when they don't. 

The most common for of this type of model is a **Generative Adversarial Network** (GAN). GANs consist of two parts:

- **Generator**: whose job is to create convincing new images/inputs from random noise
- **Discriminator**: whose job it is to detect fake (made with the Generator) vs. real (training data) inputs

The Generator and Discriminator act like a counterfitter and cop: the Generator learns to make more fakes over time, and the Discriminator learns how to detect those fakes better over time. 


## Loss

$$  E_x[log(D(x))] + E_z[log(1-D(G(z)))] $$

- $E_x$ refers to the expected value accross all real examples
- $E_z$ refers to the expected value accross all fake examples
- $D(x)$ is the predicted probability from the Discriminator ($D$) on *real* samples ($x$)
- $G(z)$ is a fake output from the Generator ($G$) given some random noise ($z$)
- $D(G(z))$ is the predicted probability from the Discriminator ($D$) on *fake* samples ($G(z)$)

<img src="https://drive.google.com/uc?export=view&id=1ghyQPx1N8dmU3MV4TrANvqNhGwnLni72" alt="Q" width = "200"/>

- Given that predicted probabilities can only be between 0-1, think through with your group, what values of $D(x)$ and $D(G(z))$ would *minimize* and *maximize* the loss.
- What do the values you calculated above tell you about the performance of the Generator/Discriminator when the loss is minimized? maximized?


Because the Generator and Discriminator are largely at odds, the Generator wants to minimize the loss function, and the Discriminator wants to maximize it. 

## Problems

However, because GANs are really two competing models in a trench coat, they can be difficult to train. Some major issues are:

- Mode Collapse: The generator simply learns a single image that will fool the discriminator and maps all input noise to that image.
- Vanishing Gradients: when the discriminator is too good, the generator's gradient goes to 0 (or near zero), causing it to not be able to learn
- Lack of Convergence: sometimes the Generator and Discriminator will reach a type of equilibrium where the generator is creating good fakes. But sometimes they just oscillate, undoing each others' progress.


# Generative Models in Your Daily Life
- Star Wars
- [TikTok](https://www.theverge.com/2023/3/2/23621751/bold-glamour-tiktok-face-filter-beauty-ai-ar-body-dismorphia)

# MNIST and Pokemon GANs

First, let's just take a look at Tensorflow's example GAN on MNIST data ([here](https://www.tensorflow.org/tutorials/generative/dcgan))

Then, let's open my modified version which uses pokemon images (converted to grayscale) in a similar GAN ([here](https://colab.research.google.com/drive/1-cZOaHL5hREzpXc3Kw4F_BFCZRf_TuW3?usp=sharing))
