# Generative Adversarial Networks

**Main Idea** 
We have **two** Networks running: 
1. The $Generator$: Which creates samples by mapping random noise to the data-space.
2. The  $Discriminator$: Which aims to classify if the image is fake or real.

In the event that the $Discriminator$ Correctly identifies a fake image, this will be sent back as a signal to the $Generator$ to improve it's sampling quality.

- We will see that training GAN isn't stable. 
- Although it's caparable of producing **Quality Samples**, it doesn't have great **Coverage**

## Discrimination as a signal

We generate new samples $\{\mathbf{x_j}^*\}$ that are drawn from the same distribution as a set of real training data $\{\mathbf{x_i}\}$

**How we generate a new sample**
1. Choose a **simple and known** dsitribution
2. **Randomly sample a point** from that dsitribution.
3. Pass this data through a network $\mathbf{x_j^*} = \mathbf{g}[\mathbf{z_j}, \theta ]$, the $Generator$.
   - The network aims to find the parameters that makes $\mathbf{x_j^*}$ look similar to training data $\{\mathbf{x_i}\}$.
4. Our $Discriminator$ network, aims to classify its input $\{Real, Generated\}$
   - If the discriminator classifies a generated image as a fake, it sends the **Gradient signal** back to the generator during backpropagation.

### GAN loss Function 

Our Discriminator: $f[\mathbf{x}, \phi] \in \mathbb{R}$
- The **higher** the **value** the **more** it believes the image to be **Real**
- The Goal of the discrimnator is to minimize loss, and this is a classic binary-classification problem.
$$
\hat{\phi} = \text{arg}\min_{\phi} \left[\sum_i -(1-y_i)\log\left(1- \sigma(f[\mathbf{x}_i, \phi])\right) - y_i\log\left(\sigma(f[\mathbf{x}_i, \phi])\right)\right]
$$
- We **define** $real$ examples $\mathbf{x_i}$ to have label $y=1$ and $generated$ examples $\mathbf{x_j^*}$:
  
$$
\hat{\phi} = \text{arg}\min_{\phi} \left[\sum_j -\log\left(1- \sigma(f[\mathbf{x_j^*}, \phi])\right) - \sum_i\log\left(\sigma(f[\mathbf{x}_i, \phi])\right)\right]
$$

Our Generator: $\mathbf{x_j^*} = g[\mathbf{z_j}, \theta] \in \mathbb{R}^n$
- The Generator wants to produce the largest value possible, but it's always being checked by the discriminator, so maximize it's minimum:

$$
\hat{\theta} = \text{arg}\max_{\theta}\left[\min_{\phi} \left[\sum_j -\log\left(1- \sigma(f[\mathbf{x_j^*}, \phi])\right) - \sum_i\log\left(\sigma(f[\mathbf{x}_i, \phi])\right)\right]\right]
$$
