# 9. GAN
This section will be an exercise. Surprisingly, you can build GAN fairly easily just by using the concepts we learned so far.

## Preparation
Read section 1, 2, 3 of the original [GAN paper](https://arxiv.org/pdf/1406.2661.pdf). Then, follow the next 7 steps to implement GAN.

To summarie, we have the following problem setup:
- $x$: data with distribution $p_{data}$
- $p_g$: distribution trained by the generator
- $z$: prior input noise variables
- $p_z$: prior of $z$
- $G(z;\theta_G)$: generator neural network with parameter $\theta_G$
- $D(z;\theta_D)$: discriminator neural network with parameter $\theta_D$

The goal for $D,G$ are the following:
- $D$: $max_D V(D) = E_{x\sim p_{data}}(x)[logD(x)] + E_{z\sim p_z(z)}[log(1-D(G(z))]$
- $G$: $min_G V(G) = E_{z\sim p_z(z)}[log(1-D(G(z))]$

In [1]:
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


We pre-selected the hyper parameters for you this time. You usually need to tune this yourself.

In [2]:
# Training Params
num_steps = 500000
batch_size = 128
learning_rate = 0.0002

The hidden dimensions of the generator and the discriminator are also prespecified.

In [14]:
# Network Params
image_dim = 784 # 28*28 pixels
gen_hidden_dim = 256
disc_hidden_dim = 256
noise_dim = 100 # Noise data points

In [4]:
tf.reset_default_graph() # Clearing all tensors before this

### 1. Implement generator and discriminator.

Let both G and D be a 1-hidden layer fully connected neural network. Use ReLU for activation function for the hidden layer. For out layer, you should know what to use :)
Since G and D are both 1-hidden layer fully connected NN, we need 
- weights for hidden layer and out layer
- bias for hidden layer and out layer

for each. Use `tf.layers.dense`.

Becareful about the dimensions of the layers:
- $G$ takes in noise and generates an image.
- $D$ takes in an image and outputs a probability of the image being real.

In [5]:
# Generator
def generator(noises, reuse=False):
    with tf.variable_scope('generator') as scope:
        if (reuse):
            tf.get_variable_scope().reuse_variables()
        # TODO hidden layer with name "g_hidden"
        # TODO out layer with name "g_out"
    return out_images

# Discriminator
def discriminator(images, reuse=False):
    with tf.variable_scope('discriminator') as scope:
        if (reuse):
            tf.get_variable_scope().reuse_variables()
        # TODO hidden layer with name "d_hidden"
        # TODO out layer with name "d_out"
    return out_prob

### 2. Define the inputs to generator and discriminator.
- Input to G: (batch size) $\times$ ??
- Input to D: (batch size) $\times$ ??

Think about what ?? should be.

### 3. Input noise to G and generate images.
This should be a one linear.

### 3. Input real and fake images to D and get predictions.
For D, you should have two inputs: real data and fake data. The latter is the output of $G$. For the latter, set `reuse=True`. I won't go into detail about it, but basically, you are reusing the samve variables in the above `discriminator` function and so you want to make them reusable.

### 4. Define the objective.
Expectation should be approximated using the sample mean. As a reminder, they are:
- $D$: $max_D V(D) = E_{x\sim p_{data}}(x)[logD(x)] + E_{z\sim p_z(z)}[log(1-D(G(z))]$
- $G$: $min_G V(G) = E_{z\sim p_z(z)}[log(1-D(G(z))]$

### 5. Minimize (or maximize) the objective.
Adam optimizer is recommended. We should have two optimizers for D and G. Be careful to only take the gradient with respect to the variables to optimize. Namely
- $V(D)$: weights and biases of D
- $V(G)$: weights and biases of G

We provided the code for extracting these variables from the computation graph.

In [10]:
tvars = tf.trainable_variables()
disc_vars = [var for var in tvars if 'd_' in var.name]
gen_vars = [var for var in tvars if 'g_' in var.name]

In [12]:
# TODO

### 6. Train the model.
For each iteration, take some batch of MNIST. Generate a prior noise $z$ by `np.random.uniform(-1., 1., size=[batch_size, noise_dim])`. Feed the batch data and prior noise to the model to update the objective.

After some epochs of training, for each noise generated, get the output $x$ by the generator and plot it using matplotlib. This time we prepared the code for you but read through it to understand it. Then, change the variable names if they are different from yours.

In [13]:
with tf.Session() as sess:

    # Run the initializer
    sess.run(tf.global_variables_initializer())
    
    for step in range(1, num_steps+1):

        #####
        # TODO
        # Get the next batch of MNIST data (only images are needed, not labels)
        # Generate noise to feed to the generator
        # Train
        #####
        
        if step % 1000 == 0 or step == 1:
            print('Step %i: Generator Loss: %f, Discriminator Loss: %f' % (step, gl, dl))
    
        # Generate images from noise, using the generator network.
        # TODO Change the variable names if they are different from yours.
        if step % 10000 == 0 or step == 1:
            f, a = plt.subplots(4, 10, figsize=(10, 4))
            for i in range(10):
                # Noise input.
                z = np.random.uniform(-1., 1., size=[4, noise_dim])
                g = sess.run([gen_sample], feed_dict={gen_input: z})
                g = np.reshape(g, newshape=(4, 28, 28, 1))
                # Reverse colours for better display
                g = -1 * (g - 1)
                for j in range(4):
                    # Generate image from noise. Extend to 3 channels for matplot figure.
                    img = np.reshape(np.repeat(g[j][:, :, np.newaxis], 3, axis=2),
                                     newshape=(28, 28, 3))
                    a[j][i].imshow(img)

            plt.draw()
            print('gan'+str(step)+'.png')
            plt.savefig('gan'+str(step)+'.png')

Step 1: Generator Loss: 0.645723, Discriminator Loss: 1.310875
gan1.png
Step 1000: Generator Loss: 4.050825, Discriminator Loss: 0.045055
Step 2000: Generator Loss: 4.184691, Discriminator Loss: 0.053454
Step 3000: Generator Loss: 5.421757, Discriminator Loss: 0.016621
Step 4000: Generator Loss: 3.798069, Discriminator Loss: 0.037094
Step 5000: Generator Loss: 3.864313, Discriminator Loss: 0.095495
Step 6000: Generator Loss: 4.470815, Discriminator Loss: 0.048160
Step 7000: Generator Loss: 4.339087, Discriminator Loss: 0.114698
Step 8000: Generator Loss: 4.121165, Discriminator Loss: 0.077570
Step 9000: Generator Loss: 4.747775, Discriminator Loss: 0.084910
Step 10000: Generator Loss: 5.032366, Discriminator Loss: 0.116188
gan10000.png
Step 11000: Generator Loss: 4.475104, Discriminator Loss: 0.097629
Step 12000: Generator Loss: 4.790847, Discriminator Loss: 0.128775
Step 13000: Generator Loss: 5.186819, Discriminator Loss: 0.101345
Step 14000: Generator Loss: 5.026080, Discriminator L

Step 132000: Generator Loss: nan, Discriminator Loss: nan
Step 133000: Generator Loss: nan, Discriminator Loss: nan
Step 134000: Generator Loss: nan, Discriminator Loss: nan
Step 135000: Generator Loss: nan, Discriminator Loss: nan
Step 136000: Generator Loss: nan, Discriminator Loss: nan
Step 137000: Generator Loss: nan, Discriminator Loss: nan
Step 138000: Generator Loss: nan, Discriminator Loss: nan
Step 139000: Generator Loss: nan, Discriminator Loss: nan
Step 140000: Generator Loss: nan, Discriminator Loss: nan
gan140000.png
Step 141000: Generator Loss: nan, Discriminator Loss: nan
Step 142000: Generator Loss: nan, Discriminator Loss: nan
Step 143000: Generator Loss: nan, Discriminator Loss: nan
Step 144000: Generator Loss: nan, Discriminator Loss: nan
Step 145000: Generator Loss: nan, Discriminator Loss: nan
Step 146000: Generator Loss: nan, Discriminator Loss: nan
Step 147000: Generator Loss: nan, Discriminator Loss: nan
Step 148000: Generator Loss: nan, Discriminator Loss: nan




gan200000.png
Step 201000: Generator Loss: nan, Discriminator Loss: nan
Step 202000: Generator Loss: nan, Discriminator Loss: nan
Step 203000: Generator Loss: nan, Discriminator Loss: nan
Step 204000: Generator Loss: nan, Discriminator Loss: nan
Step 205000: Generator Loss: nan, Discriminator Loss: nan
Step 206000: Generator Loss: nan, Discriminator Loss: nan
Step 207000: Generator Loss: nan, Discriminator Loss: nan
Step 208000: Generator Loss: nan, Discriminator Loss: nan
Step 209000: Generator Loss: nan, Discriminator Loss: nan
Step 210000: Generator Loss: nan, Discriminator Loss: nan
gan210000.png
Step 211000: Generator Loss: nan, Discriminator Loss: nan
Step 212000: Generator Loss: nan, Discriminator Loss: nan
Step 213000: Generator Loss: nan, Discriminator Loss: nan
Step 214000: Generator Loss: nan, Discriminator Loss: nan
Step 215000: Generator Loss: nan, Discriminator Loss: nan
Step 216000: Generator Loss: nan, Discriminator Loss: nan
Step 217000: Generator Loss: nan, Discrimina

Step 339000: Generator Loss: nan, Discriminator Loss: nan
Step 340000: Generator Loss: nan, Discriminator Loss: nan
gan340000.png
Step 341000: Generator Loss: nan, Discriminator Loss: nan
Step 342000: Generator Loss: nan, Discriminator Loss: nan
Step 343000: Generator Loss: nan, Discriminator Loss: nan
Step 344000: Generator Loss: nan, Discriminator Loss: nan
Step 345000: Generator Loss: nan, Discriminator Loss: nan
Step 346000: Generator Loss: nan, Discriminator Loss: nan
Step 347000: Generator Loss: nan, Discriminator Loss: nan
Step 348000: Generator Loss: nan, Discriminator Loss: nan
Step 349000: Generator Loss: nan, Discriminator Loss: nan
Step 350000: Generator Loss: nan, Discriminator Loss: nan
gan350000.png
Step 351000: Generator Loss: nan, Discriminator Loss: nan
Step 352000: Generator Loss: nan, Discriminator Loss: nan
Step 353000: Generator Loss: nan, Discriminator Loss: nan
Step 354000: Generator Loss: nan, Discriminator Loss: nan
Step 355000: Generator Loss: nan, Discrimina

Step 477000: Generator Loss: nan, Discriminator Loss: nan
Step 478000: Generator Loss: nan, Discriminator Loss: nan
Step 479000: Generator Loss: nan, Discriminator Loss: nan
Step 480000: Generator Loss: nan, Discriminator Loss: nan
gan480000.png
Step 481000: Generator Loss: nan, Discriminator Loss: nan
Step 482000: Generator Loss: nan, Discriminator Loss: nan
Step 483000: Generator Loss: nan, Discriminator Loss: nan
Step 484000: Generator Loss: nan, Discriminator Loss: nan
Step 485000: Generator Loss: nan, Discriminator Loss: nan
Step 486000: Generator Loss: nan, Discriminator Loss: nan
Step 487000: Generator Loss: nan, Discriminator Loss: nan
Step 488000: Generator Loss: nan, Discriminator Loss: nan
Step 489000: Generator Loss: nan, Discriminator Loss: nan
Step 490000: Generator Loss: nan, Discriminator Loss: nan
gan490000.png
Step 491000: Generator Loss: nan, Discriminator Loss: nan
Step 492000: Generator Loss: nan, Discriminator Loss: nan
Step 493000: Generator Loss: nan, Discrimina

### 7. [Optional] use TensorBoard to check the computation graph and loss.
You might want to read about [variable sharing](https://www.tensorflow.org/versions/r1.1/programmers_guide/variable_scope) and [variable scope](https://stackoverflow.com/questions/35919020/whats-the-difference-of-name-scope-and-a-variable-scope-in-tensorflow).

This might be a bit more involved than the previous steps...

Ok, this was probably the hardest section so far since there's less hand holding. But if you could complete this exercise, this means that you can build reasonably sophisticated neural network models in TensorFlow! Look back and see how far you got :)

Check [this](https://github.com/tensorflow/models/blob/master/research/gan/tutorial.ipynb) out if you're more interested in GANs.

Thanks for completing this workshop. If you liked it, please `star` this repo, so that more and more people can learn about TensorFlow! Feedback is always welcome!
