## Difference
### Loss Function 
### GAN
- $D$: $max_D V(D) = E_{x\sim p_{data}}(x)[logD(x)] + E_{z\sim p_z(z)}[log(1-D(G(z))]$
- $G$: $min_G V(G) = E_{z\sim p_z(z)}[log(1-D(G(z))]$

### WGAN
- $D$: $max_D V(D) = E_{x\sim p_{data}}[D(x)] - E_{z\sim p_z(z)}[D(G(z)]$
- $G$: $max_G V(G) = E_{z\sim p_z(z)}[D(G(z)]$
                                        
### WGAN-GP
- $D$: $max_D V(D) = E_{x\sim p_{data}}[D(x)] - E_{z\sim p_z(z)}[D(G(z)] + \lambda(\lVert \nabla D(\hat{x}) \rVert_{2}-1)^2$
- $G$: $max_G V(G) = E_{z\sim p_z(z)}[D(G(z)]$

|Features                      |  GAN   | WGAN |WGAN-GP
| --------------------------- |:------:|:----: |:-----
|output layer of Discriminator |Sigmoid | Linear |Linear
|optimizer                     | Adam   | RMS  | Adam
|weight clipping               | False  | True | False
|Batch Normalization           | False  | True | False

In [None]:
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

We pre-selected the hyper parameters for you this time. You usually need to tune this yourself.

In [None]:
# Training Params
num_steps = 100000 #num of iteration
batch_size = 50
learning_rate = 0.0002
Iters = 5
# c = 0.01
_lambda = 10

The hidden dimensions of the generator and the discriminator are also prespecified.

In [None]:
# Network Params
image_dim = 784 # 28*28 pixels
gen_hidden_dim = 256
disc_hidden_dim = 256
noise_dim = 128 # Noise data points

In [None]:
tf.reset_default_graph() # Clearing all tensors before this

### 1. Implement generator and discriminator

In [None]:
# Generator
def generator(noises, reuse=False):
    with tf.variable_scope('generator') as scope:
        if (reuse):
            tf.get_variable_scope().reuse_variables()
        # hidden layer with name "g_hidden"
        hidden = tf.layers.dense(noises, gen_hidden_dim, tf.nn.relu, name='g_hidden')
        # out layer with name "g_out"
        out_images = tf.layers.dense(hidden, image_dim, tf.nn.sigmoid, name='g_out')
    return out_images

# Discriminator
def discriminator(images, reuse=False):
    with tf.variable_scope('discriminator') as scope:
        if (reuse):
            tf.get_variable_scope().reuse_variables()            
        # hidden layer with name "d_hidden"
        hidden = tf.layers.dense(images, disc_hidden_dim, tf.nn.relu, name='d_hidden')
        # out layer with name "d_out"
        out_prob = tf.layers.dense(hidden, 1, None, name='d_out')
    return out_prob

### 2. Define the inputs to generator and discriminator.

In [None]:
gen_input = tf.placeholder(tf.float32, shape=[None, noise_dim], name='input_noise')
disc_input = tf.placeholder(tf.float32, shape=[None, image_dim], name='disc_input')

### 3. Input noise to G and generate images.
This should be a one linear.

In [None]:
gen_sample = generator(gen_input)

### 3. Input real and fake images to D and get predictions.
For D, you should have two inputs: real data and fake data. The latter is the output of $G$. For the latter, set `reuse=True`. I won't go into detail about it, but basically, you are reusing the samve variables in the above `discriminator` function and so you want to make them reusable.

In [None]:
disc_real = discriminator(disc_input)
disc_fake = discriminator(gen_sample, reuse=True)

### 4. Define the objective.

In [None]:
gen_loss = - disc_fake
disc_loss = -disc_real +  disc_fake

In [None]:
# clip = [p.assign(tf.clip_by_value(p,-c,c))for p in disc_var]

In [None]:
alpha = tf.random_uniform(shape=[Batch_Size,1],minval=0.,maxval=1.)
differences = fake_data-real_data
interpolates = real_data + (alpha*differences)
gradients = tf.gradients(discriminator(interpolates, reuse=True),[interpolates])[0]
slopes = tf.sqrt(tf.reduce_sum(tf.square(gradients),reduction_indices=[1]))
gradient_penalty = tf.reduce_mean((slopes-1.)**2)
disc_cost += _Lambda*gradient_penalty

### 5. Minimize (or maximize) the objective.

In [None]:
tvars = tf.trainable_variables()
disc_vars = [var for var in tvars if 'd_' in var.name]
gen_vars = [var for var in tvars if 'g_' in var.name]

In [None]:
optimizer_gen = tf.train.AdamOptimizer(learning_rate=learning_rate)
optimizer_disc = tf.train.AdamOptimizer(learning_rate=learning_rate)

In [None]:
train_gen = optimizer_gen.minimize(gen_loss, var_list=gen_vars)
train_disc = optimizer_disc.minimize(disc_loss, var_list=disc_vars)

### 6. Train the model.
For each iteration, take some batch of MNIST. Generate a prior noise $z$ by `np.random.uniform(-1., 1., size=[batch_size, noise_dim])`. Feed the batch data and prior noise to the model to update the objective.

After some epochs of training, for each noise generated, get the output $x$ by the generator and plot it using matplotlib. This time we prepared the code for you but read through it to understand it. Then, change the variable names if they are different from yours.

In [None]:
with tf.Session() as sess:

    # Run the initializer
    sess.run(tf.global_variables_initializer())
    
    for step in range(1, num_steps+1):

        batch_x, _ = mnist.train.next_batch(batch_size)
        # Generate noise to feed to the generator
        z = np.random.uniform(-1., 1., size=[batch_size, noise_dim])
        # Train
        feed_dict = {disc_input: batch_x, gen_input: z}
        _,gl = sess.run([train_gen,  gen_loss, ],
                                feed_dict=feed_dict)
        
        for i in range(Iters):
            -,dl = = sess.run([train_disc,  disc_loss, ],
                                    feed_dict=feed_dict)
        
        if step % 1000 == 0 or step == 1:
            print('Step %i: Generator Loss: %f, Discriminator Loss: %f' % (step, gl, dl))
    
        # Generate images from noise, using the generator network.
        if step % 10000 == 0 or step == 1:
            f, a = plt.subplots(4, 10, figsize=(10, 4))
            for i in range(10):
                # Noise input.
                z = np.random.uniform(-1., 1., size=[4, noise_dim])
                g = sess.run([gen_sample], feed_dict={gen_input: z})
                g = np.reshape(g, newshape=(4, 28, 28, 1))
                # Reverse colours for better display
                g = -1 * (g - 1)
                for j in range(4):
                    # Generate image from noise. Extend to 3 channels for matplot figure.
                    img = np.reshape(np.repeat(g[j][:, :, np.newaxis], 3, axis=2),
                                     newshape=(28, 28, 3))
                    a[j][i].imshow(img)

            plt.draw()
            print('gan'+str(step)+'.png')
            plt.savefig('gan'+str(step)+'.png')
    print('Done')