--------
### Generative Adverserial Networks

-----

- **Generator Network** - Takes as input a random vector(random point in latent space), and decodes it tnto synthetic image. 
- **Discriminator Network(or adversary)** - Takes an input as an image(real or synthetic), and predicts whether the image came from the training set or was created by Discriminator Network. 
- The generator network is trained to be able to fool the discriminator network, and thus it evolves towards generating increasingly realistic images as training goes on:artificial images that look indistinguishable from real ones, to the extent the discriminator is able to tell apart. 
- The discriminator is also trained to adapt to gradually improving capabilities of the generator network, setting a higher bar of realism for the generated images. 
- Once the training is over the Generator is capable of converting any point in latent space into believable image. Unlike VAEEs, this latent space has fewer explicit guarantees of meaningful structure, it isn't continous. 


-----

- In GAN we the optimization minimum is not fixed. 
- Gradient descent is rolling down hills in a static loss landscape. 
- But with GANs, every step taken down the hill changes the entire landscape a little.
- **It is a dynamic system where the optimization process is seeking not a minimum, but a equilibrium between two forces. FOr this reason, GANs are notoriously difficult to train-getting a GAN to work requires lots of careful tuning of the model architecture and training parameters.**

------

***Schematic GAN implementation***

-----

Here we use Conv2DTranspose to upsample the images. Also in the following section we will be discussing how to define GAN for CIFAR dataset. 

1. A generator network maps vectors of shape (latent_dim,) to images of shape (32,32,3). 
2. A discriminator network maps images of shape (32,32,3) to a binary score estimating the probability that the image is real. 
3. A GAN network chains the above networks together: 
```python
gan(x) = discriminator(generator(x))
```
Thus this gan network maps latenet space vector's to discriminators assessment of the realism of these latent vectors as decoded by generator. 
4. You train the discriminator using examples of real and fake images along with "real/fake" labels, just as you train any regular image-classification model.
5. To train the generator, you use the gradients of the generator's weights with regard to the loss of the gan model. This means that at every step, you move the weights of the generator in a direction that makes the discriminator more likely to classify as "real" the images created by the generator. Esentially we train the generator to fool the discriminator. 


--------

***Some tips before jumping into implementation***

------
- We use tanh as the last activation in generator, instead of sigmoid, which is more commonly found in other types of models. 
- We sample points from latent space using a Normal distribution, not a uniform distribution. 
- Stochasticity is good to induce robustness. Becasue the GAN results in dynamic equilibrium, GANs are likely to get stuck in a lot of ways. **Introducing randomness during training helps prevent this. We introduce randomness in two ways: by using dropout in the discriminator and by adding random noise to the labels for the discriminator.** 
- Sparse gradients can hinder GAN training. In deep learning, sparsity is often a desirable property, but not in GANs. Two things that induce gradient sparsity: max-poolong and ReLU activation. **Instead of max pooling, we recommend using a strided convolutionsfor downsampling, and we recommend using a LeakyReLU layer instead of ReLU. It is similar to ReLU but relaxes sparsity by allowing small negative activation values.**
- IN generated images, it;s common to see checkerboard artifacts caused by unequal coverage of the pixel space in the generator. To fix this, **we use a kernal size that's divisible by the stride size**, whenever we use a strided Conv2DTranspose or Conv2D in both generator and the discriminator. 

----------

***Let's move to the implementation***

---------

***The generator Network***

--------

A lot of time generator gets stuck with generated images that look like noise. **So, it's better to use dropout in both generator and discriminator networks**. 

In [1]:
import keras
from keras import layers
import numpy as np 

Using TensorFlow backend.


In [2]:
latent_dim = 32
height = 32
width = 32
channels = 3

In [3]:
generator_input = layers.Input(shape = (latent_dim,))

### TRANSFORMS THE INPUT TO A 16X16, 128 CHANNEL FEATURE MAP
x = layers.Dense(128*16*16)(generator_input)
x = layers.LeakyReLU()(x)
x = layers.Reshape((16,16,128))(x)

x = layers.Conv2D(256, 5, padding = 'same')(x)
x = layers.LeakyReLU()(x)

### UPSAMPLES TO 32*32
x = layers.Conv2DTranspose(256, 4, strides=2, padding='same')(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(256, 5, padding = 'same')(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(256, 5, padding = 'same')(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(channels, 7, activation = 'tanh', padding = 'same')(x)

In [4]:
generator = keras.models.Model(generator_input, x)

In [5]:
generator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 32768)             1081344   
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 32768)             0         
_________________________________________________________________
reshape_1 (Reshape)          (None, 16, 16, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 256)       819456    
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 16, 16, 256)       0         
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 32, 32, 256)       1048832   
__________

----

***Discriminator Network***

----


In [6]:
discriminator_input = layers.Input(shape=(height, width,channels))

x = layers.Conv2D(128,3)(discriminator_input)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)
x = layers.Flatten()(x)

x = layers.Dropout(0.4)(x)
x = layers.Dense(1, activation = 'sigmoid')(x)

Instructions for updating:
keep_dims is deprecated, use keepdims instead


In [7]:
discriminator = keras.models.Model(discriminator_input, x)

In [8]:
discriminator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 30, 30, 128)       3584      
_________________________________________________________________
leaky_re_lu_6 (LeakyReLU)    (None, 30, 30, 128)       0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 14, 14, 128)       262272    
_________________________________________________________________
leaky_re_lu_7 (LeakyReLU)    (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 6, 6, 128)         262272    
_________________________________________________________________
leaky_re_lu_8 (LeakyReLU)    (None, 6, 6, 128)         0         
__________

In [9]:
### To stabalize training, use learning rate decay 
### Use gradient clipping
discriminator_optimizer = keras.optimizers.RMSprop(
    lr = 0.0008,
    clipvalue = 1.0,
    decay=1e-8
)

In [10]:
discriminator.compile(optimizer = discriminator_optimizer, 
                     loss = 'binary_crossentropy')

Instructions for updating:
keep_dims is deprecated, use keepdims instead


---------

***The adversarial Network***

-------

**Now we will setup GAN. The model turns latent space points into a clssification decision - 'fake' or 'real'- and it is meant to be trained with labels "these are real images".**

 - It is important to freeze the discriminator during training. It's weights won't be updated while training GAN. 

In [11]:
discriminator.trainable = False

gan_input = layers.Input(shape = (latent_dim,))
gan_output = discriminator(generator(gan_input))

In [12]:
gan = keras.models.Model(gan_input, gan_output)

In [13]:
gan.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         (None, 32)                0         
_________________________________________________________________
model_1 (Model)              (None, 32, 32, 3)         6264579   
_________________________________________________________________
model_2 (Model)              (None, 1)                 790913    
Total params: 7,055,492.0
Trainable params: 7,055,492.0
Non-trainable params: 0.0
_________________________________________________________________


In [14]:
gan_optimizer = keras.optimizers.RMSprop(lr = 0.0004, clipvalue = 1.0, decay = 1e-8)

In [15]:
gan.compile(gan_optimizer, loss = 'binary_crossentropy')

----

***Training the DCGAN***

----

For each epoch we will do the following - 
1. Draw random points form latent space. 
2. Generate images using generator network by feeding the random points.
3. Mix generated images and real images.
4. Train Discriminator using these mixed images, with corresponding targets: either 'real' for real images and 'fake' for fake images. 
5. Draw new points from latent space. 
6. TRain gan using the newly extracted points from latent space. Here we will pass the targets as 'real'. This will train the generator(here the discriminator will be frozen). 

In [16]:
import os
from keras.preprocessing import image

In [17]:
(x_train, y_train), (_,_) = keras.datasets.cifar10.load_data()

In [18]:
### Selecting only frog images
x_train = x_train[y_train.flatten() == 6]

In [19]:
### Data Normalization
x_train = x_train.reshape((x_train.shape[0],) + (height, width,channels)).astype('float32')/255.

In [20]:
iterations = 10000
batch_size = 20
save_dir = 'gan/'

In [22]:
start = 0
for step in range(iterations):
    ### Generating random latent samples from Normal Distribution 
    random_latent_vectors = np.random.normal(size = (batch_size, latent_dim))
    
    ### Generator generates fake images
    generated_images = generator.predict(random_latent_vectors)
    
    ### Sampling real images from training data
    stop = start + batch_size
    real_images = x_train[start:stop]
    
    ### Creating dataset to train Discriminator
    combined_images = np.concatenate([generated_images, real_images])
    labels = np.concatenate([np.ones((batch_size, 1)), np.zeros((batch_size, 1))])
    ### Add noise to labels(Important)
    labels += 0.05*np.random.random(labels.shape)
    
    ### Train the discriminator
    d_loss = discriminator.train_on_batch(combined_images, labels)
    
    ###############################################################################
    
    random_latent_vectors = np.random.normal(size = (batch_size, latent_dim))
    misleading_targets = np.zeros((batch_size,1))
    ### Train generator (Here the discriminator is frozen)
    a_loss = gan.train_on_batch(random_latent_vectors, misleading_targets)
    
    
    #############################################################################
    
    start += batch_size
    if start > len(x_train) - batch_size:
        start = 0
    
    if step % 100 == 0:
        gan.save_weights('gan/gan.h5')
        print('discriminator loss:', d_loss)
        print('adverserial loss:', a_loss)
        
        img = image.array_to_img(generated_images[0]*255. , scale = False)
        img.save(os.path.join(save_dir, 'gen' + str(step) + '.png'))
        
        img = image.array_to_img(real_images[0]*255. , scale = False)
        img.save(os.path.join(save_dir, 'real' + str(step) + '.png'))

discriminator loss: 0.67374575
adverserial loss: 0.6432811
discriminator loss: 0.68710434
adverserial loss: 0.8233681
discriminator loss: 0.6849948
adverserial loss: 0.78601927
discriminator loss: 0.6934394
adverserial loss: 0.7108942
discriminator loss: 0.6998938
adverserial loss: 0.7576715
discriminator loss: 0.69417727
adverserial loss: 0.7376522
discriminator loss: 0.6934203
adverserial loss: 0.75727564
discriminator loss: 0.67730457
adverserial loss: 0.9952731
discriminator loss: 0.68475705
adverserial loss: 0.7456008
discriminator loss: 0.6937794
adverserial loss: 0.78125745
discriminator loss: 0.7264334
adverserial loss: 0.74640864
discriminator loss: 0.7057308
adverserial loss: 0.7664032
discriminator loss: 0.69312286
adverserial loss: 0.7578683
discriminator loss: 0.7026157
adverserial loss: 0.73570263
discriminator loss: 0.6921697
adverserial loss: 0.7253946
discriminator loss: 0.70758694
adverserial loss: 0.76370525
discriminator loss: 0.6841394
adverserial loss: 0.7474088
d

When training, we may see adverserial loss begin to increase considerably, while discriminator loss tends to zero-the discriminator may end up dominating the generator. In this case we can reduce the learning rate of the discriminator and increase it's dropout value. 