# Generative Adversarial Networks (GANs)

Ara an alternative to VAEs for learning latent spaces of images. They enbale the generation of fairly realistic synthetic images by forcing the generated images to be statistically almost indistinguishable from real ones.

An intuitive way to understand GANs is to imagine a forger trying to create a fake Picasso painting. At first, the forger is pretty bad at the task. He mixes some of his fakes with authentic Picassos and shows them all to an art dealer. The art dealer makes an authenticity assessement for each Picasso and gives feedback about what makes a Picasso look like a real Picasso. The forger goes to his studio to prepare more fakes. As time goes the forger becomes increasingly expert at imitating pictures, and the art dealer becomes increasingly expert spotting fakes

A GAN is made of two parts:

- *Generator network*: Takes as input a random vector (random point in the latent space), and decodes it into a synthetic image.

- *Discriminator network (or adversary)*: Takes as input an image (real or fake), and predicts weather the image came from the training set or was created by the *generator*.

![](https://cdn-images-1.medium.com/max/1200/1*pHOkZ0HJrUSP827-fZFETg.png)

## Schematic implementation of *Deep Convolutional GAN (DCGAN)*

> In this example will be using the CIFAR-10 dataset

1. A `generator` networks maps vectors of shape `(latent_dim, )` to images of shape `(32, 32, 3)`.

2. A `discriminator` network maps images ti a binary score estimating the probability thet the image is real.

3. A `gan` network chains the generator and the discriminator together: `gan(x) = discriminator(generator(x))`. Thus this `gan` maps latent space vector to the discriminator's assessement of the realism of this latent vectors as decoded by the generator.

4. You train the discriminator using examples of real and fake images along with "real" / "fake" labels, just as you train any regular image-classification model.

5. To train the generator, you use the gradients of the generator's weights with regard to the loss of the `gan` model. This means, at every step, you move the weights of the generator in a direction that makes the discriminator more likely to classify as real the images decoded by the generator. In other words, you train the generator to fool the discriminator.

## GANs' tricks

The process of training GANs adn tuning GAN implementation is difficult.

- We use `tanh` as the activation in the generator, instead of `sigmoid`.
- We sample points from the latent space using a *normal distribution*, not a uniform dits.
- Stochasticity is good to introduce robustness. GANs are likely to stuck in all sort of ways, introducing randomness helps prevent this. We introduce randomness using dropout and adding noise to the labels for the discriminator.
- Sparsisity is often a desirable property, but not in GANs. Two things can include gradient sparsisity: ReLU and max pooling. Insead of ReLU we will be using LeakyReLU which reduces sparsisity by allowing small negative numbers, and instead of max pooling will use convolutional strided layers.
- In generted images, it's common to see checkerboards artifacts caused by unequal coverage of the picel space in the generator. To fix this, we use a kernel size that's diviseible by the stride size whenever we uses an strided convolutional layer.

![Checkerboards](https://distill.pub/2016/deconv-checkerboard/thumbnail.jpg)

## The generator

In [3]:
import keras
from keras import layers
import numpy as np

latent_dim = 32
height = 32
width = 32
channels = 3

generator_input = layers.Input((latent_dim, ))

# Generate a 16x16x128 feature map
x = layers.Dense(128 * 16 * 16)(generator_input)
x = layers.LeakyReLU()(x)
x = layers.Reshape((16, 16, 128))(x)

x = layers.Conv2D(256, 5, padding='same')(x)
x = layers.LeakyReLU()(x)

# Upscale to 32x32 using a conv2d transpose
x = layers.Conv2DTranspose(256, 4, strides=2, padding='same')(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(256, 5, padding='same')(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(256, 5, padding='same')(x)
x = layers.LeakyReLU()(x)

# Produces 32 x 32 x 3 feature map (shape of CIFAR 10 image)
x = layers.Conv2D(channels, 7, activation='tanh', padding='same')(x)

generator = keras.models.Model(generator_input, x)
generator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         (None, 32)                0         
_________________________________________________________________
dense_2 (Dense)              (None, 32768)             1081344   
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 32768)             0         
_________________________________________________________________
reshape_2 (Reshape)          (None, 16, 16, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 256)       819456    
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 16, 16, 256)       0         
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 32, 32, 256)       1048832   
__________

## The discriminator

In [6]:
discriminator_input = layers.Input(shape=(height, width, channels))

x = layers.Conv2D(128, 3)(discriminator_input)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)

x = layers.Flatten()(x)

x = layers.Dropout(.4)(x)

x = layers.Dense(1, activation='sigmoid')(x)

discriminator = keras.models.Model(discriminator_input, x)
discriminator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_6 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d_13 (Conv2D)           (None, 30, 30, 128)       3584      
_________________________________________________________________
leaky_re_lu_15 (LeakyReLU)   (None, 30, 30, 128)       0         
_________________________________________________________________
conv2d_14 (Conv2D)           (None, 14, 14, 128)       262272    
_________________________________________________________________
leaky_re_lu_16 (LeakyReLU)   (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_15 (Conv2D)           (None, 6, 6, 128)         262272    
_________________________________________________________________
leaky_re_lu_17 (LeakyReLU)   (None, 6, 6, 128)         0         
__________

In [7]:
discriminator_optimizer = keras.optimizers.RMSprop(
                                            lr=0.0008,
                                            clipvalue=1.,
                                            decay=1e-8)

discriminator.compile(optimizer=discriminator_optimizer, loss='binary_crossentropy')

## The adversarial network

In [10]:
discriminator.trainable = False

gan_input = layers.Input(shape=(latent_dim, ))
gan_output = discriminator(generator(gan_input))
gan = keras.models.Model(gan_input, gan_output)

gan_optimizer = keras.optimizers.RMSprop(lr=0.0004, clipvalue=1.0, decay=1e-8)
gan.compile(optimizer=gan_optimizer, loss='binary_crossentropy')

## How to train DCGAN

