In [1]:
from keras import layers

Using TensorFlow backend.


## DCGAN - Deep convolutional GAN

Schematically, a GAN looks like:
1. A `generator` network maps vectors of shape `(latent_dim,)` to images of shape `(32, 32, 3)`.
2. A `discriminator` network maps images of shape `(32, 32, 3)` to a binary score estimating the probability that the image is real.
3. A `gan` network that chains the generator and discriminator together: `gan(x) = discriminator(generator(x))`. Thus this `gan` network maps latent space vectors to the discriminator's assessment of the realism of these latent vectors as decoded by the generator.
4. You train the discriminator using examples of real and fake images along with 'real'/'fake' labels, just as you train any regular image-classification model.
5. To train the generator, you use the gradients of the generator's weights with regard to the loss of the `gan` model. This means, at every step, you move the weights of the generator in a direction that makes the discriminator more likely to classify as 'real' the images decoded by the generator. In other words, you train the generator to fool the discriminator.

## The generator

In [2]:
import keras
from keras import layers
import numpy as np

latent_dim = 32
height = 32
width = 32
channels = 3

In [3]:
generator_input = keras.Input(shape=(latent_dim,))

x = layers.Dense(128 * 16 * 16)(generator_input)
x = layers.LeakyReLU()(x)
x = layers.Reshape((16, 16, 128))(x)

x = layers.Conv2D(256, 5, padding='same')(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2DTranspose(256, 4, strides=2, padding='same')(x)  # Upsample to 32 x 32
x = layers.LeakyReLU()(x)

x = layers.Conv2D(256, 5, padding='same')(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(256, 5, padding='same')(x)
x = layers.LeakyReLU()(x)

x = layers.Conv2D(channels, 7, activation='tanh', padding='same')(x)
generator = keras.models.Model(generator_input, x)
generator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 32768)             1081344   
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 32768)             0         
_________________________________________________________________
reshape_1 (Reshape)          (None, 16, 16, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 16, 16, 256)       819456    
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 16, 16, 256)       0         
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 32, 32, 256)       1048832   
__________

### The discriminator
Develop a `discriminator` model that takes as input a candidate image (real or synthetic) and classifies it into one of two classes: "generated image" or "real image that comes from the training set."

In [5]:
discriminator_input = layers.Input(shape=(height, width, channels))
x = layers.Conv2D(128, 3)(discriminator_input)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)
x = layers.Conv2D(128, 4, strides=2)(x)
x = layers.LeakyReLU()(x)
x = layers.Flatten()(x)

x = layers.Dropout(0.4)(x) # One dropout layer (important trick)

x = layers.Dense(1, activation='sigmoid')(x) # Classification layer

discriminator = keras.models.Model(discriminator_input, x)
discriminator.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         (None, 32, 32, 3)         0         
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 30, 30, 128)       3584      
_________________________________________________________________
leaky_re_lu_10 (LeakyReLU)   (None, 30, 30, 128)       0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 14, 14, 128)       262272    
_________________________________________________________________
leaky_re_lu_11 (LeakyReLU)   (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 6, 6, 128)         262272    
_________________________________________________________________
leaky_re_lu_12 (LeakyReLU)   (None, 6, 6, 128)         0         
__________

In [7]:
discrimintator_optimizer = keras.optimizers.RMSprop(lr=0.0008,
                                                  clipvalue=1.0,
                                                  decay=1e-8)
discriminator.compile(optimizer=discrimintator_optimizer,
                     loss='binary_crossentropy')

## The adversarial network
The GAN chains the generator and discriminator. 
- Set the discriminator to be frozen during training: its weights won't be updated when training `gan`

In [10]:
discriminator.trainable = False

gan_input = keras.Input(shape=(latent_dim,))
gan_output = discriminator(generator(gan_input))
gan = keras.models.Model(gan_input, gan_output)

gan_optimizer = keras.optimizers.RMSprop(lr=0.0004, clipvalue=1.0, decay=1e-8)
gan.compile(optimizer=gan_optimizer, loss='binary_crossentropy')

### How to train your DCGAN
For each epoch, do the following:
1. Draw random points in the **latent space** (random noise)
2. Generate images with `generator` using this random noise.
3. Mix the generated images with real ones.
4. Train `discriminator` using these mixed images, with corresponding targets: either 'real' (for the real images) or 'fake' (for the generated images'
5. Draw new random points in the latent space
6. Train `gan` using these random vectors, with targets that all say 'these are real images'. This updates the weights of the generator (only because the discriminator is frozen inside `gan`) to move them toward getting the discriminator to predict 'these are real images' for generated images: this trains the generator to fool the disciminator.

In [None]:
import os
from keras.preprocessing import image

(x_train, y_train), (_, _) = keras.datasets.cifar10.load_data()

x_train = x_train[y_train.flatten() == 6] # Select frog images (class 6)

x_train = x_train.reshape((x_train.shape[0],) + 
                         (height, width, channels)).astype('float32') / 255.

iterations = 10000
batch_size = 20
save_dir = 'gan_data' # Where to save generated images

start = 0
for step in range(iterations):
    random_latent_vectors = np.random.normal(size=(batch_size, latent_dim)) # sample random points in the latent space
    
    generated_images = generator.predict(random_latent_vectors) # decode them to fake images
    
    stop = start + batch_size                 # combines them with real images
    real_images = x_train[start: stop]
    combined_images = np.concatenate([generated_images, real_images])
    
    labels = np.concatenate([np.ones((batch_size, 1)), # Assembling labels, discriminating real from fake images
                            np.zeros((batch_size, 1))])
    
    labels += 0.05 * np.random.random(labels.shape) # Adds random noise to the labels, AN IMPORTANT Trick!
    
    d_loss = discriminator.train_on_batch(combined_images, labels) # trains the discriminator
    
    random_latent_vectors = np.random.normal(size=(batch_size, latent_dim)) # samples random points in the latent space
    
    misleading_targets = np.zeros((batch_size, 1)) # Assembles labels that say 'these are all real images' (it is a lie!)
    
    a_loss = gan.train_on_batch(random_latent_vectors, misleading_targets) # Trains the generator (via the gan model, where the discriminator weights are frozen)
    
    start += batch_size
    if start > len(x_train) - batch_size:
        start = 0
   
    if step % 100 == 0: # Occasionally saves and plots (every 100 steps)
        gan.save_weights('gan.h5') # saves model weights
        
        print('discriminator loss:', d_loss) # print metrics
        print('adversarial loss:', a_loss)
        
        img = image.array_to_img(generated_images[0]*255., scale=False) # Saves one generated image
        img.save(os.path.join(save_dir, 'generated_frog' + str(step) + '.png'))
        
        img = image.array_to_img(real_images[0]*255., scale=False)
        img.save(os.path.join(save_dir, 'real_frog' + str(step) + '.png')) # saves one real image for comparison
        

### Wrapping up

* A GAN consists of a generator network coupled with a discriminator network. The discriminator is trained to differentiate between the output of the generator and real images from a training dataset, and the generator is trained to fool the discriminator. Remarkably, the generator never sees images from the training set directly; the information it has about the data comes from the discriminator.
* GANs are difficult to train, because training a GAN is a dynamic process rather than a simple gradient descent process with a fixed loss landscape. Getting a GAN to train correctly requires using a number of heuristic tricks, as well as extensive tuning.
* GANs can potentially produce highly realistic images. But unlike VAEs, the latent space they learn doesn't have a neat continuous structure and thus may not be suited for certain practical applications, such as image editing via latent-space concept vectors.