Deep convolutional GANs

We will try using a DCGANs to generate pictures from the Fashion MNIST dataset.

In [None]:
import sklearn
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd
import os
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt

In [None]:
pip install tensorflow==2.0.0-beta1

Let's set up the fashion MNIST dataset.

In [None]:
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
X_train_full = X_train_full.astype(np.float32) / 255
X_test = X_test.astype(np.float32) / 255
X_train, X_valid = X_train_full[:-5000], X_train_full[-5000:]
y_train, y_valid = y_train_full[:-5000], y_train_full[-5000:]

Let's also set up a Deep Convolutional GANs

In [None]:
codings_size = 100 
 
generator = keras.models.Sequential([ 
    keras.layers.Dense(7 * 7 * 128, input_shape=[codings_size]), 
    keras.layers.Reshape([7, 7, 128]), 
    keras.layers.BatchNormalization(), 
    keras.layers.Conv2DTranspose(64, kernel_size=5, strides=2, padding="same", 
                                 activation="selu"), 
    keras.layers.BatchNormalization(), 
    keras.layers.Conv2DTranspose(1, kernel_size=5, strides=2, padding="same", 
                                 activation="tanh")
])
discriminator = keras.models.Sequential([ 
    keras.layers.Conv2D(64, kernel_size=5, strides=2, padding="same", 
                        activation=keras.layers.LeakyReLU(0.2), 
                        input_shape=[28, 28, 1]), 
    keras.layers.Dropout(0.4), 
    keras.layers.Conv2D(128, kernel_size=5, strides=2, padding="same", 
                        activation=keras.layers.LeakyReLU(0.2)), 
    keras.layers.Dropout(0.4), 
    keras.layers.Flatten(), 
    keras.layers.Dense(1, activation="sigmoid")
])
gan = keras.models.Sequential([generator, discriminator])

The generator takes a random distribution input (usually Gaussian) and outputs data (usually images).

The Discriminator takes real images from training set or fake images from the generator as input and must guess if real or fake.

In [None]:
discriminator.compile(loss="binary_crossentropy", optimizer="rmsprop")
discriminator.trainable = False
gan.compile(loss="binary_crossentropy", optimizer="rmsprop")

We use binary cross entropy loss since both our discriminator and our GAN are binary classifiers.

Discriminator should only be trained during second phase. This attribute is only taken into account when compiling the model.

In [None]:
X_train_dcgan = X_train.reshape(-1, 28, 28, 1) * 2. - 1.

Our generator's output layer uses tanh activation so our output ranges from -1 to 1. We need to rescale our training set to the same range before training.

In [None]:
batch_size = 32
dataset = tf.data.Dataset.from_tensor_slices(X_train_dcgan)
dataset = dataset.shuffle(1000)
dataset = dataset.batch(batch_size, drop_remainder=True).prefetch(1)

We create a dataset to iterate through images.

We cannot use regular fit since we have an uncommon training loop so we create our own.

In [None]:
def train_gan(gan, dataset, batch_size, codings_size, n_epochs=50): 
    generator, discriminator = gan.layers 
    for epoch in range(n_epochs): 
        for X_batch in dataset:
            noise = tf.random.normal(shape=[batch_size, codings_size]) 
            generated_images = generator(noise) 
            X_fake_and_real = tf.concat([generated_images, X_batch], axis=0) 
            y1 = tf.constant([[0.]] * batch_size + [[1.]] * batch_size) 
            discriminator.trainable = True 
            discriminator.train_on_batch(X_fake_and_real, y1) 
            noise = tf.random.normal(shape=[batch_size, codings_size]) 
            y2 = tf.constant([[1.]] * batch_size) 
            discriminator.trainable = False 
            gan.train_on_batch(noise, y2) 
 
train_gan(gan, dataset, batch_size, codings_size)

We feed Gaussian noise to produce fake images and concatenate and equal number of real images.

The discriminator tries to guess which images are fake and which images are real.

Let's try the same example with Hashing using a Binary Autoencoder

In [None]:
def rounded_accuracy(y_true, y_pred):
    return keras.metrics.binary_accuracy(tf.round(y_true), tf.round(y_pred))

hashing_encoder = keras.models.Sequential([
    keras.layers.Flatten(input_shape=[28, 28]),
    keras.layers.Dense(100, activation="selu"),
    keras.layers.GaussianNoise(15.),
    keras.layers.Dense(16, activation="sigmoid"),
])
hashing_decoder = keras.models.Sequential([
    keras.layers.Dense(100, activation="selu", input_shape=[16]),
    keras.layers.Dense(28 * 28, activation="sigmoid"),
    keras.layers.Reshape([28, 28])
])
hashing_ae = keras.models.Sequential([hashing_encoder, hashing_decoder])
hashing_ae.compile(loss="binary_crossentropy", optimizer=keras.optimizers.SGD(lr=1.0),
                   metrics=[rounded_accuracy])
history = hashing_ae.fit(X_train, X_train, epochs=10,
                         validation_data=[X_valid, X_valid])

The model is split into two part: the hashing encoder and the hashing decoder.

This encoder take 28x28 grayscale images and outputs it as a vector of size 16.

This decoder takes an input of size 16 and outputs 28x28 arrays.

We use binary cross-entropy again because we treat the model as a multilabel binary classification problem for faster convergence(pixel intensity represents probability that pixel should be black).

In [None]:
def show_reconstructions(model, images=X_valid, n_images=5):
    reconstructions = model.predict(images[:n_images])
    fig = plt.figure(figsize=(n_images * 1.5, 3))
    for image_index in range(n_images):
        plt.subplot(2, n_images, 1 + image_index)
        plot_image(images[image_index])
        plt.subplot(2, n_images, 1 + n_images + image_index)
        plot_image(reconstructions[image_index])

def plot_image(image):
    plt.imshow(image, cmap="binary")
    plt.axis("off")
        
show_reconstructions(hashing_ae)
plt.show()

We compare inputs with outputs. 

Our outputs are a bit blurry and we should probably train our model a bit more or make the both the encoder and decoder with more layers.

If we make the model too strong, our outputs would be better but it wouldn't learn the useful patterns of the data so we will leave the model as is.