# Generative Adversarial Nets

## Imports

In [1]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

## Model definition

### Note: I wrote the below code while looking at the notebook here: https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit/, but I didn't look at the GAN solution

First of all, let me make a plan about what I need to implement this paper:

I think the discrimnator needs to have a sigmoid activaiton function in the last layer (since it's predicting true/fake for the image it gets on the input) and its loss function needs to be binary crossentropy.

### Note: I know that the above sentence is incorrect, but I found it out after I tried to implement the GANs myself and then looking at the TensorFlow implementation

For the generator, its activation function doesn't need to exist (it can be just a Dense layer). The paper suggested building 2 multi-layer perceptrons, so I guess that's OK. The loss function should be RMSE or something of that sort since I will be measuring the difference between each pixel. With this approach I see that it's possible for the generator to learn to re-create the data distribution (the images), but this is the best idea I have as of now.

In [2]:
class GAN(keras.Model):
    def __init__(self):
        super(GAN, self).__init__()
        
        self.generator = tf.keras.Sequential([
            tf.keras.layers.Flatten(),
            tf.kears.layers.Dense(8),
            tf.kears.layers.Dense(16),
            tf.kears.layers.Dense(32),
            tf.keras.layers.Dense(inputs.shape) # TODO: Figure out how to make this layer output the image
                                                # of the same shape
        ])
        
        self.discriminator = tf.keras.Sequential([
            tf.keras.layers.Flatten(),
            tf.kears.layers.Dense(8),
            tf.kears.layers.Dense(16),
            tf.kears.layers.Dense(32),
            tf.keras.layers.Dense(1, activation="sigmoid")
        ])
        
        self.loss_tracker_generator = keras.metrics.Mean(name="loss_generator")
        self.metric_generator = keras.metrics.MeanAbsoluteError(name="mae")
        
        self.loss_tracker_discriminator = keras.metrics.Accuracy(name="loss_discriminator")
        self.metric_discriminator = keras.metrics.BinaryAccuracy(name="binary_accuracy")
        
    def train_step(self, data):
        # Unpack the data. Its structure depends on your model and
        # on what you pass to `fit()`.
        x_batch, z_batch = data

        with tf.GradientTape() as tape:
            y_pred = self.discriminator(x_batch, training=True)  # Forward pass
            # Compute the loss value
            # (the loss function is configured in `compile()`)
            loss = self.compiled_loss(y, y_pred, regularization_losses=self.metric_discriminator)

        # code below I copy/pasted from the TensorFlow tutorial to modify it later
            
        # Compute gradients
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        # Update weights
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        # Update metrics (includes the metric that tracks the loss)
        self.compiled_metrics.update_state(y, y_pred)
        # Return a dict mapping metric names to current value
        return {m.name: m.result() for m in self.metrics}

## Things that bother me:

 - I don't know how to implement the loss sugessted in the GAN paper with the Keras loss functions
 - since I'm in the _train_step_ method, how do I do _k_ gradient descent steps over the discriminator and then one step over the generator?

I'll go ahead and look at the implementation of GANs at https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit/:

In [2]:
from tensorflow.keras import layers

# Create the discriminator
discriminator = keras.Sequential(
    [
        keras.Input(shape=(28, 28, 1)),
        layers.Conv2D(64, (3, 3), strides=(2, 2), padding="same"),
        layers.LeakyReLU(alpha=0.2),
        layers.Conv2D(128, (3, 3), strides=(2, 2), padding="same"),
        layers.LeakyReLU(alpha=0.2),
        layers.GlobalMaxPooling2D(),
        layers.Dense(1),
    ],
    name="discriminator",
)

# Create the generator
latent_dim = 128
generator = keras.Sequential(
    [
        keras.Input(shape=(latent_dim,)),
        # We want to generate 128 coefficients to reshape into a 7x7x128 map
        layers.Dense(7 * 7 * 128),
        layers.LeakyReLU(alpha=0.2),
        layers.Reshape((7, 7, 128)),
        layers.Conv2DTranspose(128, (4, 4), strides=(2, 2), padding="same"),
        layers.LeakyReLU(alpha=0.2),
        layers.Conv2DTranspose(128, (4, 4), strides=(2, 2), padding="same"),
        layers.LeakyReLU(alpha=0.2),
        layers.Conv2D(1, (7, 7), padding="same", activation="sigmoid"),
    ],
    name="generator",
)


In [3]:
discriminator.summary()

Model: "discriminator"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 14, 14, 64)        640       
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 14, 14, 64)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 7, 7, 128)         73856     
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 7, 7, 128)         0         
_________________________________________________________________
global_max_pooling2d (Global (None, 128)               0         
_________________________________________________________________
dense (Dense)                (None, 1)                 129       
Total params: 74,625
Trainable params: 74,625
Non-trainable params: 0
_________________________________________________

In [4]:
generator.summary()

Model: "generator"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 6272)              809088    
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 6272)              0         
_________________________________________________________________
reshape (Reshape)            (None, 7, 7, 128)         0         
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 14, 14, 128)       262272    
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 14, 14, 128)       0         
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 28, 28, 128)       262272    
_________________________________________________________________
leaky_re_lu_4 (LeakyReLU)    (None, 28, 28, 128)       0 

In [5]:
class GAN(keras.Model):
    def __init__(self, discriminator, generator, latent_dim):
        super(GAN, self).__init__()
        self.discriminator = discriminator
        self.generator = generator
        self.latent_dim = latent_dim

    def compile(self, d_optimizer, g_optimizer, loss_fn):
        super(GAN, self).compile()
        self.d_optimizer = d_optimizer
        self.g_optimizer = g_optimizer
        self.loss_fn = loss_fn

    def train_step(self, real_images):
        if isinstance(real_images, tuple):
            real_images = real_images[0]
        # Sample random points in the latent space
        batch_size = tf.shape(real_images)[0]
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))

        # Decode them to fake images
        generated_images = self.generator(random_latent_vectors)

        # Combine them with real images
        combined_images = tf.concat([generated_images, real_images], axis=0)

        # Assemble labels discriminating real from fake images
        labels = tf.concat(
            [tf.ones((batch_size, 1)), tf.zeros((batch_size, 1))], axis=0
        )
        # Add random noise to the labels - important trick!
        labels += 0.05 * tf.random.uniform(tf.shape(labels))

        # Train the discriminator
        with tf.GradientTape() as tape:
            predictions = self.discriminator(combined_images)
            d_loss = self.loss_fn(labels, predictions)
        grads = tape.gradient(d_loss, self.discriminator.trainable_weights)
        self.d_optimizer.apply_gradients(
            zip(grads, self.discriminator.trainable_weights)
        )

        # Sample random points in the latent space
        random_latent_vectors = tf.random.normal(shape=(batch_size, self.latent_dim))

        # Assemble labels that say "all real images"
        misleading_labels = tf.zeros((batch_size, 1))

        # Train the generator (note that we should *not* update the weights
        # of the discriminator)!
        with tf.GradientTape() as tape:
            predictions = self.discriminator(self.generator(random_latent_vectors))
            g_loss = self.loss_fn(misleading_labels, predictions)
        grads = tape.gradient(g_loss, self.generator.trainable_weights)
        self.g_optimizer.apply_gradients(zip(grads, self.generator.trainable_weights))
        return {"d_loss": d_loss, "g_loss": g_loss}

In [6]:
print(tf.__version__)

2.2.0


In [7]:
# Prepare the dataset. We use both the training & test MNIST digits.
batch_size = 64
(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()
all_digits = np.concatenate([x_train, x_test])
all_digits = all_digits.astype("float32") / 255.0
all_digits = np.reshape(all_digits, (-1, 28, 28, 1))
dataset = tf.data.Dataset.from_tensor_slices(all_digits)
dataset = dataset.shuffle(buffer_size=1024).batch(batch_size)

gan = GAN(discriminator=discriminator, generator=generator, latent_dim=latent_dim)
gan.compile(
    d_optimizer=keras.optimizers.Adam(learning_rate=0.0003),
    g_optimizer=keras.optimizers.Adam(learning_rate=0.0003),
    loss_fn=keras.losses.BinaryCrossentropy(from_logits=True),
)

# To limit the execution time, we only train on 100 batches. You can train on
# the entire dataset. You will need about 20 epochs to get nice results.
gan.fit(dataset.take(100), epochs=1)



<tensorflow.python.keras.callbacks.History at 0x7fb74e7317f0>

OK. Let me try adjusting this code to be trained on Fashion MNIST. The dimensions of the images are 28x28 and the images are grayscale, so all I need to change is the dataset being loaded:

In [8]:
# Prepare the dataset. We use both the training & test MNIST digits.
batch_size = 64
(x_train, _), (x_test, _) = keras.datasets.fashion_mnist.load_data()
all_digits = np.concatenate([x_train, x_test])
all_digits = all_digits.astype("float32") / 255.0
all_digits = np.reshape(all_digits, (-1, 28, 28, 1))
dataset = tf.data.Dataset.from_tensor_slices(all_digits)
dataset = dataset.shuffle(buffer_size=1024).batch(batch_size)

gan = GAN(discriminator=discriminator, generator=generator, latent_dim=latent_dim)
gan.compile(
    d_optimizer=keras.optimizers.Adam(learning_rate=0.0003),
    g_optimizer=keras.optimizers.Adam(learning_rate=0.0003),
    loss_fn=keras.losses.BinaryCrossentropy(from_logits=True),
)

# To limit the execution time, we only train on 100 batches. You can train on
# the entire dataset. You will need about 20 epochs to get nice results.
gan.fit(dataset.take(100), epochs=1)



<tensorflow.python.keras.callbacks.History at 0x7fb70d3bc0f0>

## Conclusion

Implementing the original paper from scratch by myself was hard. I definitely need more practice implementing papers. Copy/pasting code from a TensorFlow tutorial will not make me better.

The thing that suprised me was that I was wrong about the loss functions; in the implementation, both the generator and the discriminator use binary cross-entropy. This made sense for the discriminator, but I had to look up why do we use the binary cross-entropy for the generator. I found a good answer here: https://stats.stackexchange.com/questions/242907/why-use-binary-cross-entropy-for-generator-in-adversarial-networks

Overall, I need to implement more papers!