# Exercise - GANs for learning distributions

1. You are curious if GANs are able to model just *any* multivariate normal distribution, so you decide to try a new dimensionality, mean, and standard deviation, and otherwise replicate the study from the slides.
1. As you mutter "very well, it might work, but we also start with a standard normal - in reality we are newer so close to the true distribution to start off!", you decide to model the noise from another distribution (such as the uniform distribution). Are you still able to model the true distribution? 

**Notes**: In this notebook, I have made a few choices (listed below). You **do not** have to follow those - you can use other distributions, dimensionalities, means, and standard deviations, if you like.
1. The distribution to model is now the 20D normal distribution with mean $-5$ and standard deviation $-2$.
1. The noise distribution for **2** is now the 20D uniform distribution (in the range $[0,1)$).

**Hint**: Consider looking at https://www.tensorflow.org/tutorials/generative/dcgan, as they go through some of the same steps.

**See slides for more details!**

# Setup

You do not (but are of course welcome to) have to change any of the setup code.

Note that we use 1 to indicate "real" data and zero to indicate "fake" data for the discriminator.

In the loss of the generator, this is "reversed", i.e. fake data is 1. This is since it needs to learn to create fake data that the generator believes is real.

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf

from matplotlib import pyplot as plt

In [None]:
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

def discriminator_loss(real_output, fake_output):
    real_loss = cross_entropy(tf.ones_like(real_output), real_output)
    fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
    total_loss = real_loss + fake_loss
    return total_loss

def generator_loss(fake_output):
    return cross_entropy(tf.ones_like(fake_output), fake_output)

def get_stats():
    noise = tf.random.normal([1000, 20]) # standard normal
    real = tf.random.normal([1000, 20], mean=-5, stddev=2) # to non-standard normal
    
    fake = generator.predict(noise)
    
    discr_real_pred = tf.nn.sigmoid(discriminator.predict(real)).numpy()
    discr_fake_pred = tf.nn.sigmoid(discriminator.predict(fake)).numpy()
    
    acc_real = np.mean(discr_real_pred >= 0.5)
    acc_fake = np.mean(discr_fake_pred < 0.5)
                                    
    return np.mean(fake), np.sqrt(np.var(fake)), acc_real, acc_fake

# Exercise 1

You are curious if GANs are able to model just *any* multivariate normal distribution, so you decide to try a new dimensionality, mean, and standard deviation, and otherwise replicate the study from the slides.

Let us start by defining the generator and discriminator, as well as their optimizers.

In [None]:
# CODE HERE

Let us proceed by defining the training-step function we want to use.

In [None]:
@tf.function
def train_step():
    noise = tf.random.normal([32, 20]) # standard normal
    real = tf.random.normal([32, 20], mean=-5, stddev=2) # REMEMBER THIS MUST MATCH!

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        generated = generator(noise, training=True)

        real_output = discriminator(real, training=True)
        fake_output = discriminator(generated, training=True)

        gen_loss = generator_loss(fake_output)
        disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

Now, let us train the model!

In [None]:
results = [('Before training', *get_stats())]

for epoch in range(1, 20 + 1):
    print(f'Starting epoch: {epoch}.')
    
    for _ in range(5000): # steps pr epoch
        train_step()

    results.append((epoch, *get_stats()))

results_df = pd.DataFrame(results, columns=['Epoch', 'Mean', 'Std. Dev.', 'Acc., real', 'Acc., fake'])

Finally, let us create plots like in the slides to check that everything worked.

In [None]:
plt.plot(results_df['Mean'], label='Generator mean')
plt.plot(results_df['Std. Dev.'], label='Generator std. dev.')
plt.axhline(-5, color='r', label='True mean')
plt.axhline(2, color='g', label='True std. dev.')
plt.grid()
plt.legend()
plt.xlabel('Epoch'),
plt.ylabel('Value')
plt.show()

In [None]:
plt.plot(results_df['Acc., real'], label='Accuracy (real data)')
plt.plot(results_df['Acc., fake'], label='Accuracy (fake data)')
plt.plot(results_df[['Acc., real', 'Acc., fake']].mean(axis=1), label='Accuracy (overall)')
plt.axhline(0.5, color='r', label='Nash equilibrium')
plt.grid()
plt.legend()
plt.xlabel('Epoch'),
plt.ylabel('Accuracy')
plt.show()

# Exercise 2

As you mutter "very well, it might work, but we also start with a standard normal - in reality we are newer so close to the true distribution to start off!", you decide to model the noise from another distribution (such as the uniform distribution). Are you still able to model the true distribution? 

To solve this exercise, we must:
1. Create a new generator and discriminator (including optimizers).
1. Create a new get_stats function to use another noise distribution.
1. Create a new train_step function to use the new noise distribution.
1. And finally train!

Step 1: New generator and discriminator.

In [None]:
# CODE HERE

Step 2: New get_stats.

In [None]:
# CODE HERE

Step 3: New train_step.

In [None]:
@tf.function
def train_step():
    noise = ??
    real = ??

    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
        generated = generator(noise, training=True)

        real_output = discriminator(real, training=True)
        fake_output = discriminator(generated, training=True)

        gen_loss = generator_loss(fake_output)
        disc_loss = discriminator_loss(real_output, fake_output)

    gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
    gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)

    generator_optimizer.apply_gradients(zip(gradients_of_generator, generator.trainable_variables))
    discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

Step 4: Let us train.

In [None]:
results = [('Before training', *get_stats())]

for epoch in range(1, 20 + 1):
    print(f'Starting epoch: {epoch}.')
    
    for _ in range(5000): # steps pr epoch
        train_step()

    results.append((epoch, *get_stats()))

results_df = pd.DataFrame(results, columns=['Epoch', 'Mean', 'Std. Dev.', 'Acc., real', 'Acc., fake'])

And finally, let us do the plots!

In [None]:
plt.plot(results_df['Mean'], label='Generator mean')
plt.plot(results_df['Std. Dev.'], label='Generator std. dev.')
plt.axhline(-5, color='r', label='True mean')
plt.axhline(2, color='g', label='True std. dev.')
plt.grid()
plt.legend()
plt.xlabel('Epoch'),
plt.ylabel('Value')
plt.show()

In [None]:
plt.plot(results_df['Acc., real'], label='Accuracy (real data)')
plt.plot(results_df['Acc., fake'], label='Accuracy (fake data)')
plt.plot(results_df[['Acc., real', 'Acc., fake']].mean(axis=1), label='Accuracy (overall)')
plt.axhline(0.5, color='r', label='Nash equilibrium')
plt.grid()
plt.legend()
plt.xlabel('Epoch'),
plt.ylabel('Accuracy')
plt.show()