# Advanced neural networks

CREDIT: This practical was inspired by [this post on developing a GAN](https://machinelearningmastery.com/how-to-develop-a-generative-adversarial-network-for-an-mnist-handwritten-digits-from-scratch-in-keras/).

## Imports

In [3]:
from numpy import expand_dims
from numpy import zeros
from numpy import ones
from numpy import vstack
from numpy.random import randn
from numpy.random import randint
from keras.datasets.mnist import load_data
from keras.optimizers import Adam
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Reshape
from keras.layers import Flatten
from keras.layers import Conv2D
from keras.layers import Conv2DTranspose
from keras.layers import LeakyReLU
from keras.layers import Dropout
from matplotlib import pyplot

## Introduction

The goal of this practical is to create a Generative Adversarial Network (GAN) that will generate images based on the [MNIST database](https://en.wikipedia.org/wiki/MNIST_database). We will create the discriminator, the generator, and train both models in order to generate images.

## Defining the discriminator model

In this section, we will define the model that differentiates fake samples from real ones.
Here is the architecture we will use for this model:

- A 2D convolution layer with input shape `(28,28,1)`, containing 64 filters of size `(3,3)` with a stride of `(2,2)` and zero padding.
- A LeakyReLU activation function with `alpha=0.2`
- A dropout layer dropping 40% of the input units
- Another 2D convolution layer with the same parameters (watch out, the input dimension is different, but you do not need to know it)
- Another LeakyReU activation with the same parameters
- Another dropout with the same parameters
- A Flatten layer
- A Dense layer with sigmoid activation. The output should be a single number (probability of the image being real).

Complete the cell underneath to implement this architecture.

In [4]:
# define the standalone discriminator model
def define_discriminator(in_shape=(28,28,1)):
	model = Sequential()


    # add layers here
    

	# compile model
	opt = Adam(lr=0.0002, beta_1=0.5)
	model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
	return model

In [5]:
# define the standalone discriminator model
def define_discriminator(in_shape=(28,28,1)):
	model = Sequential()
	model.add(Conv2D(64, (3,3), strides=(2, 2), padding='same', input_shape=in_shape))
	model.add(LeakyReLU(alpha=0.2))
	model.add(Dropout(0.4))
	model.add(Conv2D(64, (3,3), strides=(2, 2), padding='same'))
	model.add(LeakyReLU(alpha=0.2))
	model.add(Dropout(0.4))
	model.add(Flatten())
	model.add(Dense(1, activation='sigmoid'))
	# compile model
	opt = Adam(lr=0.0002, beta_1=0.5)
	model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
	return model

## Defining the generator model

The generator model will try to create images that will "fool" the discriminator. Its input is a vector in an arbitrarily defined **latent space** of Gaussian-distributed values, for example in 100 dimensions. This space has no meaning, it is like "raw material" for the generator to build an image from. When the model is trained, the latent space represents a compressed representation of the output space - the generator is the only one who knows how to turn it into MNIST-like images.

The model is defined by the function in the cell underneath. You will note that we do not compile it yet: that is because the loss of the generator model depends on the discriminator, so they need to be connected first. We will be doing this in the next section.

**Question**: Describe this architecture. Explain in your own words what it does, and understand the input / output shapes.

In [6]:
# define the standalone generator model
def define_generator(latent_dim):
	model = Sequential()
	# foundation for 7x7 image
	n_nodes = 128 * 7 * 7
	model.add(Dense(n_nodes, input_dim=latent_dim))
	model.add(LeakyReLU(alpha=0.2))
	model.add(Reshape((7, 7, 128)))
	# upsample to 14x14
	model.add(Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
	model.add(LeakyReLU(alpha=0.2))
	# upsample to 28x28
	model.add(Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
	model.add(LeakyReLU(alpha=0.2))
	model.add(Conv2D(1, (7,7), activation='sigmoid', padding='same'))
	return model

*[Your comments here]*

## Defining the combined generator and discriminator model

In [8]:
# define the combined generator and discriminator model, for updating the generator
def define_gan(g_model, d_model):
	d_model.trainable = False # make weights in the discriminator not trainable, because we want the backpropagation to train the generator model
	model = Sequential() # connect them
	model.add(g_model) # add the generator
	model.add(d_model) # add the discriminator
	opt = Adam(lr=0.0002, beta_1=0.5) # compile model
	model.compile(loss='binary_crossentropy', optimizer=opt)
	return model

## Training the network

First, we need to create functions that will generate data. The "real" data will be the MNIST images, while the "fake" data will be the images created by the generator (which will eventually be close to MNIST images). Complete the following cells to create the required functions.

In [None]:
# load and prepare mnist training images
def load_real_samples():
	(trainX, _), (_, _) = load_data() # load mnist dataset
	X = expand_dims(trainX, axis=-1) # expand to 3d, e.g. add channels dimension
	X = X.astype('float32') # convert from unsigned ints to floats
	X = ... # scale from [0,255] to [0,1]
	return X

In [None]:
# select real samples
def generate_real_samples(dataset, n_samples):
	X = ... # retrieve n_samples images at random
	y = ... # label all the samples as real (1)
	return X, y

In [None]:
# generate points in latent space as input for the generator
def generate_latent_points(latent_dim, n_samples):
	x_input = ... # generate n_samples points of size latent_dim
	return x_input

In [None]:
# use the generator to generate n fake examples, with class labels
def generate_fake_samples(g_model, latent_dim, n_samples):
	x_input = ... # generate points in latent space
	X = ... # predict outputs
	y = ... # label all the samples as fake (0)
	return X, y

In [None]:
# create and save a plot of generated images (reversed grayscale)
def save_plot(examples, epoch, n=10):
	# plot images
	for i in range(n * n):
		pyplot.subplot(n, n, 1 + i) # define subplot
		pyplot.axis('off') # turn off axis
		pyplot.imshow(examples[i, :, :, 0], cmap='gray_r') # plot raw pixel data
	# save plot to file
	filename = 'generated_plot_e%03d.png' % (epoch+1)
	pyplot.savefig(filename)
	pyplot.close()

## Train the model

In this section, we create the function that will let us train the model.

Using the functions created previously, complete the following cells.

In [2]:
# evaluate the discriminator, plot generated images, save generator model
def summarize_performance(epoch, g_model, d_model, dataset, latent_dim, n_samples=100):
	X_real, y_real = ... # prepare real samples
	_, acc_real = ... # evaluate discriminator on real examples
	
	x_fake, y_fake = ... # prepare fake examples
	_, acc_fake = ... # evaluate discriminator on fake examples
	
	print('>Accuracy real: %.0f%%, fake: %.0f%%' % (acc_real*100, acc_fake*100)) # summarize discriminator performance
	
	save_plot(x_fake, epoch) # save plot
	filename = 'generator_model_%03d.h5' % (epoch + 1) # save the generator model tile file
	g_model.save(filename)

### Important notice on the training

We expect the discriminator model to return a low probability of the "fake" (generated) images to be real. Therefore, if we want the loss to be high when the generator did not manage to "fool" the discriminator, **we want the label of fake images to be 1**.

In [None]:
# train the generator and discriminator
def train(g_model, d_model, gan_model, dataset, latent_dim, n_epochs=100, n_batch=256):
	bat_per_epo = int(dataset.shape[0] / n_batch)
	half_batch = int(n_batch / 2)
	# manually enumerate epochs
	for i in range(n_epochs):
		# enumerate batches over the training set
		for j in range(bat_per_epo):
			X_real, y_real = ... # get randomly selected 'real' samples
			X_fake, y_fake = ... # generate 'fake' examples
			X, y = ... # create training set for the discriminator by stacking real and fake examples
			d_loss, _ = d_model.train_on_batch(X, y) # update discriminator model weights
			
			X_gan = ... # prepare points in latent space as input for the generator
			y_gan = ... # create inverted labels for the fake samples
			g_loss = gan_model.train_on_batch(X_gan, y_gan) # update the generator via the discriminator's error
			
            # summarize loss on this batch
			print('>%d, %d/%d, d=%.3f, g=%.3f' % (i+1, j+1, bat_per_epo, d_loss, g_loss))
		# evaluate the model performance, sometimes
		if (i+1) % 10 == 0:
			summarize_performance(i, g_model, d_model, dataset, latent_dim)

## Running the algorithm and displaying the result

In [3]:
# size of the latent space
latent_dim = ...
# create the discriminator
d_model = ...
# create the generator
g_model = ...
# create the gan
gan_model = ...
# load image data
dataset = load_real_samples()
# train model
...

In [None]:
# example of loading the generator model and generating images
from keras.models import load_model

# load model
model = load_model('generator_model_100.h5')
# generate images
latent_points = generate_latent_points(100, 25)
# generate images
X = model.predict(latent_points)
# plot the result
save_plot(X, 5)