<a href="https://colab.research.google.com/github/mneunhoe/ds3_gan/blob/main/01_ds3_gan_complete.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# $\text{DS}^3$: Generative Adversarial Nets
(Marcel Neunhoeffer, LMU Munich)

This is the first workbook for the Data Science Summer School course "Generative Adversarial Nets". 

You can find the slides and the following workbooks for this course at: https://github.com/mneunhoe/ds3_gan


In this workbook we will implement a simple Generative Adversarial Net from scratch. Along the way we will learn a lot about basic and advanced design choices of GANs.



Before we get started, we install and load some helpful packages to our R environment. Installing `torch` can take some time (around 6 minutes for me).

In [None]:
p_needed <- c("viridis", "skimr", "torch", "RGAN")
packages <- rownames(installed.packages())
p_to_install <- p_needed[!(p_needed %in% packages)]
if (length(p_to_install) > 0) {
  install.packages(p_to_install)
}
sapply(p_needed, require, character.only = TRUE)

## A GAN from scratch


### Setting up some toydata

First, we generate some simple two dimensional toy data.

In [None]:
# We setup some toy data where we know the true data generating process.

# Using a seed we will all get the same "random" data
set.seed(220803)

# The number of observations
n <- 1000

# x is drawn from a standard normal distribution
x <- c(stats::rnorm(n))

# y is x^2 plus some random noise from a normal distribution with sd = 0.3
y <- c(stats::rnorm(n, x^2, 0.3))

# x and y put together is our toydata
toydata <- cbind(x, y)

# We later need the data as a torch tensor
toydata_tensor <- torch::torch_tensor(toydata)

# Get an overview of the dataset

# With a skim summary
skimr::skim(toydata)

# With a simple scatterplot
plot(toydata[,1], toydata[,2],
      main = "Toydata", xlab = "x", ylab = "y", las = 1,
      col = viridis::viridis(1, 0.6), 
      pch = 19, bty = "n")

### Setting up the neural networks
As we have seen, a GAN consists of multiple neural networks: 
- A Generator that gets random noise as input and produces fake samples as output.
- A Discriminator that gets real and fake samples as input and tries to classify which examples are real and which are fake.

In this example we want to generate fake samples from our toydata. Before we can observe the adversarial game between Generator and Discriminator we need to set up the neural networks.

Here we use the `torch` library to set up the networks.

In [None]:
Generator <- torch::nn_module(
  initialize = function(noise_dim,
                        data_dim,
                        hidden_units = list(128, 128),
                        dropout_rate = 0
                        ) {
                        # Initialize an empty nn_sequential module
                        self$seq <- torch::nn_sequential()

                        # For the hidden layers we need to keep track of our input and output dimensions. The first input will be our noise vector, therefore, it will be noise_dim
                        dim <- noise_dim

                        # i will be a simple counter to keep track of our network depth
                        i <- 1

                        # Now we loop over the list of hidden units and add the hidden layers to the nn_sequential module
                        for (neurons in hidden_units) {
                          # First, we add a ResidualBlock of the respective size.
                          self$seq$add_module(module =  torch::nn_linear(dim, neurons),
                                              name = paste0("Linear_", i))

                          # Add a ReLU activation
                          self$seq$add_module(module = torch::nn_relu(),
                                              name = paste0("Activation_", i))
                          # And then a Dropout layer.
                          self$seq$add_module(module = torch::nn_dropout(dropout_rate),
                                              name = paste0("Dropout_", i))
                          # Now we update our dim for the next hidden layer.
                          dim <- neurons
                          # Update the counter
                          i <- i + 1
                        }
                        # Finally, we add the output layer. The output dimension must be the same as our data dimension (data_dim).
                        self$seq$add_module(module = torch::nn_linear(dim, data_dim),
                                            name = "Output")
                        },
forward = function(input) {
  input <- self$seq(input)
  input
}
  )

In [None]:
g_net <- Generator(noise_dim = 2, data_dim = 2, hidden_units = list(128, 128))
g_net

In [None]:
Discriminator <- torch::nn_module(
  initialize = function(data_dim,
                        hidden_units = list(128, 128),
                        dropout_rate = 0,
                        sigmoid = TRUE) {
    # Initialize an empty nn_sequential module
    self$seq <- torch::nn_sequential()

    # For the hidden layers we need to keep track of our input and output dimensions. The first input will be our real and fake data examples, therefore, it will be data_dim
    dim <- data_dim

    # i will be a simple counter to keep track of our network depth
    i <- 1

    # Now we loop over the list of hidden units and add the hidden layers to the nn_sequential module
    for (neurons in hidden_units) {
      # We start with a fully connected linear layer
      self$seq$add_module(module = torch::nn_linear(dim, neurons),
                          name = paste0("Linear_", i))
      # Add a ReLU activation
      self$seq$add_module(module = torch::nn_relu(),
                          name = paste0("Activation_", i))
      # And a Dropout layer
      self$seq$add_module(module = torch::nn_dropout(dropout_rate),
                          name = paste0("Dropout_", i))
      # Update the input dimension to the next layer
      dim <- neurons
      # Update the counter
      i <- i + 1
    }
    # Add an output layer to the net. Since it will be one score for each example we only need a dimension of 1.
    self$seq$add_module(module = torch::nn_linear(dim, 1),
                        name = "Output")
    if (sigmoid) {
      self$seq$add_module(module = torch::nn_sigmoid(),
                          name = "Sigmoid_Output")
    }

  },
  forward = function(input) {
    data <- self$seq(input)
    data
  }
)

In [None]:
d_net <- Discriminator(data_dim = 2, hidden_units = list(128, 128))
d_net

### Setting up optimizers

To train the neural networks we need to define optimizers for both.

In [None]:
g_optim <- torch::optim_adam(g_net$parameters, lr = 0.0001)

d_optim <- torch::optim_adam(d_net$parameters, lr = 0.0001)

### Walking through one update step

In [None]:
# Update the Discriminator

# Get a batch of real data
real_data <- toydata_tensor[sample(nrow(toydata_tensor), size = 64)]

# Sample new noise
z <- torch::torch_randn(c(64, 2))

# Generate a batch of fake data
fake_data <- torch::with_no_grad(g_net(input = z))

# Calculate the discriminator scores for real and fake data
dis_real <- d_net(real_data)
dis_fake <- d_net(fake_data)

# Calculate the loss
d_loss <- torch::torch_log(dis_real) + torch::torch_log(1 - dis_fake)
d_loss <- -d_loss$mean()

# Take one update step for the parameters of the Discriminator
d_optim$zero_grad()
d_loss$backward()
d_optim$step()

# Update the Generator  

# Sample new noise
z <- torch::torch_randn(c(64, 2))

# Generate a batch of fake data
fake_data <- g_net(z)

# Calculate the discriminator score for fake data
dis_fake <- d_net(fake_data)

# Calculate the loss
g_loss <- torch::torch_log(1 - dis_fake)
g_loss <- g_loss$mean()

# Take one update step for the parameters of the Generator
g_optim$zero_grad()
g_loss$backward()
g_optim$step()


### A look at the synthetic data
We can use the Generator to produce synthetic data and look at it. After just one training step it will not yet look anything like the toydata that we want to copy. We will use the same `fixed_z` later on to track training.

In [None]:
fixed_z <- torch::torch_randn(c(1000, 2))

synthetic_data <- torch::as_array(g_net(fixed_z))

# With a simple scatterplot
plot(synthetic_data[,1], synthetic_data[,2],
      main = "Synthetic Data", xlab = "x", ylab = "y", las = 1,
      col = viridis::viridis(2, 0.6)[2], 
      pch = 19, bty = "n")


### The Training loop

Now we just need to repeat this many times (here 3000 times). After every 100th step we also look at the progress of our training.

In [None]:
# Objects to track the losses
d_losses <- NULL
g_losses <- NULL

In [None]:

for(i in 1:3000){
# Update the Discriminator
real_data <- toydata_tensor[sample(nrow(toydata_tensor), size = 64)]
z <- torch::torch_randn(c(64, 2))

fake_data <- torch::with_no_grad(g_net(input = z))

dis_real <- d_net(real_data)
dis_fake <- d_net(fake_data)


d_loss <- torch::torch_log(dis_real) + torch::torch_log(1 - dis_fake)
d_loss <- -d_loss$mean()

d_losses <- c(d_losses, torch::as_array(d_loss$detach()))

d_optim$zero_grad()
d_loss$backward()
d_optim$step()

# Update the Generator  
z <- torch::torch_randn(c(64, 2))
fake_data <- g_net(z)

dis_fake <- d_net(fake_data)

g_loss <- torch::torch_log(1 - dis_fake)
g_loss <- g_loss$mean()

g_losses <- c(g_losses, torch::as_array(g_loss$detach()))

g_optim$zero_grad()
g_loss$backward()
g_optim$step()

if(i %% 100 == 0) {
synthetic_data <- torch::as_array(g_net(fixed_z))

plot(synthetic_data[,1], synthetic_data[,2],
      main = paste0("Synthetic Data after step ", i), xlab = "x", ylab = "y", las = 1,
      col = viridis::viridis(2, 0.6)[2], 
      pch = 19, bty = "n")
}
}

In [None]:
plot(toydata[,1], toydata[,2],
      main = "Real and Synthetic data", xlab = "x", ylab = "y", las = 1,
      col = viridis::viridis(2, 0.6)[1], 
      pch = 19, bty = "n")
points(synthetic_data[,1], synthetic_data[,2],
      col = viridis::viridis(2, 0.6)[2], pch = 19)

In [None]:
par(mfrow = c(1, 2))
plot(1:length(d_losses), d_losses, type = "l", lwd = 2, col = viridis::viridis(3)[1], main = "Discriminator Loss", xlab = "Step", ylab = "Loss", bty = "n", las = 1)
plot(1:length(d_losses), g_losses, type = "l", lwd = 2, col = viridis::viridis(3)[1], main = "Generator Loss", xlab = "Step", ylab = "Loss", bty = "n", las = 1)


## Using the `RGAN` package 

To make our lives a bit easier we can use the `RGAN` package. This will also make it easy to customize GANs and experiment with different choices that we will look at in the next part of the course.

In [None]:
torch::torch_manual_seed(20220803)

g_net <- RGAN::Generator(noise_dim = 2, data_dim = 2, hidden_units = list(128, 128), dropout_rate = 0)

d_net <- RGAN::Discriminator(data_dim = 2, hidden_units = list(128, 128), dropout_rate = 0, sigmoid = TRUE)

trained_gan <- RGAN::gan_trainer(data = toydata, noise_dim = 2, generator = g_net, discriminator = d_net, plot_progress = TRUE, plot_interval = 100)


We can now use our `trained_gan` to sample synthetic data.

In [None]:
synthetic_data <- RGAN::sample_synthetic_data(trained_gan)

plot(toydata[,1], toydata[,2],
      main = "Real and Synthetic data", xlab = "x", ylab = "y", las = 1,
      col = viridis::viridis(2, 0.6)[1], 
      pch = 19, bty = "n")
points(synthetic_data[,1], synthetic_data[,2],
      col = viridis::viridis(2, 0.6)[2], pch = 19)