# GAN implementation in python
| Group members | Registration Number |
|---|:---:|
| Yashvardhan Singh | 20BAI10135 |
| Aditya Lanka | 20BAI10236 |
| Siddhartha Maratha | 20BAI10257 |
| Anshumann Ravichandar | 20BAI10281 |
| Mukul Sharma | 20BAI10222 |

### Setup

In [None]:
!pip install torch

### Preparing training data
The training data is composed of pairs (x<sub>1</sub>, x<sub>2</sub>) so that x<sub>2</sub> consists of the value of the sine of x<sub>1</sub> for x<sub>2</sub> in the interval from 0 to 2π \
(0 to 2π so that the sin wave is complete)

In [None]:
import math
import torch
from torch import nn

torch.manual_seed(111)
train_data_length = 1024

train_data = torch.zeros((train_data_length, 2))
train_data[:, 0] = 2 * math.pi * torch.rand(train_data_length)
train_data[:, 1] = torch.sin(train_data[:, 0])
print(train_data)

train_labels = torch.zeros(train_data_length)
train_set = [(train_data[i], train_labels[i]) for i in range(train_data_length)]
print(train_set)

In [None]:
import matplotlib.pyplot as plt

plt.plot(train_data[:, 0], train_data[:, 1], ".")

### Preparing training and testing samples for the models

In [None]:
batch_size = 32 # we will train the network in batches of data
train_loader = torch.utils.data.DataLoader(train_set, batch_size=batch_size, shuffle=True) # Training Data
print('Training Batches:')
for batch_idx, sample in enumerate(train_loader):
  if batch_idx == 2:
    print('...')
    break
  print(batch_idx, sample)

print('Testing Data:')
latent_space_samples = torch.randn(train_data_length, 2) # Testing Data
print(latent_space_samples)

### Implementing the Discriminator
- **nn.Linear(n,m)** - is a module that creates single layer feed forward network with `n` inputs and `m` output. Mathematically, this module is designed to calculate the linear equation `Ax = b` where x is input tensor, b is output, A is weight matrix.\
<img src='./pytorch_nn_linear_fn.png' alt='Neural network with 3 inputs connected to 2 nodes to generate 2 outputs'></img>
- **nn.ReLu()** - applies the rectified linear unit function element-wise on an *input tensor*.
- **nn.Dropout(p)** - During training, randomly zeroes some of the elements of the input tensor with probability `p` using samples from a Bernoulli distribution. Furthermore, the outputs are scaled by a factor of `1 / (1- p)` during training.
- **nn.Sigmoid()** - Applies the function 1 / 1 + e<sup>-x</sup> on the recieved *single input value*.

In [None]:
class Discriminator(nn.Module):
  def __init__(self):
    super().__init__()

    self.model = nn.Sequential(
      nn.Linear(2, 256), # layer taking vector with 2 attributes as input and map it to 256 neurons
      nn.ReLU(), # Applies the rectified linear unit activation function
      nn.Dropout(0.3), # Randomly drops neurons with given probability

      nn.Linear(256, 128), # 256 nodes connected to 128 nodes
      nn.ReLU(),
      nn.Dropout(0.3),

      nn.Linear(128, 64), # 128 nodes connected to 64 nodes
      nn.ReLU(),
      nn.Dropout(0.3),

      nn.Linear(64, 1), # 64 nodes connected to 1 output node
      nn.Sigmoid(),
    )

  def forward(self, x):
    output = self.model(x)
    return output

discriminator = Discriminator()

### Implementing the Generator
Receives an input of tensor having 2 attributes and outputs a tensor having 2 attributes. Therefore, it maps the given input to an output in the same space, which means the network *generates* an output from a random latent space.

In [None]:
class Generator(nn.Module):
    def __init__(self):
        super().__init__()

        self.model = nn.Sequential(
            nn.Linear(2, 16), # layer taking vector with 2 attributes as input and map it to 16 neurons
            nn.ReLU(), # Applies the rectified linear unit activation function

            nn.Linear(16, 32), # 16 nodes connected to 32 nodes
            nn.ReLU(),

            nn.Linear(32, 2), # 32 nodes connected to 2 nodes
        )

    def forward(self, x):
        output = self.model(x)
        return output

generator = Generator()

### Train the model
- **nn.BCELoss()** - Creates a criterion that measures the Binary Cross Entropy between the target and the input probabilities. If no attribute value is given for the attribute `reduction` then the function returns the mean loss of each pair of input and target sample. (source: [PyTorch Docs > Torch.nn > BCELoss](https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html))
- **torch.optim.Adam(...)** - performs Adam optimization on the Gradient Descent algorithm, by using momentum and RMS prop of previous epoch's gradient. You can further read from [PyTorch Docs > torch.optim > Adam](https://pytorch.org/docs/stable/generated/torch.optim.Adam.html)
- **torch.optim.Optimizer.zero_grad()** - from ["Why do we need to call zero_grad() in PyTorch?" (Stackoverflow)](https://stackoverflow.com/a/48009142/16084581):
> In `PyTorch`, for every mini-batch during the training phase, we typically want to explicitly set the gradients to zero before starting to do backpropagation (i.e., updating the Weights and biases) because PyTorch accumulates the gradients on subsequent backward passes.\
...\
Because of this, when you start your training loop, ideally you should zero out the gradients so that you do the parameter update correctly. Otherwise, the gradient would be a combination of the old gradient, which you have already used to update your model parameters and the newly-computed gradient. It would therefore point in some other direction than the intended direction towards the optimum.\
...\
The accumulation (i.e., sum) of gradients happens when `.backward()` is called on the loss tensor.
- **torch.Tensor.backward()** - Computes the gradient of current tensor w.r.t. graph leaves (i.e. in the backward pass of the neural network training). The graph is differentiated using the chain rule.
- **torch.optim.Optimizer.step()** - Performs a single optimization step (parameter update), which in the case of this neural network model is `Adam` optimization of `gradient descent` algorithm.

In [None]:
from time import sleep
from IPython.display import clear_output

lr = 0.0001
num_epochs = 500
loss_function = nn.BCELoss()

optimizer_discriminator = torch.optim.Adam(discriminator.parameters(), lr=lr)
optimizer_generator = torch.optim.Adam(generator.parameters(), lr=lr)

loss_values = ""
for epoch in range(1, num_epochs + 1):
  for n, (real_samples, _) in enumerate(train_loader):
    # Data for training the discriminator
    real_samples_labels = torch.ones((batch_size, 1)) # 1 - means real data

    # generate samples for discriminator to discriminate
    latent_space_samples = torch.randn((batch_size, 2))
    generated_samples = generator(latent_space_samples)
    generated_samples_labels = torch.zeros((batch_size, 1)) # 0 - means fake data

    # concatenate all the data into a single input and target (or labels) tensor
    all_samples = torch.cat((real_samples, generated_samples))
    all_samples_labels = torch.cat((real_samples_labels, generated_samples_labels))

    # Training the discriminator
    discriminator.zero_grad()
    output_discriminator = discriminator(all_samples)
    
    # Discriminator loss
    loss_discriminator = loss_function(output_discriminator, all_samples_labels)
    loss_discriminator.backward()
    optimizer_discriminator.step()

    # Data for training the generator
    latent_space_samples = torch.randn((batch_size, 2))

    # Training the generator
    generator.zero_grad()
    generated_samples = generator(latent_space_samples)
    output_discriminator_generated = discriminator(generated_samples)

    # Generator loss
    loss_generator = loss_function(output_discriminator_generated, real_samples_labels)
    loss_generator.backward()
    optimizer_generator.step()

    # Show loss and visualize the outputs
    if (epoch % 5 == 0) and (n == batch_size - 1):
      clear_output(wait=True)
      loss_values += f"Epoch: {epoch} Loss in D.: {loss_discriminator}\n" + f"Epoch: {epoch} Loss in G.: {loss_generator}\n" + "--------------------\n"
      generated_samples = generated_samples.detach() # converts into a normal tensor (which can be converted to numpy)
      plt.title(f"Output of Generator at epoch: {epoch}")
      plt.plot(generated_samples[:, 0], generated_samples[:, 1], ".")
      plt.show()
      sleep(0.1)

### Testing the model

In [None]:
# Loss at various epochs
print(loss_values)

latent_space_samples = torch.randn(train_data_length, 2)
generated_samples = generator(latent_space_samples)

generated_samples = generated_samples.detach()
plt.plot(generated_samples[:, 0], generated_samples[:, 1], ".")

### How GANs' generator learns to mimic the real data:
<img src="./GAN training 500 epochs.gif" alt="GAN training on 500 epochs"></img>