<a href="https://colab.research.google.com/github/peeush-agarwal/week-based-learning/blob/master/Deep-Learning/Dropout/Understanding_Dropout_impact.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Dropout
How does adding Dropout to Neural network helps reducing overfitting?

To set the context, let's consider a very simple dataset (y = x), and complex Neural Network is designed to fit the dataset. 
+ Why simple dataset and complex NN?
  + These conditions will let NN to overfit the dataset
  + Then adding a Dropout layer will help in showing how does it help model not to overfit. 

## Import libraries

In [0]:
import torch
import torch.nn as nn
import torch.optim as optim

import numpy as np
import matplotlib.pyplot as plt

# plt.style.use('dark_background')

## Generate dataset

In [0]:
N = 100
noise = 0.4

In [0]:
X_train = torch.unsqueeze(torch.linspace(-1, 1, N), 1)
Y_train = X_train + noise * torch.normal(torch.zeros((N, 1)), torch.ones((N, 1)))

X_test = torch.unsqueeze(torch.linspace(-1, 1, N), 1)
Y_test = X_test + noise*torch.normal(torch.zeros(N, 1), torch.ones(N, 1))

In [0]:
plt.figure(figsize=(10,5))
plt.scatter(X_train.data.numpy(), Y_train.data.numpy(), color='purple', alpha=0.5, label='Train')
plt.scatter(X_test.data.numpy(), Y_test.data.numpy(), color='orange',alpha=0.5, label='Test')
plt.plot(np.linspace(-1, 1, N), np.linspace(-1, 1, N), 'g--', label='Actual function')
plt.legend()
plt.show()

## Build a custom FC NN to learn the dataset

In [0]:
model = nn.Sequential(
    nn.Linear(1, 100),
    nn.ReLU(),
    nn.Linear(100, 100),
    nn.ReLU(),
    nn.Linear(100, 1)
)

model_dropout = nn.Sequential(
    nn.Linear(1, 100),
    nn.Dropout(0.5),
    nn.ReLU(),
    nn.Linear(100, 100),
    nn.Dropout(0.5),
    nn.ReLU(),
    nn.Linear(100, 1)
)

In [0]:
loss_fn = nn.MSELoss()

model_optim = optim.Adam(model.parameters(), lr=0.01)
model_dropout_optim = optim.Adam(model_dropout.parameters(), lr=0.01)

In [0]:
epochs = 1000

for epoch in range(epochs):

  model_optim.zero_grad()
  outputs = model(X_train)
  loss = loss_fn(outputs, Y_train)
  loss.backward()
  model_optim.step()

  model_dropout_optim.zero_grad()
  outputs_dropout = model_dropout(X_train)
  loss_dropout = loss_fn(outputs_dropout, Y_train)
  loss_dropout.backward()
  model_dropout_optim.step()

  if epoch % 50 == 0:

    model.eval()
    model_dropout.eval()

    outputs_test = model(X_test)
    loss_test = loss_fn(outputs_test, Y_test)

    outputs_dropout_test = model_dropout(X_test)
    loss_dropout_test = loss_fn(outputs_dropout_test, Y_test)

    plt.figure(figsize=(10, 5))
    plt.scatter(X_train.data.numpy(), Y_train.data.numpy(), color='purple', alpha=0.5, label='Train')
    plt.scatter(X_test.data.numpy(), Y_test.data.numpy(), color='orange', alpha=0.5, label='Test')
    plt.plot(np.linspace(-1, 1, N), np.linspace(-1, 1, N), 'g--', label='Actual fn')
    plt.plot(X_train.data.numpy(), outputs_test.data.numpy(), 'r', lw=3, label='Normal prediction')
    plt.plot(X_test.data.numpy(), outputs_dropout_test.data.numpy(), 'b--', lw=3, label='Dropout prediction')
    plt.title(f'E:{epoch}, loss:{loss_test.item():.4f}, loss_dropout:{loss_dropout_test.item():.4f}')
    plt.legend()
    plt.show()

    model.train()
    model_dropout.train()

Comparing above graphs, we can observe that NN without dropout overfits the given data, while model with dropout manages to not overfit to some extent.