# When/How is Dropout applied
## Is it per forward pass(training example), per batch or per epoch
Several forward passes(i.e one forward pass = 1 training example) make one batch. Several batches make one epoch. 

In PyTorch, a unique dropout mask (where elements are zeroed out) is applied to the input tensor during each forward pass during training.

Effectively creating a different network architecture **for each forward pass. i.e for every training example** it is a unique dropout(the percentage dropout will be the same. But which neurons are turned on/off are unique to each forward pass = unique to each training example)
The same neurons remain turned off until the backpropagation and weight update. 

For the next set of {forward-pass, backpropagation, weightupdate} another set of neurons are turned on/off . 

Here's a more detailed explanation:

### Dropout Purpose:
Dropout is a regularization technique used to prevent overfitting in neural networks. It works by randomly setting a fraction of input units (neurons) to zero during each forward pass. 

### Unique Mask:
For each forward pass during training, a new random mask is generated, determining which neurons will be "dropped out" (set to zero). This means that the network effectively trains on different sub-networks in each iteration, which helps it to generalize better.

### During Training vs Inference:
Dropout is only applied during the training phase. During evaluation (inference), the dropout layer simply passes the input through without any zeroing, ensuring that the model's performance is evaluated on the full network. 
PyTorch Implementation:
In PyTorch, you can implement dropout using the torch.nn.Dropout module. You specify the probability p of an element being zeroed out, with a default value of 0.5. 

## Example
To implement dropout in PyTorch and verify the applied percentage, define a nn.Dropout layer with a specified probability (e.g., p=0.5), then use it within your model's forward method, and finally, check the percentage of elements zeroed out in the output tensor during training. 


In [None]:
# 1. Define the Dropout Layer:

import torch
import torch.nn as nn

# Define the dropout probability (e.g., 50%)
dropout_prob = 0.8

# Create a dropout layer
dropout_layer = nn.Dropout(p=dropout_prob)

In [None]:
# 2. Incorporate into your Model:
## KSW-TODO: Note it is silly to define dropout at the end of the network. I've just done it to demonstrate dropout
## Come up with sth better
## TODO: how do you check the dropouts of the individual layers ?? When you call model you can only check the final output

class MyModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(MyModel, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.dropout1 = nn.Dropout(p=dropout_prob) # Define dropout here
        self.fc2 = nn.Linear(hidden_size, output_size)
        self.dropout2 = nn.Dropout(p=dropout_prob) # Define dropout here
        
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = self.dropout1(x) # Apply dropout
        x = self.fc2(x)
        x = self.dropout2(x) # Apply dropout
        return x

In [None]:
# 3. Verify Dropout Application during training
# Example usage
input_size = 10
hidden_size = 20
output_size = 5
model = MyModel(input_size, hidden_size, output_size)

print("Checking dropout rates during model TRAINING")
# Perform a forward pass
model.train()
for i in range(0,10):
    # Create a dummy input
    input_tensor = torch.randn(1, input_size)
    output_tensor = model(input_tensor)
    
    # Check the percentage of zeroed elements
    print(f"i={i}")
    print(f"Dropout Probability: {dropout_prob}")
    print(f"Percentage of zeroed elements: {(1-(torch.count_nonzero(output_tensor) / output_tensor.numel())) * 100:.2f}%")
    print("----------------------")

In [None]:
# 3. Verify Dropout Application during evaluation
# Example usage
input_size = 10
hidden_size = 20
output_size = 5
model = MyModel(input_size, hidden_size, output_size)

print("Checking dropout rates during model EVALUATION")
# Perform a forward pass
model.eval()
for i in range(0,10):
    # Create a dummy input
    input_tensor = torch.randn(1, input_size)
    output_tensor = model(input_tensor)
    
    # Check the percentage of zeroed elements
    print(f"i={i}")
    print(f"Dropout Probability: {dropout_prob}")
    print(f"Percentage of zeroed elements: {(1-(torch.count_nonzero(output_tensor) / output_tensor.numel())) * 100:.2f}%")
    print("----------------------")