# Activation Functions

Activation functions apply a linear transformation to layer output and decide whether a neuron should be activated or not. We have already seen a few examples of activation functions as well as used them before. `Binary Step`, `Sigmoid`, `ReLU`, `TanH`, `Softmax` etc are all activation functions. There exist a lot of variations of these functions like `Leaky ReLU`, and they are all implemented within PyTorch and are directly usable from there. Read more about them online to understand which one should be used in which situation. <br>

- Binary Step: is not used in practice. Outputs 1 if input is greater than threshold, or 0 otherwise.
- Sigmoid: Typically used in the last layer of a neural net for a binary classification task, outputs a probability between 0 and 1.
- TanH: Hyperbolic tangent function, it is a scaled and shifted version of Sigmoid that outputs a value between -1 and 1. It is usually used in hidden layers.
- ReLU: ReLU outputs 0 for negative values and input as output for positive, it is a linear function for values greater than 0 and 0 for negative values. It is actually a non-linear transform and is the most popular function to use in hidden layers. As a rule of thumb if you don't know what to use in hidden layers, use ReLU.
- Leaky ReLU: mostly the same, but multiplies the input with a very small value for negative numbers. It is an improved version that tries to solve the vanishing gradient problem often seen in ReLU.
- Softmax: We have discussed this in the previous notebook, it returns probabilities for multiple inputs by squashing the values between 0 and 1. Usually used in the last layer of a multi class classification problem.

In [1]:
# Imports
import torch
import torch.nn as nn
import torch.nn.functional as F

In [3]:
# Option 1: Create nn modules
class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(NeuralNet, self).__init__()
        self.linear1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.linear2 = nn.Linear(hidden_size, 1)  # binary output
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        out = self.linear1(x)
        out = self.relu(out)
        out = self.linear2(out)
        out = self.sigmoid(out)
        return out

# Option 2: Use activation functions directly in the forward pass
class NeuralNet2(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(NeuralNet, self).__init__()
        self.linear1 = nn.Linear(input_size, hidden_size)
        self.linear2 = nn.Linear(hidden_size, 1)  # binary output

    def forward(self, x):
        out = torch.relu(self.linear1(x))
        out = torch.sigmoid(self.linear2(out))
        return out

# Some functions might not be available in the torch api directly
# So we use torch.nn.functional 
# F.leaky_relu()
# It's a matter of knowing what is available where, and preference