In [1]:
"""
Activation functions are a very important feature of neural networks

Activation functions apply a non-linear transform and decide whether a neuron should be activate 
or not

Without activation functions our network is basically jsut a stacked linear regression model, which
is not suitable for more complex tasks

With non-linear trasformations our network can learn better and perform more complex tasks
After each layer we typically use an activation layer

The most popular activation functions (Watch like 2:43 onwards for examples):
    1. Step function
        > 1 if x > 0
        > 0 otherwise
        > Not usd in practise
    2. Sigmoid
        > Typically the last layer of a binary classification problem
    3. TanH
        > Hyperbolic tangent function
        > Basically a scaled and shifted sigmoid function
        > A good choice in hidden layers
    4. ReLU
        > Most popular choice in most of the networks
        > f(x) = max(0, x)
            - 0 for -ve, x for x > 0
        > Rule of thumb: If you don't know what to use, just sue ReLU for hidden layers
        > Potential vanishing gradient problem, gradient for -ve is 0 which messes with the
          back propogation, -ve neurons are "dead" (-ve values will not learn anything or whatever)
    5. Leaky ReLU
        > Improved version of ReLU, tries to solve hte vanishing gradient problem
        > If -ve, a * x, x otherwise (a is very small i.e. 0.001)
    6. Softmax
        > Gives you probability as output - squashes values to be between 0 and 1 and add up to 1
        > Good in last layer in multi class classification problems

"""

'\nActivation functions are a very important feature of neural networks\n\nActivation functions apply a non-linear transform and decide whether a neuron should be activate \nor not\n\nWithout activation functions our network is basically jsut a stacked linear regression model, which\nis not suitable for more complex tasks\n\nWith non-linear trasformations our network can learn better and perform more complex tasks\nAfter each layer we typically use an activation layer\n\nThe most popular activation functions:\n    1. Step function\n    2. Sigmoid\n    3. Tanh\n\n'

In [2]:
import torch
import torch.nn as nn
import torch.nn.functional as F # Look at the bottom of the page

# We have 2 options...

In [3]:
# Option 1 - create nn modules

class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size):
        super (NeuralNet, self).__init__()
        self.linear1 = nn.Linear(input_size, hidden_size) # Input, output
        self.relu = nn.ReLU() # ReLU activation function
        self.linear2 = nn.Linear(hidden_size, 1) # Input, output
        self.sigmoid = nn.Sigmoid() # Sigmoid activation function
        
    def forward(self, x):
        # Forward pass just goes through all of the layers
        out = self.linear1(x) # First input is x
        out = self.relu(out) # Then we just pass about out
        out = self.linear2(out)
        out = self.sigmoid(out)
        
        return out

In [5]:
# Option 2 - use activation functions direcrly in forward pass

class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(NeuralNet, self).__init__()
        self.linear1 = nn.Linear (input_size, hidden_size)
        self.linear2 = nn.Linear(hidden_size, 1)
        
    def forward(self, x):
        out = torch.relu(self.linear1(x)) # torch.relu used here isntead
        out = torch.sigomid(self.linear2(out))
        
        return out

In [None]:
"""
Both ways achive the same ting, it's just a matter of how you prefer your code

Some of the functions aren't available in the torch API directly, so you need to use, for example,
F.leaky_relu() (torch.nn.functional.leaky_relu())
"""