# Chapter 4 Activation Functions

Activation functions are used to modify the output of neurons. There are generally two types of activation functions. The first is used in the hidden layers, and the second is used in the output layers.

Activation functions help in mapping nonlinear functions.

## The Step Activation Function

The step activation function is when the output is either greater than 0 or less than or equal to. It is only when the output is greater than 0 when the neuron fires. This is not used as much because there's no way for the optimization functions to determine how close the neuron was to firing as the output is only 0 or 1.

## The Linear Activation Function

A Linear function is the equation of a line. This is more commonly used for the last layer's output for a regression model where it outputs a scalar value instead of a classification.

## The Sigmoid Activation Function

This function is more informative as it returns a range from 0 to 1. That being said, this function is generally not used anymore and is replaced with the Rectified Linear Activation Function.

## The Rectified Linear Activation Function

This function is actualyl simpler than the sigmoid function. 

The output is x when x is greater than 0
Otherwise the output is 0 when less than or equal to 0.

y = { x x > 0
    { 0 x <= 0

This function is used for speed and efficiency. It's close to being a linear function while also being non-linear. 

## Why use activation functions?

Neural networks are used to fit a nonlinear function, which requires at least two hidden layers that contain an activation function to modify the output.

A nonlinear function cannot be represnted accurately by a straight line.

## Linear Activation in the Hidden Layers

If using a Linear activation function, the output of a neural network will always fit to that linear function.

## ReLU Activation in a Pair of Neurons

See Book. Shows how adjusting the weights and biases can fit a nonlinear function when using the ReLU activation function with just two neurons.

## ReLU Activation in the Hidden Layers

See Book. Shows how adjusting the weights and biases can fit a nonlinear function when using the ReLU activation function with a full neural network

## ReLU Activation with Code

In [1]:
inputs = [0, 2, -1, 3.3, -2.7, 1.1, 2.2, -100]

output = []
for i in inputs:
    if i > 0:
        output.append(i)
    else:
        output.append(0)

print(output)

[0, 2, 0, 3.3, 0, 1.1, 2.2, 0]


We can simplify the above even further with using the `max()` function

In [2]:
inputs = [0, 2, -1, 3.3, -2.7, 1.1, 2.2, -100]
output = []
for i in inputs:
    output.append(max(0, i))

print(output)

[0, 2, 0, 3.3, 0, 1.1, 2.2, 0]


Numpy contains an equivalent max() function called maximum(). It compares each element of the input list and returns an object of the same shape filled with new values. 

In [3]:
import numpy as np
inputs = [0, 2, -1, 3.3, -2.7, 1.1, 2.2, -100]
output = np.maximum(0, inputs)
print(output)

[0.  2.  0.  3.3 0.  1.1 2.2 0. ]


Applying this function to the dense layer's outputs in the code:

In [11]:
from nnfs.datasets import spiral_data

# Dense layer
class Layer_Dense:

    # Layer initialization
    def __init__(self, n_inputs, n_neurons):

        # initialize weights and biases
        self.weights = 0.01 * np.random.randn(n_inputs, n_neurons)
        self.biases = np.zeros((1, n_neurons))

    def forward(self, inputs):
        # Calculate output values from inputs, weights, and biases
        self.output = np.dot(inputs, self.weights) + self.biases

# ReLU activation
class Activation_ReLU:
    
    # Foward pass
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)

# Create dataset
X, y = spiral_data(samples = 100, classes=3)

# Create Dense layer with 2 input features and 3 output values
dense1 = Layer_Dense(2, 3)

# Create ReLU Activation (to be used with dense layer)
activation1 = Activation_ReLU()

# Make a forward pass of our training data through this layer
dense1.forward(X)

# Forward pass through activation function
# Takes in output from previous layer
activation1.forward(dense1.output)

# View first few samples:
print(activation1.output[:5])

[[0.00000000e+00 0.00000000e+00 0.00000000e+00]
 [1.23085807e-04 9.14717560e-05 0.00000000e+00]
 [2.00419555e-04 1.79987627e-04 0.00000000e+00]
 [2.14781301e-04 2.52986393e-04 0.00000000e+00]
 [4.11037891e-04 3.61288384e-04 0.00000000e+00]]
