In [None]:
# Chapter 4: Activation Functions.

In this chapter, we will tackle a few of the activation functions and discuss their roles. We use
different activation functions for different cases, and understanding how they work can help you
properly pick which of them is best for your task. The activation function is applied to the output
of a neuron (or layer of neurons), which modifies outputs. We use activation functions because if
the activation function itself is nonlinear, it allows for neural networks with usually two or more
hidden layers to map nonlinear functions. 

In [None]:
# Different Types of Activation Functions

<font size="8"> Step Activation Function </font>
<br />
<br />
if the weights · inputs + bias results in a value greater than 0, the neuron will fire and output a 1;
otherwise, it will output a 0.
<br />
<br />
<img src="./stepActivation.png" />







<font size="8"> Linear Activation Function</font>
<br />
(basically a straight line)
<br />
<br />
This activation function is usually applied to the last layer’s output in the case of a regression
model — a model that outputs a scalar value instead of a classification.
<br />
<br />
<img src="linearActivation.png" />


<font size="8"> Sigmoid Activation Function</font>
<br />
<br />
This function returns a value in the range of 0 for negative infinity, through 0.5 for the input of 0,
and to 1 for positive infinity. 
<br />
<br />
<img src="sigmoidActivation.png" />


The output from the Sigmoid function, being in the range of 0 to 1, also works better
with neural networks — especially compared to the range of the negative to the positive infinity
— and adds nonlinearity. 


<font size="7"> Rectified Linear Activation Function</font>
<br />
<br />
The rectified linear activation function is simpler than the sigmoid. It’s quite literally y=x, clipped at 0 from the negative side. If x is less than or equal to 0, then y is 0 — otherwise, y is equal to x
<br />
<br />
<img src="reluActivation.png" />


This simple yet powerful activation function is the most widely used activation function at the time of writing for various reasons — mainly speed and efficiency. While the sigmoid activation function isn’t the most complicated, it’s still much more challenging to compute than the ReLU
activation function. The ReLU activation function is extremely close to being a linear activation function while remaining nonlinear, due to that bend after 0. This simple property is, however, very effective. 


In [3]:
# ReLU activation function

inputs = [0, 2, -1, 3.3, -2.7, 1.1, 2.2, -100]
output = []

# Convention
for i in inputs:
    output.append(max(i, 0))

print(output)

[0, 2, 0, 3.3, 0, 1.1, 2.2, 0]


In [4]:
import numpy as np
print(np.maximum(0, inputs))

[0.  2.  0.  3.3 0.  1.1 2.2 0. ]


In [5]:
class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)

In [6]:
import nnfs
print(nnfs.__version__)
nnfs.init()

from nnfs.datasets import spiral_data
import matplotlib.pyplot as plt

0.5.1


In [None]:
X, y = spiral_data(samples=100, classes=3)