# Activation Functions

Here we visualize different activation functions commonly used in neural networks.
**Activation functions introduce non-linearity to the model**, enabling it to learn complex patterns and relationships within the data.

Let's have a look to a few of them:
1. **ReLU (Rectified Linear Unit)** is one of the most widely used activation functions in neural networks. It introduces non-linearity by simply thresholding the input at zero, setting all negative values to zero.
2. **Leaky ReLU** is a variant of ReLU that addresses the "dying ReLU" problem (neurons always outputting zero for any input, becoming inactive). Instead of setting negative values to zero, Leaky ReLU allows a small, positive gradient for negative inputs.
3. **GELU (Gaussian Error Linear Unit)** approximates the Gaussian Cumulative Distribution Function (CDF) and exhibits smoothness while retaining desirable properties of non-linear activations.
4. **ELU (Exponential Linear Unit)** is another variant of ReLU that has a smooth non-linearity for negative inputs. It has an exponential behavior for negative values, which helps mitigate the vanishing gradient problem.
5. **Tanh (Hyperbolic Tangent)** squashes the input values between -1 and 1, making it useful for models where the output needs to be normalized.
6. **Sigmoid Logistic Sigmoid)** function compresses the input values between 0 and 1, making it suitable for binary classification tasks where the output needs to represent probabilities.

In [None]:
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'svg'
import torch
import torch.nn as nn

from util import show_act_functions

In [None]:
functions = [nn.ReLU, nn.LeakyReLU, nn.GELU, nn.ELU, nn.Tanh, nn.Sigmoid]

In [None]:
x = torch.arange(-10, 10, 0.1)

show_act_functions(x, functions, num_cols=3)

Let's apply an activation function to the output of a linear layer.

In [None]:
lin = nn.Linear(784, 128)

In [None]:
x = torch.randn((8, 784))  # Fake data
x = lin(x)

In [None]:
plt.plot(x[0].detach().numpy(), lw=1, label='Linear layer output')
plt.plot(nn.ReLU()(x[0]).detach().numpy(), alpha=0.6, lw=2, label='Act function applied')
plt.legend()
plt.show()