# Activation functions

## Imports

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np

## Sigmoid

A sigmoid function has been used as an activation function for neural networks for several decades, and only recently been partly replaced by ReLU.  It is still used quite frequently though.

In [None]:
def sigmoid(x):
    return 1.0/(1.0 + np.exp(-x))

In [None]:
x = np.linspace(-5.0, 5.0, 101)

In [None]:
_ = plt.plot(x, sigmoid(x))

Note that the output of the sigmoid function is in the range $[0, 1]$.

## Hyperbolic tangent

Whereas the output of the sigmoid function is always positive, the hyperbolic tangent is used when negative output values are required.

In [None]:
x = np.linspace(-5.0, 5.0, 101)

In [None]:
_ = plt.plot(x, np.tanh(x))

## ReLU versus SoftPlus

An activation that is used quite often in the context of deep learning is ReLU (Rectified Linear Unit).  It is an approximation for the SoftPlus function, and although it is not differentiable, it is far cheaper computationally.

In [None]:
def relu(x):
    return np.maximum(0, x)

In [None]:
def softplus(x):
    return np.log(1.0 + np.exp(x))

In [None]:
x = np.linspace(-5.0, 5.0, 101)

In [None]:
plt.plot(x, relu(x), label='ReLU')
plt.plot(x, softplus(x), label='SoftMax')
plt.legend(loc='upper left');

## SoftMax

The SoftMax function is often used for an output layer that represents categorical data. It will relatively increase high values, decrease low values.  More importantly, for categorical output represented by a one-hot encoding, it will normalize the outputs such that the sum is equal to 1, and they can be interpreted as the proobability of the categories.

In [None]:
def softmax(x):
    norm = np.sum(np.exp(x))
    return np.exp(x)/norm

In [None]:
x = np.random.uniform(low=-1.0, high=1.0, size=20)

In [None]:
plt.plot(x, softmax(x), 'o');

The sum of the softmax values is indeed equal to 1.

In [None]:
np.sum(softmax(x))