In [1]:
import numpy as np

## Activation Functions
- Non-linear functions
- Applied to the outputs of previous layer -> next layer or final layer
- Make the network non-linear

### Sigmoid
$$y = \frac{1}{1 + e^{-x}}$$
- Value return in range [0, 1]
- **[-]**
    + Not 0 centered
    + Subject to vanishing Gradient descent
    + $e^x$ = expensive computation

<img src="./Figs/10.jpg" alt="Drawing" style="width: 550px;"/>

In [2]:
def sigmoid(x):
    return 1.0 / (1.0 + np.exp(-x))

### Softmax - Multiclass Classification
- For `Multiclass Classification`
- Return a list of values(in range `[0,1]`): Sum up to 1

$$\sigma(z)_j = \frac{e^{z_j}}{\sum\limits_{k=1}^{K}e^{z_k}}$$

- Softmax as activation function
$$z_i = \sum_jw_{i,j}x_j + b_i$$
$$y = softmax(z)_i = \frac{e^{z_i}}{\sum_je^{z_j}}$$

<img src="./Figs/17.jpg" alt="Drawing" style="width: 550px;"/>

### Step
- Value return = 0 or 1
<img src="./Figs/11.jpg" alt="Drawing" style="width: 550px;"/>

In [3]:
def step(x):
    return x > 0

### tanh
- Value return in range [-1, 1]
- **[+]**
    + 0 centered
- **[-]**
    + Pretty much like sigmoid

<img src="./Figs/12.jpg" alt="Drawing" style="width: 550px;"/>

In [4]:
def tanh(x):
    return np.tanh(x)

### Relu - Rectified Linear Units
$$y = max(0,x)$$
- Value return in range $[0, + \infty]$
- Emphasize the value

- **[+]**
    + Fast to compute
    + $x > 0$ => No Grad vanishing
    + Faster convergence => Increase trainning speed
- **[-]**
    + Not 0 centered
    + Die if not updated
    
<img src="./Figs/13.jpg" alt="Drawing" style="width: 550px;"/>

In [5]:
def relu(x):
    return x * (x > 0)

### Leaky Relu
$$y = max(x, a*x), a \ne 1$$
- **[+]**
    + Not die easily
    
<img src="./Figs/18.jpg" alt="Drawing" style="width: 550px;"/>

### Softplus
$$y = log(1 + e^x)$$
- Value return in range $[0, + \infty]$
- smoother near 0 than relu
<img src="./Figs/14.jpg" alt="Drawing" style="width: 550px;"/>

In [6]:
def softplus(x):
    return np.log(1.0 + np.exp(x))

## Fully Connected - Feed forward - Matrix size

- Layer 1
<img src="./Figs/15.jpg" alt="Drawing" style="width: 550px;"/>
- Layer 2
<img src="./Figs/16.jpg" alt="Drawing" style="width: 550px;"/>