In [3]:
import numpy as np
import math

### Sigmoid activation function
Advantage:
1. Squashes numbers to range [0,1]
2. Historically popular since they have nice interpretation as a saturating “firing rate” of a neuron

Disadvantage:
1. Saturated neurons “kill” the gradients
2. Sigmoid outputs are not zero-centered
3. exp() is a bit compute expensive

In [51]:
def sigmooid_function(x):
    sig=1/(1+math.exp(-x)) #1/(1+np.exp(-x))
    # print("Sigmoid Function:", sig)
    return sig

In [52]:
sigmooid_function(10)

0.9999546021312976

### Tanh activation function
Advantage:
1. Squashes numbers to range [-1,1]
2. zero centered (nice)

Disadvantage:
1. still kills gradients when saturated

In [53]:
def tanh_function(x):
    thf= math.tanh(x)
    # print(thf)
    return thf

In [54]:
tanh_function(-1000)

-1.0

### Relu activation function
`max(0.0, x)`

Advantage:
- Does not saturate (in +region)
- Very computationally efficient
- Converges much faster than sigmoid/tanh in practice (e.g. 6x)
- Actually more biologically plausible than sigmoid.

Disadvantage:
- Not zero-centered output
- An annoyance'''

In [55]:
def relu_function(x):
	relu=max(0.0, x)
	# print("Relu Function:", relu)
	return relu

In [56]:
relu_function(-10)

0.0

### Leaky Relu function:
    - Does not saturate
    - Computationally efficient
    - Converges much faster than sigmoid/tanh in practice! (e.g. 6x)
    - will not “die”

In [46]:
def leaky_relu_function(x):
    if x>0:
        return x
    else:
        return .01*x

In [47]:
leaky_relu_function(-100)

-1.0

### Expotential Relu Function:
    - All benefits of ReLU
    - Closer to zero mean outputs
    - Negative saturation regime compared with Leaky ReLU adds some robustness to noise
    - Computation requires exp()

In [57]:
def exp_relu_function(x):
    '''
    '''
    if x>0:
        return x
    else:
        return .01*(np.exp(x)-1)

In [58]:
exp_relu_function(-10)

-0.009999546000702375

### Key Point:
    - Use ReLU. Be careful with your learning rates
    - Try out Leaky ReLU / Maxout / ELU
    - Try out tanh but don’t expect much
    - Don’t use sigmoid