# Activation Functions 
- decide if the neuron is firing or not.
- they provide a non linearity that is required to approch more complex problems.
![image.png](attachment:image.png)

# Step Function
- not too popular
- linear in nature
- imagine using linear regression for binary classification by using a threshold to decide if the neuron outputs 1 or 0
- very bad for multiclass classification
![image-2.png](attachment:image-2.png)

# Sigmoid/Logit Function 
- smooth curve between 0 and 1
- very good for multiclass classification
- no duplicates when predicting due to floating pt nature of sigmoid output

![image-8.png](attachment:image-8.png)

# tanh Function
- range between [-1, 1]
- issues with sigmoid and tanh is the vanishing gradient issue that could arise at extremeties as the learning becomes slow.

![image-4.png](attachment:image-4.png)

# ReLu - Rectified Linear Unit
- solves the vanishing gradient problem faced by sigmoids and tanh but only on the positive x side.
- very lightweight
- default choice for hidden layers due to it being lightweight.

![image-6.png](attachment:image-6.png)

# Leaky ReLU
- addresses the vanishing gradient problem on both axis now, does not avoid negative values.

![image-7.png](attachment:image-7.png)

# Most Popular Activation Functions:
![image-5.png](attachment:image-5.png)

#### Guideline: Use Sigmoid in the output layer, use tanh at other places if possible.
- tanh is good for other layers as it will try to center the data to mean = 0.
- for hidden layers ReLU is the most popular due to computational efficacy.

In [18]:
import math

def sigmoid(x):
    '''Maps x between 0 and 1'''
    return 1/(1+math.exp(-x))

In [19]:
### All outputs in the range 0,1
print(sigmoid(100))
print(sigmoid(20))
print(sigmoid(12))
print(sigmoid(-90))
print(sigmoid(1))

1.0
0.9999999979388463
0.9999938558253978
8.194012623990515e-40
0.7310585786300049


In [20]:
def tanh(x):
    '''Maps x between [-1, 1]'''
    return (math.exp(x) - math.exp(-x)) / (math.exp(x) + math.exp(-x))


In [21]:
print(tanh(100))
print(tanh(-100))
print(tanh(1))
print(tanh(-76))
print(tanh(6))

1.0
-1.0
0.7615941559557649
-1.0
0.9999877116507956


In [22]:
def ReLU(x):
    '''Returns max(0,x)'''
    return max(0,x)



In [23]:
print(ReLU(100))
print(ReLU(-100000))
print(ReLU(1))
print(ReLU(23))
print(ReLU(100))

100
0
1
23
100


In [24]:
def leaky_ReLU(x):
    '''returns max(0.1*x, x)'''
    return max(0.1*x, x)

In [26]:
print(leaky_ReLU(100))
print(leaky_ReLU(-100000))
print(leaky_ReLU(1))
print(leaky_ReLU(23))
print(leaky_ReLU(-100))

100
-10000.0
1
23
-10.0
