**ACTIVATION FUNCTIONS IN DEEP LEARNING LAYERS**

**Collection ht**

** **

**SIGMOID FUNCTION**

> Sigmoid Function produces a 'S' shaped Sigmoid Curve. It is used to map the required values to a range between 0 and 1.

> Formula: sigmoid(x) = 1/(1+e^-x)

> It maps large negative values to values close to 0 and large positive values to values close to 1.

> Useful for gradient-based optimization algorithms.

In [1]:
import math

def sigmoid(x):
  return 1 / (1 + math.exp(-x))

In [2]:
sigmoid(100), sigmoid(1), sigmoid(23), sigmoid(-17), sigmoid(2.5)

(1.0,
 0.7310585786300049,
 0.9999999998973812,
 4.1399375473943306e-08,
 0.9241418199787566)

sigmoid(100): The input is very large and positive, so the output is very close to 1.

sigmoid(1): The input is a moderate positive number, so the output is around 0.73.

sigmoid(23): The input is very large and positive, so the output is very close to 1.

sigmoid(-17): The input is very large and negative, so the output is very close to 0.

sigmoid(2.5): The input is a moderate positive number, so the output is around 0.92.

** **

**Tanh FUNCTION**

> Also produces an 'S' shaped curve. Similar to sigmoid but will have different range. Used to map values between -1 and 1.

> Formula: tanh(x) = (e^x - e^-x)/(e^x + e^-x)

> It maps large negative values to values close to -1 and large positive values to values close to 1.

> tanh function is zero-centered, so its output is symmetrical around the origin. Can help with faster convergence

In [3]:
def tanh(x):
  return (math.exp(x) - math.exp(-x)) / (math.exp(x) + math.exp(-x))

In [4]:
tanh(100), tanh(1), tanh(17), tanh(50), tanh(-50), tanh(-25.2), tanh(-12.6)

(1.0,
 0.7615941559557649,
 0.9999999999999966,
 1.0,
 -1.0,
 -1.0,
 -0.999999999977259)

tanh(100): The input is very large and positive, so the output is very close to 1.

tanh(1): The input is a moderate positive number, so the output is around 0.76.

tanh(17): The input is large and positive, so the output is very close to 1.

tanh(50): The input is very large and positive, so the output is very close to 1.

tanh(-50): The input is very large and negative, so the output is very close to -1.

tanh(-25.2): The input is very large and negative, so the output is very close to -1.

tanh(-12.6): The input is large and negative, so the output is very close to -1.

** **

**ReLU (RECTIFIED LINEAR UNIT) FUNCTION**

> Used to find the positive part of the values

> Formula: ReLU(x)=max(0,x)

> The output of the ReLU function ranges from 0 to positive infinity. And outputs zero for all negative input values

In [5]:
def relu(x):
    return max(0,x)

In [6]:
relu(-100), relu(14)

(0, 14)

relu(-100): The input is -100, which is a negative value. So the result is 0.

relu(14): The input is 14, which is a positive value. So the result is 14.   

** **

**LEAKY ReLU FUNCTION**

> ReLU has a disadvantage. It outputs zero for all negative inputs, which can lead to "dying ReLU" problem where neurons permanently output zero.

> Leaky ReLU allows a small, positive gradient for negative inputs, ensuring that the neurons never completely "die".

> Formula: Leaky ReLU (x) = { x , if x>0 ; a . x, if x<=0 } || a is a small positive slope coefficient =~ 0.01 

In [7]:
def leaky_relu(x):
    return max(0.1*x,x)

In [8]:
leaky_relu(-100), leaky_relu(14)

(-10.0, 14)

leaky_relu(-100): The input is negative, so Leaky ReLU outputs the input multiplied by a small positive slope coefficient, , in this case -1.0

leaky_relu(14): The input is positive, so Leaky ReLU outputs the input directly, which is 14 in this case.