# week 3
# Day3: 24 Aug 2022
## Activation functions

1. Why is a non-linear activation function important? In particular, the underlying function being learnt by the ML algorithm may totally be unrelated to the activation function being used. So how is this non-linearity going to help? 

    Without non linear activation function the neural network will act as a linear function and can only be used for linearly seperable data. Non linear activation function brings in non linearity that is needed to seperate data that are not linearly seperable.

2. Demonstrate your understanding of different activation functions (atleast 5) by implementing them in Numpy
3. Implement equivalent "backward" functions that would compute the derivative of the activation functions you've implemented above in Numpy
4. Verify the gradients computed by your function with that of PyTorch's autograd 

In [76]:
import math
import numpy as np

Activation functions

In [77]:
#sigmoid function
def sigmoid_forward(x):
  return  1/(1+np.exp(-x))
def sigmoid_backward(x):
  sigmoid=1/(1+np.exp(-x))
  return sigmoid*(1-sigmoid)
#tanh function
def tanh_forward(x):
  return np.tanh(x)
def tanh_backward(x):
  return 1 - np.tanh(x)**2

#Relu function
def relu_forward(x):
  return max(0,x)
def relu_backward(x):
  return 0 if x<=0 else 1 

#Leaky relu function
def leaky_relu_forward(x):
  return max(0.01*x,x)
def leaky_relu_backward(x):
  return 0.01 if x<=0 else 1 

#elu function
def elu_forward(x,alpha=1):
  return  x if x > 0 else alpha * (np.exp(x) - 1)
def elu_backward(x):
    if x > 0 :
        return 1
    else :
        return np.exp(x)

Sigmoid 

In [78]:
x = np.array([-5.,0.,2.,3.,5.])
print("Sigmoid : ",sigmoid_forward(x))
print("Sigmoid backward:",sigmoid_backward(x))

Sigmoid :  [0.00669285 0.5        0.88079708 0.95257413 0.99330715]
Sigmoid backward: [0.00664806 0.25       0.10499359 0.04517666 0.00664806]


In [79]:
import torch
x = torch.tensor([-5.,0.,2.,3.,5.], requires_grad=True)
y = torch.sigmoid(x)
print("Sigmoid:",y)
external_grad = torch.ones_like(x)
y.backward(gradient=external_grad)
print("Sigmoid backward:",x.grad)

Sigmoid: tensor([0.0067, 0.5000, 0.8808, 0.9526, 0.9933], grad_fn=<SigmoidBackward0>)
Sigmoid backward: tensor([0.0066, 0.2500, 0.1050, 0.0452, 0.0066])


Tanh

In [80]:
x = np.array([-5.,0.,2.,3.,5.])
print("Tanh : ",tanh_forward(x))
print("Tanh backward: ",tanh_backward(x))

Tanh :  [-0.9999092   0.          0.96402758  0.99505475  0.9999092 ]
Tanh backward:  [1.81583231e-04 1.00000000e+00 7.06508249e-02 9.86603717e-03
 1.81583231e-04]


In [81]:
import torch
x = torch.tensor([-5.,0.,2.,3.,5.], requires_grad=True)
y = torch.tanh(x)
print("Tanh:",y)
external_grad = torch.ones_like(x)
y.backward(gradient=external_grad)
print("Tanh backward:",x.grad)

Tanh: tensor([-0.9999,  0.0000,  0.9640,  0.9951,  0.9999], grad_fn=<TanhBackward0>)
Tanh backward: tensor([1.8155e-04, 1.0000e+00, 7.0651e-02, 9.8660e-03, 1.8155e-04])


Relu

In [82]:
x = np.array([-5.,0.,2.,3.,5.])
relu_forward_v = np.vectorize(relu_forward)
print("Relu : ",relu_forward_v(x))
relu_backward_v= np.vectorize(relu_backward)
print("Relu backward:",relu_backward_v(x))


Relu :  [0 0 2 3 5]
Relu backward: [0 0 1 1 1]


In [83]:
import torch

x = torch.tensor([-5.,0.,2.,3.,5.], requires_grad=True)
y = torch.nn.functional.relu(x)
print("relu:",y)
external_grad = torch.ones_like(x)
y.backward(gradient=external_grad)
print("relu backward:",x.grad)

relu: tensor([0., 0., 2., 3., 5.], grad_fn=<ReluBackward0>)
relu backward: tensor([0., 0., 1., 1., 1.])


Leaky Relu

In [84]:
x = np.array([-5.,0.,2.,3.,5.])
leaky_relu_forward_v = np.vectorize(leaky_relu_forward)
print("Leaky Relu:",leaky_relu_forward_v(x))
leaky_relu_backward_v = np.vectorize(leaky_relu_backward)
print("Leaky Relu backward:",leaky_relu_backward_v(x))

Leaky Relu: [-0.05  0.    2.    3.    5.  ]
Leaky Relu backward: [0.01 0.01 1.   1.   1.  ]


In [85]:
import torch

x = torch.tensor([-5.,0.,2.,3.,5.], requires_grad=True)
y = torch.nn.functional.leaky_relu(x)
print("Leaky Relu:",y)
external_grad = torch.ones_like(x)
y.backward(gradient=external_grad)
print("Leaky Relu backward:",x.grad)


Leaky Relu: tensor([-0.0500,  0.0000,  2.0000,  3.0000,  5.0000],
       grad_fn=<LeakyReluBackward0>)
Leaky Relu backward: tensor([0.0100, 0.0100, 1.0000, 1.0000, 1.0000])


Elu

In [86]:
x = np.array([-5.,0.,2.,3.,5.])
elu_forward_v = np.vectorize(elu_forward)
print("Elu forward:",elu_forward_v(x))
elu_backward_v = np.vectorize(elu_backward)
print("Elu backward:",elu_backward_v(x))

Elu forward: [-0.99326205  0.          2.          3.          5.        ]
Elu backward: [0.00673795 1.         1.         1.         1.        ]


In [87]:
import torch
x = torch.tensor([-5.,0.,2.,3.,5.], requires_grad=True)
y = torch.nn.functional.elu(x)
print("ELU:",y)
external_grad = torch.ones_like(x)
y.backward(gradient=external_grad)
print("ELU backward:",x.grad)

ELU: tensor([-0.9933,  0.0000,  2.0000,  3.0000,  5.0000], grad_fn=<EluBackward0>)
ELU backward: tensor([0.0067, 1.0000, 1.0000, 1.0000, 1.0000])
