## Activation Function form Scratch

In [1]:
import numpy as np

<h4>Set of inputs

In [2]:
X = [[1, 2.2, 1.1, 5],     # sample of 4 inputs
     [0.1, 4, 0.9, 3], 
     [4, 2.1, 0.5, 3.2]]

<h4>Function to create a layer

In [3]:
class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):          
        # weights matrix is of n_inputs * n_neurons
        self.weights = 0.1 * np.random.randn(n_inputs, n_neurons)
        # biases will of same size of neurons we have
        self.biases = np.zeros((1, n_neurons))

    def forward(self, inputs):
        # final output is going to be as (weights * inputs) + bias
        self.output = np.dot(inputs, self.weights) + self.biases

### ReLU Activation Function

In [22]:
class Activation_ReLU:
    def forward(self, inputs):
        self.output = np.maximum(0, inputs)

### Softmax Activation Function

In [7]:
class Activation_Softmax:
    def forward(self, inputs):
        exp_values = np.exp(inputs - np.max(inputs, axis = 1, keepdims = True))
        probabilities = exp_values / np.sum(exp_values, axis = 1, keepdims = True)
        self.output = probabilities

<h4>Creating Layer 1 and applying ReLU Activation Function</h4>
This layer has 4 input features and 5 neurons

In [23]:
layer1 = Layer_Dense(4, 5)
activation1 = Activation_ReLU()
layer1.forward(X)
activation1.forward(layer1.output)
activation1.output

array([[0.        , 0.        , 0.74610986, 0.67315742, 0.        ],
       [0.        , 0.        , 0.24890027, 0.6755653 , 0.        ],
       [0.        , 0.        , 0.71651797, 0.62955556, 0.        ]])

## Creating Layer 2 and applying Softmax Activation Function 
<p>Output of Layer 1 is input to Layer 2</p>
So, it'll have 5 input and any number of neurons we can have.

In [27]:
layer2 = Layer_Dense(5, 2)
activation2 = Activation_Softmax()
layer2.forward(layer1.output)
layer2.output
activation2.forward(layer2.output)
activation2.output

array([[0.47525765, 0.52474235],
       [0.47528659, 0.52471341],
       [0.49065911, 0.50934089]])

### We get final output from which we can calculate loss and accordingly back-propagation can be done.