In [61]:
import numpy as np

# Activation functions
## Sigmoid function
The sigmoid function is used as an activation function because it squashes the output to a probability value between 0 and 1, which is useful when the output is a probability or binary; hence, it is commonly used in binary classification models. The function also allows the network to learn more complex decision bondaries. The formula for the sigmoid function is $$ σ(x) = \frac{1}{1 + e^{-x}}. $$

In [74]:
def sigmoid_forward(x):
    return 1 / (1 + np.exp(-x))

## Derivative of sigmoid
Back propagation is essential to calculate the grandient of the loss function with respect to the weights and biases in a neural network. It allows the netowrk to effectively learn from its errors and adapt its weights based on the activating functions to update. The backward pass for sigmoid is the deravative of the sigmoid function, which can be mathematically expressed as $$ \frac{\mathrm{d}σ}{\mathrm{d}x}(x) = σ(x) \cdot \bigl(1 - σ(x)\bigr) $$

In [76]:
def sigmoid_backward(x):
    return sigmoid_forward(x) * (1 - sigmoid_forward(x))

## Tanh function
The output for the tanh function is symmetric around the origin, which can help learning alorithms converge. This function outperforms the sigmoid function in multi-layer neural networks. The formula for the tanh function can be expressed as $$ tanh(x) = \frac{e^{x} - e^{-x}}{e^{x} + e^{-x}} $$

In [None]:
def forward_tanh(x):
    return (np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x))

Similarly to the backward pass of the sigmoid function, the backward pass of the tanh function is the derivate of it, which can be expressed as $$ \frac{d(tanh)}{dx} = 1 - tanh(x)^{2} $$

In [None]:
def backward_tanh(x):
    return 1 - (forward_tanh(x) ** 2)

# ReLU function
The ReLU (Rectified Linear Unit) function helps the model learn more complex relationships in data and makes accurate predictions, and it's computationally efficient, due to its non-linearity. The ReLU function can be expressed as 
$$
\text{ReLU}(x) = 
\begin{cases} 
x, & \text{if } x \geq 0 \\ 
0, & \text{if } x < 0 
\end{cases}
$$



In [78]:
def relu_forward(x):
    return np.maximum(0, x)

The backward pass for the relu function can be expressed as 
$$
\text{ReLU}'(x) = 
\begin{cases} 
1, & \text{if } x > 0 \\ 
0, & \text{if } x \leq 0 
\end{cases}
$$


In [84]:
def relu_backward(x):
    return 1 if x > 0 else 0

# Softmax function
Unlike the sigmoid function, the softmax function is used in multiclass classification tasks: the function converts the output into probabilities, where the probability represents the likelihood of the input being in each class.The softmax function can mathematically be expressed as $$\text{softmax}(z_i) = \frac{e^{z_i}}{\sum_{j=1}^n e^{z_j}}$$

In [82]:
def softmax_forward(vector):
    e = np.exp(vector)
    return e / np.sum(e)

$$
'(z_i)} = \text{softmax}(z_i) \cdot (\delta_{ik} - \text{softmax}(z_k)
$$

$$
\text{where } \delta_{ik} = 
\begin{cases} 
1, & \text{if } i = k \\ 
0, & \text{if } i \neq k
\end{cases}
$$


In [80]:
def softmax_backward(vector, y):
    p = softmax_forward(vector)
    return p - y

# Dropout function
Dropout prevents overfitting and regularises by randomly "dropping" connections between neurons in successive layers when training.

In [2]:
def dropout(X, dropout_rate, training=True):
    if training:
        mask = np.random.rand(*X.shape) < (1 - dropout_rate)
        X = X * mask / (1 - dropout_rate)
    return X

# Implemented Neural network

In [59]:
class NeuralNetwork:
    def __init__(self):
        pass

    def sigmoid_forward(self, x):
        return 1 / (1 + np.exp(-x))

    def sigmoid_backward(self, x):
        return sigmoid_forward(x) * (1 - sigmoid_forward(x))

    def relu_forward(self, x):
        return np.maximum(0, x)

    def relu_backward(self, x):
        return 1 if x > 0 else 0

    def softmax_forward(self, vector):
        e = np.exp(vector)
        return e / np.sum(e)

    def softmax_backward(self, vector, y):
        p = softmax_forward(vector)
        return p - y

    def dropout(X, dropout_rate, training=True):
        if training:
            mask = np.random.rand(*X.shape) < (1 - dropout_rate)
            X = X * mask / (1 - dropout_rate)
        return X

