### Building a neural network from scratch with NumPy.
1. Define the architechture: Input, hidden, and output layers.
2. Initialise weights and biases: Random initialisation
3. Forward Propagation: Compute outputs layer by layer using activation functions
4. Loss Function: Calculate the error (Mean Squared Error/Cross Entropy)
5. Backward Propagation: Compute gradients using the chain rule
6. Update the weights and biases: using gradient decent (an optimization algorithm)

Let us start with a simple case, a fully connected neural network with:
- 1 input layer, 1 hidden layer, and 1 output
- Activation functions: Sigmoid for hidden/output layers
- Loss function: Mean Squared Error (MSE)

In [13]:
import numpy as np

# Sigmoid activation function and its derivative
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# This function simplifies gradient calculations in training
def sigmoid_derivative(x):
    return x * (1 - x)

# Mean Squared Error loss
def mse_loss(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

- Sigmoid in Forward Pass: Computes the output for each neuron and enables the network to make predictions.
- Sigmoid Derivative in Backward Pass: Helps adjust the weights and biases by determining how each weight and bias contributes to the error.
- The Mean Squared Error (MSE) is a loss function which measures the difference between the predicted and actual values

In [14]:
# Specify the Neural Network Parameters
np.random.seed(42)
input_size = 2 # Number of features
hidden_size = 3 # Number of neurons in the hidden layer
output_size = 1 # Number of output neurons

- Now you want to initialise the weights for the connection between the input layer and the hidden layer. Here, you are creating a 2D array (matrix) of shape (input_size, hidden_size) filled with random values uniformly distributed between o and 1.
- -0.5 shifts the range of the random values from [0, 1] to [-0.5, 0.5]. Ensure that the weights are centred around zero, which is important for better training dynamics in neural networks

In [15]:
# Initialise weights and biases
weights_input_hidden = np.random.rand(input_size, hidden_size) - 0.5
weights_hidden_output = np.random.rand(hidden_size, output_size) - 0.5
bias_hidden = np.random.rand(1, hidden_size) - 0.5
bias_output = np.random.rand(1, output_size) - 0.5

# learning rate
lr = 0.1

In [19]:
# Training data (XOR problem)
x = np.array([[0, 1], [0, 1], [1, 0], [1, 1], [0, 1], [1, 0], [1, 1], [1, 1]])
y = np.array([[1], [1], [1], [0], [1], [1], [0], [0]])

epochs = 10000
for epoch in range(epochs):
    # Forward propagation
    hidden_input = np.dot(x, weights_input_hidden) + bias_hidden
    hidden_ouput = sigmoid(hidden_input)
    
    final_input = np.dot(hidden_ouput, weights_hidden_output) + bias_output
    final_output = sigmoid(final_input)
    
    # Compute loss
    loss = mse_loss(y, final_output)
    
    # Background Propagation
    error = y - final_output
    d_ouput = error * sigmoid_derivative(final_output)
    
    error_hidden = d_ouput.dot(weights_hidden_output.T)
    d_hidden = error_hidden * sigmoid_derivative(hidden_ouput)
    
    # Update weights and biases
    weights_input_hidden += x.T.dot(d_hidden) * lr
    bias_hidden += np.sum(d_hidden, axis=0, keepdims=True) * lr
    
    # Print loss every 1000 epochs
    if epoch % 1000 == 0:
        print(f"Epoch  {epoch}: {loss:.4f}")
    
# Test the network
print("\nTesting Neural Network:")
for i in range(len(x)):
    hidden_input = np.dot(x[i], weights_input_hidden) + bias_hidden
    hidden_ouput = sigmoid(hidden_input)
    
    final_input = np.dot(hidden_ouput, weights_hidden_output) + bias_output
    final_output = sigmoid(final_input)
    
    print(f"Input: {x[i]}, Predicted Output: {final_output[0][0]}, Rounded: {round(final_output[0][0])}")

Epoch  0: 0.2135
Epoch  1000: 0.2133
Epoch  2000: 0.2131
Epoch  3000: 0.2129
Epoch  4000: 0.2126
Epoch  5000: 0.2117
Epoch  6000: 0.2099
Epoch  7000: 0.2091
Epoch  8000: 0.2086
Epoch  9000: 0.2082

Testing Neural Network:
Input: [0 1], Predicted Output: 0.49476577644884345, Rounded: 0
Input: [0 1], Predicted Output: 0.49476577644884345, Rounded: 0
Input: [1 0], Predicted Output: 0.6761432843204604, Rounded: 1
Input: [1 1], Predicted Output: 0.4787089258581361, Rounded: 0
Input: [0 1], Predicted Output: 0.49476577644884345, Rounded: 0
Input: [1 0], Predicted Output: 0.6761432843204604, Rounded: 1
Input: [1 1], Predicted Output: 0.4787089258581361, Rounded: 0
Input: [1 1], Predicted Output: 0.4787089258581361, Rounded: 0
