## CS-471: Machine Learning
### **Submitted By**:
#### **Name**: Ayesh Ahmad
#### **CMS**: 365966
#### **Class**: BESE-12A
---
## Lab 11
#### Learn back propagation from through coding exercises at different levels of complexity to help grasp the concept of backpropagation:
- #### Basic Back Propagation: This exercise introduces the concept of backpropagation and its fundamental steps.
- #### Back Propagation for Multi-layer Neural Network: In this exercise, you'll deepen your understanding of backpropagation by implementing it for a multi-layer neural network.

--- 

## Exercise 1: Basic Back Propagation

##### Step 1: Importing Libraries
---

In [1]:
import numpy as np

##### Step 2: Defining the Network Architecture
---

In [2]:
input_size = 1
hidden_size = 1
output_size = 1

##### Step 3: Initializing Weights and Biases
---

In [3]:
np.random.seed(0)  

# Hidden layer parameters
W1 = np.random.rand(input_size, hidden_size)
b1 = np.random.rand(1, hidden_size)

# Output layer parameters
W2 = np.random.rand(hidden_size, output_size)
b2 = np.random.rand(1, output_size)

##### Step 4: Defining the Activation Function (Sigmoid)
---

In [4]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

##### Step 5: Forward Pass
---

In [5]:
def forward(input_data):
    global W1, W2, b1, b2, a1

    # Hidden layer
    z1 = np.dot(input_data, W1) + b1
    a1 = sigmoid(z1)
    
    # Output layer
    z2 = np.dot(a1, W2) + b2
    output = sigmoid(z2)
    
    return output

##### Step 6: Loss Function (Mean Squared Error)
---

In [6]:
def loss(output, target):
    return np.mean(np.square(output - target))

##### Step 7: Backward Pass (Backpropagation)
---

In [7]:
def backward(input_data, target, output, learning_rate):
    global W1, W2, b1, b2
    
    # Backpropagate through output layer
    output_error = output - target
    dz2 = output_error * output * (1 - output)
    dW2 = np.dot(a1.T, dz2)
    db2 = np.sum(dz2, axis=0, keepdims=True)

    # Backpropagate through hidden layer
    hidden_error = np.dot(dz2, W2.T)
    dz1 = hidden_error * a1 * (1 - a1)  
    dW1 = np.dot(input_data.T, dz1)
    db1 = np.sum(dz1, axis=0, keepdims=True)

    # Update weights and biases
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1

##### Step 8: Training Loop
---

In [8]:
num_epochs = 1000
learning_rate = 0.1

# Input and target data
input_data = np.array([[0], [1]])
target = np.array([[0], [1]])

for epoch in range(1, num_epochs+1):
    output = forward(input_data)
    current_loss = loss(output, target)
    backward(input_data, target, output, learning_rate)

    # Print current loss every 100 epochs
    if epoch % 100 == 0:
        print(f"Epoch {epoch}: Loss = {current_loss}")

Epoch 100: Loss = 0.24790425470680683
Epoch 200: Loss = 0.24419961091272355
Epoch 300: Loss = 0.2412532405672978
Epoch 400: Loss = 0.23724138675368395
Epoch 500: Loss = 0.2316736164398244
Epoch 600: Loss = 0.2239170347183977
Epoch 700: Loss = 0.21315264327149527
Epoch 800: Loss = 0.19851427288626994
Epoch 900: Loss = 0.17955877188368535
Epoch 1000: Loss = 0.15699501239343513


##### Step 9: Testing the Trained Model
---

In [9]:
test_data = np.array([[0.5]])
print(f"Prediction for input {test_data[0][0]}: {forward(test_data)[0][0]}")

Prediction for input 0.5: 0.5366546209733737


--- 
## Exercise 2: Back Propagation for Multi-layer Neural Network

##### Step 1: Importing Libraries
---

In [10]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

##### Step 2: Defining the Network Architecture
---

In [11]:
input_size = 4  
hidden_size1 = 10 
hidden_size2 = 10
output_size = 3 

##### Step 3: Initializing Weights and Biases
---

In [12]:
np.random.seed(0)  

# Hidden layer 1 parameters
W1 = np.random.randn(input_size, hidden_size1)
b1 = np.random.randn(1, hidden_size1)

# Hidden layer 2 parameters
W2 = np.random.randn(hidden_size1, hidden_size2)
b2 = np.random.randn(1, hidden_size2)

# Output layer parameters
W3 = np.random.randn(hidden_size2, output_size)
b3 = np.random.randn(1, output_size)

##### Step 4: Defining the Activation Function
---

In [13]:
def relu(x):
    return np.maximum(0, x)

def softmax(x):
    exps = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exps / np.sum(exps, axis=1, keepdims=True)

##### Step 5: Forward Pass
---

In [14]:
def forward(input_data):
    global W1, W2, W3, b1, b2, b3, a1, z1, a2, z2

    # Hidden layer 1
    z1 = np.dot(input_data, W1) + b1
    a1 = relu(z1)

    # Hidden layer 2
    z2 = np.dot(a1, W2) + b2
    a2 = relu(z2)

    # Output layer
    z3 = np.dot(a2, W3) + b3
    output = softmax(z3)
    return output

##### Step 6: Loss Function (Cross Entropy Loss)
---

In [15]:
def loss(output, target):
    m = target.shape[0]
    log_likelihood = -np.log(output[np.arange(m), target.argmax(axis=1)])
    loss = np.sum(log_likelihood) / m
    return loss

##### Step 7: Backward Pass (Backpropagation)
---

In [16]:
def backward(input_data, target, output, learning_rate):
    global W1, W2, W3, b1, b2, b3

    m = input_data.shape[0]

    # Backpropagate through output layer
    dz3 = output - target
    dW3 = (1 / m) * np.dot(a2.T, dz3)
    db3 = (1 / m) * np.sum(dz3, axis=0, keepdims=True)
    
    # Backpropagate through hidden layer 2
    dA2 = np.dot(dz3, W3.T)
    dz2 = dA2 * (z2 > 0)
    dW2 = (1 / m) * np.dot(a1.T, dz2)
    db2 = (1 / m) * np.sum(dz2, axis=0, keepdims=True)
    
    # Backpropagate through hidden layer 1
    dA1 = np.dot(dz2, W2.T)
    dz1 = dA1 * (z1 > 0)
    dW1 = (1 / m) * np.dot(input_data.T, dz1)
    db1 = (1 / m) * np.sum(dz1, axis=0, keepdims=True)
    
    W3 -= learning_rate * dW3
    b3 -= learning_rate * db3
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1

##### Step 8: Training Loop
---

In [17]:
num_epochs = 1000
learning_rate = 0.01

iris = load_iris()
X = iris.data
y = iris.target
y_encoded = np.eye(output_size)[y]

X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.2, random_state=0)

for epoch in range(1, num_epochs+1):
    y_pred = forward(X_train)
    train_loss = loss(y_pred, y_train)
    backward(X_train, y_train, y_pred, learning_rate)
    
    if epoch % 100 == 0:
        print(f"Epoch {epoch}, Loss: {train_loss}")

Epoch 100, Loss: 0.11834210622229858
Epoch 200, Loss: 0.11205369203497231
Epoch 300, Loss: 0.10700120093391824
Epoch 400, Loss: 0.10275262896894041
Epoch 500, Loss: 0.09908939510165261
Epoch 600, Loss: 0.09587098897416459
Epoch 700, Loss: 0.09300394524293992
Epoch 800, Loss: 0.09042092363773625
Epoch 900, Loss: 0.08807544602450816
Epoch 1000, Loss: 0.08593401950838968


##### Step 9: Testing the Trained Model
---

In [18]:
y_pred_test = forward(X_test)
y_pred_labels = np.argmax(y_pred_test, axis=1)
print(classification_report(np.argmax(y_test, axis=1), y_pred_labels))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        11
           1       1.00      1.00      1.00        13
           2       1.00      1.00      1.00         6

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



--- 
## Results & Conclusion

##### Exercise 1: Basic Back Propagation
- **Architecture**: Single-layer neural network with 1 input neuron, 1 hidden neuron, and 1 output neuron.
- **Training**: The network was trained for 1000 epochs using backpropagation and achieved a final loss of 0.156.
- **Prediction**: For input 0.5, the network predicted an output of approximately 0.537.

##### Exercise 2: Back Propagation for Multi-layer Neural Network
- **Architecture**: Multi-layer neural network with 4 input neurons, 10 neurons in each of the two hidden layers, and 3 output neurons.
- **Training**: The network was trained for 1000 epochs using backpropagation and achieved a final loss of 0.086.
- **Performance**: The model achieved perfect precision, recall, and F1-score on the test set, indicating that it successfully learned to classify the Iris dataset.

##### Conclusion
- Both the basic and multi-layer neural networks were able to learn and perform well on their datasets, achieving high accuracy and demonstrating the effectiveness of backpropagation in training neural networks.
- The multi-layer neural network with two hidden layers showed superior performance, highlighting the benefits of deeper networks in learning complex patterns in data.
- Further tuning of hyperparameters such as learning rate, number of epochs, and network architecture could potentially improve the performance even more.