<a href="https://www.kaggle.com/code/sacrum/ml-labs-12-neural-networks?scriptVersionId=178243829" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Exercise 1: Basic Back Propagation

### Step 1: Import Libraries
Import Numpy.

In [1]:
import numpy as np

### Step 2: Define the Network Architecture
Define the architecture of your neural network. It should have the following components:
- `input_size`: 1 input neuron. Think of this as the input data.
- `hidden_size`: 1 hidden neuron. This is an intermediate processing unit.
- `output_size`: 1 output neuron. This represents the network's prediction.

In [2]:
# Define the network architecture
input_size = 1
hidden_size = 1
output_size = 1

### Step 3: Initialize Weights and Biases
Initialize the parameters of the neural network with random values. The network learns by adjusting these parameters during training. Think of `W1` and `W2` as connection weights and `b1` and `b2` as biases.

In [3]:
np.random.seed(42)  # For reproducibility

W1 = np.random.randn(input_size, hidden_size)
b1 = np.random.randn(hidden_size)
W2 = np.random.randn(hidden_size, output_size)
b2 = np.random.randn(output_size)


### Step 4: Define the Activation Function (Sigmoid)
Define the sigmoid activation function. 

In [4]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

### Step 5: Forward Pass
The `forward` function computes the forward pass of the neural network. It takes an `input_data` and passes it through the network to generate an `output`. Here's what happens:
- `z1` is the result of multiplying the input data by the weight matrix `W1`, and then adding the bias `b1`. This is the input to the first (and only) hidden neuron.
- `a1` is the result of applying the sigmoid activation function to `z1`. This is the output of the hidden layer.
- `z2` is computed similarly using `a1`, `W2`, and `b2`. It's the input to the output neuron.
- Finally, `output` is the result of applying the sigmoid activation function to `z2`. This is the final output of the network.

In [5]:
def forward(input_data):
    z1 = np.dot(input_data, W1) + b1
    a1 = sigmoid(z1)
    z2 = np.dot(a1, W2) + b2
    output = sigmoid(z2)
    return output, a1

### Step 6: Loss Function (Mean Squared Error)
The `loss` function computes the mean squared error between the predicted output and the target output. This tells us how far off our predictions are from the desired targets. During training, the goal is to minimize this loss.

In [6]:
# Loss Function (Mean Squared Error)
def loss(predicted_output, target_output):
    return np.mean((predicted_output - target_output) ** 2)

### Step 7: Backward Pass (Backpropagation)
The `backward` function implements backpropagation, which is the key to training the neural network. Here's a breakdown:
- `output` is the result of a forward pass through the network. It's used to compute the output error, which is the difference between the predicted output and the target output.
- `dW2` and `db2` are gradients that indicate how much the weights and biases in the output layer need to be adjusted to reduce the error.
- `hidden_error` is a measure of how much the hidden layer contributed to the error. It's used to compute `dW1` and `db1`, which represent the gradients for the weights and biases in the hidden layer.
- Finally, the weights and biases are updated using these gradients. The learning rate (`learning_rate`) controls the size of the updates.

In [7]:
def backward(input_data, target_output, output, a1, learning_rate=0.1):
    global W1, b1, W2, b2  # Declare globals at the start
    
    output_error = output - target_output
    dW2 = np.dot(a1.T, output_error * output * (1 - output))
    db2 = np.sum(output_error * output * (1 - output), axis=0)
    
    hidden_error = np.dot(output_error * output * (1 - output), W2.T)
    dW1 = np.dot(input_data.T, hidden_error * a1 * (1 - a1))
    db1 = np.sum(hidden_error * a1 * (1 - a1), axis=0)
    
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2


### Step 8: Training Loop
This is the training loop. It runs for a specified number of `num_epochs` and trains the neural network to make better predictions. The loop consists of the following steps:
- Input data (`input_data`) and target values (`target`) are defined. In this case, the input data is ` [0]` and `[1]` and the corresponding targets are also `[0]` and `[1]`.
- A forward pass is performed to get the predicted output.
- The current loss is computed using the `loss` function.
- Backpropagation is executed to adjust the weights and biases.
- Every 100 epochs, the current loss is printed to monitor the training progress.

In [8]:
input_data = np.array([[0], [1]])
target = np.array([[0], [1]])

num_epochs = 1000

for epoch in range(num_epochs):
    output, a1 = forward(input_data)
    current_loss = loss(output, target)
    backward(input_data, target, output, a1)
    
    if (epoch + 1) % 100 == 0:
        print(f'Epoch {epoch + 1}, Loss: {current_loss}')


Epoch 100, Loss: 0.2772080817526419
Epoch 200, Loss: 0.24919477102920234
Epoch 300, Loss: 0.24460396348390362
Epoch 400, Loss: 0.2398881473343858
Epoch 500, Loss: 0.23289412423845401
Epoch 600, Loss: 0.2226766568799951
Epoch 700, Loss: 0.2085477071770783
Epoch 800, Loss: 0.19029012839390697
Epoch 900, Loss: 0.16847095788115848
Epoch 1000, Loss: 0.14466209300946115


### Step 9: Test the Trained Model
Once the training is done, we test the trained neural network with new data (`test_data`). We input 0.5 and see what the network predicts. This demonstrates that our neural network has learned to approximate certain patterns in the data.

In [9]:
test_data = np.array([[0.5]])
predicted_output, _ = forward(test_data)
print(f'Prediction for input 0.5: {predicted_output}')


Prediction for input 0.5: [[0.53594128]]


# Exercise 2: Back Propagation for Multi-layer Neural Network

### Step 1. Importing Libraries:
Import any necessary libraries.

In [10]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder


### Step 2. Defining Network Architecture:
Here, you define the architecture of the neural network:
    • `input_size`: The number of input neurons, set to 4.
    • `hidden_size`: The number of neurons in the hidden layer, it’s a hyperparameter.
    • `output_size`: The number of output neurons, set to 3.

In [11]:
input_size = 4       # Number of input neurons (features in IRIS dataset)
hidden_size = 5      # Number of neurons in the hidden layer (hyperparameter, can be adjusted)
output_size = 3      # Number of output neurons (classes in IRIS dataset)


### Step 3. Initializing Weights and Biases:
Here, you initialize the weights and biases with random values. These parameters will be adjusted during training to make the network learn.

In [12]:
np.random.seed(42)  # For reproducibility

W1 = np.random.randn(input_size, hidden_size)
b1 = np.random.randn(hidden_size)
W2 = np.random.randn(hidden_size, output_size)
b2 = np.random.randn(output_size)


### Step 4. Defining the Activation Function:
Define appropriate activation functions for each layer.

In [13]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def softmax(x):
    exp_x = np.exp(x - np.max(x))
    return exp_x / exp_x.sum(axis=1, keepdims=True)


### Step 5. Forward Pass:
This function performs the forward pass through the neural network.

In [14]:
def forward(input_data):
    z1 = np.dot(input_data, W1) + b1
    a1 = sigmoid(z1)
    z2 = np.dot(a1, W2) + b2
    output = softmax(z2)
    return output, a1


### Step 6. Loss Function (Cross Entropy Loss):
The `loss` function calculates the cross-entropy loss between the predicted output and the target output. This measures how well the network is performing, and during training, the goal is to minimize this error.

In [15]:
def loss(predicted_output, target_output):
    m = target_output.shape[0]
    log_likelihood = -np.log(predicted_output[range(m), target_output.argmax(axis=1)])
    return np.sum(log_likelihood) / m


### Step 7. Backward Pass (Backpropagation):
The `backward` function implements backpropagation to update the weights and biases of the neural network. 

In [16]:
def backward(input_data, target_output, output, a1, learning_rate=0.1):
    global W1, b1, W2, b2  # Declare globals at the start
    
    m = input_data.shape[0]

    output_error = output - target_output
    dW2 = np.dot(a1.T, output_error) / m
    db2 = np.sum(output_error, axis=0) / m
    
    hidden_error = np.dot(output_error, W2.T) * a1 * (1 - a1)
    dW1 = np.dot(input_data.T, hidden_error) / m
    db1 = np.sum(hidden_error, axis=0) / m
    
    W1 -= learning_rate * dW1
    b1 -= learning_rate * db1
    W2 -= learning_rate * dW2
    b2 -= learning_rate * db2


### Step 8. Training Loop:
This section runs a training loop for a specified number of epochs (`num_epochs`). In each epoch:
- Input data (`input_data`) and target values (`target`) are defined. 
- A forward pass is performed to get the predicted output and the intermediate activations.
- The current loss is computed using the `loss` function.
- Backpropagation is executed to adjust the weights and biases based on the error.

In [17]:
# Load and preprocess the IRIS dataset
iris = load_iris()
input_data = iris.data
target = iris.target.reshape(-1, 1)

# One hot encode the target values
encoder = OneHotEncoder(sparse=False)
target = encoder.fit_transform(target)

# Split the data into training and testing sets
train_data, test_data, train_target, test_target = train_test_split(input_data, target, test_size=0.2, random_state=42)

num_epochs = 1000

for epoch in range(num_epochs):
    output, a1 = forward(train_data)
    current_loss = loss(output, train_target)
    backward(train_data, train_target, output, a1)
    
    if (epoch + 1) % 100 == 0:
        print(f'Epoch {epoch + 1}, Loss: {current_loss}')


Epoch 100, Loss: 0.7258009559928861
Epoch 200, Loss: 0.6099738404180975
Epoch 300, Loss: 0.5557907540900652
Epoch 400, Loss: 0.5269849886784278
Epoch 500, Loss: 0.5100887697979113
Epoch 600, Loss: 0.49916775267736113
Epoch 700, Loss: 0.49136864163446153
Epoch 800, Loss: 0.4850218876415384
Epoch 900, Loss: 0.4783990161707043
Epoch 1000, Loss: 0.4649933768182768




### Step 9. Testing the Trained Model:
After training, the code tests the trained model on the validation set using the `test_data`. It prints the performance metrics.

In [18]:
def predict(input_data):
    output, _ = forward(input_data)
    return np.argmax(output, axis=1)

# Predict on test data
predictions = predict(test_data)
accuracy = np.mean(predictions == test_target.argmax(axis=1))
print(f'Test Accuracy: {accuracy * 100:.2f}%')


Test Accuracy: 100.00%


**Find More Labs**

This lab is from my Machine Learning Course, that is a part of my [Software Engineering](https://seecs.nust.edu.pk/program/bachelor-of-software-engineering-for-fall-2021-onward) Degree at [NUST](https://nust.edu.pk).

The content in the provided list of notebooks covers a range of topics in **machine learning** and **data analysis** implemented from scratch or using popular libraries like **NumPy**, **pandas**, **scikit-learn**, **seaborn**, and **matplotlib**. It includes introductory materials on NumPy showcasing its efficiency for mathematical operations, **linear regression**, **logistic regression**, **decision trees**, **K-nearest neighbors (KNN)**, **support vector machines (SVM)**, **Naive Bayes**, **K-means** clustering, principle component analysis (**PCA**), and **neural networks** with **backpropagation**. Each notebook demonstrates practical implementation and application of these algorithms on various datasets such as the **California Housing** Dataset, **MNIST** dataset, **Iris** dataset, **Auto-MPG** dataset, and the **UCI Adult Census Income** dataset. Additionally, it covers topics like **gradient descent optimization**, model evaluation metrics (e.g., **accuracy, precision, recall, f1 score**), **regularization** techniques (e.g., **Lasso**, **Ridge**), and **data visualization**.

| Title                                                                                                                   | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| ----------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [01 - Intro to Numpy](https://www.kaggle.com/code/sacrum/ml-labs-01-intro-to-numpy)                                     | The notebook demonstrates NumPy's efficiency for mathematical operations like array `reshaping`, `sigmoid`, `softmax`, `dot` and `outer products`, `L1 and L2 losses`, and matrix operations. It highlights NumPy's superiority over standard Python lists in speed and convenience for scientific computing and machine learning tasks.                                                                                                                                                                                              |
| [02 - Linear Regression From Scratch](https://www.kaggle.com/code/sacrum/ml-labs-02-linear-regression-from-scratch)     | This notebook implements `linear regression` and `gradient descent` from scratch in Python using `NumPy`, focusing on predicting house prices with the `California Housing Dataset`. It defines functions for prediction, `MSE` calculation, and gradient computation. Batch gradient descent is used for optimization. The dataset is loaded, scaled, and split. `Batch, stochastic, and mini-batch gradient descents` are applied with varying hyperparameters. Finally, the MSEs of the predictions from each method are compared. |
| [03 - Logistic Regression from Scratch](https://www.kaggle.com/code/sacrum/ml-labs-03-logistic-regression-from-scratch) | This notebook outlines the implementation of `logistic regression` from scratch in Python using `NumPy`, including functions for prediction, loss calculation, gradient computation, and batch `gradient descent` optimization, applied to the `MNIST` dataset for handwritten digit recognition and `Iris` data. And also inclues metrics like `accuracy`, `precision`, `recall`, `f1 score`                                                                                                                                         |
| [04 - Auto-MPG Regression](https://www.kaggle.com/code/sacrum/ml-labs-04-auto-mpg-regression)                           | The notebook uses `pandas` for data manipulation, `seaborn` and `matplotlib` for visualization, and `sklearn` for `linear regression` and `regularization` techniques (`Lasso` and `Ridge`). It includes data loading, processing, visualization, model training, and evaluation on the `Auto-MPG dataset`.                                                                                                                                                                                                                           |
| [05 - Desicion Trees from Scratch](https://www.kaggle.com/code/sacrum/ml-labs-05-desicion-trees-from-scratch)           | In this notebook, `DecisionTree` algorithm has been implmented from scratch and applied on dummy dataset                                                                                                                                                                                                                                                                                                                                                                                                                              |
| [06 - KNN from Scratch](https://www.kaggle.com/code/sacrum/ml-labs-06-knn-from-scratch)                                 | In this notebook, `K-Nearest Neighbour` algorithm has been implemented from scratch and compared with KNN provided in scikit-learn package                                                                                                                                                                                                                                                                                                                                                                                            |
| [07 - SVM](https://www.kaggle.com/code/sacrum/ml-labs-07-svm)                                                           | This notebook implements `SVM classifier` on `Iris Dataset`                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| [08 - Naive Bayes](https://www.kaggle.com/code/sacrum/ml-labs-08-naive-bayes)                                           | This notebook trains `Naive Bayes` and compares it with other algorithms `Decision Trees`, `SVM` and `Logistic Regression`                                                                                                                                                                                                                                                                                                                                                                                                            |
| [09 - K-means](https://www.kaggle.com/code/sacrum/ml-labs-09-k-means)                                                   | In this notebook `K-means` algorithm has been implemented using `scikit-learn` and different values of `k` are compared to understand the `elbow method` in `Calinski Harabasz Scores`                                                                                                                                                                                                                                                                                                                                                |
| [10 - UCI Adult Census Income](https://www.kaggle.com/code/sacrum/ml-labs-10-uci-adult-census-income)                   | Here I have used the UCI Adult Income dataset and applied different machine learning algorithms to find the best model configuration for predicting salary from the given information                                                                                                                                                                                                                                                                                                                                                 |
| [11 - PCA](https://www.kaggle.com/code/sacrum/ml-labs-11-pca)                                                           | `Principle Component Analysis` implemented from scratch                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| [12 - Neural Networks](https://www.kaggle.com/code/sacrum/ml-labs-12-neural-networks)                                   | This code implements neural networks with back propagation from scratch                                                                                                                                                                                                                                                                                                                                                                                                                                                               |