# Lab 9: Neural Networks

In this coding lab, we will explore the fundamentals of building a simple neural network using Python. We will use the well-known Iris dataset to classify the flowers based on their features.

Throughout this lesson, you will learn how to:

1. **Prepare the Data**: Load the Iris dataset, preprocess it by normalizing the features, and apply one-hot encoding to the target labels.
2. **Build a Neural Network**: Implement a basic neural network from scratch, including forward propagation, loss calculation, backpropagation, and parameter updates.
3. **Train the Model**: Train the neural network on the Iris dataset and monitor the loss to ensure convergence.
3. **Evaluate Performance**: Assess the model's accuracy on unseen data and visualize the training process.

By the end of this lab, you will have a solid understanding of how neural networks work and how to implement them, from scratch, for classification tasks!

***Note***: This lab is structured like our Decision Tree lab. There won't be many "<font color='red'>**TRY IT**</font> &#x1f9e0;s", but rather, we will walk through this live together. If you are following along from outside of our course, try to fill in wherever you see "XXXX". If you get stuck, refer to the answer key that's also posted in the repository.

### Data Preparation

This section of code is responsible for preparing the Iris dataset for training a neural network.

1. **Loading the Data:** The `load_iris()` function retrieves the Iris dataset, which consists of features (measurements of iris flowers) and target labels (species of the flowers).

2. **One-Hot Encoding:** The target labels are reshaped and converted to a one-hot encoded format using OneHotEncoder. This transformation allows the neural network to predict class probabilities rather than single integer labels, facilitating multi-class classification.

3. **Feature Normalization:** The features are standardized using `StandardScaler`, which scales the data to have a mean of 0 and a standard deviation of 1. Normalization improves the model's convergence during training by ensuring that all features contribute equally.

4. **Train-Test Split:** The dataset is split into training and testing sets using `train_test_split`. This separation is crucial for evaluating the model's performance on unseen data.

Together, these ensure that the data is in the right format and scale for training the neural network.


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

# One-Hot Encoding the target labels
y = y.reshape(-1, 1)
encoder = OneHotEncoder(sparse_output=False)
y_encoded = encoder.fit_transform(y)

# Normalize the features
scaler = StandardScaler()
X_normalized = scaler.fit_transform(X)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_normalized, y_encoded, test_size=0.3, random_state=42)

### Define the Neural Network Structure
We'll create a simple one-layer neural network with an input layer and an output layer. In this case, we’ll use a single layer to classify three species of Iris.

In [None]:
class SimpleNeuralNetwork:
    def __init__(self, input_size, output_size, learning_rate=0.01):
        """
        Initialize the neural network with random weights and biases.

        Parameters:
        input_size (int): Number of input features.
        output_size (int): Number of output classes.
        learning_rate (float): Learning rate for weight updates.
        """
        self.learning_rate = learning_rate
        self.weights = XXXX
        self.bias = XXXX

    def softmax(self, z):
        """Compute softmax values for each class in z."""
        exp_z = np.exp(z - np.max(z))  # subtraction of np.max(z) is for stability improvement
        return XXXX / XXXX

    def forward(self, X):
        """Forward pass: computes predicted class probabilities."""
        z = np.dot(XXXX, XXXX) + XXXX  # Calculate the linear combination of inputs and weights
        return XXXX # Apply the softmax function

    def compute_loss(self, y_pred, y_true):
        """Compute the cross-entropy loss."""
        m = y_true.shape[0]  # Number of samples

        # Get the index of the true class for each sample
        true_class_indices = np.argmax(y_true, axis=1)

        # Calculate log-likelihood using the true class indices
        log_likelihood = -np.log(y_pred[range(m), true_class_indices])
        loss = XXXX / XXXX  # Average loss

        return loss

    def backward(self, X, y_true, y_pred):
        """Backward pass: computes gradients."""
        m = y_true.shape[0]  # Number of samples

        # Get the index of the true class for each sample
        true_class_indices = np.argmax(y_true, axis=1)
        y_pred[range(m), true_class_indices] -= 1  # Subtract 1 from predicted classes
        y_pred /= m  # Normalize gradients by number of samples

        dw = np.dot(X.T, y_pred)  # Gradient for weights
        db = np.sum(y_pred, axis=0)  # Gradient for bias

        return dw, db

    def update_parameters(self, dw, db):
        """
        Update weights and biases using computed gradients.

        This is where the gradient descent step occurs:
        The weights and biases are updated to reduce the loss.
        """
        # Update weights and biases using gradient descent
        self.weights -= XXXX  # Adjust weights
        self.bias -= XXXX  # Adjust biases

    def train(self, X, y, epochs=1000):
        """
        Train the neural network using the provided data.
        """
        loss_history = []  # To store loss values

        for epoch in range(epochs):

            # Forward pass: calculate predictions
            y_pred = XXXX
            # Compute the loss for the current predictions
            loss = XXXX

            # Backward pass: compute gradients
            dw, db = XXXX
            # Update parameters based on gradients (Gradient Descent)
            XXXX

            # track the loss for plotting purposes
            loss_history.append(loss)

            # Print loss every 100 epochs for monitoring
            if epoch % 50 == 0:
                print(f'Epoch {epoch}, Loss: {loss:.4f}')

        # Plot the loss over epochs
        plt.plot(loss_history)
        plt.title('Loss over Epochs')
        plt.xlabel('Epochs')
        plt.ylabel('Loss')
        plt.grid()
        plt.show()

In [None]:
# Create and train the neural network
input_size = X_train.shape[1] # this is the number of features
output_size = y_train.shape[1] # this is the number of classes
nn = SimpleNeuralNetwork(XXXX, XXXX, XXXX=XXXX)
nn.train(X_train, y_train, epochs=XXXX)

### Step 4: Making Predictions
After training, we can use our neural network to make predictions. We just call the `forward` function we made to run the data through the neural network. Then we take the max of the output probabilities to get the most likely class each observation belongs to.

In [None]:
# Evaluate the model on the test set
y_pred = nn.forward(X_test)
y_pred_classes = np.argmax(y_pred, axis=1)
y_test_classes = np.argmax(y_test, axis=1)

accuracy = accuracy_score(y_test_classes, y_pred_classes)
print(f'Accuracy: {accuracy:.2f}')

How were your results? Not so good, right? That's because we've only done a few epochs!

"<font color='red'>**TRY IT**</font> &#x1f9e0;": Increase that number and see what happens. Still not happy with your results? Change the learning rate and see what happens!