# **Practice Lab Neural Network 1**

 **Name** : Mayur Kapgate                
 **Roll Number** : 27   
 **PRN No.** : 202201040065                                   
 **Batch** : DL 2

# **Neural Network Implementation from Scratch**

**Objective** : Implement a simple feedforward neural network from scratch in Python without using
 any in-built deep learning libraries. This implementation will focus on basic components like
 forward pass, backward propagation (backpropagation), and training using gradient descent.

# **Problem Definition**

**Dataset** : We'll use the Iris dataset, a famous dataset with 150 samples of iris flowers classified into 3
species.

 **Task**: The task is a multi-class classification problem, where we will classify the flowers into one of
three species based on the features (sepal length, sepal width, petal length, and petal width).

# **Neural Network Architecture** :

 **Input Layer** : 4 neurons (since there are 4 features).

 **Hidden Layer** : 10 neurons (this is a hyperparameter).

 **Output Layer** : 3 neurons (since we have 3 classes).

 # **Methodology**

* **Activation Functions :**

 a. **Hidden layer:** ReLU (Rectified Linear Unit)

 b. **Output layer:** Softmax (for multi-class classification).

 c. **Loss Function:** Cross-Entropy Loss (because it's a classification problem).


 * **Optimization:** **Gradient Descent** (we'll use batch gradient descent here).

* **Data Preprocessig:**

 a. We load the Iris dataset using **sklearn.datasets.load_iris**.

 b. Labels are one-hot encoded using **OneHotEncoder** because we have multiple classes.

 c. Features are standardized using **StandardScale**r to improve the training performance.

* **Neural Network Initialization:** We define the number of neurons in each layer and initialize the weights (W1, W2) and biases (b1, b2) with small random values.

* **Activation Functions:**

 a. **ReLU** is used for the hidden layer to introduce non-linearity.

 b. **Softmax** is used in the output layer for multi-class classification.

* **Loss Function:** We use Cross-Entropy Loss which is appropriate for multi-class classification tasks.

* **Gradient Descent (Backpropagation):** We compute the gradients of the loss with respect to the weights and biases and update the parameters using gradient descent.

* **Training:** We train the model for 1000 epochs (you can adjust this) and print the loss every 100 epochs.

* **Evaluation:** After training, we evaluate the model on both the training and test sets by computing
accuracy

In [None]:
from google.colab import drive
drive.mount('/content/drive')

# **Neural Network** Implementation from Scratch using **Iris Dataset**

In [13]:
#import necessary libraries

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import StandardScaler

In [14]:
# Load Iris dataset

data = load_iris()
X = data.data
y = data.target

**Link to the dataset :** https://www.kaggle.com/datasets/uciml/iris

In [15]:
# One-hot encode the labels

encoder = OneHotEncoder(sparse_output=False)
y_one_hot = encoder.fit_transform(y.reshape(-1, 1))

In [16]:
# Standardize the input features

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

In [17]:
# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_one_hot, test_size=0.2, random_state=42)

In [18]:
# Neural Network Parameters

input_size = X_train.shape[1]  # 4 features
hidden_size = 10  # 10 neurons in the hidden layer
output_size = y_train.shape[1]  # 3 output classes

In [19]:
# Initialize weights and biases

np.random.seed(42)

W1 = np.random.randn(input_size, hidden_size) * 0.01  # Weight for input to hidden layer
b1 = np.zeros((1, hidden_size))  # Bias for hidden layer

W2 = np.random.randn(hidden_size, output_size) * 0.01  # Weight for hidden to output layer
b2 = np.zeros((1, output_size))  # Bias for output layer

In [20]:
# Activation functions and their derivatives

def relu(x):
    return np.maximum(0, x)

def relu_derivative(x):
    return (x > 0).astype(float)

def softmax(x):
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))  # For numerical stability
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

def cross_entropy_loss(y_pred, y_true):
    m = y_true.shape[0]
    return -np.sum(y_true * np.log(y_pred + 1e-8)) / m  # Add small epsilon for numerical stability

In [21]:
# Forward Pass

def forward(X):
    z1 = np.dot(X, W1) + b1
    a1 = relu(z1)
    z2 = np.dot(a1, W2) + b2
    a2 = softmax(z2)
    return a1, a2

In [22]:
# Backward Pass (Gradient Descent and Backpropagation)

def backward(X, y, a1, a2):
    m = X.shape[0]

    # Output layer error
    dz2 = a2 - y
    dW2 = np.dot(a1.T, dz2) / m
    db2 = np.sum(dz2, axis=0, keepdims=True) / m

    # Hidden layer error
    dz1 = np.dot(dz2, W2.T) * relu_derivative(a1)
    dW1 = np.dot(X.T, dz1) / m
    db1 = np.sum(dz1, axis=0, keepdims=True) / m

    return dW1, db1, dW2, db2

In [23]:
# Training the model using gradient descent

def train(X_train, y_train, learning_rate=0.01, epochs=1000):
    global W1, b1, W2, b2

    for epoch in range(epochs):
        # Forward pass
        a1, a2 = forward(X_train)

        # Compute the loss
        loss = cross_entropy_loss(a2, y_train)

        # Backward pass
        dW1, db1, dW2, db2 = backward(X_train, y_train, a1, a2)

        # Update the weights and biases using gradient descent
        W1 -= learning_rate * dW1
        b1 -= learning_rate * db1
        W2 -= learning_rate * dW2
        b2 -= learning_rate * db2

        if epoch % 100 == 0:
            print(f"Epoch {epoch}/{epochs}, Loss: {loss:.4f}")

In [24]:
# Evaluating the model

def predict(X):
    _, a2 = forward(X)
    return np.argmax(a2, axis=1)

In [25]:
# Training the network

train(X_train, y_train, learning_rate=0.1, epochs=1000)

Epoch 0/1000, Loss: 1.0985
Epoch 100/1000, Loss: 0.5751
Epoch 200/1000, Loss: 0.2933
Epoch 300/1000, Loss: 0.1786
Epoch 400/1000, Loss: 0.1196
Epoch 500/1000, Loss: 0.0927
Epoch 600/1000, Loss: 0.0788
Epoch 700/1000, Loss: 0.0707
Epoch 800/1000, Loss: 0.0654
Epoch 900/1000, Loss: 0.0618


In [26]:
# Evaluate the model

y_pred_train = predict(X_train)
y_pred_test = predict(X_test)

y_true_train = np.argmax(y_train, axis=1)
y_true_test = np.argmax(y_test, axis=1)

train_accuracy = np.mean(y_pred_train == y_true_train) * 100
test_accuracy = np.mean(y_pred_test == y_true_test) * 100

print(f"\nTrain Accuracy: {train_accuracy:.2f}%")
print(f"\nTest Accuracy: {test_accuracy:.2f}%")


Train Accuracy: 98.33%

Test Accuracy: 100.00%


 **Declaration:**

 I, Mayur Kapgate, confirm that the work submitted in this assignment is my own and has been completed following academic integrity guidelines. The code is uploaded on my GitHub repository account, and the repository link is provided below :

 **GitHub Repository Link:** [Insert GitHub Link]

 **Signature:** Mayur Ashok Kapgate

 **Submission Checklist:**

 ● Codefile (Python Notebook or Script)

 ● Dataset or link to the dataset

 ● Visualizations (if applicable)

 ● Screenshots of model performance metrics

 ● ReadmeFile