# 🧠 Simple Neural Network from Scratch using NumPy

## 📌 Project Overview

This notebook demonstrates how to build and train a simple neural network **from scratch** using **NumPy**, without relying on high-level deep learning frameworks like TensorFlow or PyTorch.

The project follows these key goals:

1. **Implement Dense (Fully Connected) Layers**
2. **Use ReLU and Softmax Activation Functions**
3. **Perform Forward and Backward Propagation**
4. **Train with Gradient Descent using Batch Processing**
5. **Use Cross-Entropy Loss Function**
6. **Evaluate Accuracy on the Wine Dataset**

---

## 🧪 Dataset

We use the **Wine dataset** from the UCI Machine Learning Repository:
- 178 samples with 13 numerical features each.
- 3 target classes representing different cultivars of wine.
- Labels are one-hot encoded for training.

---

## 🔧 Core Components Implemented

- **Activation Functions**: `ReLU`, `Softmax`, and their derivatives
- **Loss Function**: `Cross-Entropy` and its derivative
- **Dense Layer Class**: Handles weights, bias, forward and backward propagation
- **Neural Network Class**: Composed of 3 layers (each with 1 neuron for simplicity)
- **Training Loop**: Supports batch training with manual weight updates

---


## 📎 Note

This notebook is structured for clarity and ease of understanding. Each code block is followed by short explanations and outputs.



In [None]:
import numpy as np
import urllib.request


In [None]:
import urllib.request
import numpy as np

# Load the Wine dataset from the UCI Machine Learning Repository
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data'
raw_data = urllib.request.urlopen(url)
data = np.genfromtxt(raw_data, delimiter=',')

"""
The dataset has class labels in the first column (1, 2, or 3) and
13 numerical features in the remaining columns.
We separate them into `y` (labels) and `X` (features).
"""
y = data[:, 0].astype(int)
X = data[:, 1:]

"""
Normalize the features to have zero mean and unit variance.
This helps the neural network converge faster during training.
"""
X = (X - X.mean(axis=0)) / X.std(axis=0)

"""
Define a one-hot encoding function.
For a multiclass classification task, we convert the integer class labels into
binary vectors. For example, class 1 becomes [1, 0, 0], class 2 becomes [0, 1, 0], etc.
"""
def one_hot_encode(y, num_classes=None):
    if num_classes is None:
        num_classes = np.max(y)
    return np.eye(num_classes)[y - 1]  # subtract 1 to shift classes to 0-based index

# Apply one-hot encoding to the labels
y_encoded = one_hot_encode(y, num_classes=3)


In [None]:
"""
Activation and loss functions:
- relu: activation function, replaces negatives with 0
- relu_derivative: gradient for relu
- softmax: normalizes outputs into probabilities
- cross_entropy: loss function for classification
- cross_entropy_derivative: gradient of loss w.r.t predictions
"""

def relu(x):
    return np.maximum(0, x)

def relu_derivative(x):
    return (x > 0).astype(float)

def softmax(x):
    e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return e_x / np.sum(e_x, axis=1, keepdims=True)

def cross_entropy(predictions, targets):
    m = targets.shape[0]
    return -np.sum(targets * np.log(predictions + 1e-15)) / m

def cross_entropy_derivative(predictions, targets):
    return predictions - targets


In [None]:
class DenseLayer:
    def __init__(self, input_size, activation):
        self.weights = np.random.randn(input_size, 1) * 0.01
        self.bias = np.zeros((1,))
        self.activation = activation

    def forward(self, input):
        self.input = input
        self.linear_output = np.dot(input, self.weights) + self.bias

        if self.activation == 'relu':
            self.output = relu(self.linear_output)
        elif self.activation == 'softmax':
            self.output = softmax(self.linear_output)
        return self.output

    def backward(self, output_error, learning_rate):
        if self.activation == 'relu':
            delta = output_error * relu_derivative(self.linear_output)
        elif self.activation == 'softmax':
            delta = output_error

        input_error = np.dot(delta, self.weights.T)
        weights_error = np.dot(self.input.T, delta)

        self.weights -= learning_rate * weights_error
        self.bias -= learning_rate * np.sum(delta, axis=0)

        return input_error

In [None]:
class SimpleNeuralNetwork:
    """
    DenseLayer:
    - Initializes weights and biases for a single neuron layer.
    - Supports ReLU and Softmax activation.
    - Performs forward pass (linear + activation).
    - Performs backward pass (computes gradients and updates weights).
    """

    def __init__(self, input_dim, output_dim):
        self.layer1 = DenseLayer(input_dim, 'relu')
        self.layer2 = DenseLayer(1, 'relu')
        self.layer3 = DenseLayer(1, 'softmax')  # softmax later handled manually

        # Final transformation layer
        self.output_weights = np.random.randn(1, output_dim) * 0.01
        self.output_bias = np.zeros((1, output_dim))

    def forward(self, x):
        out1 = self.layer1.forward(x)
        out2 = self.layer2.forward(out1)
        out3 = self.layer3.forward(out2)
        self.logits = np.dot(out3, self.output_weights) + self.output_bias
        self.output = softmax(self.logits)
        return self.output

    def backward(self, y_true, learning_rate):
        error = cross_entropy_derivative(self.output, y_true)
        d_logits = error

        d_output_weights = np.dot(self.layer3.output.T, d_logits)
        d_output_bias = np.sum(d_logits, axis=0, keepdims=True)
        d_hidden = np.dot(d_logits, self.output_weights.T)

        self.output_weights -= learning_rate * d_output_weights
        self.output_bias -= learning_rate * d_output_bias

        d_out3 = self.layer3.backward(d_hidden, learning_rate)
        d_out2 = self.layer2.backward(d_out3, learning_rate)
        self.layer1.backward(d_out2, learning_rate)

    def train(self, X, y, epochs=1000, batch_size=16, learning_rate=0.1):
        for epoch in range(epochs):
            indices = np.random.permutation(X.shape[0])
            X_shuffled = X[indices]
            y_shuffled = y[indices]

            for i in range(0, X.shape[0], batch_size):
                X_batch = X_shuffled[i:i+batch_size]
                y_batch = y_shuffled[i:i+batch_size]

                output = self.forward(X_batch)
                loss = cross_entropy(output, y_batch)
                self.backward(y_batch, learning_rate)

            if epoch % 100 == 0:
                print(f"Epoch {epoch}, Loss: {loss:.4f}")

    def predict(self, X):
        output = self.forward(X)
        return np.argmax(output, axis=1)


In [None]:
"""
Initialize and train the neural network:
- Creates a model with input size equal to feature dimension and 3 output classes.
- Trains the model using cross-entropy loss and batch gradient descent.
"""

model = SimpleNeuralNetwork(input_dim=X.shape[1], output_dim=3)

# Train
model.train(X, y_encoded, epochs=1000, batch_size=16, learning_rate=0.001)


Epoch 0, Loss: 1.0829
Epoch 100, Loss: 1.1067
Epoch 200, Loss: 1.1067
Epoch 300, Loss: 1.1047
Epoch 400, Loss: 1.2083
Epoch 500, Loss: 1.1051
Epoch 600, Loss: 1.0119
Epoch 700, Loss: 1.2082
Epoch 800, Loss: 1.0121
Epoch 900, Loss: 1.3132


In [None]:
"""
Evaluate model performance:
- Predict class labels on training data.
- Compare with true labels to calculate accuracy.
"""

y_pred = model.predict(X)
y_true = np.argmax(y_encoded, axis=1)

# Accuracy
accuracy = np.mean(y_pred == y_true)
print(f"\nFinal Accuracy: {accuracy * 100:.2f}%")



Final Accuracy: 39.89%
