# Neural Networks from Scratch

## Objective

This notebook walks through the implementation of a **feed-forward neural network (MLP)** using **only NumPy**.

The purpose is to understand:

* how neural networks compute outputs
* how gradients are calculated via backpropagation
* how learning happens through gradient descent

In brief, we have
* **Layer:** `Z = X @ W + b`, `A = f(Z)`
* **Activation (ReLU):** `ReLU(z) = max(0, z)`
* **Loss (MSE):** `L = mean((y - ŷ)²)`
* **Backprop:**

  ```
  dW = Xᵀ @ dZ
  db = sum(dZ)
  dX = dZ @ Wᵀ
  ```
* **Update:** `W -= lr * dW`, `b -= lr * db`

> Linear algebra + non-linearity + gradients = learning.


## Core Building Blocks

In [1]:
import numpy as np

In [10]:
### Activations

def sigmoid(x):
    return 1/(1 + np.exp(-x))

def sigmoid_derivative(x):
    s = sigmoid(x)
    return s * (1 - s)

def relu(x):
    return np.maximum(0, x)

def relu_derivative(x):
    return np.where(x > 0, 1, 0)

In [11]:
### Loss Functions

def mse(y_true, y_pred):
    return np.mean((y_true - y_pred) ** 2)

def mse_derivative(y_true, y_pred):
    return -2 * (y_true - y_pred) / y_true.size

In [12]:
### Fully Connected Layer or Dense Layer

class DenseLayer:
    def __init__(self, input_dim, output_dim):
        self.input_dim = input_dim
        self.output_dim = output_dim
        self.w = np.random.randn(input_dim, output_dim) * 0.01
        self.b = np.zeros((1, output_dim))

    def forward(self, X):
        self.X = X
        self.Z = self.X @ self.w + self.b
        return self.Z
    
    def backward(self, dZ, learning_rate=0.01):
        m = self.X.shape[0]
        self.dw = self.X.T @ dZ / m
        self.db = np.sum(dZ, axis=0, keepdims=True) / m
        dX = dZ @ self.w.T
        
        # Update weights and biases
        self.w -= learning_rate * self.dw
        self.b -= learning_rate * self.db
        
        return dX

In [19]:
class Activation:
    def __init__(self, func, func_derivative):
        self.func = func
        self.func_derivative = func_derivative

    def forward(self, Z):
        self.Z = Z
        return self.func(Z)

    def backward(self, dA):
        return dA * self.func_derivative(self.Z)


class NeuralNetwork:
    def __init__(self):
        self.layers = []

    def add(self, layer):
        self.layers.append(layer)

    def forward(self, X):
        for layer in self.layers:
            X = layer.forward(X)
        return X

    def backward(self, dLoss, lr):
        for layer in reversed(self.layers):
            dLoss = layer.backward(dLoss)


### Lets try it out!!

In [20]:
# Dummy dataset
np.random.seed(0)
X = np.random.rand(100, 1)
y = 3 * X + 2

# Build network
nn = NeuralNetwork()
nn.add(DenseLayer(1, 16))
nn.add(Activation(relu, relu_derivative))
nn.add(DenseLayer(16, 1))

# Train
epochs = 1000
lr = 0.1

for epoch in range(epochs):
    y_pred = nn.forward(X)
    loss = mse(y, y_pred)

    dLoss = mse_derivative(y, y_pred)
    nn.backward(dLoss, lr)

    if epoch % 100 == 0:
        print(f"Epoch {epoch}, Loss: {loss:.4f}")


Epoch 0, Loss: 12.4316
Epoch 100, Loss: 11.9718
Epoch 200, Loss: 11.5301
Epoch 300, Loss: 11.1056
Epoch 400, Loss: 10.6976
Epoch 500, Loss: 10.3056
Epoch 600, Loss: 9.9289
Epoch 700, Loss: 9.5667
Epoch 800, Loss: 9.2186
Epoch 900, Loss: 8.8840
