# Neural Network from Scratch Demo

This notebook demonstrates the process of building, training, and evaluating a simple neural network implemented from scratch using the `micrograd` library. The neural network code is defined in the `nn.py` file, which provides a minimal framework for constructing multi-layer perceptrons (MLPs) and performing automatic differentiation.

## Notebook Overview

- **Data Preparation:** We define a small dataset where each input is a list of four numbers, and the target label is determined by whether the sum of the input is positive or not.
- **Model Construction:** The neural network model is created using the custom `MLP` class from `micrograd.nn`.
- **Training Loop:** The model is trained using gradient descent, with the loss and parameter updates computed manually.
- **Evaluation:** The notebook shows predictions before and after training, and demonstrates how the model learns to classify the inputs correctly.

This notebook serves as an educational example for understanding the inner workings of neural networks, forward and backward passes, and gradient-based optimization, all implemented from scratch for transparency and learning.

In [8]:
import numpy as np

from micrograd.nn import  MLP

x = [
    [1.0, -2.0, 3.0, 4.0],      # sum = 6.0  -> y = 1
    [-0.5, -1.5, 2.5, 3.5],     # sum = 4.0  -> y = 1
    [-2.0, -2.0, 2.0, 1.0],     # sum = -1.0 -> y = -1
    [0.0, 0.0, 0.0, 0.0],       # sum = 0.0  -> y = -1
    [5.0, -1.0, 0.0, -4.0],     # sum = 0.0  -> y = -1
    [-3.0, 2.0, 1.0, 0.0],      # sum = 0.0  -> y = -1
    [1.0, 1.0, -1.0, -1.0],     # sum = 0.0  -> y = -1
    [2.5, 2.5, -2.5, 2.5]       # sum = 5.0  -> y = 1
]

y = [1 if sum(row) > 0 else -1 for row in x]
y

[1, 1, -1, -1, -1, -1, -1, 1]

In [3]:
model = MLP(input_size=4, hidden_sizes=[5, 5], output_size=1)
predictions = [model.forward(x_i) for x_i in x]
predictions

[Value(0.6285093447143044),
 Value(0.6463391763868636),
 Value(0.7243709615384885),
 Value(0.5239761348711873),
 Value(0.5885857598967051),
 Value(0.6148290660715675),
 Value(0.6866854366042201),
 Value(0.4722425749376248)]

## Training the Neural Network

The goal is to train a neural network to classify inputs based on whether their sum is positive (output = 1) or negative/zero (output = -1).

**How Gradient Descent Works:**
During training, we update each parameter by subtracting a fraction of its gradient (scaled by the learning rate). This moves the parameters in the direction that reduces the loss:
- ❌ `param.data += param.grad` (moves in direction of increasing loss)
- ✅ `param.data -= learning_rate * param.grad` (moves in direction of decreasing loss - gradient descent)


In [18]:
# Convert inputs to Value objects for proper gradient tracking
x_values = [[Value(xi) for xi in row] for row in x]

learning_rate = 0.01

for epoch in range(100):
    
    # Forward pass - use Value objects as inputs
    predictions = [model.forward(x_i) for x_i in x_values]

    # Compute loss (mean squared error)
    loss = sum((pred - target) ** 2 for target, pred in zip(y, predictions))

    # Backward pass - zero gradients first
    for param in model.parameters():
        param.grad = 0.0

    loss.backward()
    
    # Update parameters (gradient descent: subtract gradient)
    for param in model.parameters():
        param.data -= learning_rate * param.grad
    
    if epoch % 10 == 0:  # Print every 10 epochs to reduce output
        print(f"Epoch {epoch+1}, Loss: {loss.data:.4f}")

print(f"Final Loss: {loss.data:.4f}")


Epoch 1, Loss: 14.0827
Epoch 11, Loss: 3.2389
Epoch 21, Loss: 1.0345
Epoch 31, Loss: 0.3792
Epoch 41, Loss: 0.1982
Epoch 51, Loss: 0.1273
Epoch 61, Loss: 0.0918
Epoch 71, Loss: 0.0710
Epoch 81, Loss: 0.0576
Epoch 91, Loss: 0.0483
Final Loss: 0.0420
Epoch 41, Loss: 0.1982
Epoch 51, Loss: 0.1273
Epoch 61, Loss: 0.0918
Epoch 71, Loss: 0.0710
Epoch 81, Loss: 0.0576
Epoch 91, Loss: 0.0483
Final Loss: 0.0420


In [19]:
predictions

[Value(0.9390086676532606),
 Value(0.8925452207542912),
 Value(-0.8975398669649308),
 Value(-0.9061631461857372),
 Value(-0.9837607365348159),
 Value(-0.9546030516888907),
 Value(-0.9651620796473922),
 Value(0.9374773135189293)]