# Micrograd: Train a Neural Net

This notebook uses the **engine** (autograd) and **nn** (MLP) from micrograd to train a small neural network on a toy regression task.

## 1. Imports and setup

In [2]:
import random
from engine import Value
from nn import MLP

## 2. Toy dataset

We'll learn a simple function: **target = x0 + 2*x1** (2 inputs â†’ 1 output). Each sample is a list of floats; we'll wrap them in `Value` during the forward pass so the graph is built and gradients flow.

In [3]:
def make_data(n_samples=100, seed=42):
    random.seed(seed)
    data = []
    for _ in range(n_samples):
        x0 = random.uniform(-2, 2)
        x1 = random.uniform(-2, 2)
        target = x0 + 2 * x1  # function we want to learn
        data.append(([x0, x1], target))
    return data

data = make_data(n_samples=200)
print(f"Dataset size: {len(data)}")
print(f"Example: input {data[0][0]} -> target {data[0][1]:.4f}")

Dataset size: 200
Example: input [0.557707193831535, -1.8999569791093323] -> target -3.2422


## 3. Model and training loop

- **Model**: MLP with 2 inputs, one hidden layer of size 4, 1 output (regression).
- **Loss**: mean squared error (MSE).
- **Optimizer**: manual gradient descent: `p.data -= lr * p.grad` for each parameter.

In [4]:
nin = 2
nouts = [4, 1]  # hidden size 4, output size 1
model = MLP(nin, nouts)
lr = 0.05
epochs = 100

print(model)

MLP of [Layer of [ReLUNeuron(2), ReLUNeuron(2), ReLUNeuron(2), ReLUNeuron(2)], Layer of [LinearNeuron(4)]]


In [5]:
for epoch in range(epochs):
    model.zero_grad()
    total_loss = 0.0
    
    for (x_list, target) in data:
        # Wrap inputs in Value so the computation graph is built
        x = [Value(xi) for xi in x_list]
        pred = model(x)
        loss = (pred - target) ** 2
        total_loss += loss.data
        loss.backward()
    
    # Gradient descent: update every parameter (average grads over samples)
    n = len(data)
    for p in model.parameters():
        p.data -= lr * (p.grad / n)
    
    mse = total_loss / len(data)
    if (epoch + 1) % 20 == 0 or epoch == 0:
        print(f"Epoch {epoch + 1:3d}  MSE = {mse:.6f}")

Epoch   1  MSE = 9.669970
Epoch  20  MSE = 0.143600
Epoch  40  MSE = 0.059303
Epoch  60  MSE = 0.046844
Epoch  80  MSE = 0.038989
Epoch 100  MSE = 0.032848


## 4. Check predictions

Compare model output to the true function **target = x0 + 2*x1** on a few points.

In [6]:
test_points = [[0, 0], [1, 0], [0, 1], [1, 1], [-1, 2]]
print("Input        True (x0+2*x1)  Predicted")
print("-" * 45)
for x0, x1 in test_points:
    x = [Value(x0), Value(x1)]
    pred = model(x)
    true_val = x0 + 2 * x1
    print(f"[{x0:4.1f}, {x1:4.1f}]      {true_val:8.2f}       {pred.data:.4f}")

Input        True (x0+2*x1)  Predicted
---------------------------------------------
[ 0.0,  0.0]          0.00       -0.2636
[ 1.0,  0.0]          1.00       1.1590
[ 0.0,  1.0]          2.00       2.0168
[ 1.0,  1.0]          3.00       2.9675
[-1.0,  2.0]          3.00       2.9752
