
# PyTorch Multilayer Perceptron (MLP) Tutorial

This notebook will guide you through building and training a simple **Multilayer Perceptron (MLP)** in PyTorch.
We'll use a toy dataset (`make_moons`) for visualization and understanding.


In [None]:

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import numpy as np


## 1 Create a Toy Dataset

In [None]:

X, y = make_moons(n_samples=1000, noise=0.2, random_state=42)
X = torch.tensor(X, dtype=torch.float32)
y = torch.tensor(y, dtype=torch.long)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

plt.scatter(X_train[:,0], X_train[:,1], c=y_train, cmap='viridis', edgecolor='k')
plt.title('Training Data (Two Moons)')
plt.show()


## 2 Define the MLP Model

In [None]:

class MLP(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(MLP, self).__init__()
        self.layers = nn.Sequential(
            nn.Linear(input_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, output_dim)
        )
    
    def forward(self, x):
        return self.layers(x)

model = MLP(input_dim=2, hidden_dim=16, output_dim=2)
print(model)


## 3 Define Loss Function and Optimizer

In [None]:

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)


## 4 Train the Model

In [None]:

epochs = 100
for epoch in range(epochs):
    outputs = model(X_train)
    loss = criterion(outputs, y_train)
    
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 10 == 0:
        print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}")


## 5 Evaluate Model Accuracy

In [None]:

with torch.no_grad():
    preds = model(X_test)
    predicted = torch.argmax(preds, dim=1)
    acc = (predicted == y_test).float().mean()
    print(f"Test Accuracy: {acc:.4f}")


## 6 Visualize Decision Boundary

In [None]:

xx, yy = np.meshgrid(np.linspace(-2, 3, 100), np.linspace(-1.5, 2, 100))
grid = torch.tensor(np.c_[xx.ravel(), yy.ravel()], dtype=torch.float32)
with torch.no_grad():
    Z = torch.argmax(model(grid), dim=1).reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.5, cmap='viridis')
plt.scatter(X_test[:,0], X_test[:,1], c=y_test, edgecolor='k')
plt.title("MLP Decision Boundary")
plt.show()


## Exercises

### Exercise 1
Modify the MLP to have **two hidden layers** with dimensions `[16, 8]` and ReLU activations.


### Exercise 2
Try replacing `ReLU` with `nn.Tanh()`. How does accuracy change?

### Exercise 3
Add **L2 regularization (weight decay)** to the optimizer and observe its effect.