# Multilayer Perceptrons (MLP) - Theory and Practice

## Introduction
A Multilayer Perceptron (MLP) is a class of feedforward artificial neural networks. An MLP consists of at least three layers of nodes: an input layer, one or more hidden layers, and an output layer. Each neuron in one layer is connected to every neuron in the next layer, and these connections have associated weights.

In this notebook, we will explore the theory and practice of MLPs using examples from the sports domain, such as predicting a player's performance based on their past game statistics.


## Why Use MLP?
MLPs are well-suited for problems where the relationship between inputs and outputs is complex and non-linear. By stacking layers and using non-linear activation functions, MLPs can approximate any continuous function.

In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt

## Step 1: Generate Data
We'll simulate some sports data, for example, basketball players' statistics. This will include points, assists, rebounds, etc., and we'll try to predict the player's performance category (excellent, average, or poor).

In [19]:
np.random.seed(0)
n_samples = 500
X = np.random.rand(n_samples, 4) * 100  # 4 features: points, assists, rebounds, steals
y = np.random.randint(0, 3, n_samples)  # 3 classes: 0 - poor, 1 - average, 2 - excellent

X.shape, y.shape

((500, 4), (500,))

In [20]:
X[:3]

array([[54.88135039, 71.51893664, 60.27633761, 54.4883183 ],
       [42.36547993, 64.58941131, 43.75872113, 89.17730008],
       [96.36627605, 38.34415188, 79.17250381, 52.88949198]])

In [21]:
y[:3]

array([1, 0, 1])

## Step 2: Create MLP Model
Now, let's create an MLP model using PyTorch. The architecture will include an input layer, two hidden layers, and an output layer corresponding to the three categories of player performance.

In [22]:
class MLP(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        x = self.relu(x)
        x = self.fc3(x)
        return x

## Step 3: Train the Model
We'll use CrossEntropyLoss since this is a classification problem. The optimizer will be Stochastic Gradient Descent (SGD).

In [23]:
input_size = 4
hidden_size = 128
output_size = 3
learning_rate = 0.01
num_epochs = 5000

model = MLP(input_size, hidden_size, output_size)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate)

## Step 4: Train and Evaluate the Model

In [24]:
X_train = torch.tensor(X, dtype=torch.float32)
y_train = torch.tensor(y, dtype=torch.long)

for epoch in range(num_epochs):
    outputs = model(X_train)
    loss = criterion(outputs, y_train)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if (epoch+1) % 100 == 0:
        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

Epoch [100/5000], Loss: 1.0633
Epoch [200/5000], Loss: 1.0406
Epoch [300/5000], Loss: 1.0224
Epoch [400/5000], Loss: 1.0049
Epoch [500/5000], Loss: 0.9938
Epoch [600/5000], Loss: 0.9837
Epoch [700/5000], Loss: 0.9728
Epoch [800/5000], Loss: 0.9621
Epoch [900/5000], Loss: 0.9415
Epoch [1000/5000], Loss: 0.9325
Epoch [1100/5000], Loss: 0.9265
Epoch [1200/5000], Loss: 0.9177
Epoch [1300/5000], Loss: 0.9039
Epoch [1400/5000], Loss: 0.9062
Epoch [1500/5000], Loss: 0.8957
Epoch [1600/5000], Loss: 0.8820
Epoch [1700/5000], Loss: 0.8785
Epoch [1800/5000], Loss: 0.8676
Epoch [1900/5000], Loss: 0.8616
Epoch [2000/5000], Loss: 0.8640
Epoch [2100/5000], Loss: 0.8600
Epoch [2200/5000], Loss: 0.8606
Epoch [2300/5000], Loss: 0.8618
Epoch [2400/5000], Loss: 0.8357
Epoch [2500/5000], Loss: 0.8295
Epoch [2600/5000], Loss: 0.8224
Epoch [2700/5000], Loss: 0.8429
Epoch [2800/5000], Loss: 0.8071
Epoch [2900/5000], Loss: 0.8333
Epoch [3000/5000], Loss: 0.8241
Epoch [3100/5000], Loss: 0.7998
Epoch [3200/5000]

## Conclusion
In this notebook, we have learned how to build a basic MLP model to predict player performance based on their statistics. By using MLP's powerful representation learning ability, we can model non-linear relationships, making it a powerful tool for sports analytics.

# Your Turn!

Build your own MLP. Take this input data, and this output data, and try to build an MLP that trains. How do you know that it trains? The loss decreases.

Try to build a 5 layers MLP, with at least one ReLU. You can use the same loss function and optimizer as we've seen in class.

In [26]:
np.random.seed(0)
n_samples = 1000
X = np.random.rand(n_samples, 6) * 100  # 6 features
y = np.random.randint(0, 4, n_samples)  # 4 classes

X.shape, y.shape

((1000, 6), (1000,))