**UE Initiation à la R&D - ENSIIE (2022/2023)**

# Introduction

In this notebook, we will illustrate two concepts:

- supervised classification with a logistic regression,
- supervised classification with a feed-forward neural network.

# Setup

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
from einops import asnumpy, rearrange, reduce
from einops.layers.torch import Rearrange
from torchvision import datasets
from torchvision.transforms import ToTensor

# Multiclass classification: MNIST

## Data

In [None]:
# load the data
train_dataset = datasets.MNIST(
    root="data", train=True, download=True, transform=ToTensor()
)

test_dataset = datasets.MNIST(
    root="data", train=False, download=True, transform=ToTensor()
)

Plot some images.

In [None]:
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(train_dataset), size=(1,)).item()
    img, label = train_dataset[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>What are the sizes of the train and test sets? What is the size of an image? The number of classes?</p>
</div>

In [None]:
input_size = ...
n_classes = ..

## Logistic regression

**Define the model.**

In [None]:
model = nn.Sequential(
    Rearrange("b c h w -> b (c h w)"),  # batch channel height width
    nn.Linear(in_features=input_size, out_features=n_classes),
    nn.Sigmoid(),
)
model

**Define the loss**

In [None]:
loss_func = nn.CrossEntropyLoss()

**Parameter for the optimization.**

In [None]:
# Hyper-parameters
n_epochs = 5
batch_size = 100
learning_rate = 0.001

# batch loader
train_loader = torch.utils.data.DataLoader(
    dataset=train_dataset, batch_size=batch_size, shuffle=True
)

test_loader = torch.utils.data.DataLoader(
    dataset=test_dataset, batch_size=batch_size, shuffle=False
)

n_batches = len(train_loader)

In [None]:
# Optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

### Train the model

In [None]:
for k_epoch in range(n_epochs):
    for k_batch, (images, y_true) in enumerate(train_loader):

        # Forward pass
        y_pred = model(images)
        loss = loss_func(y_pred, y_true)

        # Backward and optimize
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (k_batch + 1) % 100 == 0:
            msg = f"Epoch [{k_epoch+1}/{n_epochs}], Batch [{k_batch + 1}/{n_batches}], Loss: {loss.item():.4f}"
            print(msg, end="\r")

### Compute the test error

In [None]:
# In test phase, we don't need to compute gradients (for memory efficiency)
with torch.no_grad():
    correct = 0
    total = 0
    for (images, y_true) in test_loader:
        y_prob = model(images)
        _, y_pred = torch.max(y_prob.data, 1)  # argmax for each test instance

        total += y_true.size(0)
        correct += (y_pred == y_true).sum().item()

    print(
        f"Accuracy of the network on the 10000 test images: {100 * correct / total} %"
    )

### Display the learned weights

In [None]:
linear_layer = model[1]
weights = asnumpy(rearrange(linear_layer.weight, "c (h w)-> c h w", h=28, w=28))

In [None]:
scale = np.abs(weights).max()
plt.figure(figsize=(10, 5))

for i in range(10):  # 0-9
    coef_plot = plt.subplot(2, 5, i + 1)  # 2x5 plot

    coef_plot.imshow(
        weights[i], cmap=plt.cm.RdBu, vmin=-scale, vmax=scale, interpolation="bilinear"
    )

    coef_plot.set_xticks(())
    coef_plot.set_yticks(())  # remove ticks
    coef_plot.set_xlabel(f"Class {i}")

plt.suptitle("Coefficients for various classes");

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>Interpretation?</p>
</div>

## Feed-forward neural network

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>Describe a simple neural network.</p>
</div>

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>Is this model better or worse than the logistic regression? Try different hidden sizes.</p>
</div>

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>Which error do you report on this task?</p>
</div>