# **MNIST-1D**: Observing deep double descent

This notebook was originally proposed at [MNIST 1D github repo](https://github.com/greydanus/mnist1d/blob/master/notebooks/deep-double-descent.ipynb) and adapted by Gabriel V. Cardoso.

The goal of this notebook is to reproduce the double descent experiment with different networks.

In [1]:
import matplotlib.pyplot as plt
import numpy as np

import torch
import torch.nn as nn
from torch.utils.data import TensorDataset, DataLoader
from tqdm import tqdm
from typing import Dict, Tuple

In [2]:
from mnist1d.data import make_dataset, get_dataset_args

args = get_dataset_args()
args.num_samples = 16_000
args.train_split = 0.25

data = make_dataset(args)

print(data['x'].shape)
print(data['x_test'].shape)

(4000, 40)
(12000, 40)


In [3]:
# Add 15% noise to training labels

import copy
data_with_label_noise = copy.deepcopy(data)

for i in range(len(data['y'])):
    if np.random.random_sample() < 0.15:
        data_with_label_noise['y'][i] = np.random.randint(0, 10)

In [4]:
# Define MLP architecture with one hidden layer

def get_model(n_hidden):
    return nn.Sequential(
        nn.Linear(40, n_hidden),
        nn.ReLU(),
        nn.Linear(n_hidden, n_hidden),
        nn.ReLU(),
        nn.Linear(n_hidden, 10)
    )

## Question 1:

Write a `fit_model` function according to the docstring defined below.

In [None]:
def fit_model(
    model: torch.nn.Module,
    data: Dict[str, np.ndarray],
    n_epoch: int,
    optimizer_params: Dict[str, float],
    return_last_only: bool = True,
    ) -> Tuple[torch.Tensor, torch.Tensor]:
    """
    This function must return the train and test error.
    Args:
        model (torch.nn.Module): The model being used to classify.
        data (Dict[str, np.ndarray]): A dictionary with keys ["x", "y", "x_test", "y_test"]
        n_epoch (int): number of epochs to test
        optimizer_params (Dict[str, float]): Dictionary with parameters for the optimizer, such as ["lr", "momentum"...].

    Returns:
        Tuple[torch.Tensor, torch.Tensor]: Return train and test error of the trained model.
    """
    return None

## Question 2:

Test your implementation and parameterization: Running it with MLP of size 100 should give you 0 train error. Make sure this is the case and that 0 train error is obtained not in the end of the traning

## Question 3:

Investigate double descent on the MLP class for no noise and noisy data.

## Question 4

Do the same for CNN.

In [11]:
# Define CNN architecture

def get_model_cnn(channels):
    return nn.Sequential(
        nn.Conv1d(1, channels, 5, stride=2, padding=1),
        nn.ReLU(),
        nn.Conv1d(channels, channels, 3, stride=2, padding=1),
        nn.ReLU(),
        nn.Conv1d(channels, channels, 3, stride=2, padding=1),
        nn.ReLU(),
        nn.Flatten(),
        nn.Linear(channels * 5, 10),
    )

## Question 5:

Write a short report about your findings and the impact of the different factors.