This notbook (almost) replicates Figure 4a presented in the paper "*Amplified Early Stopping Bias: Overestimated Performance with Deep Learning*" as an example. Specifically, we run the code to train a multi-layer Perceptron on random Gaussian vectors for multiple network depth, training sample sizes, and input feutere sizes.  

Run the code below if you are using Google Colab (or probably also on other cloud services).

In [1]:
!git clone https://github.com/NonaRjb/DeepOverestimation.git
%cd DeepOverestimation
%pwd

fatal: destination path 'DeepOverestimation' already exists and is not an empty directory.
/content/DeepOverestimation


'/content/DeepOverestimation'

In [2]:
# Install a pip package in the current Jupyter kernel
import sys
!{sys.executable} -m pip install mat73
!{sys.executable} -m pip install mne
!{sys.executable} -m pip install torchmetrics

Collecting torchmetrics
  Downloading torchmetrics-1.5.1-py3-none-any.whl.metadata (20 kB)
Collecting lightning-utilities>=0.8.0 (from torchmetrics)
  Downloading lightning_utilities-0.11.8-py3-none-any.whl.metadata (5.2 kB)
Downloading torchmetrics-1.5.1-py3-none-any.whl (890 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m890.6/890.6 kB[0m [31m19.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading lightning_utilities-0.11.8-py3-none-any.whl (26 kB)
Installing collected packages: lightning-utilities, torchmetrics
Successfully installed lightning-utilities-0.11.8 torchmetrics-1.5.1


## The original values
In the original experiment that results in the numbers presented in Fig. 4a, we used the following values for different inputs to the Python program:

1. **D (input feature size):** d_array = (4 8 16 32 64 128 256 512)
2. **N (number of training samples):** n_array = (50 100 200 400 800 1600 3200)
3. **L (network depth):** l_array = (1 4 32 64)
4. **H (network width):** 16
5. **O (optimizer):** AdamW

Also we used a batch size of 16 and a learning rate of 0.0001. The total number of epochs was at most 500.

## What we run here
As using all the list values from the above list takes a long time to run, we will use a smaller list to test the code. Specifically, we will run the code with:
1. **D:** d_array = (8 16 32 64)
2. **N:** n_array = (50 100 200 400 800)
3. **L:** l_array = (1 32)

In [None]:
import itertools

# Define your lists of variables
list_l = [1, 32]  # Replace with actual values
list_n = [50, 100, 200, 400, 800]
list_d = [8, 16, 32, 64]

# Iterate over all combinations of $l, $n, and $d
for l, n, d in itertools.product(list_l, list_n, list_d):
    # Run your command with the variables
    # For example, if the command is a Python function, call it directly
    print(f"Running command with l={l}, n={n}, d={d}")
    # command_function(l, n, d)

    # If it's a shell command, you can use `!` to execute it
    # For example:
    # !echo "Running command with l={l}, n={n}, d={d}"
    command = f"python3 train_random.py -b 16 --lr 0.0001 --epochs 500 --hidden_size 16 -l {l} -n {n} --n_test 5000 -d {d} -r 2 --optim adamw --seed 42 --save_path ./out/ --experiment N1000_dnl"
    print(f"Running: {command}")
    !{command}

Running command with l=1, n=50, d=8
Running: python3 train_random.py -b 16 --lr 0.0001 --epochs 500 --hidden_size 16 -l 1 -n 50 --n_test 5000 -d 8 -r 2 --optim adamw --seed 42 --save_path ./out/ --experiment N1000_dnl
device:  cuda
best epoch = 499
Train Loss = 0.6434607803821564, Train ROC-AUC = 0.75
Val Loss = 0.6659126281738281, Val ROC-AUC = 0.5
Test Loss = 0.7630192161368105, Test ROC-AUC = 0.498751699924469

best epoch = 499
Train Loss = 0.6517908573150635, Train ROC-AUC = 0.6761133670806885
Val Loss = 0.694635272026062, Val ROC-AUC = 0.6666666865348816
Test Loss = 0.7294797962085127, Test ROC-AUC = 0.49832502007484436

best epoch = 499
Train Loss = 0.6461367905139923, Train ROC-AUC = 0.591093122959137
Val Loss = 0.6334357261657715, Val ROC-AUC = 0.6666666865348816
Test Loss = 0.7095257243790185, Test ROC-AUC = 0.5004978775978088

best epoch = 494
Train Loss = 0.6313289105892181, Train ROC-AUC = 0.612500011920929
Val Loss = 0.6464171409606934, Val ROC-AUC = 0.8333333730697632
Tes