# A1_4 â€“ Regularisation

In this notebook we compare different regularisation techniques applied to our neturalnet_torch model.

More specifically, we compare:

- L1/L2 Regularisation
- Dropout Regularisation

For each:

- We experiment with different parameters
- We present the results of the evaluation

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sys, os
import joblib

base = os.path.dirname(os.getcwd())  
sys.path.append(os.path.join(base, "models"))
sys.path.append(os.path.join(base, "utils"))

from sklearn.metrics import mean_squared_error, mean_absolute_error

from utils import predict_batch, mape, evaluate_regression
from NeuralNet import NeuralNet                      # manual BP implementation
from mlr_sklearn import MultipleLinearRegressionSK   # simple MLR wrapper
from neuralnet_torch import NeuralNetTorch           # PyTorch implementation

In [2]:
# Load preprocessed data from ./data
X_trainval_np = np.load("../data/X_trainval_np.npy")
X_test_np     = np.load("../data/X_test_np.npy")

y_trainval = np.load("../data/y_trainval.npy")
y_test     = np.load("../data/y_test.npy")

y_trainval_scaled = np.load("../data/y_trainval_scaled.npy")
y_test_scaled     = np.load("../data/y_test_scaled.npy")

x_scaler = joblib.load("../data/x_scaler.joblib")
y_scaler = joblib.load("../data/y_scaler.joblib")

n_features = X_trainval_np.shape[1]
print("Loaded preprocessed data from ../data")
print("X_trainval_np:", X_trainval_np.shape)
print("X_test_np    :", X_test_np.shape)
print("n_features   :", n_features)

Loaded preprocessed data from ../data
X_trainval_np: (1200, 61)
X_test_np    : (300, 61)
n_features   : 61


In [3]:
# Selected configuration for manual BP, we copy here the desired Hyperpaarameters from the notebook 2

hidden_layers_bp = [40, 15]
epochs_bp = 600
lr_bp = 0.005
momentum_bp = 0.9
activation_bp = "tanh"

print("Manual BP selected configuration:")
print("Hidden layers :", hidden_layers_bp)
print("Epochs        :", epochs_bp)
print("Learning rate :", lr_bp)
print("Momentum      :", momentum_bp)
print("Activation    :", activation_bp)


Manual BP selected configuration:
Hidden layers : [40, 15]
Epochs        : 600
Learning rate : 0.005
Momentum      : 0.9
Activation    : tanh


In [4]:
# PyTorch Neural Network 

hidden_layers_torch = hidden_layers_bp
layers_torch = [n_features] + hidden_layers_torch + [1]

results = []   
configs = [
    {"name": "none",    "reg_type": None, "dropout": 0.0, "lambda_reg": 0.0},
    {"name": "L1",      "reg_type": "L1", "dropout": 0.0, "lambda_reg": 1e-5},
    {"name": "L2",      "reg_type": "L2", "dropout": 0.0, "lambda_reg": 1e-4},
    {"name": "dropout", "reg_type": "Dropout", "dropout": 0.3, "lambda_reg": 0.0},
]

for cfg in configs:    
    print(f"\n=== Training with {cfg['name']} ===")

    net_torch = NeuralNetTorch(
    n=layers_torch,
    fact=activation_bp,   # activation
    eta=lr_bp,            # learning rate
    alpha=momentum_bp,    # momentum
    epochs=epochs_bp,     # number of epochs
    val_split=0.2,        # validation split,
    dropout=cfg['dropout']) # dropout
    
    # Train with scaled data
    net_torch.fit(X_trainval_np, y_trainval_scaled, cfg['reg_type'], cfg['lambda_reg'])

    # Loss history for later plots
    train_err_torch, val_err_torch = net_torch.loss_epochs()
    
    # Predictions in scaled space
    y_trainval_pred_torch_scaled = net_torch.predict(X_trainval_np).reshape(-1, 1)
    y_test_pred_torch_scaled     = net_torch.predict(X_test_np).reshape(-1, 1)
    
    # Back to original target scale (cnt_log)
    y_trainval_pred_torch = y_scaler.inverse_transform(y_trainval_pred_torch_scaled).ravel()
    y_test_pred_torch     = y_scaler.inverse_transform(y_test_pred_torch_scaled).ravel()
    
    # Metrics (in original cnt_log scale)
    metrics_trainval = evaluate_regression(y_trainval, y_trainval_pred_torch)
    metrics_test     = evaluate_regression(y_test,     y_test_pred_torch)

    print("=== PyTorch Neural Network (same config) ===")
    print("TRAIN+VAL:", metrics_trainval)
    print("TEST     :", metrics_test)

    # Store results
    # Add TRAIN+VAL row
    results.append({
        "Regularisation": cfg['reg_type'],
        "Split": "Train+Val",
        "MSE":  metrics_trainval["MSE"],
        "MAE":  metrics_trainval["MAE"],
        "MAPE": metrics_trainval["MAPE"],
    })

    # Add TEST row
    results.append({
        "Regularisation": cfg['reg_type'],
        "Split": "Test",
        "MSE":  metrics_test["MSE"],
        "MAE":  metrics_test["MAE"],
        "MAPE": metrics_test["MAPE"],
    })



=== Training with none ===
NeuralNetTorch (PyTorch) initialized
 - Layers: [61, 40, 15, 1]
 - Activation: tanh
 - Learning rate: 0.005 | Momentum: 0.9
 - Epochs: 600 | Val split: 0.2
Epoch 0: Train MSE=1.026734 | Val MSE=1.062613
Epoch 100: Train MSE=0.096878 | Val MSE=0.089626
Epoch 200: Train MSE=0.081147 | Val MSE=0.079150
Epoch 300: Train MSE=0.069471 | Val MSE=0.072786
Epoch 400: Train MSE=0.059091 | Val MSE=0.066860
Epoch 500: Train MSE=0.049878 | Val MSE=0.061230
=== PyTorch Neural Network (same config) ===
TRAIN+VAL: {'MSE': 0.08867711487377777, 'MAE': 0.20712834657094129, 'MAPE': 7.391251864780278}
TEST     : {'MSE': 0.11917111798456015, 'MAE': 0.22467764350878033, 'MAPE': 9.080963105963232}

=== Training with L1 ===
NeuralNetTorch (PyTorch) initialized
 - Layers: [61, 40, 15, 1]
 - Activation: tanh
 - Learning rate: 0.005 | Momentum: 0.9
 - Epochs: 600 | Val split: 0.2
Epoch 0: Train MSE=1.060467 | Val MSE=1.094181
Epoch 100: Train MSE=0.098581 | Val MSE=0.090045
Epoch 200: 

In [5]:

# Comparison tables: TRAIN+VAL and TEST metrics
df_results = pd.DataFrame(results)

print("=== Evaluation metrics w/ Regularisation ===")
display(df_results)




=== Evaluation metrics w/ Regularisation ===


Unnamed: 0,Regularisation,Split,MSE,MAE,MAPE
0,,Train+Val,0.088677,0.207128,7.391252
1,,Test,0.119171,0.224678,9.080963
2,L1,Train+Val,0.082692,0.195285,7.200499
3,L1,Test,0.122859,0.220391,9.409757
4,L2,Train+Val,0.09019,0.19069,7.289969
5,L2,Test,0.137705,0.219934,9.633988
6,Dropout,Train+Val,0.144926,0.236235,9.44481
7,Dropout,Test,0.169812,0.240575,11.057191


## Summary 
L1/L2: added as a penalty on the weights in the loss function.
Dropout: added as a layer in the model architecture