<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Neural Networks (NN) - MLP Modeling
</p>


## Purpose and evaluation protocol

This notebook implements a time-aware ANN (MLP) pipeline for path loss prediction. The evaluation design follows strict temporal separation to avoid leakage.

Key principles:
- Train/test split is time-ordered (train window precedes test window).
- Model selection is done via time-aware cross-validation on the training window only.
- Feature scaling is fit on the training window and applied to validation/test.
- The held-out test window is used once for final reporting.

Metrics reported: MSE, MAE, RMSE, R2, MAPE, Median AE.


<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Libraries and Reproducibility
</p>


In [3]:
# Libraries for data manipulation, visualization, and modeling

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import tensorflow as tf
from keras.models import Sequential
from keras.layers import (Dense, Input, BatchNormalization, Dropout, LeakyReLU)
from keras.regularizers import l2
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau

from sklearn.metrics import (mean_squared_error, r2_score, mean_absolute_percentage_error, median_absolute_error)
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import PredefinedSplit
import os  # Import the os module

# Set seed for reproducibility
GLOBAL_SEED = 50
np.random.seed(GLOBAL_SEED)                      # Seed for NumPy
tf.random.set_seed(GLOBAL_SEED)                  # Seed for TensorFlow

2025-12-19 20:50:34.077176: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-12-19 20:50:34.098144: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Data and Preprocessing
</p>


In [4]:
# Time-aware data load (from Data Preparation.ipynb outputs)

base_path = '../../Comprehensive ML - Files & Plots etc'

df_train = pd.read_csv(f"{base_path}/train.csv", parse_dates=['time'])
df_test  = pd.read_csv(f"{base_path}/test.csv", parse_dates=['time'])
fold_assignments = np.load(f"{base_path}/train_folds.npy")
fold_assignments = np.asarray(fold_assignments).ravel()

if len(fold_assignments) != len(df_train):
    raise ValueError("fold_assignments length does not match df_train rows.")

# Feature columns (same as your original ANN block)
feature_columns = [
    'distance',
    'frequency',
    'c_walls',
    'w_walls',
    'co2',
    'humidity',
    'pm25',
    'pressure',
    'temperature',
    'snr'
]

target_column = 'PL'  # Target column

# Check for missing columns in train/test
missing_train = set(feature_columns + [target_column]) - set(df_train.columns)
missing_test  = set(feature_columns + [target_column]) - set(df_test.columns)
if missing_train:
    raise ValueError(f"Missing columns in train.csv: {missing_train}")
if missing_test:
    raise ValueError(f"Missing columns in test.csv: {missing_test}")

# Extract features (X) and target (y)
X_train = df_train[feature_columns].to_numpy()
X_test  = df_test[feature_columns].to_numpy()
y_train = df_train[target_column].to_numpy()
y_test  = df_test[target_column].to_numpy()

# Feature scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled  = scaler.transform(X_test)

# Time-aware folds
time_train = df_train['time'].to_numpy()
time_test  = df_test['time'].to_numpy()
ps = PredefinedSplit(fold_assignments)

# Hold out the most recent training fold for final validation
fold_ids = np.unique(fold_assignments)
fold_ids = fold_ids[fold_ids >= 0]
if fold_ids.size == 0:
    raise ValueError("fold_assignments must contain at least one non-negative fold id.")
val_fold = fold_ids.max()
val_mask = fold_assignments == val_fold

X_train_time = X_train_scaled[~val_mask]
y_train_time = y_train[~val_mask]
X_val_time = X_train_scaled[val_mask]
y_val_time = y_train[val_mask]

# Aliases to keep downstream code consistent
X_train_all_scaled = X_train_scaled
X_test_all_scaled  = X_test_scaled
PL_train_all = y_train
PL_test_all  = y_test

print(f"Train: {len(df_train)} rows, Test: {len(df_test)} rows")
print(f"Train window: {df_train.time.min()} -> {df_train.time.max()}")
print(f"Test window:  {df_test.time.min()} -> {df_test.time.max()}")
print(f"Validation fold: {val_fold} (rows: {val_mask.sum()})")
print("Time-based split + feature scaling completed...")

Train: 1663627 rows, Test: 415907 rows
Train window: 2024-10-01 00:01:07.420593+00:00 -> 2025-08-12 17:18:53.293125+00:00
Test window:  2025-08-12 17:19:02.126782+00:00 -> 2025-09-30 23:59:55.971870+00:00
Validation fold: 4 (rows: 277271)
Time-based split + feature scaling completed...


<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Model Definition
</p>


In [5]:
# Flexible model creation function

def create_ann_model(layer_units, input_dim, 
                     l2_reg=0.001, 
                     dropout_rate=0.3, 
                     negative_slope=0.1):
    '''
    Creates an ANN model for regression with configurable architecture and 
    hyperparameters like L2 regularization, dropout, and LeakyReLU slope.
    
    Arguments:
        layer_units    : list of integers (e.g., [64, 32]) specifying 
                         the number of neurons in each hidden layer
        input_dim      : int, dimension of the input layer
        l2_reg         : float, L2 regularization factor
        dropout_rate   : float, dropout rate
        negative_slope : float, negative slope for LeakyReLU
    Returns:
        model          : Compiled Keras Sequential model
    '''
    model = Sequential()

    # Explicit input layer
    model.add(Input(shape=(input_dim,)))  

    # Add hidden layers based on the list of units
    for units in layer_units:
        model.add(Dense(units, kernel_regularizer=l2(l2_reg)))
        model.add(LeakyReLU(negative_slope=negative_slope))
        model.add(BatchNormalization())
        model.add(Dropout(dropout_rate))
    
    # Output layer for regression
    model.add(Dense(1, activation='linear'))

    # Compile the model
    model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])
    
    return model

<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Architecture Grid
</p>


In [6]:
# Define candidate architectures
architectures = [
    # 1-Layer Models
    {'name': 'A1', 'units': [1]}, 
    {'name': 'A2', 'units': [2]}, 
    {'name': 'A3', 'units': [3]}, 
    {'name': 'A4', 'units': [4]}, 
    {'name': 'A5', 'units': [5]},
    {'name': 'A6', 'units': [6]},
    {'name': 'A7', 'units': [7]},
    {'name': 'A8', 'units': [8]},
    {'name': 'A9', 'units': [9]},
    
    # 2-Layer Models
    {'name': 'B1', 'units': [1, 1]},     # Total units: 2
    {'name': 'B2', 'units': [1, 2]},     # Total units: 3
    {'name': 'B3', 'units': [1, 3]},     # Total units: 4
    {'name': 'B4', 'units': [1, 4]},     # Total units: 5
    {'name': 'B5', 'units': [1, 5]},     # Total units: 6
    {'name': 'B6', 'units': [1, 6]},     # Total units: 7
    {'name': 'B7', 'units': [1, 7]},     # Total units: 8
    {'name': 'B8', 'units': [1, 8]},     # Total units: 9

    {'name': 'C1', 'units': [2, 1]},     # Total units: 3
    {'name': 'C2', 'units': [2, 2]},     # Total units: 4
    {'name': 'C3', 'units': [2, 3]},     # Total units: 5
    {'name': 'C4', 'units': [2, 4]},     # Total units: 6
    {'name': 'C5', 'units': [2, 5]},     # Total units: 7
    {'name': 'C6', 'units': [2, 6]},     # Total units: 8
    {'name': 'C7', 'units': [2, 7]},     # Total units: 9

    {'name': 'D1', 'units': [3, 1]},     # Total units: 4
    {'name': 'D2', 'units': [3, 2]},     # Total units: 5
    {'name': 'D3', 'units': [3, 3]},     # Total units: 6
    {'name': 'D4', 'units': [3, 4]},     # Total units: 7
    {'name': 'D5', 'units': [3, 5]},     # Total units: 8
    {'name': 'D6', 'units': [3, 6]},     # Total units: 9

    {'name': 'E1', 'units': [4, 1]},     # Total units: 5
    {'name': 'E2', 'units': [4, 2]},     # Total units: 6
    {'name': 'E3', 'units': [4, 3]},     # Total units: 7
    {'name': 'E4', 'units': [4, 4]},     # Total units: 8
    {'name': 'E5', 'units': [4, 5]},     # Total units: 9

    {'name': 'F1', 'units': [5, 1]},     # Total units: 6
    {'name': 'F2', 'units': [5, 2]},     # Total units: 7
    {'name': 'F3', 'units': [5, 3]},     # Total units: 8
    {'name': 'F4', 'units': [5, 4]},     # Total units: 9

    {'name': 'G1', 'units': [6, 1]},     # Total units: 7
    {'name': 'G2', 'units': [6, 2]},     # Total units: 8
    {'name': 'G3', 'units': [6, 3]},     # Total units: 9

    {'name': 'H1', 'units': [7, 1]},     # Total units: 8
    {'name': 'H2', 'units': [7, 2]},     # Total units: 9

    {'name': 'I1', 'units': [8, 1]},     # Total units: 9
]

<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Time-Aware Cross-Validation (Model Selection)
</p>


In [None]:
# Time-aware cross-validation for each architecture
kfold_results = []

n_splits = ps.get_n_splits()

for arch in architectures:
    print(f"Performing {n_splits}-Fold CV for Architecture: {arch['name']}")
    fold_metrics = []
    
    for fold_num, (train_idx, val_idx) in enumerate(ps.split(X_train_all_scaled), start=1):
        print(f"  Fold {fold_num}/{n_splits}...")

        # Split data
        X_train_fold, X_val_fold = X_train_all_scaled[train_idx], X_train_all_scaled[val_idx]
        y_train_fold, y_val_fold = PL_train_all[train_idx], PL_train_all[val_idx]
        
        # Build model
        model_cv = create_ann_model(
            layer_units=arch['units'], 
            input_dim=X_train_all_scaled.shape[1],
            l2_reg=0.001,
            dropout_rate=0.3,
            negative_slope=0.1
        )
        
        # Define callbacks
        early_stop_cv = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
        reduce_lr_cv = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-6, verbose=0)
        
        # Train
        model_cv.fit(
            X_train_fold,
            y_train_fold,
            validation_data=(X_val_fold, y_val_fold),
            epochs=100,
            batch_size=128,
            verbose=0,
            callbacks=[early_stop_cv, reduce_lr_cv]
        )
        
        # Evaluate on the validation fold
        val_loss_cv, val_mae_cv = model_cv.evaluate(X_val_fold, y_val_fold, verbose=0)
        val_pred = model_cv.predict(X_val_fold).flatten()
        
        # Compute metrics
        rmse_cv = np.sqrt(mean_squared_error(y_val_fold, val_pred))
        r2_cv = r2_score(y_val_fold, val_pred)
        mape_cv = mean_absolute_percentage_error(y_val_fold, val_pred) * 100
        median_ae_cv = median_absolute_error(y_val_fold, val_pred)
        
        fold_metrics.append({
            'Fold': fold_num,
            'Val MSE': val_loss_cv,
            'Val MAE': val_mae_cv,
            'Val RMSE': rmse_cv,
            'R2 Score': r2_cv,
            'Val MAPE (%)': mape_cv,
            'Val Median AE': median_ae_cv
        })
        
        print(f" Fold {fold_num} Metrics - Val MSE: {val_loss_cv:.4f} | Val RMSE: {rmse_cv:.4f} | R2: {r2_cv:.4f} | MAPE: {mape_cv:.2f}%")
    
    # After CV loop, summarize
    arch_cv_df = pd.DataFrame(fold_metrics)
    arch_cv_mean = arch_cv_df.mean(numeric_only=True)
    arch_cv_std = arch_cv_df.std(numeric_only=True)
    
    kfold_results.append({
        'Architecture': arch['name'],
        'Hidden Layers': str(arch['units']),
        'Mean Val MSE': arch_cv_mean['Val MSE'],
        'Std Val MSE': arch_cv_std['Val MSE'],
        'Mean Val MAE': arch_cv_mean['Val MAE'],
        'Std Val MAE': arch_cv_std['Val MAE'],
        'Mean Val RMSE': arch_cv_mean['Val RMSE'],
        'Std Val RMSE': arch_cv_std['Val RMSE'],
        'Mean R2': arch_cv_mean['R2 Score'],
        'Std R2': arch_cv_std['R2 Score'],
        'Mean Val MAPE (%)': arch_cv_mean['Val MAPE (%)'],
        'Std Val MAPE (%)': arch_cv_std['Val MAPE (%)'],
        'Mean Val MedAE': arch_cv_mean['Val Median AE'],
        'Std Val MedAE': arch_cv_std['Val Median AE']
    })

# Display aggregated CV results for all architectures
kfold_results_df = pd.DataFrame(kfold_results)
kfold_results_df_sorted = kfold_results_df.sort_values(by='Mean Val RMSE', ascending=True)
print("Time-aware CV summary (sorted by Mean Val RMSE):")
display(kfold_results_df_sorted)

# Save the results to a CSV file
kfold_results_df_sorted.to_csv('kfold_results_summary_1_.csv', index=False)
print("CV results saved to kfold_results_summary_1_.csv")

# Select the best architecture by CV
best_arch_name = kfold_results_df_sorted.iloc[0]['Architecture']
best_arch_units = next(a['units'] for a in architectures if a['name'] == best_arch_name)
print(f"Selected architecture: {best_arch_name} with layers {best_arch_units}")

Performing 5-Fold CV for Architecture: A1
  Fold 1/5...
[1m17330/17330[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 232us/step
    Fold 1 Metrics - Val MSE: 90.4003 | Val RMSE: 9.5079 | R2: 0.7477 | MAPE: 8.89%
  Fold 2/5...
[1m8665/8665[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 232us/step
    Fold 2 Metrics - Val MSE: 92.2137 | Val RMSE: 9.6027 | R2: 0.7433 | MAPE: 8.75%
  Fold 3/5...
[1m8665/8665[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 235us/step
    Fold 3 Metrics - Val MSE: 88.5639 | Val RMSE: 9.4107 | R2: 0.7230 | MAPE: 7.90%
  Fold 4/5...
[1m8665/8665[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 243us/step
    Fold 4 Metrics - Val MSE: 110.5229 | Val RMSE: 10.5129 | R2: 0.7239 | MAPE: 10.38%
  Fold 5/5...
[1m8665/8665[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 248us/step
    Fold 5 Metrics - Val MSE: 77.1082 | Val RMSE: 8.7811 | R2: 0.7629 | MAPE: 8.18%
Performing 5-Fold CV for Architecture: A2
  Fold 1/5...
[1m17330/17

Unnamed: 0,Architecture,Hidden Layers,Mean Val MSE,Std Val MSE,Mean Val MAE,Std Val MAE,Mean Val RMSE,Std Val RMSE,Mean R2,Std R2,Mean Val MAPE (%),Std Val MAPE (%),Mean Val MedAE,Std Val MedAE
7,A8,[8],48.497184,8.5217,5.139373,0.479855,6.94041,0.623192,0.861796,0.027476,6.08902,0.603354,4.037642,0.416148
8,A9,[9],49.247688,9.195287,5.220267,0.535948,6.989234,0.686046,0.860515,0.022977,6.199997,0.677894,4.127295,0.442025
6,A7,[7],49.304791,9.4567,5.21356,0.599309,6.991538,0.711412,0.86051,0.022644,6.179393,0.736659,4.134825,0.534327
5,A6,[6],50.880038,9.943422,5.256035,0.599111,7.102141,0.729547,0.855664,0.027172,6.216893,0.761939,4.161173,0.525506
38,F4,"[5, 4]",50.847356,5.941883,5.475363,0.359955,7.119593,0.421885,0.855344,0.017407,6.484675,0.478268,4.563373,0.470515
41,G3,"[6, 3]",53.027991,6.815009,5.603134,0.46907,7.26867,0.468287,0.849371,0.017191,6.629193,0.678768,4.689668,0.598437
3,A4,[4],54.223961,8.286845,5.482252,0.504179,7.344459,0.580798,0.846382,0.018329,6.544784,0.660726,4.366522,0.488977
37,F3,"[5, 3]",55.16692,7.571812,5.664406,0.497975,7.411614,0.523846,0.843425,0.018785,6.679468,0.687132,4.691504,0.588158
34,E5,"[4, 5]",55.657772,9.510686,5.686042,0.525953,7.436059,0.657549,0.841845,0.026518,6.719969,0.675875,4.661445,0.516061
43,H2,"[7, 2]",55.743166,8.396536,5.787663,0.582731,7.447414,0.569491,0.84231,0.015075,6.890792,0.776413,4.892592,0.693606


CV results saved to kfold_results_summary_1_.csv
Selected architecture: A8 with layers [8]


<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Final Training and Test Evaluation
</p>


In [8]:
# Train the selected architecture on the training window
final_model = create_ann_model(
    layer_units=best_arch_units, 
    input_dim=X_train_all_scaled.shape[1],
    l2_reg=0.001,
    dropout_rate=0.3,
    negative_slope=0.1
)

early_stop = EarlyStopping(monitor='val_loss', patience=30, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, min_lr=1e-6, verbose=0)

history_final = final_model.fit(
    X_train_time,
    y_train_time,
    validation_data=(X_val_time, y_val_time),
    epochs=500,
    batch_size=128,
    verbose=0,
    callbacks=[early_stop, reduce_lr]
)

# Evaluate on the held-out test set
final_test_mse, final_test_mae = final_model.evaluate(X_test_all_scaled, PL_test_all, verbose=0)
final_pred = final_model.predict(X_test_all_scaled).flatten()

final_test_rmse = np.sqrt(mean_squared_error(PL_test_all, final_pred))
final_test_r2 = r2_score(PL_test_all, final_pred)
final_test_mape = mean_absolute_percentage_error(PL_test_all, final_pred) * 100
final_test_median_ae = median_absolute_error(PL_test_all, final_pred)

final_results_df = pd.DataFrame([
    {
        'Architecture': best_arch_name,
        'Hidden Layers': str(best_arch_units),
        'Test MSE': final_test_mse,
        'Test MAE': final_test_mae,
        'Test RMSE': final_test_rmse,
        'R2 Score': final_test_r2,
        'Test MAPE (%)': final_test_mape,
        'Test Median AE': final_test_median_ae
    }
])

print("Final model evaluation on held-out test set:")
display(final_results_df)

final_results_df.to_csv('final_model_results.csv', index=False)
print("Final results saved to final_model_results.csv")

[1m12998/12998[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 247us/step
Final model evaluation on held-out test set:


Unnamed: 0,Architecture,Hidden Layers,Test MSE,Test MAE,Test RMSE,R2 Score,Test MAPE (%),Test Median AE
0,A8,[8],52.303398,5.052016,7.230896,0.852511,5.830328,3.888727


Final results saved to final_model_results.csv


<p style="font-family: 'Courier New', Courier, monospace; font-size: 30px; font-weight: bold; color: blue;  text-align: left;">
Exploratory Test Sweep
</p>


This sweep evaluates every architecture on the held-out test set. Use it only for exploratory plots (e.g., layer-by-layer comparisons), not for model selection.


In [9]:
RUN_TEST_SWEEP = False  # Set True if you need per-architecture test metrics for plots

if RUN_TEST_SWEEP:
    model_results = []
    arch_histories = {}
    arch_predictions = {}

    early_stop = EarlyStopping(monitor='val_loss', patience=30, restore_best_weights=True)
    reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, min_lr=1e-6, verbose=0)

    for arch in architectures:
        print(f"Training Architecture: {arch['name']} with layers {arch['units']}")
        
        model = create_ann_model(
            layer_units=arch['units'], 
            input_dim=X_train_all_scaled.shape[1],
            l2_reg=0.001,
            dropout_rate=0.3,
            negative_slope=0.1
        )
        
        history = model.fit(
            X_train_time,
            y_train_time,
            validation_data=(X_val_time, y_val_time),
            epochs=500,
            batch_size=128,
            verbose=0,
            callbacks=[early_stop, reduce_lr]
        )
        
        arch_histories[arch['name']] = history
        
        # Evaluate on training and test data
        train_loss, train_mae = model.evaluate(X_train_all_scaled, PL_train_all, verbose=0)
        test_loss, test_mae = model.evaluate(X_test_all_scaled, PL_test_all, verbose=0)
        
        PL_pred = model.predict(X_test_all_scaled).flatten()
        arch_predictions[arch['name']] = PL_pred
        
        rmse_test = np.sqrt(mean_squared_error(PL_test_all, PL_pred))
        r2_test = r2_score(PL_test_all, PL_pred)
        mape_test = mean_absolute_percentage_error(PL_test_all, PL_pred) * 100
        median_ae_test = median_absolute_error(PL_test_all, PL_pred)
        
        model_results.append({
            'Architecture': arch['name'],
            'Hidden Layers': str(arch['units']),
            'Train MSE': train_loss,
            'Train MAE': train_mae,
            'Test MSE': test_loss,
            'Test MAE': test_mae,
            'Test RMSE': rmse_test,
            'R2 Score': r2_test,
            'Test MAPE (%)': mape_test,
            'Test Median AE': median_ae_test
        })
        
        print(f"Completed {arch['name']} -> Test MSE: {test_loss:.4f}, Test MAE: {test_mae:.4f}, R2: {r2_test:.4f}")

    model_results_df = pd.DataFrame(model_results)
    print("All architectures trained. Here is the summary:")
    display(model_results_df)

    model_results_df.to_csv('model_results_summary_1_.csv', index=False)
    print("Results saved to model_results_summary_1_.csv")