># **Multi-Layer Perceptron for Regression using Keras**
>
>
>The dataframe used in this notebook originates from the preprocessing steps performed in the `"1_4b-preprocessing-feature-engineering-and-preprocessing-for-predictive-models.ipynb"` notebook, along with some additional steps taken directly before running the Gradient Boosting Regressor (feature-target split, Winsorization to deal with outliers, logarithmic transformation of the electric range variable, and train-test split).
>The final refinement of the selected variables is carried out here to meet the specific requirements of the models being developed, based on insights from the aforementioned notebook.

In [None]:
# Importing Required Libraries
import numpy as np
import matplotlib.pyplot as plt
import json
import joblib
import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Load datasets
x_train, x_test, y_train, y_test = joblib.load("train_test_split.pkl")

print("Train-test split loaded successfully!")

Train-test split loaded successfully!


In [None]:
# Paths for Saving MLP Model and Training History
CHECKPOINT_DIR = "mlp_checkpoints"
BEST_MODEL_PATH = os.path.join(CHECKPOINT_DIR, "best_model.keras")
FINAL_MODEL_PATH = "mlp_model_final.keras"
HISTORY_PATH = "training_history.json"
os.makedirs(CHECKPOINT_DIR, exist_ok = True)

In [None]:
# Normalize inputs
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)

In [None]:
# MLP Model – Definition, Training, and Evaluation

# Define a lightweight MLP
model = Sequential([
    Dense(64, activation = "relu", input_shape = (x_train_scaled.shape[1],)),
    Dropout(0.1),
    Dense(32, activation = "relu"),
    Dropout(0.1),
    Dense(1) 
])

# Compile the model
model.compile(optimizer = "adam", loss = "mse", metrics = ["mae"])

# Early stopping: prevent overfitting, save memory/time
early_stop = [EarlyStopping(monitor = "val_loss", patience = 10, restore_best_weights = True),
    ModelCheckpoint(
        filepath = BEST_MODEL_PATH,
        monitor = "val_loss",
        save_best_only = True,
        save_weights_only = False,
        verbose = 1
    )]

# Train with smaller batch size to reduce memory load
history = model.fit(
    x_train_scaled, y_train,
    validation_split = 0.2,
    epochs = 100,
    batch_size = 32,
    callbacks = [early_stop],
    verbose = 1
)

# Save model
model.save(FINAL_MODEL_PATH)
with open(HISTORY_PATH, "w") as f:
    json.dump(history.history, f)

# Predict and evaluate
y_pred = model.predict(x_test_scaled).flatten()

print(f"MAE: {mean_absolute_error(y_test, y_pred):.2f}")
print(f"RMSE: {mean_squared_error(y_test, y_pred, squared = False):.2f}")
print(f"R²: {r2_score(y_test, y_pred):.2f}")

# Plot loss curves
plt.figure(figsize = (10, 5))
plt.plot(history.history["loss"], label = "Train Loss")
plt.plot(history.history["val_loss"], label = "Validation Loss")
plt.title("Loss Curve (MSE)")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.grid(True)
plt.show()


Epoch 1/100
[1m473627/473654[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - loss: 98.0964 - mae: 2.4328
Epoch 1: val_loss improved from inf to 43.70820, saving model to mlp_checkpoints\best_model.keras
[1m473654/473654[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1004s[0m 2ms/step - loss: 98.0953 - mae: 2.4328 - val_loss: 43.7082 - val_mae: 1.1444
Epoch 2/100
[1m473630/473654[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - loss: 65.3117 - mae: 2.3942
Epoch 2: val_loss improved from 43.70820 to 37.42362, saving model to mlp_checkpoints\best_model.keras
[1m473654/473654[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m985s[0m 2ms/step - loss: 65.3117 - mae: 2.3942 - val_loss: 37.4236 - val_mae: 1.0258
Epoch 3/100
[1m473645/473654[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - loss: 62.1824 - mae: 2.3430
Epoch 3: val_loss improved from 37.42362 to 34.93877, saving model to mlp_checkpoints\best_model.keras
[1m473654/473654[0m [

TypeError: got an unexpected keyword argument 'squared'

>### Training Interruption and Resumption
>
> This notebook originally contained several repeated training runs due to kernel interruptions and runtime errors (e.g., `unexpected keyword argument 'squared'`). To ensure reproducibility and clarity, all intermediate training cells have been removed.
>
>The training process was **interrupted after epoch 32** due to a runtime error related to an unexpected keyword argument (`squared`) in a metric or loss function. To prevent loss of progress, **checkpoints were used**, and the model was later **resumed from the last valid saved state**.
>
>The continuation of training — starting from **epoch 92** — is shown in the following cell. This was done by reloading the model and its training history from JSON and checkpoint files.
>
>All configurations for the model architecture (activation functions, dropout, optimizers, etc.) remain exactly as originally defined in the first cell. Only the training loop was resumed.

In [None]:
# Load training history from JSON file to resume training
with open("training_history.json", "r") as f:
    history_data = json.load(f)

last_epoch = len(history_data["loss"])

In [None]:
# Continue Training from Checkpoint – MLP Model

# Paths
CHECKPOINT_DIR = "mlp_checkpoints"
BEST_MODEL_PATH = os.path.join(CHECKPOINT_DIR, "best_model.keras")
HISTORY_PATH = "training_history.json"

# Reload saved model
model = load_model(BEST_MODEL_PATH)

# Load saved training history
with open(HISTORY_PATH, "r") as f:
    history_data = json.load(f)

# Determine the last completed epoch
last_epoch = len(history_data["loss"])

# Redefine callbacks
early_stop = [
    EarlyStopping(monitor = "val_loss", patience = 10, restore_best_weights = True),
    ModelCheckpoint(
        filepath = BEST_MODEL_PATH,
        monitor = "val_loss",
        save_best_only = True,
        save_weights_only = False,
        verbose = 1
    )
]

# Resume training
history_new = model.fit(
    x_train_scaled, y_train,
    validation_split = 0.2,
    epochs = 100,                   
    initial_epoch = last_epoch,    
    batch_size = 2,
    callbacks = early_stop,
    verbose = 1
)

# Update complete training history
for key in history_new.history:
    history_data[key].extend(history_new.history[key])

with open(HISTORY_PATH, "w") as f:
    json.dump(history_data, f)

print(f"Training resumed from epoch {last_epoch}. History updated and saved.")


Epoch 93/100
[1m473626/473654[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 1ms/step - loss: 49.2141 - mae: 2.3370
Epoch 93: val_loss improved from inf to 25.40499, saving model to mlp_checkpoints\best_model.keras
[1m473654/473654[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m763s[0m 2ms/step - loss: 49.2141 - mae: 2.3370 - val_loss: 25.4050 - val_mae: 0.7681
Epoch 94/100
[1m473640/473654[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - loss: 49.7111 - mae: 2.3445
Epoch 94: val_loss improved from 25.40499 to 24.89479, saving model to mlp_checkpoints\best_model.keras
[1m473654/473654[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m838s[0m 2ms/step - loss: 49.7111 - mae: 2.3445 - val_loss: 24.8948 - val_mae: 1.0835
Epoch 95/100
[1m473618/473654[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 2ms/step - loss: 49.5739 - mae: 2.3419
Epoch 95: val_loss improved from 24.89479 to 24.01681, saving model to mlp_checkpoints\best_model.keras
[1m473654/473654[

In [None]:
# Evaluate Model Performance on Training Set

# Make predictions on the training set
y_pred_train = model.predict(x_train_scaled).flatten()

# Evaluate performance on the training set
r2_train = r2_score(y_train, y_pred_train)
mae_train = mean_absolute_error(y_train, y_pred_train)
mse_train = mean_squared_error(y_train, y_pred_train)
rmse_train = np.sqrt(mse_train)
mean_y_train = np.mean(y_train)

# Display metrics
print("=== Train Set ===")
print(f"R² Score: {r2_train:.4f}")
print(f"MAE: {mae_train:.4f}")
print(f"MSE: {mse_train:.4f}")
print(f"RMSE: {rmse_train:.4f}")
print(f"RMSE (% of mean): {100 * rmse_train / mean_y_train:.2f}%")
print(f"Within 5% threshold? {'Yes' if rmse_train / mean_y_train < 0.05 else 'No'}")


[1m592068/592068[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m393s[0m 662us/step
=== Train Set ===
R² Score: -0.0776
MAE: 45.6552
MSE: 3076.3042
RMSE: 55.4644
RMSE (% of mean): 340.48%
Within 5% threshold? No


In [None]:
# Evaluate Model Performance on Test Set

# Make predictions on the test set
y_pred = model.predict(x_test_scaled).flatten()

# Evaluate performance on the test set
r2 = r2_score(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)
mean_y = np.mean(y_test)

# Display metrics
print(f"R² Score: {r2:.4f}")
print(f"MAE: {mae:.4f}")
print(f"MSE: {mse:.4f}")
print(f"RMSE: {rmse:.4f}")
print(f"RMSE (% of mean): {100 * rmse / mean_y:.2f}%")
print(f"Within 5% threshold? {'Yes' if rmse / mean_y < 0.05 else 'No'}")

[1m148017/148017[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m107s[0m 722us/step
R² Score: -0.0779
MAE: 45.6559
MSE: 3077.1706
RMSE: 55.4723
RMSE (% of mean): 340.45%
Within 5% threshold? No


### Analysis of the Results

**Train Set Evaluation:**  
- **R² Score:** -0.0776  
- **Mean Absolute Error:** 45.6552  
- **Mean Squared Error:** 3076.3042  
- **Root Mean Squared Error:** 55.4644  
- **RMSE as % of mean:** 340.48%  
- **Within 5% threshold?** No

**Test Set Evaluation:**  
- **R² Score:** -0.0779  
- **Mean Absolute Error:** 45.6559  
- **Mean Squared Error:** 3077.1706  
- **Root Mean Squared Error:** 55.4723  
- **RMSE as % of mean:** 340.45%  
- **Within 5% threshold?** No

**Additional Insight:**  
- **Mean Electric Energy Consumption (excluding zeros):** 182.3966  

The deep learning model, despite its theoretical capacity to capture complex patterns, **failed to produce meaningful results** in this phase. The R² values are negative for both training and test sets, indicating that the model performs **worse than a constant prediction of the mean**. The RMSE exceeds 340% of the target mean, which points to **large deviations** between predictions and actual values.

Moreover, the MAE values are high and practically identical across both sets, suggesting that the model is not overfitting — it simply **did not learn relevant patterns from the data**.

These results suggest that either the model architecture is unsuitable, the data preprocessing was not compatible with deep learning requirements, or the model was **not adequately trained or optimized** for this task. Further work would be required to investigate:

- Network depth and layer design  
- Normalization and feature scaling strategies  
- Loss function suitability and optimizer settings  
- Number of training epochs and batch size  
- Volume and structure of the training data

At this stage, however, the deep learning model serves more as a **control case** than a viable solution. It underscores the importance of careful tuning and preprocessing when working with neural networks and highlights the robustness of tree-based models like HistGradientBoosting and LightGBM for this particular dataset.
