# Enhanced REDf (Recurrent Energy Demand forecasting) Model Implementation

This notebook implements an enhanced version of the REDf model for energy demand forecasting. The implementation uses a hybrid CNN-LSTM architecture with Temporal Pattern Attention (TPA) mechanism and includes hyperparameter tuning to optimize model performance. The model processes hourly energy consumption data from multiple regions to predict future energy demand.

## Import Required Libraries

In [2]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import LSTM, Dense, Dropout, Input, Conv1D, MaxPooling1D, Multiply, GlobalAveragePooling1D
import tensorflow as tf
import os
import requests
import warnings
from tensorflow.keras.callbacks import EarlyStopping

warnings.filterwarnings("ignore")

# Set random seed for reproducibility
np.random.seed(42)
tf.random.set_seed(42)

2025-04-18 06:30:00.546986: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-04-18 06:30:00.550372: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-04-18 06:30:00.559219: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1744938000.574969  124052 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1744938000.578839  124052 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1744938000.590488  124052 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin

## Data Download and Preparation

This function downloads the required energy consumption datasets from GitHub if they don't already exist locally.

In [3]:

os.makedirs("data", exist_ok=True)
dataset_urls = {
    "AEP": "https://raw.githubusercontent.com/panambY/Hourly_Energy_Consumption/refs/heads/master/data/AEP_hourly.csv",
    "COMED": "https://raw.githubusercontent.com/panambY/Hourly_Energy_Consumption/refs/heads/master/data/COMED_hourly.csv",
    "DAYTON": "https://raw.githubusercontent.com/panambY/Hourly_Energy_Consumption/refs/heads/master/data/DAYTON_hourly.csv",
    "PJME": "https://raw.githubusercontent.com/panambY/Hourly_Energy_Consumption/refs/heads/master/data/PJME_hourly.csv",
}

dataset_paths = {}
for name, url in dataset_urls.items():
    file_path = f"data/{name}_hourly.csv"
    dataset_paths[name] = file_path
    if not os.path.exists(file_path):
        print(f"Downloading {name} dataset...")
        try:
            response = requests.get(url)
            response.raise_for_status()
            with open(file_path, "w") as f:
                f.write(response.text)
            print(f"{name} dataset downloaded successfully.")
        except Exception as e:
            print(f"Error downloading {name} dataset: {e}")
    else:
        print(f"{name} dataset already exists.")

AEP dataset already exists.
COMED dataset already exists.
DAYTON dataset already exists.
PJME dataset already exists.


## Algorithm 1: Data Pre-processing

Transform raw data into pre-processed data suitable for model training and evaluation.

### Steps:
1. **Input**: Raw Data `Draw`
2. **Output**: Pre-processed Data `Dpre`
3. **Procedure**:
   - **Step 4**: Load the raw data into the program: `Draw ← load_data()`
   - **Step 5**: Check for missing values and handle them accordingly: `Dpre ← handle_missing_values(Draw)`
   - **Step 6**: Check for outliers and handle them accordingly: `Dpre ← handle_outliers(Dpre)`
   - **Step 7**: Normalize the data: `Dpre ← normalize(Dpre)`
   - **Step 8**: Divide the data into training and testing sets: `(Dtrain, Dtest) ← split_data(Dpre)`
   - **Step 9**: Return the pre-processed data: `return Dpre`

In [None]:
def preprocess_data(file_path):

    data = pd.read_csv(file_path)

    data["Datetime"] = pd.to_datetime(data["Datetime"])
    data.set_index("Datetime", inplace=True)
    
    energy_column = data.columns[0]
    energy_demand = data[energy_column].values.reshape(-1, 1)


    z_scores = (energy_demand - np.mean(energy_demand)) / np.std(energy_demand)
    abs_z_scores = np.abs(z_scores)
    outlier_indices = (abs_z_scores > 3).flatten()


    median_value = np.median(energy_demand)
    energy_demand[outlier_indices] = median_value

    scaler = MinMaxScaler(feature_range=(0, 1))
    scaled_data = scaler.fit_transform(energy_demand)

    return scaled_data, scaler, energy_column

## Sequence Creation for Time Series

This section describes the function responsible for generating input-output sequences for time series forecasting. The model requires input sequences of several hours (24 in this case) to predict future energy demand accurately.

In [5]:
def create_sequences(data, time_steps):
    X, y = [], []
    for i in range(len(data) - time_steps):
        X.append(data[i : (i + time_steps), 0])
        y.append(data[i + time_steps, 0])
    return np.array(X), np.array(y)

## Enhanced REDf Model Architecture: Hybrid CNN-LSTM with Temporal Pattern Attention

This section introduces an enhanced architecture for the REDf model that incorporates CNN layers for feature extraction, multiple stacked LSTM layers for temporal processing, and a Temporal Pattern Attention (TPA) mechanism.

### Key Components of the Enhanced Model:

1. **CNN Feature Extraction Block**:
   - Uses 1D convolutional layers to extract local patterns from the time series data
   - Applies max pooling to reduce dimensionality and focus on important features

2. **Multi-layer LSTM Block**:
   - Uses three stacked LSTM layers to capture complex temporal dependencies
   - Each LSTM layer is followed by a dropout layer to prevent overfitting

3. **Temporal Pattern Attention Mechanism**:
   - Applies attention weights to focus on the most relevant temporal patterns
   - Uses convolutional features to generate attention scores
   - Weights the LSTM outputs according to their importance for prediction

4. **Dense Output Layer**:
   - Produces the final energy demand prediction

This architecture enhances the original REDf model by incorporating feature extraction capabilities and attention mechanisms that can better capture complex patterns in energy demand data.

In [6]:
def build_redf_model(input_shape, lstm_units=250, dropout_rate=0.1):
    inputs = Input(shape=(input_shape, 1))
    
    # CNN Feature Extraction Block
    x = Conv1D(filters=64, kernel_size=3, padding="same", activation="relu")(inputs)
    x = MaxPooling1D(pool_size=2)(x)
    
    # LSTM Temporal Processing Block with multiple LSTM layers
    x = LSTM(lstm_units, return_sequences=True)(x)
    x = Dropout(dropout_rate)(x)
    x = LSTM(lstm_units, return_sequences=True)(x)
    x = Dropout(dropout_rate)(x)
    x = LSTM(lstm_units, return_sequences=True)(x)
    x = Dropout(dropout_rate)(x)
    
    # Temporal Pattern Attention Mechanism
    conv_features = Conv1D(64, 3, padding="same", activation="tanh")(x)
    attention_scores = Dense(1, activation="sigmoid")(conv_features)
    weighted_features = Multiply()([x, attention_scores])
    context_vector = GlobalAveragePooling1D()(weighted_features)
    
    outputs = Dense(1)(context_vector)
    
    model = Model(inputs=inputs, outputs=outputs)
    model.compile(optimizer="adam", loss="mse")
    
    return model

## Hyperparameter Tuning with Grid Search

This section implements a grid search approach to find the optimal hyperparameters for the model. Unlike the original implementation which used fixed hyperparameters, this enhanced version explores different combinations to identify the best configuration for each dataset.

### Grid Search Parameters:

- **LSTM Units**: Number of units in each LSTM layer (options: 200, 250, 300)
- **Dropout Rate**: Rate of dropout for regularization (options: 0.1, 0.2)
- **Batch Size**: Number of samples per gradient update (options: 500, 1000, 1500)
- **Epochs**: Number of complete passes through the training dataset (options: 30, 50)

The function performs an exhaustive search over all combinations of these parameters and selects the best based on validation loss.

In [None]:
def grid_search_hyperparameters(X_train, y_train, X_val, y_val, input_shape):
    # Define parameter ranges
    lstm_units_options = [200, 250]
    dropout_rate_options = [0.1, 0.2]
    batch_size_options = [500]
    epoch_options = [30, 40,50]

    best_val_loss = np.inf
    best_params = {}
    best_history = None

    for lstm_units in lstm_units_options:
        for dropout_rate in dropout_rate_options:
            for batch_size in batch_size_options:
                for epochs in epoch_options:
                    print(
                        f"Testing params: LSTM Units={lstm_units}, Dropout={dropout_rate}, Batch Size={batch_size}, Epochs={epochs}"
                    )
                    model = build_redf_model(
                        input_shape, lstm_units=lstm_units, dropout_rate=dropout_rate
                    )

                    early_stopping = EarlyStopping(
                        monitor="val_loss", patience=3, restore_best_weights=True
                    )

                    history = model.fit(
                        X_train,
                        y_train,
                        epochs=epochs,
                        batch_size=batch_size,
                        validation_data=(X_val, y_val),
                        verbose=0,
                        callbacks=[early_stopping],
                    )
                    val_loss = min(history.history["val_loss"])
                    print(f"Validation loss: {val_loss:.4f}")

                    if val_loss < best_val_loss:
                        best_val_loss = val_loss
                        best_params = {
                            "lstm_units": lstm_units,
                            "dropout_rate": dropout_rate,
                            "batch_size": batch_size,
                            "epochs": epochs,
                        }
                        best_history = history

    print("Best Hyperparameters:", best_params)
    print(f"Best Validation Loss: {best_val_loss:.4f}")
    return best_params, best_history

## Training and Evaluation Process

This function trains the enhanced REDf model with the optimal hyperparameters identified during the grid search and evaluates its performance on the test set.

### Key Components:
- Early stopping to prevent overfitting
- Evaluation using multiple metrics (MAE, RMSE, R²)
- Detailed reporting of model performance

In [None]:
def train_redf(
    X_train, y_train, X_val, y_val, epochs, batch_size, lstm_units, dropout_rate
):
    
    X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
    X_val = X_val.reshape(X_val.shape[0], X_val.shape[1], 1)
    
    model = build_redf_model(
        X_train.shape[1], lstm_units=lstm_units, dropout_rate=dropout_rate
    )
    
    model.summary()
    
    early_stopping = EarlyStopping(
        monitor="val_loss", patience=5, restore_best_weights=True
    )
    
    history = model.fit(
        X_train,
        y_train,
        epochs=epochs,
        batch_size=batch_size,
        validation_data=(X_val, y_val),
        verbose=1,
        callbacks=[early_stopping],
    )
    
    
    return model, history

## Main Workflow

This section orchestrates the complete workflow for the enhanced REDf model implementation:

1. **Preprocess Data**: For each dataset, preprocess the raw data by handling missing values, removing outliers, and normalizing the energy demand values.
2. **Create Sequences**: Generate input-output sequences for time series forecasting using a sliding window approach.
3. **Split Data**: Divide the sequences into training (80%) and testing (20%) sets.
4. **Hyperparameter Tuning**: Find optimal hyperparameters
5. **Train the Model**: Train the REDf model using the training data and evaluate its performance on the testing data.
6. **Save the Model**: Save the trained model for each dataset for future use.

This workflow ensures a systematic approach to model training and evaluation, enabling reproducibility and consistent performance assessment across datasets.


In [None]:
time_steps = 24  # 24 hours (1 day) sequence length
best_hyperparams = {}

for name, path in dataset_paths.items():
    print(f"\n{'='*50}")
    print(f"Processing {name} dataset...")
    print(f"{'='*50}")
    
    try:
        scaled_data, scaler, energy_column = preprocess_data(path)
        
        X, y = create_sequences(scaled_data, time_steps)
        
        total_samples = len(X)
        train_end = int(total_samples * 0.8)
        val_end = int(total_samples * 0.9)
        
        X_train, y_train = X[:train_end], y[:train_end]
        X_val, y_val = X[train_end:val_end], y[train_end:val_end]
        X_test, y_test = X[val_end:], y[val_end:]
        
        print(f"Training set shape: {X_train.shape}")
        print(f"Validation set shape: {X_val.shape}")
        print(f"Testing set shape: {X_test.shape}")
        
        best_params, _ = grid_search_hyperparameters(
            X_train,
            y_train,
            X_val,
            y_val,
            input_shape=X_train.shape[1],
        )
        
        best_hyperparams[name] = best_params
        
        model, history = train_redf(
            X_train,
            y_train,
            X_val,
            y_val,
            epochs=best_params["epochs"],
            batch_size=best_params["batch_size"],
            lstm_units=best_params["lstm_units"],
            dropout_rate=best_params["dropout_rate"],
        )
        
        # Save the model
        model.save(f"{name}.h5")
        print(f"Model saved as {name}.h5")
        
    except Exception as e:
        print(f"Error processing {name} dataset: {e}")


print("\nBest Hyperparameters per Dataset:")
for name, params in best_hyperparams.items():
    print(f"{name}: {params}")

## Model Loading and evaluation

1. **Load the Model**: The saved model can be loaded using the `load_model()` function from Keras.
2. **Inference**: The loaded model is used to make predictions on new or test data.
3. **Evaluation**: The predictions are evaluated using metrics such as MAE, RMSE, and R^2 to assess the model\'s performance.


This implementation provides several improvements over the original REDf model:

1. **Enhanced Architecture**: Hybrid CNN-LSTM with Temporal Pattern Attention
2. **Hyperparameter Tuning**: Systematic grid search for optimal parameters
3. **Improved Regularization**: Multiple dropout layers and early stopping
4. **Better Data Utilization**: Separate validation set for hyperparameter selection

In [None]:
from tensorflow.keras.models import load_model
import glob

# Load all models from the ./models_new folder
model_files = glob.glob("./models_new/*.h5")
loaded_models = {}

for model_file in model_files:
    model_name = model_file.split("/")[-1].replace(".h5", "")
    loaded_models[model_name] = model_file

results = {}
for name, path in dataset_paths.items():
    print(f"\n{'='*50}")
    print(f"Processing {name} dataset...")
    print(f"{'='*50}")
    
    
    scaled_data, scaler, energy_column = preprocess_data(path)
    
    time_steps = 24
    X, y = create_sequences(scaled_data, time_steps)
    
    total_samples = len(X)
    train_end = int(total_samples * 0.8)
    val_end = int(total_samples * 0.9)
    
    X_train, y_train = X[:train_end], y[:train_end]
    X_val, y_val = X[train_end:val_end], y[train_end:val_end]
    X_test, y_test = X[val_end:], y[val_end:]

    model = load_model(loaded_models[name], custom_objects={"mse": tf.keras.losses.MeanSquaredError()})

    
    y_pred = model.predict(X_test)
    
    mae = mean_absolute_error(y_test, y_pred)
    rmse = np.sqrt(mean_squared_error(y_test, y_pred))
    r2 = r2_score(y_test, y_pred)

    results[name] = (mae, rmse, r2)



Processing AEP dataset...


2025-04-18 06:30:18.200453: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


[1m379/379[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 9ms/step

Processing COMED dataset...




[1m208/208[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step

Processing DAYTON dataset...




[1m379/379[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 12ms/step

Processing PJME dataset...




[1m455/455[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 9ms/step


## Final Output

This section summarizes the performance of the REDf model on each dataset.

In [None]:

print("Dataset | MAE | RMSE | R²")
print("---------------------------------")
for name, metrics in results.items():
    mae, rmse, r2 = metrics
    print(f"{name.ljust(7)} | {mae:.4f} | {rmse:.4f} | {r2:.4f}")


Summary of Results:
------------------
Dataset | MAE | RMSE | R²
---------------------------------
AEP     | 0.0143 | 0.0222 | 0.9848
COMED   | 0.0213 | 0.0414 | 0.9432
DAYTON  | 0.0151 | 0.0238 | 0.9797
PJME    | 0.0131 | 0.0254 | 0.9750
