In [1]:
import pandas as pd
import datetime 
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import os 
from datetime import datetime
import time

In [2]:
from helpers import prepare_features

In [3]:
# Save current directory
current_directory = os.getcwd()

# Set print options to suppress scientific notation and show 3 decimal places
np.set_printoptions(suppress=True, precision=5)
pd.options.display.float_format = '{:.5f}'.format

# Suppress all warnings globally
import warnings
warnings.filterwarnings("ignore")

### Load cleaned data

In [5]:
augmented_dataset = 0
if augmented_dataset:
    file_path = os.path.join(current_directory, 'data_augmented/X_augmented.csv')
else:
    file_path = os.path.join(current_directory, 'data_augmented/X.csv')

X = pd.read_csv(file_path, index_col = 0)
df = X 

file_path = os.path.join(current_directory, 'data_augmented/timestamps.csv')
timestamps = pd.read_csv(file_path, index_col = 0)

df['timestamp'] = timestamps
df.set_index("timestamp", inplace=True)

df.index = pd.to_datetime(df.index)
df = df.asfreq('H')  # 'H' for hourly frequency

### Train-Test split

In the context of time series forecasting for electricity market bidding, it is crucial to split the dataset into training and testing sets in a way that aligns with real-world operational requirements. Here, the test set is designed to reflect the market dynamics, where bidding occurs daily at 10 AM for the next 24 hours. 

To achieve this, the code ensures:
1. The test set starts at the first available 11 AM, enabling the model to forecast the next 24 hours based on data available just before the bidding deadline.
2. The test set ends at the last available 10 AM, ensuring predictions align precisely with the actual bidding window.

In [7]:
# Train-Test Split
train_test_split_ratio = 0.95
train_size = int(len(df) * train_test_split_ratio)  
initial_test_start = train_size  

while df.index[initial_test_start].hour != 11: # Adjust test start to align with the next occurrence of 11 AM
    initial_test_start += 1

final_test_end = len(df) - 1
while df.index[final_test_end].hour != 10: # Adjust test end to align with the last 10 AM in the dataset
    final_test_end -= 1

train = df.iloc[:initial_test_start]
test = df.iloc[initial_test_start:final_test_end+1]  # Include the last index

In [8]:
print('Train set dimensions:')
print(train.shape) # (num_train_samples, num_features)
print('Test set dimensions:')
print(test.shape) # (num_test_samples, num_features)

Train set dimensions:
(18145, 32)
Test set dimensions:
(936, 32)


### Standardization

Training and test data are standardized separately to prevent **data leakage**, ensuring the test set remains independent from the training process. By doing so, the model’s performance reflects real-world scenarios where unseen data is encountered.

In [10]:
# Standardize data
from sklearn.preprocessing import StandardScaler

if augmented_dataset:
    columns_not_to_scale = ['is_monday', 'is_tuesday', 'is_wednesday',
           'is_thursday', 'is_friday', 'is_saturday', 'is_sunday', 'is_weekend',
           'is_spring', 'is_summer', 'is_autumn', 'is_winter', 'is_holiday',
           'is_daylight','hour_1', 'hour_2', 'hour_3',
       'hour_4', 'hour_5', 'hour_6', 'hour_7', 'hour_8', 'hour_9',
       'hour_10', 'hour_11', 'hour_12', 'hour_13', 'hour_14', 'hour_15',
       'hour_16', 'hour_17', 'hour_18', 'hour_19', 'hour_20', 'hour_21',
       'hour_22', 'hour_23', 'month_2', 'month_3', 'month_4', 'month_5',
       'month_6', 'month_7', 'month_8', 'month_9', 'month_10', 'month_11',
       'month_12']
    columns_to_scale = [col for col in train.columns if col not in columns_not_to_scale]
else:
    columns_to_scale = ['power_consumption', 'temp']

# Train data
scaler_train = StandardScaler()
scaled_train = pd.DataFrame(
    scaler_train.fit_transform(train[columns_to_scale]),
    columns=columns_to_scale
)

means_train = pd.DataFrame(columns = columns_to_scale)
means_train.loc[0] = scaler_train.mean_
stds_train = pd.DataFrame(columns = columns_to_scale)
stds_train.loc[0] = scaler_train.scale_

train[columns_to_scale] = scaled_train.values

# Test data
scaler_test = StandardScaler()
scaled_test = pd.DataFrame(
    scaler_test.fit_transform(test[columns_to_scale]),
    columns=columns_to_scale
)

means_test = pd.DataFrame(columns = columns_to_scale)
means_test.loc[0] = scaler_test.mean_
stds_test = pd.DataFrame(columns = columns_to_scale)
stds_test.loc[0] = scaler_test.scale_

test[columns_to_scale] = scaled_test.values

In [11]:
# Save train and test data
file_path = os.path.join(current_directory, 'data_augmented/train.csv')
train.to_csv(file_path)
file_path = os.path.join(current_directory, 'data_augmented/test.csv')
test.to_csv(file_path)
file_path = os.path.join(current_directory, 'data_augmented/means_train.csv')
means_train.to_csv(file_path)
file_path = os.path.join(current_directory, 'data_augmented/means_test.csv')
means_test.to_csv(file_path)
file_path = os.path.join(current_directory, 'data_augmented/stds_train.csv')
stds_train.to_csv(file_path)
file_path = os.path.join(current_directory, 'data_augmented/stds_test.csv')
stds_test.to_csv(file_path)

### Pre-process data for LSTM and TCN Models

When training temporal models like LSTMs or TCNs, the data must be structured into sequences to capture temporal dependencies effectively. The function `prepare_features` preprocesses the data by creating feature tensors (`X`) and target tensors (`y`) suitable for training these models. Additionally, the function allows for the inclusion of forecasted exogenous variables (e.g., temperature or wind forecasts) alongside historical data, ensuring that the model can utilize all relevant information available at prediction time. Missing values are handled using forward and backward filling to ensure data consistency without losing samples.


If forecasted variables are included, the function generates new columns for each forecasted variable, representing hourly predictions over the forecast horizon (e.g., `temp_forecast_1h`, `temp_forecast_2h`, ..., `temp_forecast_24h`). These columns are created by shifting the original exogenous variable backward in time to align forecasted features with their corresponding prediction intervals. This allows the model to incorporate known forecasted inputs when predicting power consumption for the next 24 hours.

##### Structure of `X` and `y` Tensors

1. **Feature Tensor (`X`)**:
   - **Shape**: `(num_samples, window_length, num_features)`
   - **Content**: 
     - For each sequence, the historical window (`window_length`) includes the target variable (e.g., power consumption) and exogenous variables (e.g., temperature).
     - If forecasted variables are included, each sequence also contains the aligned forecasted values for the entire horizon.

2. **Target Tensor (`y`)**:
   - **Shape**: `(num_samples, forecast_horizon)`
   - **Content**: The target tensor contains the power consumption values for the next 24 hours for each sequence.

The choice of **168 hours** as the sequence length corresponds to one full week of historical data, allowing the model to capture **weekly seasonal patterns** such as differences between weekdays and weekends, as well as weather-driven fluctuations. This length provides sufficient context to detect **trends, seasonal fluctuations, and anomalies** while balancing **information richness and computational efficiency**. Longer sequences could increase computational cost with diminishing returns, making 168 hours an optimal choice for many applications. Additionally, it aligns with practical considerations in energy markets, where weekly cycles are significant in forecasting power consumption.

In [14]:
# Prepare input data to NN: no forecasted variables included
include_forecast = False

target_col = 'power_consumption'
forecast_cols = ['temp'] # Choose between: ['temp'], [col for col in df.columns if col not in target_col]
window_length = 168  # 7 days
forecast_horizon = 24  # 24 hours

X_train, y_train, timestamps_train = prepare_features(train, target_col, forecast_cols, window_length, forecast_horizon, include_forecast)
X_test, y_test, timestamps_test = prepare_features(test, target_col, forecast_cols, window_length, forecast_horizon, include_forecast)

In [15]:
# Save input data to NN
file_path = os.path.join(current_directory, 'data_augmented/X_train.npy')
np.save(file_path, X_train)
file_path = os.path.join(current_directory, 'data_augmented/y_train.npy')
np.save(file_path, y_train)
file_path = os.path.join(current_directory, 'data_augmented/timestamps_train.csv')
timestamps_train.to_series().to_csv(file_path, index=False) 
file_path = os.path.join(current_directory, 'data_augmented/X_test.npy')
np.save(file_path, X_test)
file_path = os.path.join(current_directory, 'data_augmented/y_test.npy')
np.save(file_path, y_test)
file_path = os.path.join(current_directory, 'data_augmented/timestamps_test.csv')
timestamps_test.to_series().to_csv(file_path, index=False) 

In [16]:
# Prepare input data to NN: forecasted variables included
include_forecast = True

X_train, y_train, timestamps_train = prepare_features(train, target_col, forecast_cols, window_length, forecast_horizon, include_forecast)
X_test, y_test, timestamps_test = prepare_features(test, target_col, forecast_cols, window_length, forecast_horizon, include_forecast)

In [17]:
# Save input data to NN
file_path = os.path.join(current_directory, 'data_augmented/X_train_include_forecast.npy')
np.save(file_path, X_train)
file_path = os.path.join(current_directory, 'data_augmented/X_test_include_forecast.npy')
np.save(file_path, X_test)

### Metrics to Evaluate Prediction Accuracy

The evaluation of prediction accuracy is performed with multiple metrics, as each highlights different aspects of model performance. 

1. **Root Mean Squared Error (RMSE)**  
   $$
   \text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^n (\hat{y}_i - y_i)^2}
   $$
   - Sensitive to large errors (penalizes outliers).  
   - Captures the overall accuracy but is non-robust to extreme values.
     

2. **Mean Absolute Error (MAE)**  
   $$
   \text{MAE} = \frac{1}{n} \sum_{i=1}^n |\hat{y}_i - y_i|
   $$  
   - Treats all errors equally.  
   - Provides a robust measure of the average error magnitude.

  
3. **Maximum Error (ME)**  
   $$
   \text{ME} = \max(|\hat{y}_i - y_i|)
   $$  
   - Highlights the largest deviation, capturing extreme behavior.

  
4. **Mean Absolute Percentage Error (MAPE)**  
   $$
   \text{MAPE} = \frac{100}{n} \sum_{i=1}^n \left| \frac{\hat{y}_i - y_i}{y_i} \right|
   $$  
   - Expresses errors as a percentage, providing relative interpretability.  
   - Sensitive to small actual values.

These metrics together offer a holistic evaluation of a model, balancing overall accuracy, robustness, and sensitivity to extreme deviations.

In [19]:
errors = pd.DataFrame(columns = ['RMSE', 'MAE', 'ME', 'MAPE'])
file_path = os.path.join(current_directory, 'results/errors.csv')
errors.to_csv(file_path)

### Metrics to evaluate uncertainty of predictions 

The predictions uncertainty is quantified with the metrics PICP and PINAW. 

1. **Prediction Interval Coverage Probability (PICP)**:
   - Measures the percentage of true values captured within the prediction intervals.
   - A high PICP (e.g., 95%) indicates well-calibrated intervals.
   - Too low: Intervals miss actual values (underestimating uncertainty).
   - Too high: Intervals are overly wide (too conservative).

2. **Prediction Interval Normalized Average Width (PINAW)**:
   - Quantifies the average width of prediction intervals relative to the range of true values.
   - A low PINAW means the intervals are narrow (sharp).
   - Too narrow: Risk of poor coverage (low PICP).
   - Too wide: Excessive conservatism, reducing utility.

The goal is to balance **PICP** (coverage) and **PINAW** (sharpness) to create reliable and efficient prediction intervals.

In [21]:
uncertainty_evaluation = pd.DataFrame(columns = ['PICP', 'PINAW'])
file_path = os.path.join(current_directory, 'results/uncertainty_evaluation.csv')
uncertainty_evaluation.to_csv(file_path) 