# Electricity Price Forecasting on the German day-ahead market

This notebook is the main interface to the associated EPF library. Various parameters can be adjusted within the library via the Config file. A more detailed description of the individual parameters can be found within the configuration classes. In the “exploratory_analysis” notebook, the raw data sets are analyzed exploratively and the results are displayed visually. Based on these results, features from the data sets were specifically selected for further use in the deep learning pipeline.

The forecasting pipeline is built to automatically perform data preprocessing including data cleaning, outlier removal and seasonal decomposition. Within the configuration feature engineering can be toggled on and of for each feature. Forecasting is done with three different models, that can be retrained and saved anytime. The models available are an LSTM, GRU and CNN. They all perform multi-step ahead, single shot forecasts. This means by default each forecast contains 24 time steps into the future, that are forecasted together in one computation.

In [1]:
from datetime import datetime

from epf.pipeline import EpfPipeline

pipeline = EpfPipeline()

[32m2025-05-10 23:57:14.039[0m | [1mINFO    [0m | [36mepf.config[0m:[36m<module>[0m:[36m19[0m - [1mPROJ_ROOT path is: C:\Users\valen\PycharmProjects\epf[0m
[32m2025-05-10 23:57:14.054[0m | [1mINFO    [0m | [36mepf.config[0m:[36mcreate_dir[0m:[36m15[0m - [1mDATA_DIR path is: C:/Users/valen/PycharmProjects/epf/data[0m
[32m2025-05-10 23:57:14.054[0m | [1mINFO    [0m | [36mepf.config[0m:[36mcreate_dir[0m:[36m15[0m - [1mRAW_DATA_DIR path is: C:/Users/valen/PycharmProjects/epf/data/raw[0m
[32m2025-05-10 23:57:14.054[0m | [1mINFO    [0m | [36mepf.config[0m:[36mcreate_dir[0m:[36m15[0m - [1mINTERIM_DATA_DIR path is: C:/Users/valen/PycharmProjects/epf/data/interim[0m
[32m2025-05-10 23:57:14.054[0m | [1mINFO    [0m | [36mepf.config[0m:[36mcreate_dir[0m:[36m15[0m - [1mPROCESSED_DATA_DIR path is: C:/Users/valen/PycharmProjects/epf/data/processed[0m
[32m2025-05-10 23:57:14.054[0m | [1mINFO    [0m | [36mepf.config[0m:[36mcreate_dir[0

Pipeline initialized with:
FeatureConfig: 
WINDOW_LENGTH: 24
N_SIGMA: 3
METHOD: nearest

ModelConfig: 
TRAIN_SPLIT: 2023-09-30 22:00:00+00:00
VALIDATION_SPLIT: 2023-12-31 22:00:00+00:00
MAX_EPOCHS: 20
OUT_STEPS: 24
SEASONALITY_PERIOD: 168
INPUT_WIDTH_FACTOR: 1.25
MODEL_BUILDER: LSTM
NUM_FEATURES: 22
UNIT_MIN_VALUE: 32
UNIT_MAX_VALUE: 128
UNIT_STEP: 32
LR_MIN_VALUE: 0.001
LR_MAX_VALUE: 0.1
LR_STEP: 0.05
DROPOUT_RATE_MIN_VALUE: 0.2
DROPOUT_RATE_MAX_VALUE: 0.7
DROPOUT_RATE_STEP: 0.1
USE_HIDDEN_LAYERS: True
NUM_LAYERS_MIN: 1
NUM_LAYERS_MAX: 3
NUM_LAYERS_STEP: 1
MAX_TRIALS: 50
LABEL_COL: de_prices_hat_rm_seasonal




In [None]:
pipeline.train('test', overwrite=True, prep_data=False, use_tuned_hyperparams=False)

Trial 1 Complete [00h 01m 40s]
mean_absolute_error: 0.5627419352531433

Best mean_absolute_error So Far: 0.5627419352531433
Total elapsed time: 00h 01m 40s

Search: Running Trial #2

Value             |Best Value So Far |Hyperparameter
0.001             |0.051             |learning_rate
3                 |2                 |num_layers
32                |96                |units
0.4               |0.2               |dropout
64                |64                |units_0
0.6               |0.2               |dropout_0
96                |32                |units_1
0.4               |0.2               |dropout_1

Epoch 1/20
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m36s[0m 160ms/step - loss: 0.6086 - mean_absolute_error: 0.6086 - mean_absolute_percentage_error: 262.0644 - root_mean_squared_error: 0.8427 - val_loss: 0.7742 - val_mean_absolute_error: 0.7742 - val_mean_absolute_percentage_error: 200.4095 - val_root_mean_squared_error: 1.0436
Epoch 2/20
[1m198/198[0m [32m━━━

In [None]:
pipeline.evaluate('lstm_all_features_hl0_drY')

In [None]:
import pickle as pkl

with open("../data/processed/performance.pkl", "rb") as f:
    performance = pkl.load(f)

with open("../data/processed/val_performance.pkl", "rb") as f:
    val_performance = pkl.load(f)

In [None]:
from matplotlib import pyplot as plt
import numpy as np

x = np.arange(len(performance))
width = 0.3

metric_name = 'mean_absolute_error'
val_mae = [v[metric_name] for v in val_performance.values()]
test_mae = [v[metric_name] for v in performance.values()]

plt.bar(x - 0.17, val_mae, width, label='Validation')
plt.bar(x + 0.17, test_mae, width, label='Test')
plt.xticks(ticks=x, labels=performance.keys(),
           rotation=45)
plt.ylabel(f'MAE (average over all times and outputs) normed')
_ = plt.legend()

In [None]:
import pickle as pkl

with open("../models/lstm_all_features.pkl", "rb") as f:
    model_obj = pkl.load(f)

window = model_obj['window']
model = model_obj['best_model']
window.plot(model)

In [None]:
from epf.config import PREDICTIONS_DIR, MODELS_DIR

model_path = MODELS_DIR / "lstm_all_features.keras"
predictions_dir = PREDICTIONS_DIR

In [None]:
pipeline.predict(data=window.test, model_path=model_path, predictions_dir=predictions_dir)