# Workflow of the `evaluate_model` Function

## 1. **Function Definition and Parameters**
The function `evaluate_model` takes in the following parameters:
- `entsoe`: A DataFrame containing power data
- `target_column`: Column name containing the target variable (default: 'power')
- `dist`: Probability distribution (default: LogNormal)
- `case`: Integer determining feature selection and loss function
- `n_estimators`: Number of boosting iterations (default: 100)
- `learning_rate`: Learning rate for the NGBoost model (default: 0.03)
- `random_state`: Random seed (default: 42)
- `output_file`: File path for saving results

## 2. **Preprocessing the Data**
- **Scaling the Power Data**:
  - The maximum power value is identified and rounded up to the nearest 1000.
  - The target variable (`power`) is normalized using this max value plus a small epsilon to avoid division by zero.
  - A new feature, `power_t-96`, is created by shifting the target column by 96 time steps.
  - A time-based interval index (`interval_index`) is computed to segment the data into 15-minute intervals.
  - Any rows containing NaN values are dropped.

## 3. **Train-Validation-Test Split**
   - Split the dataset into three subsets:  
     - **Train**: Data from 2016–2022.  
     - **Validation**: Data from 2023.  
     - **Test**: Data from 2024.  
     
## 4. **Feature Selection Based on Case Parameter**
Different cases dictate which features are included and which loss function is used:
- Cases 1-10 select different combinations of power history (`power_t-96`) and wind speed data at different heights (`ws_10m_loc`, `ws_100m_loc`).
- Loss functions used: `CRPScore` or `LogScore`.
- Output file paths are dynamically updated based on the selected case.

## 5. **Extract Training and Validation Data**
- Feature columns and target values are extracted for training and validation.

## 6. **Initialize and Train NGBoost Model**
- `NGBRegressor` is instantiated with:
  - Distribution: `dist` (e.g., LogNormal)
  - Loss function: Chosen based on case
  - `n_estimators`, `learning_rate`, and `random_state`
- The model is trained using `model.fit()` with validation data included.

## 7. **Model Evaluation**
- **Interval-Based Scoring**:
  - Validation data is split into 96 sub-arrays, corresponding to 96 time intervals per day.
  - The model's score is computed separately for each interval and overall.
- **Predictions**:
  - `y_val_pred`: Point predictions.
  - `y_val_dists`: Predicted probability distributions.
- **Compute CRPS and NLL**:
  - Continuous Ranked Probability Score (CRPS) and Negative Log-Likelihood (NLL) are computed for each sample.
  - CRPS is calculated for both Gaussian and log-normal distributions.
  - Scores are stored in lists for later aggregation.

## 8. **Compute Per-Interval Statistics**
- For each of the 96 intervals:
  - Mean, min, and max CRPS (Gaussian and log-normal) are computed.
  - Mean, min, and max NLL are computed.
  - Results are stored in dictionaries.

## 9. **Store Results in DataFrames**
- **`results_per_time_interval_df`**: Stores per-interval CRPS, NLL, and model scores.
- **`results_summary_stats_df`**: Stores overall summary statistics (means, min, max for CRPS, NLL, and model scores).
- **`results_per_row_df`**: Stores per-sample CRPS and NLL values.
- **`hyperparameters_df`**: Stores model hyperparameters and dataset details.

## 10. **Save Results to Excel**
- An Excel file is created with multiple sheets:
  - `Interval_Scores` (per-interval stats)
  - `Summary_Scores` (aggregated stats)
  - `Detailed_Scores` (per-row CRPS and NLL values)
  - `Hyperparameters` (model settings)

## 11. **Return DataFrames for Display**
- The function returns and displays the four main result DataFrames.

In [1]:
from analysis.datasets import load_entsoe
#from analysis.splits_old import to_train_validation_test_data
from analysis.splits import to_train_validation_test_data

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from ngboost import NGBRegressor
from sklearn.metrics import mean_squared_error
from ngboost.scores import LogScore, CRPScore
from ngboost.distns import Normal
from ngboost.distns import LogNormal
from ngboost.distns.normal import NormalCRPScore

from scipy.stats import lognorm
from scipy.integrate import quad
from scipy.stats.distributions import norm
from scipy import stats
import openpyxl
from openpyxl.drawing.image import Image
import glob

from pathlib import Path

# Manually

In [2]:
entsoe = load_entsoe()
feature_columns = ['ws_100m_loc_mean', 'ws_10m_loc_mean']
target_column = ['power']
dist=LogNormal
loss_function=CRPScore
n_estimators=3
learning_rate=0.03
random_state=42
#output_file='model_evaluation.xlsx'

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)


In [3]:
# Scaling power data
max_power_value = entsoe[target_column].max()
print("max power:", max_power_value)
max_power_value_rounded = np.ceil(max_power_value / 1000) * 1000
epsilon = 1e-9
entsoe[target_column] = entsoe[target_column] / max_power_value_rounded + epsilon
entsoe['power_t-96'] = entsoe[target_column].shift(96)
entsoe['interval_index'] = ((entsoe.index.hour * 60 + entsoe.index.minute) // 15) + 1
entsoe.dropna(inplace=True)

max power: power    16676.0
dtype: float64


In [5]:
X_train = train[feature_columns]
y_train = train[target_column]

In [6]:
X_validation, y_validation = validation[feature_columns], validation[target_column]

In [7]:
# Train the model with the chosen loss function
model = NGBRegressor(Dist=dist, Score=loss_function, n_estimators=n_estimators, learning_rate=learning_rate, random_state=random_state, verbose = True, verbose_eval = True)

model.fit(X_train, y_train.squeeze(), X_val=X_validation, Y_val=y_validation.squeeze())
#print("model.fit ended")



[iter 0] loss=0.6450 val_loss=0.8581 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.8343 scale=1.0000 norm=1.0961
[iter 2] loss=0.6126 val_loss=0.8121 scale=1.0000 norm=1.0779


In [8]:
# Split validation data into 96 intervals
X_validation_sub_arrays = [X_validation[i::96] for i in range(96)]
y_validation_sub_arrays = [y_validation[i::96] for i in range(96)]

#print("X_validation_sub_arrays[0] = ", display(X_validation_sub_arrays[0]))

In [9]:
model_scores_intervals = [model.score(np.array(X_validation_sub_arrays[i]), np.array(y_validation_sub_arrays[i])) for i in range(96)]
model_scores_overall = model.score(np.array(X_validation), np.array(y_validation))

y_val_pred = model.predict(X_validation)
y_val_dists = model.pred_dist(X_validation)

print("y_val_pred[0]", y_val_pred[0])

y_val_pred[0] 0.30951383057142146




In [10]:
crps_gaussian = [] # Gaussian formula with log(y) used for crps rather than the correct log normal crps formula by NGBoost
crps_log_gaussian = []
nll = []


for i in range(len(y_val_pred)):
        y = y_validation.iloc[i,0]
        median = y_val_pred[i]
        sigma = y_val_dists[i].scale
        mu = y_val_dists[i].loc
        ylog = np.log(y)
        z = (ylog - mu) / sigma
        crps_i = sigma * ( 
            z * (2 * norm.cdf(z) - 1)
            + 2 * norm.pdf(z) 
            - 1/np.sqrt(np.pi)
        )

        crps_full_i = y * (2 * norm.cdf(z) - 1) - 2 * np.exp(mu + 0.5 * sigma**2) * (
            norm.cdf(z - sigma) + norm.cdf(sigma/np.sqrt(2)) - 1)
        
        nll_i = lognorm.logpdf(y, s=sigma, scale=np.exp(mu))


        crps_log_gaussian.append(crps_full_i)
            
        crps_gaussian.append(crps_i)

        nll.append(nll_i)

        nll_sub_arrays = [nll[i::96] for i in range(96)]
        crps_gaussian_sub_arrays = [crps_gaussian[i::96] for i in range(96)]
        crps_lognormal_sub_arrays = [crps_log_gaussian[i::96] for i in range(96)]

        #print("i = ", i)



#display('CRPS_gaussian shape:', len(crps_gaussian_sub_arrays))
#display('CRPS_lognormal shape', len(crps_lognormal_sub_arrays))
#display('NLL shape:', len(nll_sub_arrays))
#display('model_scores shape', len(model_scores_intervals))

#display('CRPS_gaussian shape', len(crps_gaussian))
#display('CRPS_lognormal shape', len(crps_log_gaussian))
#display('NLL shape', len(nll))
#display('model_scores shape', model_scores_overall)


#with pd.ExcelWriter(output_file) as writer:
    #results_df.to_excel(writer, sheet_name='Interval_Scores', index=False)
    #overall_df.to_excel(writer, sheet_name='Overall_Scores', index=False)



In [11]:
# Compute statistics per interval

crps_gaussian_min_per_interval = [np.min(crps_gaussian_sub_arrays[i]) for i in range(96)]
crps_gaussian_max_per_interval = [np.max(crps_gaussian_sub_arrays[i]) for i in range(96)]
crps_gaussian_mean_per_interval = [np.mean(crps_gaussian_sub_arrays[i]) for i in range(96)]

nll_min_per_interval = [np.min(nll_sub_arrays[i]) for i in range(96)]
nll_max_per_interval = [np.max(nll_sub_arrays[i]) for i in range(96)]
nll_mean_per_interval = [np.mean(nll_sub_arrays[i]) for i in range(96)]


crps_lognormal_min_per_interval = [np.min(crps_lognormal_sub_arrays[i]) for i in range(96)]
crps_lognormal_max_per_interval = [np.max(crps_lognormal_sub_arrays[i]) for i in range(96)]
crps_lognormal_mean_per_interval = [np.mean(crps_lognormal_sub_arrays[i]) for i in range(96)]

# Save results to an Excel file
results_per_time_interval_df = pd.DataFrame({
    'Interval': list(range(1, 97)),
    'CRPS_gaussian_mean': crps_gaussian_mean_per_interval,
    'CRPS_gaussian_min': crps_gaussian_min_per_interval,
    'CRPS_gaussian_max': crps_gaussian_max_per_interval,

    'CRPS_lognormal_mean': crps_lognormal_mean_per_interval,
    'CRPS_lognormal_min': crps_lognormal_min_per_interval,
    'CRPS_lognormal_max': crps_lognormal_max_per_interval,

    'NLL_mean': nll_mean_per_interval,
    'NLL_min': nll_min_per_interval,
    'NLL_max': nll_max_per_interval,
    'model_scores': model_scores_intervals
})

results_summary_stats_df = pd.DataFrame({
    'Interval': list(range(1, 2)),
    'CRPS_gaussian_mean': np.mean(crps_gaussian_mean_per_interval),
    'CRPS_gaussian_min': np.min(crps_gaussian_min_per_interval),
    'CRPS_gaussian_max': np.max(crps_gaussian_max_per_interval),

    'CRPS_lognormal_mean': np.mean(crps_lognormal_mean_per_interval),
    'CRPS_lognormal_min': np.min(crps_lognormal_min_per_interval),
    'CRPS_lognormal_max': np.max(crps_lognormal_max_per_interval),

    'NLL_mean': np.mean(nll_mean_per_interval),
    'NLL_min': np.min(nll_min_per_interval),
    'NLL_max': np.max(nll_max_per_interval),

    'model_scores': model_scores_intervals
})

results_per_row_df = pd.DataFrame({
    'Entry_no': list(range(1, y_validation.shape[0] + 1)),
    'CRPS_gaussian': crps_gaussian,
    'CRPS_lognormal': crps_log_gaussian,
    'NLL': nll,
})

hyperparameters_df = pd.DataFrame({
    'dataset': 'entsoe',
    'feature_columns': [feature_columns],
    'distribution': dist,
    'loss_function': loss_function,
    'iterations': n_estimators,
    'learning_rate': learning_rate,
    'random_state': random_state
})

ValueError: All arrays must be of the same length

In [82]:
with pd.ExcelWriter('../data/dummy.xlsx') as writer:
    results_per_time_interval_df.to_excel(writer, sheet_name='Interval_Scores', index=False)
    results_summary_stats_df.to_excel(writer, sheet_name='Summary_Scores', index=False)
    results_per_row_df.to_excel(writer, sheet_name='Detailed_Scores', index=False)
    hyperparameters_df.to_excel(writer, sheet_name='Hyperparameters', index=False)

In [90]:
y_validation.iloc[i,0]

0.8262941186470588

# Function (automatic)

    """
    case 1: feature_columns = ['power_t-96'], loss: CRPS
    case 2: feature_columns = ['power_t-96'], loss: NLL
    case 3: feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean'], loss: CRPS
    case 4: feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean'], loss: NLL
    case 5: feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean'], loss: CRPS
    case 6: feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean'], loss: NLL
    case 7: feature_columns = ['power_t-96', 'ws_10m_loc_1', ..., 'ws_10m_loc_10', 'ws_100m_loc_1', ..., 'ws_100m_loc_10'], loss: CRPS
    case 8: feature_columns = ['power_t-96', 'ws_10m_loc_1', ..., 'ws_10m_loc_10', 'ws_100m_loc_1', ..., 'ws_100m_loc_10'], loss: NLL
    case 9: feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', ..., 'ws_10m_loc_10', 'ws_100m_loc_1', ..., 
    'ws_100m_loc_10'], loss: CRPS
    case 10: feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', ..., 'ws_10m_loc_10', 'ws_100m_loc_1', ..., 
    'ws_100m_loc_10'], loss: NLL
    case 11: feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', ..., 'ws_10m_loc_10', 'ws_100m_loc_1', ..., 
    'ws_100m_loc_10', 'interval_index'], loss: CRPS
    case 12: feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', ..., 'ws_10m_loc_10', 'ws_100m_loc_1', ..., 
    'ws_100m_loc_10', 'interval_index'], loss: NLL
    case 13:...
    case 14...
    case 15...
    case 16...

    """

In [7]:
import os
import warnings


def evaluate_ngboost_model(
              entsoe, 
              target_column='power', 
              dist=Normal, 
              case=1, 
              n_estimators=100, 
              learning_rate=0.03, 
              random_state=42, 
              output_file='../results/NGBoost/',
              train_start = "2016-01-01",
              train_end = "2022-12-31",
              validation_start = "2023-01-01",
              validation_end = "2023-12-31"
    ):
    
    print("passed output file path:", output_file)
    
    if train_start == "2022-10-01" and train_end == "2022-12-31":
           output_file=f"{output_file}q4_train/"

    if train_start == "2016-01-01" and train_end == "2022-12-31":
           output_file=f"{output_file}full_year/"
           print("output_file in dist=normal", output_file)

    #if dist == Normal:
           #output_file=f"{output_file}Normal_dist/"
           #print("output_file in dist=normal", output_file)


    #else:
    #       output_file='../results/NGBoost/Lognormal_dist/'

    # Scale power data
    max_power_value = entsoe[target_column].max()
    max_power_value_rounded = np.ceil(max_power_value / 1000) * 1000
    #epsilon = 1e-9
    epsilon = 1e-5
    entsoe[target_column] = np.log(entsoe[target_column] / max_power_value_rounded + epsilon)
    entsoe['power_t-96'] = entsoe[target_column].shift(96)
    entsoe['interval_index'] = ((entsoe.index.hour * 60 + entsoe.index.minute) // 15) + 1
    entsoe.dropna(inplace=True)

    # Train-test split
    train_X, train_y, validation_X, validation_y, test_X, test_y = to_train_validation_test_data(entsoe, train_start, train_end, validation_start, validation_end)

    #display(validation_y)
    #train, validation, test = to_train_validation_test_data(entsoe, "2022-12-31 23:45:00", "2023-12-31 23:45:00")
    
    output_dir = output_file
    print("Creating output_dir:", output_dir)

    if case == 1:
            feature_columns = ['power_t-96']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            print(f"output file case = {case}: {output_file}")
            feature_abbr = "P"
    
    if case == 2:
            feature_columns = ['power_t-96']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "P"


    if case == 3:
            feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "ws_mean"

    
    if case == 4:
            feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "ws_mean"
    
    elif case == 5:
            feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_mean"


    elif case == 6:
            feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_mean"


    elif case == 7:
            feature_columns = ['power_t-96', 'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6',
                               'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10',
                               'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6',
                               'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_10_loc"



    elif case == 8:
            feature_columns = ['power_t-96', 'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6',
                               'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10',
                               'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6',
                               'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_10_loc"


    elif case == 9:
            feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean',
                               'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6',
                               'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10',
                               'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6',
                               'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_mean, ws_10_loc"


    elif case == 10:
            feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean',
                               'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6',
                               'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10',
                               'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6',
                               'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_mean, ws_10_loc"

    
    elif case == 11:
            feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean',
                               'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6',
                               'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10',
                               'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6',
                               'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10', 'interval_index']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_mean, ws_10_loc, t_index"

    elif case == 12:
            feature_columns = ['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean',
                               'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6',
                               'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10',
                               'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6',
                               'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10', 'interval_index']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, ws_mean, ws_10_loc, t_index"

    elif case == 13:
            feature_columns = ['power_t-96', 'interval_index']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, t_index"

    elif case == 14:
            feature_columns = ['power_t-96', 'interval_index']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "p, t_index"

    elif case == 15:
            feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean', 'interval_index']
            loss_function = CRPScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "ws_mean, t_index"

    elif case == 16:
            feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean', 'interval_index']
            loss_function = LogScore
            output_file = f'{output_file}case{case}.xlsx'
            feature_abbr = "ws_mean, t_index"



    #X_train, y_train = train_X[feature_columns], train_y[target_column]
    #X_validation, y_validation = validation_X[feature_columns], validation_y[target_column]
   
    X_train, y_train = train_X[feature_columns], train_y
    X_validation, y_validation = validation_X[feature_columns], validation_y

    with warnings.catch_warnings():
       warnings.simplefilter("ignore", category=FutureWarning)
       # Train model
       model = NGBRegressor(
                Dist=dist, Score=loss_function, 
                n_estimators=n_estimators, learning_rate=learning_rate, 
                random_state=random_state, verbose=True, verbose_eval=True
        )
       model.fit(X_train, y_train.squeeze(), X_val=X_validation, Y_val=y_validation.squeeze())

    # Split validation data into 96 intervals
    X_validation_sub_arrays = [X_validation[i::96] for i in range(96)]
    y_validation_sub_arrays = [y_validation[i::96] for i in range(96)]

    model_scores_intervals = [model.score(np.array(X_validation_sub_arrays[i]), np.array(y_validation_sub_arrays[i])) for i in range(96)]
    model_scores_overall = model.score(np.array(X_validation), np.array(y_validation))

    y_val_pred = model.predict(X_validation)
    y_val_dists = model.pred_dist(X_validation)


    # Compute predictions
    #y_val_pred = model.predict(X_validation)
    #y_val_dists = model.pred_dist(X_validation)

    # Compute CRPS and NLL per sample
    crps_gaussian, crps_log_gaussian, nll, pit_values = [], [], [], []

    for i in range(len(y_val_pred)):
        y = y_validation.iloc[i]
        sigma, mu = y_val_dists[i].scale, y_val_dists[i].loc

        if dist == Normal:
               pit_value = norm.cdf(y, scale=sigma, loc=mu) # Note: loc = mean, scale = standard deviation (scipy)
               z = (y - mu) / sigma
               crps_gaussian.append(
                      sigma * (z * (2 * norm.cdf(z) - 1) + 2 * norm.pdf(z) - 1 / np.sqrt(np.pi)))
               
               crps_log_gaussian.append(0)

               nll.append(-norm.logpdf(y, scale=sigma, loc=mu))
        
        # NGBoost uses the CRPS formula of the Normal distribution with y -> Ln(y) rather than the correct CRPS formula for the LogNormal distribution
        # If dist == Normal only crps_gaussian is to be used.
        # If dist == LogNormal then CRPS log_gaussian is the correct formula. CRPS_Gaussian is calculated to double check that this is the score that NGBoost returns
        else:
               pit_value = lognorm.cdf(y, s=sigma, scale=np.exp(mu)) # Note: s = sigma and scale = exp(mu) (scipy)
               ylog = np.log(y)
               z = (ylog - mu) / sigma
               crps_gaussian.append(sigma * (z * (2 * norm.cdf(z) - 1) + 2 * norm.pdf(z) - 1 / np.sqrt(np.pi))
                                    )
               crps_log_gaussian.append(
                      y * (2 * norm.cdf(z) - 1) - 2 * np.exp(mu + 0.5 * sigma**2) * (norm.cdf(z - sigma) + norm.cdf(sigma/np.sqrt(2)) - 1)
                      )
               nll.append(-lognorm.logpdf(y, s=sigma, scale=np.exp(mu)))


        pit_values.append(pit_value)   


    # Compute per-interval statistics
    crps_gaussian_sub_arrays = [crps_gaussian[i::96] for i in range(96)]

    nll_sub_arrays = [nll[i::96] for i in range(96)]
    pit_sub_arrays = [pit_values[i::96] for i in range(96)]
    
    crps_lognormal_sub_arrays = [crps_log_gaussian[i::96] for i in range(96)]

    crps_lognormal_stats = {
           'mean': [np.mean(arr) for arr in crps_lognormal_sub_arrays],
           'min': [np.min(arr) for arr in crps_lognormal_sub_arrays],
           'max': [np.max(arr) for arr in crps_lognormal_sub_arrays]
    }

    crps_gaussian_stats = {
        'mean': [np.mean(arr) for arr in crps_gaussian_sub_arrays],
        'min': [np.min(arr) for arr in crps_gaussian_sub_arrays],
        'max': [np.max(arr) for arr in crps_gaussian_sub_arrays]
    }

    nll_stats = {
        'mean': [np.mean(arr) for arr in nll_sub_arrays],
        'min': [np.min(arr) for arr in nll_sub_arrays],
        'max': [np.max(arr) for arr in nll_sub_arrays]
    }

    # Calculates deciles per time interval
    deciles = []
    
    for i in range(0, 96):
        pit_a = pit_sub_arrays[i]
        bin_edges = np.arange(0, 1.1, 0.1)  # Creating bin edges from 0 to 1 with a step of 0.1
        decile, bins = np.histogram(pit_a, bins=bin_edges, density=True)
        #decile, bin_edges = np.histogram(pit_a, bins=10, density=True)
        deciles.append(decile)

    bin_edges = np.arange(0, 1.1, 0.1)  # Creating bin edges from 0 to 1 with a step of 0.1
    sum_deciles, bins = np.histogram(pit_values, bins=bin_edges, density=True)
    
    # Create DataFrames
    results_per_time_interval_df = pd.DataFrame({
       'Interval': list(range(1, 97)),
        **{f'CRPS_gaussian_{k}': v for k, v in crps_gaussian_stats.items()},
        **{f'CRPS_lognormal_{k}': v for k, v in crps_lognormal_stats.items()},
        **{f'NLL_{k}': v for k, v in nll_stats.items()},
        'model_scores': model_scores_intervals,
        'pit_values': deciles
        })

    results_summary_stats_df = pd.DataFrame({
        'CRPS_gaussian_mean': np.mean(crps_gaussian_stats['mean']),
        'CRPS_gaussian_min': np.min(crps_gaussian_stats['min']),
        'CRPS_gaussian_max': np.max(crps_gaussian_stats['max']),
        'CRPS_lognormal_mean': np.mean(crps_lognormal_stats['mean']),
        'CRPS_lognormal_min': np.min(crps_lognormal_stats['min']),
        'CRPS_lognormal_max': np.max(crps_lognormal_stats['max']),
        'NLL_mean': np.mean(nll_stats['mean']),
        'NLL_min': np.min(nll_stats['min']),
        'NLL_max': np.max(nll_stats['max']),
        'model_scores_mean': np.mean(model_scores_intervals),
        'pit_overall': [sum_deciles]

    }, index=[0])

    results_per_row_df = pd.DataFrame({
        'Entry_no': list(range(1, len(y_validation) + 1)),
        'CRPS_gaussian': crps_gaussian,
        'CRPS_lognormal': crps_log_gaussian,
        'NLL': nll
    })

    hyperparameters_df = pd.DataFrame({
        'dataset': 'entsoe',
        'feature_abbr': feature_abbr,
        'feature_columns': [feature_columns],
        'distribution': str(dist),
        'loss_function': str(loss_function),
        'iterations': n_estimators,
        'learning_rate': learning_rate,
        'random_state': random_state
    })

    # Save results to an Excel file
    with pd.ExcelWriter(output_file) as writer:
        results_per_time_interval_df.to_excel(writer, sheet_name='Interval_Scores', index=False)
        results_summary_stats_df.to_excel(writer, sheet_name='Summary_Scores', index=False)
        results_per_row_df.to_excel(writer, sheet_name='Detailed_Scores', index=False)
        hyperparameters_df.to_excel(writer, sheet_name='Hyperparameters', index=False)
        
        
        #output_dir = os.path.join(output_dir.rstrip('/'), '')

        if train_start == "2022-10-01" and train_end == "2022-12-31":
               file_paths = list(Path(output_dir).glob("*.xlsx"))  # Use Path.glob to find files
               print("file_paths\n", file_paths)
        
        elif train_start == "2016-01-01" and train_end == "2022-12-31":
               file_paths = list(Path(output_dir).glob("*.xlsx"))  # Use Path.glob to find files
               print("output_file in dist=normal", output_file)


        #if dist == Normal:
        #       file_paths = list(Path(output_dir).glob("*.xlsx"))  # Use Path.glob to find files
        #       print("file_paths\n", file_paths)
        
        #else:
        #        file_paths = glob.glob(f"{output_file}*.xlsx")  # Update with the correct path


    # Step 1: Get all Excel files in a folder

    # Step 2: Check if there are exactly 16 Excel files
    #print(file_paths)
    if len(file_paths) == 16:
        merged_data = []

        # Step 3: Loop through each file and extract both sheets
        for file in file_paths:
                try:
                        # Read "Summary_Scores" sheet
                        df_scores = pd.read_excel(file, sheet_name="Summary_Scores")
                        df_scores["Source_File"] = file  # Optional: Track source file

                        # Read "Hyperparameters" sheet
                        df_hyperparams = pd.read_excel(file, sheet_name="Hyperparameters")
                        df_hyperparams["Source_File"] = file  # Optional: Track source file

                        # Combine the two dataframes horizontally (side by side)
                        combined_df = pd.concat([df_scores, df_hyperparams], axis=1)
                        merged_data.append(combined_df)

                except Exception as e:
                        print(f"Could not read {file}: {e}")

        # Step 4: Merge all data into one DataFrame
        final_merged_df = pd.concat(merged_data, ignore_index=True)

        # Step 5: Save to a new Excel file
        
        #final_merged_df.to_excel(f"{file_paths}/Merged_Sheet.xlsx", index=False)
        final_merged_df.to_excel(f"{output_dir}/Merged_Sheet.xlsx", index=False)

        print("Merge completed! The final file is 'Merged_Sheet.xlsx'.")

        plt.hist(bin_edges[:-1], bin_edges, weights=results_summary_stats_df['pit_overall'].values, edgecolor='black', alpha=0.7)
        plt.xlabel('Bin Edges')
        plt.ylabel('Density')
        plt.title('Histogram of Deciles')
        # Save the plot as an image (e.g., PNG format)
        image_filename = 'histogram.png'
        plt.savefig(image_filename)
        plt.close()
        
        # Load the existing Excel file (if it already exists)
        excel_file = f"{output_dir}/Merged_Sheet.xlsx"

        wb = openpyxl.load_workbook(excel_file)
        
        # Select the specific sheet where you want to insert the image
        new_sheet = wb.create_sheet('Histogram Sheet')
        
        # Load the image you saved earlier
        img = Image(image_filename)
        
        # Specify the location where you want the image to appear in the sheet (e.g., cell 'A1')
        new_sheet.add_image(img, 'A1')
        
        # Save the modified Excel file
        wb.save(excel_file)
        
    else:
           print(f"Expected 16 Excel files, but found {len(file_paths)} files. Skipping the merge step.")


    return results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df

## Results for FY Training, FY Validation, epsilon = 1e-5 (status: 11.04.2025)

In [9]:
for i in range(2, 17):
    entsoe = load_entsoe()
    evaluate_ngboost_model(
        entsoe=entsoe, 
        target_column="power", 
        dist=Normal, 
        output_file="C:/Users/Minu/Documents/NGboost/", 
        case=i,
        train_start = "2016-01-01",
        train_end= "2022-12-31",
        validation_start="2023-01-01",
        validation_end="2023-12-31",
        learning_rate=0.03,
        n_estimators=100,
        random_state=42
    )

passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.6075 scale=1.0000 norm=1.1072
[iter 1] loss=1.5546 val_loss=1.6003 scale=1.0000 norm=1.0999
[iter 2] loss=1.5482 val_loss=1.5935 scale=1.0000 norm=1.0932
[iter 3] loss=1.5423 val_loss=1.5873 scale=1.0000 norm=1.0870
[iter 4] loss=1.5368 val_loss=1.5815 scale=1.0000 norm=1.0813
[iter 5] loss=1.5317 val_loss=1.5760 scale=1.0000 norm=1.0759
[iter 6] loss=1.5269 val_loss=1.5709 scale=1.0000 norm=1.0710
[iter 7] loss=1.5223 val_loss=1.5660 scale=1.0000 norm=1.0663
[iter 8] loss=1.5181 val_loss=1.5614 scale=1.0000 norm=1.0620
[iter 9] loss=1.5140 val_loss=1.5572 scale=1.0000 norm=1.0579
[iter 10] loss=1.5102 val_loss=1.5531 scale=1.0000 norm=1.05



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case2.xlsx
Expected 16 Excel files, but found 2 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6717 scale=1.0000 norm=1.1201
[iter 1] loss=0.6278 val_loss=0.6544 scale=1.0000 norm=1.0992
[iter 2] loss=0.6120 val_loss=0.6382 scale=1.0000 norm=1.0808
[iter 3] loss=0.5970 val_loss=0.6227 scale=1.0000 norm=1.0646
[iter 4] loss=0.5828 val_loss=0.6081 scale=1.0000 norm=1.0502
[iter 5] loss=0.5692 val_loss=0.5942 scale=1.0000 norm=1.0376
[iter 6] loss=0.5564 val_loss=0.5810 scale=1.0000 norm=1.0266
[iter 7] loss=0.5442 val_loss=0.5685 scale=1.0000 norm=1.0169
[iter 8] loss=0.5325 val_loss=0.556



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case3.xlsx
Expected 16 Excel files, but found 3 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.5774 scale=1.0000 norm=1.1072
[iter 1] loss=1.5242 val_loss=1.5447 scale=1.0000 norm=1.0725
[iter 2] loss=1.4922 val_loss=1.5156 scale=1.0000 norm=1.0427
[iter 3] loss=1.4640 val_loss=1.4647 scale=2.0000 norm=2.0325
[iter 4] loss=1.4145 val_loss=1.4213 scale=2.0000 norm=1.9396
[iter 5] loss=1.3724 val_loss=1.3833 scale=2.0000 norm=1.8612
[iter 6] loss=1.3355 val_loss=1.3489 scale=2.0000 norm=1.7936
[iter 7] loss=1.3020 val_loss=1.3176 scale=2.0000 norm=1.7337
[iter 8] loss=1.2711 val_loss=1.288



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case4.xlsx
Expected 16 Excel files, but found 4 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6717 scale=1.0000 norm=1.1201
[iter 1] loss=0.6278 val_loss=0.6544 scale=1.0000 norm=1.0992
[iter 2] loss=0.6120 val_loss=0.6381 scale=1.0000 norm=1.0808
[iter 3] loss=0.5970 val_loss=0.6227 scale=1.0000 norm=1.0646
[iter 4] loss=0.5827 val_loss=0.6080 scale=1.0000 norm=1.0502
[iter 5] loss=0.5692 val_loss=0.5941 scale=1.0000 norm=1.0376
[iter 6] loss=0.5564 val_loss=0.5810 scale=1.0000 norm=1.0265
[iter 7] loss=0.5441 val_loss=0.5684 scale=1.0000 norm=1.0169
[iter 8] loss=0.5325 val_loss=0.556



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case5.xlsx
Expected 16 Excel files, but found 5 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.5774 scale=1.0000 norm=1.1072
[iter 1] loss=1.5238 val_loss=1.5445 scale=1.0000 norm=1.0722
[iter 2] loss=1.4916 val_loss=1.5152 scale=1.0000 norm=1.0422
[iter 3] loss=1.4633 val_loss=1.4640 scale=2.0000 norm=2.0315
[iter 4] loss=1.4136 val_loss=1.4203 scale=2.0000 norm=1.9386
[iter 5] loss=1.3716 val_loss=1.3824 scale=2.0000 norm=1.8603
[iter 6] loss=1.3346 val_loss=1.3478 scale=2.0000 norm=1.7926
[iter 7] loss=1.3011 val_loss=1.3164 scale=2.0000 norm=1.7328
[iter 8] loss=1.2702 val_loss=1.286



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case6.xlsx
Expected 16 Excel files, but found 6 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6711 scale=1.0000 norm=1.1201
[iter 1] loss=0.6271 val_loss=0.6530 scale=1.0000 norm=1.0984
[iter 2] loss=0.6105 val_loss=0.6357 scale=1.0000 norm=1.0790
[iter 3] loss=0.5946 val_loss=0.6195 scale=1.0000 norm=1.0616
[iter 4] loss=0.5796 val_loss=0.6039 scale=1.0000 norm=1.0463
[iter 5] loss=0.5651 val_loss=0.5890 scale=1.0000 norm=1.0327
[iter 6] loss=0.5513 val_loss=0.5748 scale=1.0000 norm=1.0207
[iter 7] loss=0.5381 val_loss=0.5613 scale=1.0000 norm=1.0100
[iter 8] loss=0.5254 val_loss=0.548



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case7.xlsx
Expected 16 Excel files, but found 7 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.5724 scale=1.0000 norm=1.1072
[iter 1] loss=1.5198 val_loss=1.5362 scale=1.0000 norm=1.0689
[iter 2] loss=1.4845 val_loss=1.5051 scale=1.0000 norm=1.0365
[iter 3] loss=1.4543 val_loss=1.4762 scale=1.0000 norm=1.0088
[iter 4] loss=1.4266 val_loss=1.4504 scale=1.0000 norm=0.9834
[iter 5] loss=1.4019 val_loss=1.4263 scale=1.0000 norm=0.9608
[iter 6] loss=1.3788 val_loss=1.4040 scale=1.0000 norm=0.9397
[iter 7] loss=1.3571 val_loss=1.3830 scale=1.0000 norm=0.9202
[iter 8] loss=1.3366 val_loss=1.362



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case8.xlsx
Expected 16 Excel files, but found 8 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6711 scale=1.0000 norm=1.1201
[iter 1] loss=0.6272 val_loss=0.6530 scale=1.0000 norm=1.0984
[iter 2] loss=0.6106 val_loss=0.6359 scale=1.0000 norm=1.0791
[iter 3] loss=0.5948 val_loss=0.6196 scale=1.0000 norm=1.0618
[iter 4] loss=0.5797 val_loss=0.6040 scale=1.0000 norm=1.0465
[iter 5] loss=0.5652 val_loss=0.5891 scale=1.0000 norm=1.0327
[iter 6] loss=0.5514 val_loss=0.5750 scale=1.0000 norm=1.0207
[iter 7] loss=0.5382 val_loss=0.5614 scale=1.0000 norm=1.0101
[iter 8] loss=0.5255 val_loss=0.548



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case9.xlsx
Expected 16 Excel files, but found 9 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.5723 scale=1.0000 norm=1.1072
[iter 1] loss=1.5196 val_loss=1.5362 scale=1.0000 norm=1.0688
[iter 2] loss=1.4846 val_loss=1.5049 scale=1.0000 norm=1.0367
[iter 3] loss=1.4545 val_loss=1.4761 scale=1.0000 norm=1.0092
[iter 4] loss=1.4268 val_loss=1.4504 scale=1.0000 norm=0.9838
[iter 5] loss=1.4020 val_loss=1.4263 scale=1.0000 norm=0.9611
[iter 6] loss=1.3787 val_loss=1.4040 scale=1.0000 norm=0.9399
[iter 7] loss=1.3572 val_loss=1.3828 scale=1.0000 norm=0.9204
[iter 8] loss=1.3366 val_loss=1.363



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case10.xlsx
Expected 16 Excel files, but found 10 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6711 scale=1.0000 norm=1.1201
[iter 1] loss=0.6272 val_loss=0.6530 scale=1.0000 norm=1.0984
[iter 2] loss=0.6106 val_loss=0.6359 scale=1.0000 norm=1.0791
[iter 3] loss=0.5948 val_loss=0.6196 scale=1.0000 norm=1.0618
[iter 4] loss=0.5797 val_loss=0.6040 scale=1.0000 norm=1.0465
[iter 5] loss=0.5652 val_loss=0.5891 scale=1.0000 norm=1.0327
[iter 6] loss=0.5514 val_loss=0.5750 scale=1.0000 norm=1.0207
[iter 7] loss=0.5382 val_loss=0.5614 scale=1.0000 norm=1.0101
[iter 8] loss=0.5255 val_loss=0.5



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case11.xlsx
Expected 16 Excel files, but found 11 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.5723 scale=1.0000 norm=1.1072
[iter 1] loss=1.5196 val_loss=1.5362 scale=1.0000 norm=1.0688
[iter 2] loss=1.4846 val_loss=1.5049 scale=1.0000 norm=1.0367
[iter 3] loss=1.4545 val_loss=1.4761 scale=1.0000 norm=1.0092
[iter 4] loss=1.4268 val_loss=1.4504 scale=1.0000 norm=0.9838
[iter 5] loss=1.4020 val_loss=1.4263 scale=1.0000 norm=0.9611
[iter 6] loss=1.3787 val_loss=1.4040 scale=1.0000 norm=0.9399
[iter 7] loss=1.3572 val_loss=1.3828 scale=1.0000 norm=0.9204
[iter 8] loss=1.3366 val_loss=1.3



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case12.xlsx
Expected 16 Excel files, but found 12 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6805 scale=2.0000 norm=2.2401
[iter 1] loss=0.6369 val_loss=0.6720 scale=2.0000 norm=2.2233
[iter 2] loss=0.6301 val_loss=0.6644 scale=2.0000 norm=2.2100
[iter 3] loss=0.6239 val_loss=0.6575 scale=2.0000 norm=2.1996
[iter 4] loss=0.6184 val_loss=0.6513 scale=2.0000 norm=2.1914
[iter 5] loss=0.6133 val_loss=0.6457 scale=2.0000 norm=2.1853
[iter 6] loss=0.6088 val_loss=0.6407 scale=2.0000 norm=2.1810
[iter 7] loss=0.6047 val_loss=0.6361 scale=2.0000 norm=2.1782
[iter 8] loss=0.6009 val_loss=0.6



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case13.xlsx
Expected 16 Excel files, but found 13 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.6072 scale=1.0000 norm=1.1072
[iter 1] loss=1.5542 val_loss=1.5997 scale=1.0000 norm=1.0997
[iter 2] loss=1.5473 val_loss=1.5927 scale=1.0000 norm=1.0927
[iter 3] loss=1.5410 val_loss=1.5861 scale=1.0000 norm=1.0862
[iter 4] loss=1.5351 val_loss=1.5800 scale=1.0000 norm=1.0802
[iter 5] loss=1.5296 val_loss=1.5743 scale=1.0000 norm=1.0746
[iter 6] loss=1.5244 val_loss=1.5689 scale=1.0000 norm=1.0694
[iter 7] loss=1.5196 val_loss=1.5638 scale=1.0000 norm=1.0646
[iter 8] loss=1.5149 val_loss=1.5



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case14.xlsx
Expected 16 Excel files, but found 14 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=0.6446 val_loss=0.6717 scale=1.0000 norm=1.1201
[iter 1] loss=0.6278 val_loss=0.6544 scale=1.0000 norm=1.0992
[iter 2] loss=0.6120 val_loss=0.6381 scale=1.0000 norm=1.0808
[iter 3] loss=0.5970 val_loss=0.6227 scale=1.0000 norm=1.0646
[iter 4] loss=0.5827 val_loss=0.6080 scale=1.0000 norm=1.0502
[iter 5] loss=0.5692 val_loss=0.5941 scale=1.0000 norm=1.0376
[iter 6] loss=0.5564 val_loss=0.5809 scale=1.0000 norm=1.0265
[iter 7] loss=0.5442 val_loss=0.5684 scale=1.0000 norm=1.0169
[iter 8] loss=0.5325 val_loss=0.5



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case15.xlsx
Expected 16 Excel files, but found 15 files. Skipping the merge step.
passed output file path: C:/Users/Minu/Documents/NGboost/
output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/full_year/
[iter 0] loss=1.5614 val_loss=1.5773 scale=1.0000 norm=1.1072
[iter 1] loss=1.5240 val_loss=1.5445 scale=1.0000 norm=1.0723
[iter 2] loss=1.4919 val_loss=1.5152 scale=1.0000 norm=1.0424
[iter 3] loss=1.4636 val_loss=1.4642 scale=2.0000 norm=2.0320
[iter 4] loss=1.4140 val_loss=1.4205 scale=2.0000 norm=1.9390
[iter 5] loss=1.3719 val_loss=1.3824 scale=2.0000 norm=1.8606
[iter 6] loss=1.3350 val_loss=1.3478 scale=2.0000 norm=1.7930
[iter 7] loss=1.3014 val_loss=1.3164 scale=2.0000 norm=1.7332
[iter 8] loss=1.2705 val_loss=1.2



output_file in dist=normal C:/Users/Minu/Documents/NGboost/full_year/case16.xlsx
Merge completed! The final file is 'Merged_Sheet.xlsx'.


In [107]:
entsoe = load_entsoe()
evaluate_ngboost_model(
    entsoe=entsoe, 
    target_column="power", 
    dist=Normal, 
    output_file="C:/Users/Minu/Documents/NGboost/", 
    case=1,
    learning_rate=0.03,
    n_estimators=100,
    random_state=42,
    train_start="2022-10-01",
    train_end="2022-12-31",
    validation_start="2023-01-01",
    validation_end="2023-12-31"
    )

passed output file path: C:/Users/Minu/Documents/NGboost/
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
Creating output_dir: C:/Users/Minu/Documents/NGboost/q4_train/
output file case = 1: C:/Users/Minu/Documents/NGboost/q4_train/case1.xlsx
[iter 0] loss=0.5706 val_loss=0.6721 scale=1.0000 norm=1.1580
[iter 1] loss=0.5666 val_loss=0.6675 scale=1.0000 norm=1.1527
[iter 2] loss=0.5629 val_loss=0.6632 scale=1.0000 norm=1.1479
[iter 3] loss=0.5594 val_loss=0.6593 scale=1.0000 norm=1.1437
[iter 4] loss=0.5561 val_loss=0.6557 scale=1.0000 norm=1.1400
[iter 5] loss=0.5530 val_loss=0.6522 scale=1.0000 norm=1.1368
[iter 6] loss=0.5501 val_loss=0.6490 scale=1.0000 norm=1.1340
[iter 7] loss=0.5473 val_loss=0.6460 scale=1.0000 norm=1.1316
[iter 8] loss=0.5447 val_loss=0.6432 scale=1.0000 norm=1.1295
[iter 9] loss=0.5422 val_loss=0.6404 scale=1.0000 norm=1.1278
[iter 10] loss=0.5399 val_loss=0.6379 scale=1.0000 norm=1.12

  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.var = self.scale**2


[iter 99] loss=0.4928 val_loss=0.6088 scale=0.0039 norm=0.0057




file_paths
 [WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case1.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case10.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case11.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case12.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case13.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case14.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case15.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case16.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case2.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case3.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case4.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case5.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case6.xlsx'), WindowsPath('C:/Users/Minu/Documents/NGboost/q4_train/case7.xlsx'), WindowsPath('C:/Users/Minu/D

(    Interval  CRPS_gaussian_mean  CRPS_gaussian_min  CRPS_gaussian_max  \
 0          1            0.526354           0.019125           3.438288   
 1          2            0.527628           0.019509           3.527394   
 2          3            0.529027           0.022135           3.706398   
 3          4            0.535547           0.026766           3.963583   
 4          5            0.533503           0.042718           4.107701   
 ..       ...                 ...                ...                ...   
 91        92            0.528508           0.023917           2.894188   
 92        93            0.524756           0.016621           2.914808   
 93        94            0.527796           0.016166           3.000210   
 94        95            0.529677           0.021006           3.116768   
 95        96            0.527597           0.018813           3.253602   
 
     CRPS_lognormal_mean  CRPS_lognormal_min  CRPS_lognormal_max     NLL_mean  \
 0               

In [None]:
for i in range(1, 17):
    entsoe = load_entsoe()
    evaluate_ngboost_model(
        entsoe=entsoe, 
        target_column="power", 
        dist=Normal, 
        output_file="C:/Users/Minu/Documents/NGboost/", 
        case=i,
        learning_rate=0.03,
        n_estimators=100,
        random_state=42,
        train_start="2022-10-01",
        train_end="2022-12-31",
        validation_start="2023-01-01",
        validation_end="2023-12-31"
        )

KeyboardInterrupt: 

## Results for Q4 Training, FY Validation, epsilon = 1e-5 (status: 11.04.2025)

In [44]:
for i in range(1, 17):
        entsoe = load_entsoe()
        evaluate_ngboost_model(
                entsoe=entsoe, 
                target_column="power", 
                dist=Normal, 
                output_file="C:/Users/Minu/Documents/NGboost/q4_train/", 
                case=i,
                learning_rate=0.03,
                n_estimators=100,
                random_state=42
                )

# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
C:/Users/Minu/Documents/NGboost/q4_train/case1.xlsx
[iter 0] loss=0.5706 val_loss=0.6721 scale=1.0000 norm=1.1580
[iter 1] loss=0.5666 val_loss=0.6675 scale=1.0000 norm=1.1527
[iter 2] loss=0.5629 val_loss=0.6632 scale=1.0000 norm=1.1479
[iter 3] loss=0.5594 val_loss=0.6593 scale=1.0000 norm=1.1437
[iter 4] loss=0.5561 val_loss=0.6557 scale=1.0000 norm=1.1400
[iter 5] loss=0.5530 val_loss=0.6522 scale=1.0000 norm=1.1368
[iter 6] loss=0.5501 val_loss=0.6490 scale=1.0000 norm=1.1340
[iter 7] loss=0.5473 val_loss=0.6460 scale=1.0000 norm=1.1316
[iter 8] loss=0.5447 val_loss=0.6432 scale=1.0000 norm=1.1295
[iter 9] loss=0.5422 val_loss=0.6404 scale=1.0000 norm=1.1278




[iter 10] loss=0.5399 val_loss=0.6379 scale=1.0000 norm=1.1263
[iter 11] loss=0.5376 val_loss=0.6355 scale=1.0000 norm=1.1252
[iter 12] loss=0.5355 val_loss=0.6332 scale=1.0000 norm=1.1243
[iter 13] loss=0.5335 val_loss=0.6311 scale=1.0000 norm=1.1237
[iter 14] loss=0.5316 val_loss=0.6292 scale=1.0000 norm=1.1233
[iter 15] loss=0.5298 val_loss=0.6274 scale=1.0000 norm=1.1231
[iter 16] loss=0.5280 val_loss=0.6256 scale=1.0000 norm=1.1231
[iter 17] loss=0.5264 val_loss=0.6240 scale=1.0000 norm=1.1233
[iter 18] loss=0.5248 val_loss=0.6225 scale=1.0000 norm=1.1236
[iter 19] loss=0.5234 val_loss=0.6210 scale=1.0000 norm=1.1242
[iter 20] loss=0.5220 val_loss=0.6196 scale=1.0000 norm=1.1248
[iter 21] loss=0.5206 val_loss=0.6183 scale=1.0000 norm=1.1256
[iter 22] loss=0.5193 val_loss=0.6171 scale=1.0000 norm=1.1265
[iter 23] loss=0.5181 val_loss=0.6159 scale=1.0000 norm=1.1275
[iter 24] loss=0.5169 val_loss=0.6148 scale=1.0000 norm=1.1287
[iter 25] loss=0.5158 val_loss=0.6136 scale=1.0000 norm

  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.var = self.scale**2


[iter 99] loss=0.4928 val_loss=0.6088 scale=0.0039 norm=0.0057




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.6236 scale=1.0000 norm=1.0144
[iter 1] loss=1.4319 val_loss=1.6154 scale=1.0000 norm=1.0072
[iter 2] loss=1.4252 val_loss=1.6076 scale=1.0000 norm=1.0006
[iter 3] loss=1.4189 val_loss=1.6006 scale=1.0000 norm=0.9945
[iter 4] loss=1.4131 val_loss=1.5942 scale=1.0000 norm=0.9890
[iter 5] loss=1.4077 val_loss=1.5883 scale=1.0000 norm=0.9838
[iter 6] loss=1.4026 val_loss=1.5830 scale=1.0000 norm=0.9790
[iter 7] loss=1.3978 val_loss=1.5781 scale=1.0000 norm=0.9746
[iter 8] loss=1.3933 val_loss=1.5689 scale=2.0000 norm=1.9410
[iter 9] loss=1.3849 val_loss=1.5650 scale=1.0000 norm=0.9630
[iter 10] loss=1.3810 val_loss=1.5579 scale=2.0000 norm=1.9192




[iter 11] loss=1.3737 val_loss=1.5506 scale=2.0000 norm=1.9069
[iter 12] loss=1.3668 val_loss=1.5443 scale=2.0000 norm=1.8957
[iter 13] loss=1.3608 val_loss=1.5387 scale=2.0000 norm=1.8866
[iter 14] loss=1.3552 val_loss=1.5336 scale=2.0000 norm=1.8784
[iter 15] loss=1.3500 val_loss=1.5289 scale=2.0000 norm=1.8712
[iter 16] loss=1.3452 val_loss=1.5254 scale=2.0000 norm=1.8647
[iter 17] loss=1.3404 val_loss=1.5226 scale=2.0000 norm=1.8588
[iter 18] loss=1.3360 val_loss=1.5204 scale=2.0000 norm=1.8535
[iter 19] loss=1.3320 val_loss=1.5183 scale=2.0000 norm=1.8490
[iter 20] loss=1.3282 val_loss=1.5157 scale=2.0000 norm=1.8452
[iter 21] loss=1.3247 val_loss=1.5144 scale=2.0000 norm=1.8418
[iter 22] loss=1.3214 val_loss=1.5134 scale=2.0000 norm=1.8389
[iter 23] loss=1.3183 val_loss=1.5122 scale=2.0000 norm=1.8363
[iter 24] loss=1.3154 val_loss=1.5116 scale=2.0000 norm=1.8342
[iter 25] loss=1.3126 val_loss=1.5107 scale=2.0000 norm=1.8322
[iter 26] loss=1.3100 val_loss=1.5108 scale=2.0000 norm



Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6567 scale=1.0000 norm=1.1580
[iter 1] loss=0.5535 val_loss=0.6376 scale=1.0000 norm=1.1327
[iter 2] loss=0.5374 val_loss=0.6196 scale=1.0000 norm=1.1110
[iter 3] loss=0.5224 val_loss=0.6028 scale=1.0000 norm=1.0925
[iter 4] loss=0.5081 val_loss=0.5869 scale=1.0000 norm=1.0767
[iter 5] loss=0.4947 val_loss=0.5722 scale=1.0000 norm=1.0632
[iter 6] loss=0.4820 val_loss=0.5581 scale=1.0000 norm=1.0518
[iter 7] loss=0.4700 val_loss=0.5448 scale=1.0000 norm=1.0424
[iter 8] loss=0.4586 val_loss=0.5321 scale=1.0000 norm=1.0347
[iter 9] loss=0.4477 val_loss=0.5202 scale=1.0000 norm=1.0285




[iter 10] loss=0.4374 val_loss=0.5088 scale=1.0000 norm=1.0237
[iter 11] loss=0.4275 val_loss=0.4979 scale=1.0000 norm=1.0202
[iter 12] loss=0.4181 val_loss=0.4877 scale=1.0000 norm=1.0180
[iter 13] loss=0.4091 val_loss=0.4778 scale=1.0000 norm=1.0170
[iter 14] loss=0.4005 val_loss=0.4684 scale=1.0000 norm=1.0170
[iter 15] loss=0.3922 val_loss=0.4594 scale=1.0000 norm=1.0181
[iter 16] loss=0.3843 val_loss=0.4509 scale=1.0000 norm=1.0202
[iter 17] loss=0.3767 val_loss=0.4428 scale=1.0000 norm=1.0233
[iter 18] loss=0.3694 val_loss=0.4349 scale=1.0000 norm=1.0274
[iter 19] loss=0.3624 val_loss=0.4273 scale=1.0000 norm=1.0323
[iter 20] loss=0.3556 val_loss=0.4200 scale=1.0000 norm=1.0381
[iter 21] loss=0.3492 val_loss=0.4129 scale=1.0000 norm=1.0448
[iter 22] loss=0.3429 val_loss=0.4063 scale=1.0000 norm=1.0523
[iter 23] loss=0.3369 val_loss=0.4000 scale=1.0000 norm=1.0607
[iter 24] loss=0.3311 val_loss=0.3940 scale=1.0000 norm=1.0700
[iter 25] loss=0.3256 val_loss=0.3881 scale=1.0000 norm

  self.scale = np.exp(params[1])
  self.var = self.scale**2
  self.var = self.scale**2
  self.var = self.scale**2


[iter 93] loss=0.2158 val_loss=0.2780 scale=0.0156 norm=0.0415
[iter 94] loss=0.2158 val_loss=0.2780 scale=0.0312 norm=0.0824
[iter 95] loss=0.2158 val_loss=0.2780 scale=0.5000 norm=1.3120
[iter 96] loss=0.2157 val_loss=0.2781 scale=0.5000 norm=1.3222
[iter 97] loss=0.2155 val_loss=0.2781 scale=0.0312 norm=0.0832
[iter 98] loss=0.2155 val_loss=0.2780 scale=0.5000 norm=1.3255
[iter 99] loss=0.2154 val_loss=0.2780 scale=1.0000 norm=2.6651




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.5753 scale=1.0000 norm=1.0144
[iter 1] loss=1.3990 val_loss=1.4876 scale=2.0000 norm=1.9579
[iter 2] loss=1.3337 val_loss=1.4233 scale=2.0000 norm=1.8443
[iter 3] loss=1.2830 val_loss=1.3721 scale=2.0000 norm=1.7576
[iter 4] loss=1.2403 val_loss=1.3285 scale=2.0000 norm=1.6861
[iter 5] loss=1.2026 val_loss=1.2907 scale=2.0000 norm=1.6248
[iter 6] loss=1.1686 val_loss=1.2568 scale=2.0000 norm=1.5713
[iter 7] loss=1.1372 val_loss=1.2259 scale=2.0000 norm=1.5241
[iter 8] loss=1.1078 val_loss=1.1971 scale=2.0000 norm=1.4821
[iter 9] loss=1.0799 val_loss=1.1701 scale=2.0000 norm=1.4445
[iter 10] loss=1.0535 val_loss=1.1448 scale=2.0000 norm=1.4110
[iter 11] loss=1.0281 val_loss=1.1208 scale=2.0000 norm=1.3810




[iter 12] loss=1.0037 val_loss=1.0978 scale=2.0000 norm=1.3541
[iter 13] loss=0.9803 val_loss=1.0756 scale=2.0000 norm=1.3300
[iter 14] loss=0.9575 val_loss=1.0546 scale=2.0000 norm=1.3084
[iter 15] loss=0.9354 val_loss=1.0342 scale=2.0000 norm=1.2889
[iter 16] loss=0.9139 val_loss=1.0141 scale=2.0000 norm=1.2714
[iter 17] loss=0.8930 val_loss=0.9947 scale=2.0000 norm=1.2557
[iter 18] loss=0.8726 val_loss=0.9761 scale=2.0000 norm=1.2416
[iter 19] loss=0.8529 val_loss=0.9582 scale=2.0000 norm=1.2290
[iter 20] loss=0.8336 val_loss=0.9407 scale=2.0000 norm=1.2178
[iter 21] loss=0.8148 val_loss=0.9240 scale=2.0000 norm=1.2077
[iter 22] loss=0.7965 val_loss=0.9083 scale=2.0000 norm=1.1986
[iter 23] loss=0.7787 val_loss=0.8927 scale=2.0000 norm=1.1905
[iter 24] loss=0.7613 val_loss=0.8774 scale=2.0000 norm=1.1833
[iter 25] loss=0.7444 val_loss=0.8630 scale=2.0000 norm=1.1768
[iter 26] loss=0.7280 val_loss=0.8489 scale=2.0000 norm=1.1711
[iter 27] loss=0.7121 val_loss=0.8356 scale=2.0000 norm



Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6567 scale=1.0000 norm=1.1580
[iter 1] loss=0.5534 val_loss=0.6376 scale=1.0000 norm=1.1326
[iter 2] loss=0.5373 val_loss=0.6196 scale=1.0000 norm=1.1108
[iter 3] loss=0.5221 val_loss=0.6027 scale=1.0000 norm=1.0921
[iter 4] loss=0.5078 val_loss=0.5868 scale=1.0000 norm=1.0761
[iter 5] loss=0.4943 val_loss=0.5718 scale=1.0000 norm=1.0626




[iter 6] loss=0.4815 val_loss=0.5577 scale=1.0000 norm=1.0512
[iter 7] loss=0.4694 val_loss=0.5444 scale=1.0000 norm=1.0417
[iter 8] loss=0.4579 val_loss=0.5318 scale=1.0000 norm=1.0339
[iter 9] loss=0.4469 val_loss=0.5199 scale=1.0000 norm=1.0276
[iter 10] loss=0.4364 val_loss=0.5085 scale=1.0000 norm=1.0228
[iter 11] loss=0.4265 val_loss=0.4977 scale=1.0000 norm=1.0193
[iter 12] loss=0.4169 val_loss=0.4873 scale=1.0000 norm=1.0170
[iter 13] loss=0.4078 val_loss=0.4774 scale=1.0000 norm=1.0159
[iter 14] loss=0.3991 val_loss=0.4680 scale=1.0000 norm=1.0160
[iter 15] loss=0.3907 val_loss=0.4592 scale=1.0000 norm=1.0171
[iter 16] loss=0.3827 val_loss=0.4505 scale=1.0000 norm=1.0192
[iter 17] loss=0.3749 val_loss=0.4423 scale=1.0000 norm=1.0223
[iter 18] loss=0.3674 val_loss=0.4345 scale=1.0000 norm=1.0263
[iter 19] loss=0.3602 val_loss=0.4269 scale=1.0000 norm=1.0314
[iter 20] loss=0.3533 val_loss=0.4197 scale=1.0000 norm=1.0372
[iter 21] loss=0.3466 val_loss=0.4127 scale=1.0000 norm=1.0



Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.5738 scale=1.0000 norm=1.0144
[iter 1] loss=1.3983 val_loss=1.5266 scale=1.0000 norm=0.9783
[iter 2] loss=1.3638 val_loss=1.4526 scale=2.0000 norm=1.8967
[iter 3] loss=1.3065 val_loss=1.3952 scale=2.0000 norm=1.7980
[iter 4] loss=1.2596 val_loss=1.3490 scale=2.0000 norm=1.7190
[iter 5] loss=1.2193 val_loss=1.3085 scale=2.0000 norm=1.6527
[iter 6] loss=1.1833 val_loss=1.2724 scale=2.0000 norm=1.5953




[iter 7] loss=1.1503 val_loss=1.2401 scale=2.0000 norm=1.5450
[iter 8] loss=1.1196 val_loss=1.2100 scale=2.0000 norm=1.5004
[iter 9] loss=1.0907 val_loss=1.1819 scale=2.0000 norm=1.4607
[iter 10] loss=1.0633 val_loss=1.1553 scale=2.0000 norm=1.4251
[iter 11] loss=1.0369 val_loss=1.1301 scale=2.0000 norm=1.3929
[iter 12] loss=1.0112 val_loss=1.1062 scale=2.0000 norm=1.3636
[iter 13] loss=0.9866 val_loss=1.0833 scale=2.0000 norm=1.3374
[iter 14] loss=0.9629 val_loss=1.0612 scale=2.0000 norm=1.3142
[iter 15] loss=0.9397 val_loss=1.0399 scale=2.0000 norm=1.2929
[iter 16] loss=0.9172 val_loss=1.0197 scale=2.0000 norm=1.2738
[iter 17] loss=0.8951 val_loss=0.9999 scale=2.0000 norm=1.2566
[iter 18] loss=0.8736 val_loss=0.9808 scale=2.0000 norm=1.2408
[iter 19] loss=0.8526 val_loss=0.9627 scale=2.0000 norm=1.2267
[iter 20] loss=0.8320 val_loss=0.9450 scale=2.0000 norm=1.2140
[iter 21] loss=0.8119 val_loss=0.9276 scale=2.0000 norm=1.2023
[iter 22] loss=0.7924 val_loss=0.9112 scale=2.0000 norm=1.



Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6560 scale=1.0000 norm=1.1580
[iter 1] loss=0.5522 val_loss=0.6365 scale=1.0000 norm=1.1304




[iter 2] loss=0.5349 val_loss=0.6182 scale=1.0000 norm=1.1066
[iter 3] loss=0.5186 val_loss=0.6006 scale=1.0000 norm=1.0862
[iter 4] loss=0.5030 val_loss=0.5841 scale=1.0000 norm=1.0688
[iter 5] loss=0.4884 val_loss=0.5683 scale=1.0000 norm=1.0540
[iter 6] loss=0.4743 val_loss=0.5534 scale=1.0000 norm=1.0412
[iter 7] loss=0.4609 val_loss=0.5394 scale=1.0000 norm=1.0305
[iter 8] loss=0.4480 val_loss=0.5257 scale=1.0000 norm=1.0217
[iter 9] loss=0.4358 val_loss=0.5125 scale=1.0000 norm=1.0147
[iter 10] loss=0.4240 val_loss=0.5002 scale=1.0000 norm=1.0095
[iter 11] loss=0.4127 val_loss=0.4881 scale=1.0000 norm=1.0058
[iter 12] loss=0.4018 val_loss=0.4769 scale=1.0000 norm=1.0035
[iter 13] loss=0.3913 val_loss=0.4658 scale=1.0000 norm=1.0026
[iter 14] loss=0.3812 val_loss=0.4551 scale=1.0000 norm=1.0032
[iter 15] loss=0.3714 val_loss=0.4448 scale=1.0000 norm=1.0050
[iter 16] loss=0.3619 val_loss=0.4349 scale=1.0000 norm=1.0081
[iter 17] loss=0.3528 val_loss=0.4250 scale=1.0000 norm=1.0127




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.5638 scale=1.0000 norm=1.0144
[iter 1] loss=1.3908 val_loss=1.5108 scale=1.0000 norm=0.9721




[iter 2] loss=1.3521 val_loss=1.4675 scale=1.0000 norm=0.9390
[iter 3] loss=1.3191 val_loss=1.4316 scale=1.0000 norm=0.9111
[iter 4] loss=1.2902 val_loss=1.4008 scale=1.0000 norm=0.8871
[iter 5] loss=1.2643 val_loss=1.3727 scale=1.0000 norm=0.8658
[iter 6] loss=1.2404 val_loss=1.3475 scale=1.0000 norm=0.8464
[iter 7] loss=1.2181 val_loss=1.3237 scale=1.0000 norm=0.8286
[iter 8] loss=1.1970 val_loss=1.3017 scale=1.0000 norm=0.8119
[iter 9] loss=1.1772 val_loss=1.2810 scale=1.0000 norm=0.7965
[iter 10] loss=1.1581 val_loss=1.2610 scale=1.0000 norm=0.7821
[iter 11] loss=1.1398 val_loss=1.2419 scale=1.0000 norm=0.7685
[iter 12] loss=1.1220 val_loss=1.2065 scale=2.0000 norm=1.5111
[iter 13] loss=1.0881 val_loss=1.1735 scale=2.0000 norm=1.4632
[iter 14] loss=1.0551 val_loss=1.1425 scale=2.0000 norm=1.4190
[iter 15] loss=1.0234 val_loss=1.1119 scale=2.0000 norm=1.3792
[iter 16] loss=0.9927 val_loss=1.0817 scale=2.0000 norm=1.3431
[iter 17] loss=0.9628 val_loss=1.0538 scale=2.0000 norm=1.3104




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6561 scale=1.0000 norm=1.1580
[iter 1] loss=0.5523 val_loss=0.6365 scale=1.0000 norm=1.1305




[iter 2] loss=0.5351 val_loss=0.6181 scale=1.0000 norm=1.1070
[iter 3] loss=0.5187 val_loss=0.6006 scale=1.0000 norm=1.0864
[iter 4] loss=0.5033 val_loss=0.5842 scale=1.0000 norm=1.0688
[iter 5] loss=0.4888 val_loss=0.5685 scale=1.0000 norm=1.0539
[iter 6] loss=0.4748 val_loss=0.5534 scale=1.0000 norm=1.0412
[iter 7] loss=0.4613 val_loss=0.5390 scale=1.0000 norm=1.0304
[iter 8] loss=0.4484 val_loss=0.5254 scale=1.0000 norm=1.0216
[iter 9] loss=0.4364 val_loss=0.5125 scale=1.0000 norm=1.0147
[iter 10] loss=0.4245 val_loss=0.5001 scale=1.0000 norm=1.0091
[iter 11] loss=0.4131 val_loss=0.4879 scale=1.0000 norm=1.0051
[iter 12] loss=0.4022 val_loss=0.4765 scale=1.0000 norm=1.0027
[iter 13] loss=0.3916 val_loss=0.4655 scale=1.0000 norm=1.0016
[iter 14] loss=0.3814 val_loss=0.4547 scale=1.0000 norm=1.0020
[iter 15] loss=0.3716 val_loss=0.4445 scale=1.0000 norm=1.0038
[iter 16] loss=0.3621 val_loss=0.4346 scale=1.0000 norm=1.0068
[iter 17] loss=0.3529 val_loss=0.4250 scale=1.0000 norm=1.0113




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.5639 scale=1.0000 norm=1.0144
[iter 1] loss=1.3907 val_loss=1.5107 scale=1.0000 norm=0.9719




[iter 2] loss=1.3519 val_loss=1.4679 scale=1.0000 norm=0.9387
[iter 3] loss=1.3190 val_loss=1.4319 scale=1.0000 norm=0.9110
[iter 4] loss=1.2900 val_loss=1.4007 scale=1.0000 norm=0.8870
[iter 5] loss=1.2641 val_loss=1.3725 scale=1.0000 norm=0.8656
[iter 6] loss=1.2403 val_loss=1.3466 scale=1.0000 norm=0.8463
[iter 7] loss=1.2178 val_loss=1.3229 scale=1.0000 norm=0.8283
[iter 8] loss=1.1969 val_loss=1.3011 scale=1.0000 norm=0.8119
[iter 9] loss=1.1767 val_loss=1.2798 scale=1.0000 norm=0.7962
[iter 10] loss=1.1574 val_loss=1.2605 scale=1.0000 norm=0.7815
[iter 11] loss=1.1393 val_loss=1.2419 scale=1.0000 norm=0.7680
[iter 12] loss=1.1213 val_loss=1.2060 scale=2.0000 norm=1.5100
[iter 13] loss=1.0869 val_loss=1.1892 scale=1.0000 norm=0.7309
[iter 14] loss=1.0705 val_loss=1.1730 scale=1.0000 norm=0.7199
[iter 15] loss=1.0542 val_loss=1.1413 scale=2.0000 norm=1.4186
[iter 16] loss=1.0229 val_loss=1.1113 scale=2.0000 norm=1.3792
[iter 17] loss=0.9923 val_loss=1.0817 scale=2.0000 norm=1.3434




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6561 scale=1.0000 norm=1.1580
[iter 1] loss=0.5523 val_loss=0.6365 scale=1.0000 norm=1.1305




[iter 2] loss=0.5351 val_loss=0.6181 scale=1.0000 norm=1.1070
[iter 3] loss=0.5187 val_loss=0.6006 scale=1.0000 norm=1.0864
[iter 4] loss=0.5033 val_loss=0.5842 scale=1.0000 norm=1.0688
[iter 5] loss=0.4888 val_loss=0.5685 scale=1.0000 norm=1.0539
[iter 6] loss=0.4748 val_loss=0.5534 scale=1.0000 norm=1.0412
[iter 7] loss=0.4613 val_loss=0.5390 scale=1.0000 norm=1.0304
[iter 8] loss=0.4484 val_loss=0.5254 scale=1.0000 norm=1.0216
[iter 9] loss=0.4364 val_loss=0.5125 scale=1.0000 norm=1.0147
[iter 10] loss=0.4245 val_loss=0.5001 scale=1.0000 norm=1.0091
[iter 11] loss=0.4131 val_loss=0.4879 scale=1.0000 norm=1.0051
[iter 12] loss=0.4022 val_loss=0.4765 scale=1.0000 norm=1.0027
[iter 13] loss=0.3916 val_loss=0.4655 scale=1.0000 norm=1.0016
[iter 14] loss=0.3814 val_loss=0.4547 scale=1.0000 norm=1.0020
[iter 15] loss=0.3716 val_loss=0.4445 scale=1.0000 norm=1.0038
[iter 16] loss=0.3621 val_loss=0.4346 scale=1.0000 norm=1.0068
[iter 17] loss=0.3529 val_loss=0.4250 scale=1.0000 norm=1.0113




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.5639 scale=1.0000 norm=1.0144
[iter 1] loss=1.3907 val_loss=1.5107 scale=1.0000 norm=0.9719




[iter 2] loss=1.3519 val_loss=1.4679 scale=1.0000 norm=0.9387
[iter 3] loss=1.3190 val_loss=1.4319 scale=1.0000 norm=0.9110
[iter 4] loss=1.2900 val_loss=1.4007 scale=1.0000 norm=0.8870
[iter 5] loss=1.2641 val_loss=1.3725 scale=1.0000 norm=0.8656
[iter 6] loss=1.2403 val_loss=1.3466 scale=1.0000 norm=0.8463
[iter 7] loss=1.2178 val_loss=1.3229 scale=1.0000 norm=0.8283
[iter 8] loss=1.1969 val_loss=1.3011 scale=1.0000 norm=0.8119
[iter 9] loss=1.1767 val_loss=1.2798 scale=1.0000 norm=0.7962
[iter 10] loss=1.1574 val_loss=1.2605 scale=1.0000 norm=0.7815
[iter 11] loss=1.1393 val_loss=1.2419 scale=1.0000 norm=0.7680
[iter 12] loss=1.1213 val_loss=1.2060 scale=2.0000 norm=1.5100
[iter 13] loss=1.0869 val_loss=1.1892 scale=1.0000 norm=0.7309
[iter 14] loss=1.0705 val_loss=1.1730 scale=1.0000 norm=0.7199
[iter 15] loss=1.0542 val_loss=1.1413 scale=2.0000 norm=1.4186
[iter 16] loss=1.0229 val_loss=1.1113 scale=2.0000 norm=1.3792
[iter 17] loss=0.9923 val_loss=1.0817 scale=2.0000 norm=1.3434




Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6724 scale=1.0000 norm=1.1580
[iter 1] loss=0.5666 val_loss=0.6677 scale=1.0000 norm=1.1525
[iter 2] loss=0.5628 val_loss=0.6636 scale=1.0000 norm=1.1475
[iter 3] loss=0.5592 val_loss=0.6597 scale=1.0000 norm=1.1431
[iter 4] loss=0.5558 val_loss=0.6560 scale=1.0000 norm=1.1392
[iter 5] loss=0.5525 val_loss=0.6526 scale=1.0000 norm=1.1357
[iter 6] loss=0.5495 val_loss=0.6494 scale=1.0000 norm=1.1327
[iter 7] loss=0.5465 val_loss=0.6465 scale=1.0000 norm=1.1300
[iter 8] loss=0.5438 val_loss=0.6437 scale=1.0000 norm=1.1277
[iter 9] loss=0.5412 val_loss=0.6411 scale=1.0000 norm=1.1259




[iter 10] loss=0.5387 val_loss=0.6385 scale=1.0000 norm=1.1243
[iter 11] loss=0.5363 val_loss=0.6362 scale=1.0000 norm=1.1230
[iter 12] loss=0.5341 val_loss=0.6338 scale=1.0000 norm=1.1221
[iter 13] loss=0.5320 val_loss=0.6317 scale=1.0000 norm=1.1214
[iter 14] loss=0.5300 val_loss=0.6296 scale=1.0000 norm=1.1208
[iter 15] loss=0.5280 val_loss=0.6276 scale=1.0000 norm=1.1205
[iter 16] loss=0.5262 val_loss=0.6257 scale=1.0000 norm=1.1204
[iter 17] loss=0.5245 val_loss=0.6240 scale=1.0000 norm=1.1206
[iter 18] loss=0.5228 val_loss=0.6223 scale=1.0000 norm=1.1209
[iter 19] loss=0.5212 val_loss=0.6209 scale=1.0000 norm=1.1214
[iter 20] loss=0.5197 val_loss=0.6195 scale=1.0000 norm=1.1222
[iter 21] loss=0.5182 val_loss=0.6182 scale=1.0000 norm=1.1228
[iter 22] loss=0.5168 val_loss=0.6170 scale=1.0000 norm=1.1237
[iter 23] loss=0.5154 val_loss=0.6155 scale=1.0000 norm=1.1247
[iter 24] loss=0.5141 val_loss=0.6144 scale=1.0000 norm=1.1259
[iter 25] loss=0.5129 val_loss=0.6134 scale=1.0000 norm



Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.6235 scale=1.0000 norm=1.0144
[iter 1] loss=1.4313 val_loss=1.6150 scale=1.0000 norm=1.0069
[iter 2] loss=1.4242 val_loss=1.6072 scale=1.0000 norm=1.0000
[iter 3] loss=1.4175 val_loss=1.6001 scale=1.0000 norm=0.9937
[iter 4] loss=1.4114 val_loss=1.5934 scale=1.0000 norm=0.9879
[iter 5] loss=1.4057 val_loss=1.5870 scale=1.0000 norm=0.9826
[iter 6] loss=1.4003 val_loss=1.5817 scale=1.0000 norm=0.9777
[iter 7] loss=1.3954 val_loss=1.5765 scale=1.0000 norm=0.9731
[iter 8] loss=1.3906 val_loss=1.5717 scale=1.0000 norm=0.9689




[iter 9] loss=1.3860 val_loss=1.5631 scale=2.0000 norm=1.9295
[iter 10] loss=1.3774 val_loss=1.5591 scale=1.0000 norm=0.9572
[iter 11] loss=1.3732 val_loss=1.5555 scale=1.0000 norm=0.9536
[iter 12] loss=1.3692 val_loss=1.5521 scale=1.0000 norm=0.9501
[iter 13] loss=1.3653 val_loss=1.5458 scale=2.0000 norm=1.8939
[iter 14] loss=1.3583 val_loss=1.5428 scale=1.0000 norm=0.9413
[iter 15] loss=1.3548 val_loss=1.5400 scale=1.0000 norm=0.9388
[iter 16] loss=1.3515 val_loss=1.5374 scale=1.0000 norm=0.9363
[iter 17] loss=1.3484 val_loss=1.5316 scale=2.0000 norm=1.8682
[iter 18] loss=1.3427 val_loss=1.5294 scale=1.0000 norm=0.9302
[iter 19] loss=1.3398 val_loss=1.5273 scale=1.0000 norm=0.9283
[iter 20] loss=1.3370 val_loss=1.5236 scale=2.0000 norm=1.8530
[iter 21] loss=1.3319 val_loss=1.5203 scale=2.0000 norm=1.8466
[iter 22] loss=1.3269 val_loss=1.5160 scale=2.0000 norm=1.8406
[iter 23] loss=1.3224 val_loss=1.5135 scale=2.0000 norm=1.8356
[iter 24] loss=1.3180 val_loss=1.5107 scale=2.0000 norm=



Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.5706 val_loss=0.6567 scale=1.0000 norm=1.1580
[iter 1] loss=0.5535 val_loss=0.6377 scale=1.0000 norm=1.1327
[iter 2] loss=0.5374 val_loss=0.6197 scale=1.0000 norm=1.1109
[iter 3] loss=0.5223 val_loss=0.6028 scale=1.0000 norm=1.0924
[iter 4] loss=0.5081 val_loss=0.5870 scale=1.0000 norm=1.0766
[iter 5] loss=0.4947 val_loss=0.5722 scale=1.0000 norm=1.0631
[iter 6] loss=0.4820 val_loss=0.5581 scale=1.0000 norm=1.0518
[iter 7] loss=0.4699 val_loss=0.5448 scale=1.0000 norm=1.0423
[iter 8] loss=0.4585 val_loss=0.5321 scale=1.0000 norm=1.0346




[iter 9] loss=0.4476 val_loss=0.5202 scale=1.0000 norm=1.0284
[iter 10] loss=0.4373 val_loss=0.5087 scale=1.0000 norm=1.0236
[iter 11] loss=0.4275 val_loss=0.4979 scale=1.0000 norm=1.0202
[iter 12] loss=0.4180 val_loss=0.4876 scale=1.0000 norm=1.0180
[iter 13] loss=0.4090 val_loss=0.4778 scale=1.0000 norm=1.0170
[iter 14] loss=0.4004 val_loss=0.4684 scale=1.0000 norm=1.0170
[iter 15] loss=0.3921 val_loss=0.4593 scale=1.0000 norm=1.0180
[iter 16] loss=0.3842 val_loss=0.4508 scale=1.0000 norm=1.0201
[iter 17] loss=0.3766 val_loss=0.4427 scale=1.0000 norm=1.0233
[iter 18] loss=0.3693 val_loss=0.4347 scale=1.0000 norm=1.0274
[iter 19] loss=0.3623 val_loss=0.4271 scale=1.0000 norm=1.0323
[iter 20] loss=0.3555 val_loss=0.4198 scale=1.0000 norm=1.0381
[iter 21] loss=0.3490 val_loss=0.4127 scale=1.0000 norm=1.0447
[iter 22] loss=0.3427 val_loss=0.4061 scale=1.0000 norm=1.0523
[iter 23] loss=0.3367 val_loss=0.3998 scale=1.0000 norm=1.0606
[iter 24] loss=0.3309 val_loss=0.3937 scale=1.0000 norm=

  self.scale = np.exp(params[1])
  self.var = self.scale**2


Expected 16 Excel files, but found 0 files. Skipping the merge step.
# of training observations: 8832 | 2.80%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=1.4392 val_loss=1.5746 scale=1.0000 norm=1.0144
[iter 1] loss=1.3986 val_loss=1.4865 scale=2.0000 norm=1.9572
[iter 2] loss=1.3334 val_loss=1.4218 scale=2.0000 norm=1.8438
[iter 3] loss=1.2826 val_loss=1.3710 scale=2.0000 norm=1.7571
[iter 4] loss=1.2393 val_loss=1.3275 scale=2.0000 norm=1.6848
[iter 5] loss=1.2017 val_loss=1.2900 scale=2.0000 norm=1.6236
[iter 6] loss=1.1674 val_loss=1.2558 scale=2.0000 norm=1.5700
[iter 7] loss=1.1360 val_loss=1.2246 scale=2.0000 norm=1.5228




[iter 8] loss=1.1066 val_loss=1.1955 scale=2.0000 norm=1.4810
[iter 9] loss=1.0786 val_loss=1.1684 scale=2.0000 norm=1.4434
[iter 10] loss=1.0520 val_loss=1.1433 scale=2.0000 norm=1.4099
[iter 11] loss=1.0266 val_loss=1.1191 scale=2.0000 norm=1.3798
[iter 12] loss=1.0021 val_loss=1.0961 scale=2.0000 norm=1.3529
[iter 13] loss=0.9785 val_loss=1.0739 scale=2.0000 norm=1.3288
[iter 14] loss=0.9557 val_loss=1.0527 scale=2.0000 norm=1.3072
[iter 15] loss=0.9335 val_loss=1.0323 scale=2.0000 norm=1.2876
[iter 16] loss=0.9119 val_loss=1.0121 scale=2.0000 norm=1.2701
[iter 17] loss=0.8909 val_loss=0.9925 scale=2.0000 norm=1.2544
[iter 18] loss=0.8705 val_loss=0.9738 scale=2.0000 norm=1.2403
[iter 19] loss=0.8507 val_loss=0.9559 scale=2.0000 norm=1.2277
[iter 20] loss=0.8313 val_loss=0.9384 scale=2.0000 norm=1.2164
[iter 21] loss=0.8125 val_loss=0.9214 scale=2.0000 norm=1.2062
[iter 22] loss=0.7941 val_loss=0.9057 scale=2.0000 norm=1.1972
[iter 23] loss=0.7761 val_loss=0.8899 scale=2.0000 norm=1



Expected 16 Excel files, but found 0 files. Skipping the merge step.


In [None]:


train, validation, test = to_train_validation_test_data(entsoe, "2016-12-31 23:45:00", "2023-12-31 23:45:00")

# NGBoost with log(power) and normal distribution to allow comparison with TabPFN because CRPS with lognormal distribution cannot be converted to a CRPS value with normal distribution and log power

In [3]:
entsoe = load_entsoe()
results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=1, dist=Normal)


Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6806 scale=2.0000 norm=2.2332
[iter 1] loss=0.6375 val_loss=0.6722 scale=2.0000 norm=2.2170
[iter 2] loss=0.6307 val_loss=0.6647 scale=2.0000 norm=2.2042
[iter 3] loss=0.6246 val_loss=0.6579 scale=2.0000 norm=2.1944
[iter 4] loss=0.6192 val_loss=0.6519 scale=2.0000 norm=2.1869
[iter 5] loss=0.6143 val_loss=0.6464 scale=2.0000 norm=2.1814
[iter 6] loss=0.6099 val_loss=0.6415 scale=2.0000 norm=2.1775
[iter 7] loss=0.6059 val_loss=0.6370 scale=2.0000 norm=2.1751
[iter 8] loss=0.6022 val_loss=0.6330 scale=2.0000 norm=2.1740
[iter 9] loss=0.5989 val_loss=0.6293 scale=2.0000 norm=2.1739
[iter 10] loss=0.5959 val_loss=0.6259 scale=2.0000 norm=2.1747
[iter 11] loss=0.5932 val_loss=0.6228 scale=2.0000 norm=2.1763
[iter 12] loss=0.5907 val_loss=0.6200 scale=2.0000 norm=2.1787
[iter 13] loss=0.5884 val_loss=0.6175 scale=2.0000 norm=2.1816
[iter 14] loss=0.5863 val_loss=0.6152 scale=2.0000 norm=2.1851
[iter 15] loss=0.5844 val_loss=0.6131 scale=2.0000 norm=2.1890
[i



Expected 16 Excel files, but found 1 files. Skipping the merge step.


In [4]:
for i in range(2,17):
    entsoe = load_entsoe()
    results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=i, dist=Normal)


Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.6075 scale=1.0000 norm=1.1098
[iter 1] loss=1.5581 val_loss=1.6003 scale=1.0000 norm=1.1025
[iter 2] loss=1.5517 val_loss=1.5936 scale=1.0000 norm=1.0957
[iter 3] loss=1.5458 val_loss=1.5874 scale=1.0000 norm=1.0895
[iter 4] loss=1.5403 val_loss=1.5816 scale=1.0000 norm=1.0837
[iter 5] loss=1.5351 val_loss=1.5762 scale=1.0000 norm=1.0783
[iter 6] loss=1.5303 val_loss=1.5711 scale=1.0000 norm=1.0733
[iter 7] loss=1.5257 val_loss=1.5663 scale=1.0000 norm=1.0686
[iter 8] loss=1.5214 val_loss=1.5617 scale=1.0000 norm=1.0642
[iter 9] loss=1.5174 val_loss=1.5575 scale=1.0000 norm=1.0602
[iter 10] loss=1.5135 val_loss=1.5535 scale=1.0000 norm=1.0563
[iter 11] loss=1.5099 val_loss=1.5497 scale=1.0000 norm=1.0527
[iter 12] loss=1.5064 val_loss=1.5461 scale=1.0000 norm=1.0494
[iter 13] loss=1.5032 val_loss=1.5426 scale=1.0000 norm=1.0462
[iter 14] loss=1.5001 val_loss=1.5361 scale=2.0000 norm=2.0866
[iter 15] loss=1.4942 val_loss=1.5304 scale=2.0000 norm=2.0757
[i



Expected 16 Excel files, but found 2 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0961
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5835 val_loss=0.6085 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5815 scale=1.0000 norm=1.0242
[iter 7] loss=0.5450 val_loss=0.5690 scale=1.0000 norm=1.0146
[iter 8] loss=0.5334 val_loss=0.5570 scale=1.0000 norm=1.0064
[iter 9] loss=0.5223 val_loss=0.5457 scale=1.0000 norm=0.9993
[iter 10] loss=0.5117 val_loss=0.5347 scale=1.0000 norm=0.9933
[iter 11] loss=0.5015 val_loss=0.5243 scale=1.0000 norm=0.9883
[iter 12] loss=0.4917 val_loss=0.5143 scale=1.0000 norm=0.9843
[iter 13] loss=0.4824 val_loss=0.5047 scale=1.0000 norm=0.9812
[iter 14] loss=0.4734 val_loss=0.4955 scale=1.0000 norm=0.9790
[iter 15] loss=0.4647 val_loss=0.4866 scale=1.0000 norm=0.9776
[i



Expected 16 Excel files, but found 4 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.5775 scale=1.0000 norm=1.1098
[iter 1] loss=1.5274 val_loss=1.5449 scale=1.0000 norm=1.0747
[iter 2] loss=1.4952 val_loss=1.5160 scale=1.0000 norm=1.0446
[iter 3] loss=1.4669 val_loss=1.4899 scale=1.0000 norm=1.0181
[iter 4] loss=1.4414 val_loss=1.4430 scale=2.0000 norm=1.9884
[iter 5] loss=1.3959 val_loss=1.4030 scale=2.0000 norm=1.9030
[iter 6] loss=1.3565 val_loss=1.3672 scale=2.0000 norm=1.8300
[iter 7] loss=1.3215 val_loss=1.3344 scale=2.0000 norm=1.7662
[iter 8] loss=1.2893 val_loss=1.3043 scale=2.0000 norm=1.7097
[iter 9] loss=1.2597 val_loss=1.2760 scale=2.0000 norm=1.6596
[iter 10] loss=1.2316 val_loss=1.2494 scale=2.0000 norm=1.6145
[iter 11] loss=1.2052 val_loss=1.2242 scale=2.0000 norm=1.5742
[iter 12] loss=1.1800 val_loss=1.2000 scale=2.0000 norm=1.5380
[iter 13] loss=1.1559 val_loss=1.1770 scale=2.0000 norm=1.5054
[iter 14] loss=1.1327 val_loss=1.1546 scale=2.0000 norm=1.4762
[iter 15] loss=1.1103 val_loss=1.1332 scale=2.0000 norm=1.4498
[i



Expected 16 Excel files, but found 4 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0960
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5834 val_loss=0.6084 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5814 scale=1.0000 norm=1.0241
[iter 7] loss=0.5450 val_loss=0.5689 scale=1.0000 norm=1.0146
[iter 8] loss=0.5333 val_loss=0.5569 scale=1.0000 norm=1.0063
[iter 9] loss=0.5222 val_loss=0.5455 scale=1.0000 norm=0.9992
[iter 10] loss=0.5116 val_loss=0.5346 scale=1.0000 norm=0.9933
[iter 11] loss=0.5014 val_loss=0.5242 scale=1.0000 norm=0.9883
[iter 12] loss=0.4916 val_loss=0.5142 scale=1.0000 norm=0.9843
[iter 13] loss=0.4823 val_loss=0.5045 scale=1.0000 norm=0.9812
[iter 14] loss=0.4733 val_loss=0.4953 scale=1.0000 norm=0.9789
[iter 15] loss=0.4646 val_loss=0.4865 scale=1.0000 norm=0.9775
[i



Expected 16 Excel files, but found 5 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.5773 scale=1.0000 norm=1.1098
[iter 1] loss=1.5268 val_loss=1.5446 scale=1.0000 norm=1.0742
[iter 2] loss=1.4944 val_loss=1.5156 scale=1.0000 norm=1.0440
[iter 3] loss=1.4660 val_loss=1.4894 scale=1.0000 norm=1.0173
[iter 4] loss=1.4404 val_loss=1.4424 scale=2.0000 norm=1.9869
[iter 5] loss=1.3948 val_loss=1.4023 scale=2.0000 norm=1.9015
[iter 6] loss=1.3554 val_loss=1.3664 scale=2.0000 norm=1.8286
[iter 7] loss=1.3203 val_loss=1.3333 scale=2.0000 norm=1.7648
[iter 8] loss=1.2882 val_loss=1.3032 scale=2.0000 norm=1.7084
[iter 9] loss=1.2584 val_loss=1.2746 scale=2.0000 norm=1.6583
[iter 10] loss=1.2304 val_loss=1.2479 scale=2.0000 norm=1.6133
[iter 11] loss=1.2039 val_loss=1.2225 scale=2.0000 norm=1.5729
[iter 12] loss=1.1787 val_loss=1.1983 scale=2.0000 norm=1.5368
[iter 13] loss=1.1545 val_loss=1.1750 scale=2.0000 norm=1.5042
[iter 14] loss=1.1312 val_loss=1.1527 scale=2.0000 norm=1.4749
[iter 15] loss=1.1088 val_loss=1.1311 scale=2.0000 norm=1.4485
[i



Expected 16 Excel files, but found 6 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0952
[iter 2] loss=0.6111 val_loss=0.6360 scale=1.0000 norm=1.0760
[iter 3] loss=0.5953 val_loss=0.6198 scale=1.0000 norm=1.0588
[iter 4] loss=0.5803 val_loss=0.6043 scale=1.0000 norm=1.0438
[iter 5] loss=0.5659 val_loss=0.5895 scale=1.0000 norm=1.0302
[iter 6] loss=0.5522 val_loss=0.5753 scale=1.0000 norm=1.0183
[iter 7] loss=0.5389 val_loss=0.5618 scale=1.0000 norm=1.0078
[iter 8] loss=0.5262 val_loss=0.5488 scale=1.0000 norm=0.9986
[iter 9] loss=0.5140 val_loss=0.5364 scale=1.0000 norm=0.9906
[iter 10] loss=0.5022 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4800 val_loss=0.5015 scale=1.0000 norm=0.9735
[iter 13] loss=0.4693 val_loss=0.4908 scale=1.0000 norm=0.9697
[iter 14] loss=0.4591 val_loss=0.4802 scale=1.0000 norm=0.9670
[iter 15] loss=0.4491 val_loss=0.4701 scale=1.0000 norm=0.9653
[i



Expected 16 Excel files, but found 7 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.5738 scale=1.0000 norm=1.1098
[iter 1] loss=1.5229 val_loss=1.5375 scale=1.0000 norm=1.0710
[iter 2] loss=1.4873 val_loss=1.5069 scale=1.0000 norm=1.0382
[iter 3] loss=1.4571 val_loss=1.4780 scale=1.0000 norm=1.0104
[iter 4] loss=1.4293 val_loss=1.4520 scale=1.0000 norm=0.9848
[iter 5] loss=1.4045 val_loss=1.4280 scale=1.0000 norm=0.9621
[iter 6] loss=1.3814 val_loss=1.4057 scale=1.0000 norm=0.9410
[iter 7] loss=1.3598 val_loss=1.3847 scale=1.0000 norm=0.9214
[iter 8] loss=1.3392 val_loss=1.3647 scale=1.0000 norm=0.9030
[iter 9] loss=1.3197 val_loss=1.3457 scale=1.0000 norm=0.8857
[iter 10] loss=1.3012 val_loss=1.3274 scale=1.0000 norm=0.8695
[iter 11] loss=1.2833 val_loss=1.3098 scale=1.0000 norm=0.8541
[iter 12] loss=1.2657 val_loss=1.2926 scale=1.0000 norm=0.8393
[iter 13] loss=1.2488 val_loss=1.2757 scale=1.0000 norm=0.8252
[iter 14] loss=1.2322 val_loss=1.2435 scale=2.0000 norm=1.6235
[iter 15] loss=1.2005 val_loss=1.2119 scale=2.0000 norm=1.5735
[i



Expected 16 Excel files, but found 8 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0953
[iter 2] loss=0.6112 val_loss=0.6362 scale=1.0000 norm=1.0762
[iter 3] loss=0.5955 val_loss=0.6199 scale=1.0000 norm=1.0591
[iter 4] loss=0.5804 val_loss=0.6044 scale=1.0000 norm=1.0439
[iter 5] loss=0.5660 val_loss=0.5896 scale=1.0000 norm=1.0303
[iter 6] loss=0.5523 val_loss=0.5755 scale=1.0000 norm=1.0184
[iter 7] loss=0.5391 val_loss=0.5620 scale=1.0000 norm=1.0079
[iter 8] loss=0.5264 val_loss=0.5490 scale=1.0000 norm=0.9987
[iter 9] loss=0.5142 val_loss=0.5365 scale=1.0000 norm=0.9907
[iter 10] loss=0.5023 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4799 val_loss=0.5017 scale=1.0000 norm=0.9733
[iter 13] loss=0.4694 val_loss=0.4908 scale=1.0000 norm=0.9697
[iter 14] loss=0.4591 val_loss=0.4802 scale=1.0000 norm=0.9670
[iter 15] loss=0.4491 val_loss=0.4701 scale=1.0000 norm=0.9652
[i



Expected 16 Excel files, but found 9 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.5737 scale=1.0000 norm=1.1098
[iter 1] loss=1.5228 val_loss=1.5375 scale=1.0000 norm=1.0709
[iter 2] loss=1.4874 val_loss=1.5067 scale=1.0000 norm=1.0384
[iter 3] loss=1.4573 val_loss=1.4778 scale=1.0000 norm=1.0107
[iter 4] loss=1.4295 val_loss=1.4522 scale=1.0000 norm=0.9852
[iter 5] loss=1.4046 val_loss=1.4281 scale=1.0000 norm=0.9624
[iter 6] loss=1.3813 val_loss=1.4058 scale=1.0000 norm=0.9411
[iter 7] loss=1.3597 val_loss=1.3846 scale=1.0000 norm=0.9216
[iter 8] loss=1.3391 val_loss=1.3648 scale=1.0000 norm=0.9031
[iter 9] loss=1.3197 val_loss=1.3456 scale=1.0000 norm=0.8859
[iter 10] loss=1.3010 val_loss=1.3093 scale=2.0000 norm=1.7391
[iter 11] loss=1.2656 val_loss=1.2921 scale=1.0000 norm=0.8390
[iter 12] loss=1.2487 val_loss=1.2583 scale=2.0000 norm=1.6500
[iter 13] loss=1.2153 val_loss=1.2271 scale=2.0000 norm=1.5964
[iter 14] loss=1.1843 val_loss=1.1972 scale=2.0000 norm=1.5490
[iter 15] loss=1.1544 val_loss=1.1678 scale=2.0000 norm=1.5055
[i



Expected 16 Excel files, but found 10 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0953
[iter 2] loss=0.6112 val_loss=0.6362 scale=1.0000 norm=1.0762
[iter 3] loss=0.5955 val_loss=0.6199 scale=1.0000 norm=1.0591
[iter 4] loss=0.5804 val_loss=0.6044 scale=1.0000 norm=1.0439
[iter 5] loss=0.5660 val_loss=0.5896 scale=1.0000 norm=1.0303
[iter 6] loss=0.5523 val_loss=0.5755 scale=1.0000 norm=1.0184
[iter 7] loss=0.5391 val_loss=0.5620 scale=1.0000 norm=1.0079
[iter 8] loss=0.5264 val_loss=0.5490 scale=1.0000 norm=0.9987
[iter 9] loss=0.5142 val_loss=0.5365 scale=1.0000 norm=0.9907
[iter 10] loss=0.5023 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4799 val_loss=0.5017 scale=1.0000 norm=0.9733
[iter 13] loss=0.4694 val_loss=0.4908 scale=1.0000 norm=0.9697
[iter 14] loss=0.4591 val_loss=0.4802 scale=1.0000 norm=0.9670
[iter 15] loss=0.4491 val_loss=0.4701 scale=1.0000 norm=0.9652
[i



Expected 16 Excel files, but found 11 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.5737 scale=1.0000 norm=1.1098
[iter 1] loss=1.5228 val_loss=1.5375 scale=1.0000 norm=1.0709
[iter 2] loss=1.4874 val_loss=1.5067 scale=1.0000 norm=1.0384
[iter 3] loss=1.4573 val_loss=1.4778 scale=1.0000 norm=1.0107
[iter 4] loss=1.4295 val_loss=1.4522 scale=1.0000 norm=0.9852
[iter 5] loss=1.4046 val_loss=1.4281 scale=1.0000 norm=0.9624
[iter 6] loss=1.3813 val_loss=1.4058 scale=1.0000 norm=0.9411
[iter 7] loss=1.3597 val_loss=1.3846 scale=1.0000 norm=0.9216
[iter 8] loss=1.3391 val_loss=1.3648 scale=1.0000 norm=0.9031
[iter 9] loss=1.3197 val_loss=1.3456 scale=1.0000 norm=0.8859
[iter 10] loss=1.3010 val_loss=1.3093 scale=2.0000 norm=1.7391
[iter 11] loss=1.2656 val_loss=1.2921 scale=1.0000 norm=0.8390
[iter 12] loss=1.2487 val_loss=1.2583 scale=2.0000 norm=1.6500
[iter 13] loss=1.2153 val_loss=1.2271 scale=2.0000 norm=1.5964
[iter 14] loss=1.1843 val_loss=1.1972 scale=2.0000 norm=1.5490
[iter 15] loss=1.1544 val_loss=1.1678 scale=2.0000 norm=1.5055
[i



Expected 16 Excel files, but found 12 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6806 scale=2.0000 norm=2.2332
[iter 1] loss=0.6374 val_loss=0.6721 scale=2.0000 norm=2.2169
[iter 2] loss=0.6306 val_loss=0.6645 scale=2.0000 norm=2.2040
[iter 3] loss=0.6245 val_loss=0.6577 scale=2.0000 norm=2.1939
[iter 4] loss=0.6189 val_loss=0.6515 scale=2.0000 norm=2.1860
[iter 5] loss=0.6139 val_loss=0.6459 scale=2.0000 norm=2.1802
[iter 6] loss=0.6094 val_loss=0.6409 scale=2.0000 norm=2.1761
[iter 7] loss=0.6052 val_loss=0.6363 scale=2.0000 norm=2.1736
[iter 8] loss=0.6015 val_loss=0.6321 scale=2.0000 norm=2.1724
[iter 9] loss=0.5981 val_loss=0.6283 scale=2.0000 norm=2.1723
[iter 10] loss=0.5950 val_loss=0.6248 scale=2.0000 norm=2.1732
[iter 11] loss=0.5922 val_loss=0.6218 scale=2.0000 norm=2.1750
[iter 12] loss=0.5896 val_loss=0.6189 scale=2.0000 norm=2.1776
[iter 13] loss=0.5872 val_loss=0.6163 scale=2.0000 norm=2.1807
[iter 14] loss=0.5851 val_loss=0.6139 scale=2.0000 norm=2.1844
[iter 15] loss=0.5831 val_loss=0.6118 scale=2.0000 norm=2.1885
[i



Expected 16 Excel files, but found 13 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.6072 scale=1.0000 norm=1.1098
[iter 1] loss=1.5577 val_loss=1.5997 scale=1.0000 norm=1.1022
[iter 2] loss=1.5508 val_loss=1.5927 scale=1.0000 norm=1.0951
[iter 3] loss=1.5444 val_loss=1.5862 scale=1.0000 norm=1.0886
[iter 4] loss=1.5385 val_loss=1.5801 scale=1.0000 norm=1.0825
[iter 5] loss=1.5329 val_loss=1.5743 scale=1.0000 norm=1.0769
[iter 6] loss=1.5277 val_loss=1.5689 scale=1.0000 norm=1.0716
[iter 7] loss=1.5228 val_loss=1.5638 scale=1.0000 norm=1.0668
[iter 8] loss=1.5181 val_loss=1.5590 scale=1.0000 norm=1.0622
[iter 9] loss=1.5137 val_loss=1.5545 scale=1.0000 norm=1.0579
[iter 10] loss=1.5095 val_loss=1.5501 scale=1.0000 norm=1.0539
[iter 11] loss=1.5056 val_loss=1.5459 scale=1.0000 norm=1.0501
[iter 12] loss=1.5019 val_loss=1.5419 scale=1.0000 norm=1.0466
[iter 13] loss=1.4984 val_loss=1.5382 scale=1.0000 norm=1.0434
[iter 14] loss=1.4950 val_loss=1.5347 scale=1.0000 norm=1.0403
[iter 15] loss=1.4918 val_loss=1.5313 scale=1.0000 norm=1.0375
[i



Expected 16 Excel files, but found 14 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0960
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5835 val_loss=0.6084 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5814 scale=1.0000 norm=1.0242
[iter 7] loss=0.5450 val_loss=0.5689 scale=1.0000 norm=1.0146
[iter 8] loss=0.5334 val_loss=0.5569 scale=1.0000 norm=1.0063
[iter 9] loss=0.5222 val_loss=0.5455 scale=1.0000 norm=0.9993
[iter 10] loss=0.5116 val_loss=0.5346 scale=1.0000 norm=0.9933
[iter 11] loss=0.5014 val_loss=0.5242 scale=1.0000 norm=0.9883
[iter 12] loss=0.4917 val_loss=0.5142 scale=1.0000 norm=0.9843
[iter 13] loss=0.4823 val_loss=0.5045 scale=1.0000 norm=0.9812
[iter 14] loss=0.4733 val_loss=0.4953 scale=1.0000 norm=0.9789
[iter 15] loss=0.4647 val_loss=0.4864 scale=1.0000 norm=0.9775
[i



Expected 16 Excel files, but found 15 files. Skipping the merge step.
Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%




[iter 0] loss=1.5650 val_loss=1.5774 scale=1.0000 norm=1.1098
[iter 1] loss=1.5271 val_loss=1.5447 scale=1.0000 norm=1.0745
[iter 2] loss=1.4948 val_loss=1.5157 scale=1.0000 norm=1.0443
[iter 3] loss=1.4665 val_loss=1.4896 scale=1.0000 norm=1.0178
[iter 4] loss=1.4409 val_loss=1.4425 scale=2.0000 norm=1.9878
[iter 5] loss=1.3953 val_loss=1.4024 scale=2.0000 norm=1.9023
[iter 6] loss=1.3559 val_loss=1.3665 scale=2.0000 norm=1.8293
[iter 7] loss=1.3208 val_loss=1.3334 scale=2.0000 norm=1.7655
[iter 8] loss=1.2887 val_loss=1.3033 scale=2.0000 norm=1.7090
[iter 9] loss=1.2590 val_loss=1.2748 scale=2.0000 norm=1.6589
[iter 10] loss=1.2309 val_loss=1.2482 scale=2.0000 norm=1.6139
[iter 11] loss=1.2044 val_loss=1.2230 scale=2.0000 norm=1.5735
[iter 12] loss=1.1792 val_loss=1.1986 scale=2.0000 norm=1.5373
[iter 13] loss=1.1550 val_loss=1.1753 scale=2.0000 norm=1.5048
[iter 14] loss=1.1318 val_loss=1.1529 scale=2.0000 norm=1.4755
[iter 15] loss=1.1093 val_loss=1.1314 scale=2.0000 norm=1.4490
[i



Merge completed! The final file is 'Merged_Sheet.xlsx'.


In [3]:
for i in range(1, 17):
    entsoe = load_entsoe()
    results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=i)


Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%


  m, s = sp.stats.norm.fit(np.log(Y))


ValueError: The data contains non-finite values.

In [None]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=1)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6806 scale=2.0000 norm=2.2332
[iter 1] loss=0.6375 val_loss=0.6722 scale=2.0000 norm=2.2170
[iter 2] loss=0.6307 val_loss=0.6647 scale=2.0000 norm=2.2042
[iter 3] loss=0.6246 val_loss=0.6579 scale=2.0000 norm=2.1944
[iter 4] loss=0.6192 val_loss=0.6519 scale=2.0000 norm=2.1869
[iter 5] loss=0.6143 val_loss=0.6464 scale=2.0000 norm=2.1814
[iter 6] loss=0.6099 val_loss=0.6415 scale=2.0000 norm=2.1775
[iter 7] loss=0.6059 val_loss=0.6370 scale=2.0000 norm=2.1751
[iter 8] loss=0.6022 val_loss=0.6330 scale=2.0000 norm=2.1740
[iter 9] loss=0.5989 val_loss=0.6293 scale=2.0000 norm=2.1739
[iter 10] loss=0.5959 val_loss=0.6259 scale=2.0000 norm=2.1747
[iter 11] loss=0.5932 val_loss=0.6228 scale=2.0000 norm=2.1763
[iter 12] loss=0.5907 val_loss=0.6200 scale=2.0000 n

In [4]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=2)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.2717 scale=1.0000 norm=1.1098
[iter 1] loss=-0.4931 val_loss=-0.2789 scale=1.0000 norm=1.1025
[iter 2] loss=-0.4995 val_loss=-0.2856 scale=1.0000 norm=1.0957
[iter 3] loss=-0.5054 val_loss=-0.2918 scale=1.0000 norm=1.0895
[iter 4] loss=-0.5109 val_loss=-0.2976 scale=1.0000 norm=1.0837
[iter 5] loss=-0.5161 val_loss=-0.3031 scale=1.0000 norm=1.0783
[iter 6] loss=-0.5209 val_loss=-0.3082 scale=1.0000 norm=1.0733
[iter 7] loss=-0.5255 val_loss=-0.3130 scale=1.0000 norm=1.0686
[iter 8] loss=-0.5297 val_loss=-0.3175 scale=1.0000 norm=1.0642
[iter 9] loss=-0.5338 val_loss=-0.3217 scale=1.0000 norm=1.0602
[iter 10] loss=-0.5376 val_loss=-0.3258 scale=1.0000 norm=1.0563
[iter 11] loss=-0.5413 val_loss=-0.3296 scale=1.0000 norm=1.0527
[iter 12] loss=-0.5447 val_l

In [5]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=3)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0961
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5835 val_loss=0.6085 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5815 scale=1.0000 norm=1.0242
[iter 7] loss=0.5450 val_loss=0.5690 scale=1.0000 norm=1.0146
[iter 8] loss=0.5334 val_loss=0.5570 scale=1.0000 norm=1.0064
[iter 9] loss=0.5223 val_loss=0.5457 scale=1.0000 norm=0.9993
[iter 10] loss=0.5117 val_loss=0.5347 scale=1.0000 norm=0.9933
[iter 11] loss=0.5015 val_loss=0.5243 scale=1.0000 norm=0.9883
[iter 12] loss=0.4917 val_loss=0.5143 scale=1.0000 n

In [6]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=4)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3017 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5238 val_loss=-0.3343 scale=1.0000 norm=1.0747
[iter 2] loss=-0.5560 val_loss=-0.3633 scale=1.0000 norm=1.0446
[iter 3] loss=-0.5843 val_loss=-0.3893 scale=1.0000 norm=1.0181
[iter 4] loss=-0.6097 val_loss=-0.4362 scale=2.0000 norm=1.9884
[iter 5] loss=-0.6553 val_loss=-0.4763 scale=2.0000 norm=1.9030
[iter 6] loss=-0.6946 val_loss=-0.5120 scale=2.0000 norm=1.8300
[iter 7] loss=-0.7297 val_loss=-0.5448 scale=2.0000 norm=1.7662
[iter 8] loss=-0.7618 val_loss=-0.5749 scale=2.0000 norm=1.7097
[iter 9] loss=-0.7915 val_loss=-0.6032 scale=2.0000 norm=1.6596
[iter 10] loss=-0.8195 val_loss=-0.6298 scale=2.0000 norm=1.6145
[iter 11] loss=-0.8460 val_loss=-0.6550 scale=2.0000 norm=1.5742
[iter 12] loss=-0.8712 val_l

In [7]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=5)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0960
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5834 val_loss=0.6084 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5814 scale=1.0000 norm=1.0241
[iter 7] loss=0.5450 val_loss=0.5689 scale=1.0000 norm=1.0146
[iter 8] loss=0.5333 val_loss=0.5569 scale=1.0000 norm=1.0063
[iter 9] loss=0.5222 val_loss=0.5455 scale=1.0000 norm=0.9992
[iter 10] loss=0.5116 val_loss=0.5346 scale=1.0000 norm=0.9933
[iter 11] loss=0.5014 val_loss=0.5242 scale=1.0000 norm=0.9883
[iter 12] loss=0.4916 val_loss=0.5142 scale=1.0000 n

In [8]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=6)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3020 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5244 val_loss=-0.3346 scale=1.0000 norm=1.0742
[iter 2] loss=-0.5567 val_loss=-0.3637 scale=1.0000 norm=1.0440
[iter 3] loss=-0.5852 val_loss=-0.3899 scale=1.0000 norm=1.0173
[iter 4] loss=-0.6108 val_loss=-0.4368 scale=2.0000 norm=1.9869
[iter 5] loss=-0.6564 val_loss=-0.4769 scale=2.0000 norm=1.9015
[iter 6] loss=-0.6957 val_loss=-0.5129 scale=2.0000 norm=1.8285
[iter 7] loss=-0.7308 val_loss=-0.5459 scale=2.0000 norm=1.7648
[iter 8] loss=-0.7630 val_loss=-0.5761 scale=2.0000 norm=1.7084
[iter 9] loss=-0.7927 val_loss=-0.6046 scale=2.0000 norm=1.6583
[iter 10] loss=-0.8208 val_loss=-0.6313 scale=2.0000 norm=1.6132
[iter 11] loss=-0.8473 val_loss=-0.6567 scale=2.0000 norm=1.5729
[iter 12] loss=-0.8725 val_l

In [9]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=7)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0952
[iter 2] loss=0.6111 val_loss=0.6360 scale=1.0000 norm=1.0760
[iter 3] loss=0.5953 val_loss=0.6198 scale=1.0000 norm=1.0588
[iter 4] loss=0.5803 val_loss=0.6043 scale=1.0000 norm=1.0438
[iter 5] loss=0.5659 val_loss=0.5895 scale=1.0000 norm=1.0302
[iter 6] loss=0.5522 val_loss=0.5753 scale=1.0000 norm=1.0183
[iter 7] loss=0.5389 val_loss=0.5618 scale=1.0000 norm=1.0078
[iter 8] loss=0.5262 val_loss=0.5488 scale=1.0000 norm=0.9986
[iter 9] loss=0.5140 val_loss=0.5364 scale=1.0000 norm=0.9906
[iter 10] loss=0.5022 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4800 val_loss=0.5015 scale=1.0000 n

In [10]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=8)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3054 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5282 val_loss=-0.3417 scale=1.0000 norm=1.0710
[iter 2] loss=-0.5639 val_loss=-0.3723 scale=1.0000 norm=1.0382
[iter 3] loss=-0.5940 val_loss=-0.4012 scale=1.0000 norm=1.0103
[iter 4] loss=-0.6219 val_loss=-0.4272 scale=1.0000 norm=0.9848
[iter 5] loss=-0.6467 val_loss=-0.4512 scale=1.0000 norm=0.9621
[iter 6] loss=-0.6698 val_loss=-0.4735 scale=1.0000 norm=0.9410
[iter 7] loss=-0.6914 val_loss=-0.4945 scale=1.0000 norm=0.9214
[iter 8] loss=-0.7120 val_loss=-0.5145 scale=1.0000 norm=0.9030
[iter 9] loss=-0.7315 val_loss=-0.5335 scale=1.0000 norm=0.8857
[iter 10] loss=-0.7500 val_loss=-0.5518 scale=1.0000 norm=0.8695
[iter 11] loss=-0.7679 val_loss=-0.5695 scale=1.0000 norm=0.8541
[iter 12] loss=-0.7855 val_l

In [11]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=9)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0953
[iter 2] loss=0.6112 val_loss=0.6362 scale=1.0000 norm=1.0762
[iter 3] loss=0.5955 val_loss=0.6199 scale=1.0000 norm=1.0591
[iter 4] loss=0.5804 val_loss=0.6044 scale=1.0000 norm=1.0439
[iter 5] loss=0.5660 val_loss=0.5896 scale=1.0000 norm=1.0303
[iter 6] loss=0.5523 val_loss=0.5755 scale=1.0000 norm=1.0184
[iter 7] loss=0.5391 val_loss=0.5620 scale=1.0000 norm=1.0079
[iter 8] loss=0.5264 val_loss=0.5490 scale=1.0000 norm=0.9987
[iter 9] loss=0.5142 val_loss=0.5365 scale=1.0000 norm=0.9907
[iter 10] loss=0.5023 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4799 val_loss=0.5017 scale=1.0000 n

In [12]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=10)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3056 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5283 val_loss=-0.3417 scale=1.0000 norm=1.0709
[iter 2] loss=-0.5638 val_loss=-0.3725 scale=1.0000 norm=1.0384
[iter 3] loss=-0.5939 val_loss=-0.4014 scale=1.0000 norm=1.0107
[iter 4] loss=-0.6217 val_loss=-0.4271 scale=1.0000 norm=0.9852
[iter 5] loss=-0.6465 val_loss=-0.4512 scale=1.0000 norm=0.9624
[iter 6] loss=-0.6699 val_loss=-0.4735 scale=1.0000 norm=0.9411
[iter 7] loss=-0.6915 val_loss=-0.4946 scale=1.0000 norm=0.9216
[iter 8] loss=-0.7120 val_loss=-0.5145 scale=1.0000 norm=0.9031
[iter 9] loss=-0.7315 val_loss=-0.5336 scale=1.0000 norm=0.8859
[iter 10] loss=-0.7501 val_loss=-0.5700 scale=2.0000 norm=1.7391
[iter 11] loss=-0.7856 val_loss=-0.5871 scale=1.0000 norm=0.8390
[iter 12] loss=-0.8025 val_l

In [13]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=11)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0953
[iter 2] loss=0.6112 val_loss=0.6362 scale=1.0000 norm=1.0762
[iter 3] loss=0.5955 val_loss=0.6199 scale=1.0000 norm=1.0591
[iter 4] loss=0.5804 val_loss=0.6044 scale=1.0000 norm=1.0439
[iter 5] loss=0.5660 val_loss=0.5896 scale=1.0000 norm=1.0303
[iter 6] loss=0.5523 val_loss=0.5755 scale=1.0000 norm=1.0184
[iter 7] loss=0.5391 val_loss=0.5620 scale=1.0000 norm=1.0079
[iter 8] loss=0.5264 val_loss=0.5490 scale=1.0000 norm=0.9987
[iter 9] loss=0.5142 val_loss=0.5365 scale=1.0000 norm=0.9907
[iter 10] loss=0.5023 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4799 val_loss=0.5017 scale=1.0000 n

In [14]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_model(entsoe, case=12)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3056 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5283 val_loss=-0.3417 scale=1.0000 norm=1.0709
[iter 2] loss=-0.5638 val_loss=-0.3725 scale=1.0000 norm=1.0384
[iter 3] loss=-0.5939 val_loss=-0.4014 scale=1.0000 norm=1.0107
[iter 4] loss=-0.6217 val_loss=-0.4271 scale=1.0000 norm=0.9852
[iter 5] loss=-0.6465 val_loss=-0.4512 scale=1.0000 norm=0.9624
[iter 6] loss=-0.6699 val_loss=-0.4735 scale=1.0000 norm=0.9411
[iter 7] loss=-0.6915 val_loss=-0.4946 scale=1.0000 norm=0.9216
[iter 8] loss=-0.7120 val_loss=-0.5145 scale=1.0000 norm=0.9031
[iter 9] loss=-0.7315 val_loss=-0.5336 scale=1.0000 norm=0.8859
[iter 10] loss=-0.7501 val_loss=-0.5700 scale=2.0000 norm=1.7391
[iter 11] loss=-0.7856 val_loss=-0.5871 scale=1.0000 norm=0.8390
[iter 12] loss=-0.8025 val_l

In [3]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=13)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6806 scale=2.0000 norm=2.2332
[iter 1] loss=0.6374 val_loss=0.6721 scale=2.0000 norm=2.2169
[iter 2] loss=0.6306 val_loss=0.6645 scale=2.0000 norm=2.2040
[iter 3] loss=0.6245 val_loss=0.6577 scale=2.0000 norm=2.1939
[iter 4] loss=0.6189 val_loss=0.6515 scale=2.0000 norm=2.1860
[iter 5] loss=0.6139 val_loss=0.6459 scale=2.0000 norm=2.1802
[iter 6] loss=0.6094 val_loss=0.6409 scale=2.0000 norm=2.1761
[iter 7] loss=0.6052 val_loss=0.6363 scale=2.0000 norm=2.1736
[iter 8] loss=0.6015 val_loss=0.6321 scale=2.0000 norm=2.1724
[iter 9] loss=0.5981 val_loss=0.6283 scale=2.0000 norm=2.1723
[iter 10] loss=0.5950 val_loss=0.6248 scale=2.0000 norm=2.1732
[iter 11] loss=0.5922 val_loss=0.6218 scale=2.0000 norm=2.1750
[iter 12] loss=0.5896 val_loss=0.6189 scale=2.0000 n

In [4]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=14)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.2720 scale=1.0000 norm=1.1098
[iter 1] loss=-0.4935 val_loss=-0.2795 scale=1.0000 norm=1.1022
[iter 2] loss=-0.5004 val_loss=-0.2865 scale=1.0000 norm=1.0951
[iter 3] loss=-0.5067 val_loss=-0.2931 scale=1.0000 norm=1.0886
[iter 4] loss=-0.5127 val_loss=-0.2992 scale=1.0000 norm=1.0825
[iter 5] loss=-0.5182 val_loss=-0.3049 scale=1.0000 norm=1.0769
[iter 6] loss=-0.5235 val_loss=-0.3103 scale=1.0000 norm=1.0716
[iter 7] loss=-0.5284 val_loss=-0.3154 scale=1.0000 norm=1.0667
[iter 8] loss=-0.5331 val_loss=-0.3202 scale=1.0000 norm=1.0621
[iter 9] loss=-0.5375 val_loss=-0.3248 scale=1.0000 norm=1.0579
[iter 10] loss=-0.5416 val_loss=-0.3292 scale=1.0000 norm=1.0539
[iter 11] loss=-0.5456 val_loss=-0.3333 scale=1.0000 norm=1.0501
[iter 12] loss=-0.5493 val_l

In [5]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=15)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0960
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5835 val_loss=0.6084 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5814 scale=1.0000 norm=1.0242
[iter 7] loss=0.5450 val_loss=0.5689 scale=1.0000 norm=1.0146
[iter 8] loss=0.5334 val_loss=0.5569 scale=1.0000 norm=1.0063
[iter 9] loss=0.5222 val_loss=0.5455 scale=1.0000 norm=0.9993
[iter 10] loss=0.5116 val_loss=0.5346 scale=1.0000 norm=0.9933
[iter 11] loss=0.5014 val_loss=0.5242 scale=1.0000 norm=0.9883
[iter 12] loss=0.4917 val_loss=0.5142 scale=1.0000 n

In [6]:
entsoe = load_entsoe()

results_per_time_interval_df, results_summary_stats_df, results_per_row_df, hyperparameters_df = evaluate_ngboost_model(entsoe, case=16)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3018 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5241 val_loss=-0.3345 scale=1.0000 norm=1.0745
[iter 2] loss=-0.5563 val_loss=-0.3636 scale=1.0000 norm=1.0443
[iter 3] loss=-0.5847 val_loss=-0.3897 scale=1.0000 norm=1.0178
[iter 4] loss=-0.6102 val_loss=-0.4367 scale=2.0000 norm=1.9877
[iter 5] loss=-0.6558 val_loss=-0.4768 scale=2.0000 norm=1.9023
[iter 6] loss=-0.6952 val_loss=-0.5127 scale=2.0000 norm=1.8293
[iter 7] loss=-0.7303 val_loss=-0.5458 scale=2.0000 norm=1.7655
[iter 8] loss=-0.7625 val_loss=-0.5759 scale=2.0000 norm=1.7090
[iter 9] loss=-0.7922 val_loss=-0.6044 scale=2.0000 norm=1.6589
[iter 10] loss=-0.8203 val_loss=-0.6310 scale=2.0000 norm=1.6139
[iter 11] loss=-0.8467 val_loss=-0.6563 scale=2.0000 norm=1.5735
[iter 12] loss=-0.8720 val_l

# Alter code ohne Figur und excel Dokument für merged information

In [3]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=1)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6806 scale=2.0000 norm=2.2332
[iter 1] loss=0.6375 val_loss=0.6722 scale=2.0000 norm=2.2170
[iter 2] loss=0.6307 val_loss=0.6647 scale=2.0000 norm=2.2042
[iter 3] loss=0.6246 val_loss=0.6579 scale=2.0000 norm=2.1944
[iter 4] loss=0.6192 val_loss=0.6519 scale=2.0000 norm=2.1869
[iter 5] loss=0.6143 val_loss=0.6464 scale=2.0000 norm=2.1814
[iter 6] loss=0.6099 val_loss=0.6415 scale=2.0000 norm=2.1775
[iter 7] loss=0.6059 val_loss=0.6370 scale=2.0000 norm=2.1751
[iter 8] loss=0.6022 val_loss=0.6330 scale=2.0000 norm=2.1740
[iter 9] loss=0.5989 val_loss=0.6293 scale=2.0000 norm=2.1739
[iter 10] loss=0.5959 val_loss=0.6259 scale=2.0000 norm=2.1747
[iter 11] loss=0.5932 val_loss=0.6228 scale=2.0000 norm=2.1763
[iter 12] loss=0.5907 val_loss=0.6200 scale=2.0000 n

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.527452,0.127293,2.945731,0.104590,0.021858,0.464429,0.301276,-16.310341,2.462182,0.527452
1,2,0.528015,0.117377,3.034884,0.104495,0.021690,0.473311,0.301891,-17.337429,2.458429,0.528015
2,3,0.529974,0.118462,3.208672,0.103955,0.023160,0.479594,0.306262,-18.670506,2.451273,0.529974
3,4,0.533789,0.126418,3.471371,0.103474,0.023219,0.475667,0.302312,-22.804050,2.451273,0.533789
4,5,0.534015,0.125654,3.623714,0.102753,0.023301,0.467540,0.303279,-26.445000,2.450815,0.534015
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.527363,0.121338,2.567617,0.107025,0.021376,0.445425,0.288973,-11.218587,2.417542,0.527363
92,93,0.526984,0.121263,2.658296,0.106515,0.023216,0.455137,0.289712,-11.682353,2.441640,0.526984
93,94,0.527614,0.122556,2.759381,0.105974,0.023686,0.450364,0.286348,-12.547662,2.449185,0.527614
94,95,0.528161,0.119704,2.856737,0.105456,0.023459,0.461679,0.293559,-13.897368,2.457971,0.528161


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.594162,0.097035,5.890023,0.106901,0.019255,0.67271,0.357223,-76.403841,2.462182,0.594162


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.318985,0.135220,-0.336709
1,2,0.329743,0.141667,-0.374002
2,3,0.344702,0.140830,-0.389813
3,4,0.370102,0.155723,-0.480130
4,5,0.352404,0.143660,-0.406276
...,...,...,...,...
35035,35036,0.588794,0.169652,-0.636659
35036,35037,0.599610,0.174756,-0.668718
35037,35038,0.589348,0.169897,-0.638686
35038,35039,0.594754,0.172441,-0.654711


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,[power_t-96],<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.CRPScore'>,100,0.03,42


(None, None, None, None)

In [4]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=2)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.2717 scale=1.0000 norm=1.1098
[iter 1] loss=-0.4931 val_loss=-0.2789 scale=1.0000 norm=1.1025
[iter 2] loss=-0.4995 val_loss=-0.2856 scale=1.0000 norm=1.0957
[iter 3] loss=-0.5054 val_loss=-0.2918 scale=1.0000 norm=1.0895
[iter 4] loss=-0.5109 val_loss=-0.2976 scale=1.0000 norm=1.0837
[iter 5] loss=-0.5161 val_loss=-0.3031 scale=1.0000 norm=1.0783
[iter 6] loss=-0.5209 val_loss=-0.3082 scale=1.0000 norm=1.0733
[iter 7] loss=-0.5255 val_loss=-0.3130 scale=1.0000 norm=1.0686
[iter 8] loss=-0.5297 val_loss=-0.3175 scale=1.0000 norm=1.0642
[iter 9] loss=-0.5338 val_loss=-0.3217 scale=1.0000 norm=1.0602
[iter 10] loss=-0.5376 val_loss=-0.3258 scale=1.0000 norm=1.0563
[iter 11] loss=-0.5413 val_loss=-0.3296 scale=1.0000 norm=1.0527
[iter 12] loss=-0.5447 val_l

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.537649,0.145426,2.751427,0.106279,0.021762,0.470965,0.310102,-8.188232,2.620279,-0.310102
1,2,0.538279,0.140797,2.840580,0.106227,0.021418,0.482642,0.313692,-8.773340,2.593023,-0.313692
2,3,0.540054,0.130817,3.017881,0.105717,0.022286,0.493840,0.318430,-9.876306,2.567069,-0.318430
3,4,0.543939,0.127473,3.277068,0.105252,0.022260,0.485165,0.319234,-11.912503,2.567069,-0.319234
4,5,0.544089,0.126221,3.452851,0.104539,0.022252,0.475086,0.315138,-16.359392,2.565647,-0.315138
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.537392,0.140868,2.455417,0.108759,0.022798,0.451250,0.299585,-5.844949,2.610450,-0.299585
92,93,0.536985,0.140934,2.557743,0.108168,0.023186,0.459892,0.300358,-6.087454,2.682225,-0.300358
93,94,0.538411,0.141086,2.662264,0.107935,0.023695,0.458133,0.296767,-6.608451,2.673838,-0.296767
94,95,0.538613,0.140691,2.766034,0.107272,0.022558,0.471411,0.298518,-7.351262,2.656880,-0.298518


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.598208,0.1217,5.76988,0.107759,0.020418,0.681694,0.391899,-22.42936,2.703592,-0.391899


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.362262,0.148818,-0.432126
1,2,0.374076,0.155768,-0.473793
2,3,0.387520,0.153610,-0.478837
3,4,0.414538,0.169451,-0.570695
4,5,0.388123,0.154426,-0.478541
...,...,...,...,...
35035,35036,0.641102,0.179656,-0.718106
35036,35037,0.652189,0.184888,-0.750035
35037,35038,0.645731,0.180766,-0.725969
35038,35039,0.651295,0.183384,-0.741976


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,[power_t-96],<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.LogScore'>,100,0.03,42


(None, None, None, None)

In [5]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=3)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0961
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5835 val_loss=0.6085 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5815 scale=1.0000 norm=1.0242
[iter 7] loss=0.5450 val_loss=0.5690 scale=1.0000 norm=1.0146
[iter 8] loss=0.5334 val_loss=0.5570 scale=1.0000 norm=1.0064
[iter 9] loss=0.5223 val_loss=0.5457 scale=1.0000 norm=0.9993
[iter 10] loss=0.5117 val_loss=0.5347 scale=1.0000 norm=0.9933
[iter 11] loss=0.5015 val_loss=0.5243 scale=1.0000 norm=0.9883
[iter 12] loss=0.4917 val_loss=0.5143 scale=1.0000 n

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.264119,0.039231,1.843364,0.043700,0.007590,0.246034,1.181318,-4.452546,3.239751,0.264119
1,2,0.256047,0.040304,1.826902,0.042521,0.007338,0.245063,1.220266,-4.875345,3.197452,0.256047
2,3,0.255210,0.042522,1.847871,0.042479,0.007353,0.250181,1.224061,-5.582254,3.184421,0.255210
3,4,0.263589,0.040177,1.935265,0.043666,0.007335,0.256068,1.181143,-6.810906,3.194656,0.263589
4,5,0.276389,0.037387,2.350113,0.046109,0.007356,0.264569,1.095351,-13.806473,3.280074,0.276389
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.248994,0.042107,1.477869,0.042808,0.007356,0.492718,1.220676,-7.945705,3.371682,0.248994
92,93,0.255955,0.042142,1.573762,0.043565,0.007415,0.510068,1.188542,-8.208379,3.409821,0.255955
93,94,0.248534,0.041178,1.656464,0.042216,0.007543,0.455700,1.228028,-6.596305,3.403516,0.248534
94,95,0.247176,0.040990,1.762140,0.041710,0.007644,0.367759,1.239769,-5.086997,3.355049,0.247176


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.274447,0.036167,5.30723,0.041566,0.007335,0.517243,1.300109,-25.411731,3.654129,0.274447


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.128530,0.084984,0.672531
1,2,0.120391,0.080946,0.713430
2,3,0.147393,0.096697,0.542058
3,4,0.122406,0.083131,0.689244
4,5,0.163583,0.108400,0.408125
...,...,...,...,...
35035,35036,0.317015,0.120779,-0.374866
35036,35037,0.345987,0.132514,-0.614721
35037,35038,0.317944,0.121215,-0.382897
35038,35039,0.301698,0.116607,-0.269452


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[ws_10m_loc_mean, ws_100m_loc_mean]",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.CRPScore'>,100,0.03,42


(None, None, None, None)

In [7]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=5)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6719 scale=1.0000 norm=1.1166
[iter 1] loss=0.6284 val_loss=0.6547 scale=1.0000 norm=1.0960
[iter 2] loss=0.6126 val_loss=0.6384 scale=1.0000 norm=1.0779
[iter 3] loss=0.5976 val_loss=0.6230 scale=1.0000 norm=1.0618
[iter 4] loss=0.5834 val_loss=0.6084 scale=1.0000 norm=1.0476
[iter 5] loss=0.5700 val_loss=0.5946 scale=1.0000 norm=1.0351
[iter 6] loss=0.5572 val_loss=0.5814 scale=1.0000 norm=1.0241
[iter 7] loss=0.5450 val_loss=0.5689 scale=1.0000 norm=1.0146
[iter 8] loss=0.5333 val_loss=0.5569 scale=1.0000 norm=1.0063
[iter 9] loss=0.5222 val_loss=0.5455 scale=1.0000 norm=0.9992
[iter 10] loss=0.5116 val_loss=0.5346 scale=1.0000 norm=0.9933
[iter 11] loss=0.5014 val_loss=0.5242 scale=1.0000 norm=0.9883
[iter 12] loss=0.4916 val_loss=0.5142 scale=1.0000 n

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.255146,0.035575,1.950867,0.042082,0.007416,0.256997,1.212219,-4.500322,3.416249,0.255146
1,2,0.247892,0.037963,1.923008,0.041063,0.007087,0.237951,1.246805,-4.644070,3.410094,0.247892
2,3,0.247589,0.039241,1.925165,0.040956,0.007136,0.240701,1.249662,-5.315924,3.399780,0.247589
3,4,0.255812,0.040879,1.928825,0.042054,0.007784,0.246844,1.211155,-5.384542,3.242941,0.255812
4,5,0.267912,0.039314,2.191476,0.044357,0.007525,0.258814,1.136819,-9.660994,3.278190,0.267912
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.239934,0.043625,1.534282,0.040816,0.007527,0.492859,1.260699,-8.247268,3.329846,0.239934
92,93,0.246606,0.043568,1.661790,0.041559,0.007375,0.507060,1.227222,-8.634752,3.365323,0.246606
93,94,0.239288,0.043730,1.753675,0.040279,0.007595,0.459845,1.271025,-6.714806,3.380517,0.239288
94,95,0.237746,0.041460,1.856388,0.039906,0.007591,0.379221,1.279859,-5.601854,3.353540,0.237746


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.267949,0.035494,5.215438,0.040297,0.006735,0.50706,1.32888,-22.550611,3.848754,0.267949


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.124840,0.082706,0.696168
1,2,0.115007,0.077389,0.744579
2,3,0.140523,0.092272,0.599469
3,4,0.113316,0.076876,0.747241
4,5,0.146735,0.096920,0.550612
...,...,...,...,...
35035,35036,0.313481,0.120620,-0.526205
35036,35037,0.343048,0.132635,-0.813442
35037,35038,0.314431,0.121065,-0.535712
35038,35039,0.301583,0.117587,-0.429829


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_mean, ws_100m_loc_mean]",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.CRPScore'>,100,0.03,42


(None, None, None, None)

In [8]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=6)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3020 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5244 val_loss=-0.3346 scale=1.0000 norm=1.0742
[iter 2] loss=-0.5567 val_loss=-0.3637 scale=1.0000 norm=1.0440
[iter 3] loss=-0.5852 val_loss=-0.3899 scale=1.0000 norm=1.0173
[iter 4] loss=-0.6108 val_loss=-0.4368 scale=2.0000 norm=1.9869
[iter 5] loss=-0.6564 val_loss=-0.4769 scale=2.0000 norm=1.9015
[iter 6] loss=-0.6957 val_loss=-0.5129 scale=2.0000 norm=1.8285
[iter 7] loss=-0.7308 val_loss=-0.5459 scale=2.0000 norm=1.7648
[iter 8] loss=-0.7630 val_loss=-0.5761 scale=2.0000 norm=1.7084
[iter 9] loss=-0.7927 val_loss=-0.6046 scale=2.0000 norm=1.6583
[iter 10] loss=-0.8208 val_loss=-0.6313 scale=2.0000 norm=1.6132
[iter 11] loss=-0.8473 val_loss=-0.6567 scale=2.0000 norm=1.5729
[iter 12] loss=-0.8725 val_l

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.258365,0.041534,1.873890,0.042607,0.005138,0.258824,1.212362,-4.052442,3.727156,-1.212362
1,2,0.250206,0.041294,1.869984,0.041449,0.004610,0.234926,1.248774,-3.170278,3.658750,-1.248774
2,3,0.250014,0.040297,1.876125,0.041387,0.004609,0.228437,1.254218,-3.213207,3.551091,-1.254218
3,4,0.258486,0.040369,1.878056,0.042423,0.004176,0.236317,1.222145,-3.388918,3.343097,-1.222145
4,5,0.270761,0.039846,1.969375,0.044661,0.004215,0.256110,1.153924,-7.971978,3.442451,-1.153924
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.245529,0.038566,1.518742,0.041301,0.004428,0.488119,1.251773,-7.315616,3.603943,-1.251773
92,93,0.250590,0.038851,1.624872,0.041804,0.004340,0.502634,1.229226,-7.800260,3.661389,-1.229226
93,94,0.242244,0.038633,1.712369,0.040708,0.004246,0.457102,1.263849,-6.205735,3.741193,-1.263849
94,95,0.241131,0.039353,1.813981,0.040483,0.004277,0.371905,1.269855,-4.795446,3.683912,-1.269855


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.269698,0.037543,5.155103,0.040694,0.002293,0.502634,1.334739,-18.039244,5.241196,-1.334739


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.130655,0.085513,0.638052
1,2,0.117562,0.078117,0.739481
2,3,0.133611,0.086698,0.644791
3,4,0.101255,0.067743,0.860962
4,5,0.152563,0.100113,0.478771
...,...,...,...,...
35035,35036,0.318113,0.121147,-0.382802
35036,35037,0.345592,0.132168,-0.577432
35037,35038,0.319043,0.121583,-0.390849
35038,35039,0.314351,0.120946,-0.363308


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_mean, ws_100m_loc_mean]",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.LogScore'>,100,0.03,42


(None, None, None, None)

In [9]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=7)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0952
[iter 2] loss=0.6111 val_loss=0.6360 scale=1.0000 norm=1.0760
[iter 3] loss=0.5953 val_loss=0.6198 scale=1.0000 norm=1.0588
[iter 4] loss=0.5803 val_loss=0.6043 scale=1.0000 norm=1.0438
[iter 5] loss=0.5659 val_loss=0.5895 scale=1.0000 norm=1.0302
[iter 6] loss=0.5522 val_loss=0.5753 scale=1.0000 norm=1.0183
[iter 7] loss=0.5389 val_loss=0.5618 scale=1.0000 norm=1.0078
[iter 8] loss=0.5262 val_loss=0.5488 scale=1.0000 norm=0.9986
[iter 9] loss=0.5140 val_loss=0.5364 scale=1.0000 norm=0.9906
[iter 10] loss=0.5022 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4800 val_loss=0.5015 scale=1.0000 n

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.177076,0.022424,1.211708,0.030956,0.003223,0.176491,1.533382,-5.130505,3.781915,0.177076
1,2,0.174571,0.023203,1.229947,0.030062,0.003590,0.187393,1.576459,-4.672855,3.849287,0.174571
2,3,0.171087,0.026813,1.260057,0.028845,0.003484,0.199236,1.625567,-4.621095,3.881627,0.171087
3,4,0.172172,0.025586,1.308767,0.028484,0.003397,0.203061,1.636395,-4.929359,3.822051,0.172172
4,5,0.172720,0.020645,1.672848,0.028300,0.003492,0.208911,1.653026,-5.168518,3.845731,0.172720
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.174026,0.021626,1.158200,0.030272,0.003905,0.289317,1.534440,-6.884829,3.707459,0.174026
92,93,0.176072,0.021976,1.271622,0.030546,0.003622,0.299818,1.512130,-6.377614,3.739484,0.176072
93,94,0.171679,0.023134,1.286986,0.030105,0.003629,0.339642,1.551462,-6.374514,3.696741,0.171679
94,95,0.171986,0.022217,1.178905,0.030349,0.003354,0.296615,1.552474,-5.856139,3.799814,0.171986


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.182779,0.020159,4.144601,0.028352,0.002734,0.357295,1.714674,-22.048296,4.087833,0.182779


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.197566,0.131057,-0.452582
1,2,0.185209,0.124767,-0.276310
2,3,0.220739,0.145497,-0.826033
3,4,0.172337,0.116527,-0.088154
4,5,0.211078,0.138978,-0.663309
...,...,...,...,...
35035,35036,0.253844,0.101345,-0.351559
35036,35037,0.250614,0.101653,-0.297950
35037,35038,0.236827,0.095143,-0.134052
35038,35039,0.229735,0.093310,-0.077951


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_1, ws_10m_loc_2, ws_10...",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.CRPScore'>,100,0.03,42


(None, None, None, None)

In [10]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=8)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3054 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5282 val_loss=-0.3417 scale=1.0000 norm=1.0710
[iter 2] loss=-0.5639 val_loss=-0.3723 scale=1.0000 norm=1.0382
[iter 3] loss=-0.5940 val_loss=-0.4012 scale=1.0000 norm=1.0103
[iter 4] loss=-0.6219 val_loss=-0.4272 scale=1.0000 norm=0.9848
[iter 5] loss=-0.6467 val_loss=-0.4512 scale=1.0000 norm=0.9621
[iter 6] loss=-0.6698 val_loss=-0.4735 scale=1.0000 norm=0.9410
[iter 7] loss=-0.6914 val_loss=-0.4945 scale=1.0000 norm=0.9214
[iter 8] loss=-0.7120 val_loss=-0.5145 scale=1.0000 norm=0.9030
[iter 9] loss=-0.7315 val_loss=-0.5335 scale=1.0000 norm=0.8857
[iter 10] loss=-0.7500 val_loss=-0.5518 scale=1.0000 norm=0.8695
[iter 11] loss=-0.7679 val_loss=-0.5695 scale=1.0000 norm=0.8541
[iter 12] loss=-0.7855 val_l

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.175829,0.038148,1.117058,0.031442,0.001966,0.155610,1.545846,-4.761534,4.215046,-1.545846
1,2,0.176617,0.038518,1.263309,0.030852,0.002212,0.160704,1.558906,-3.997456,4.404543,-1.558906
2,3,0.173037,0.038551,1.258002,0.030113,0.002393,0.175959,1.590409,-3.332317,4.414499,-1.590409
3,4,0.173785,0.037935,1.415674,0.029588,0.002271,0.193886,1.608170,-3.164027,4.447552,-1.608170
4,5,0.175685,0.040427,1.563752,0.029703,0.002180,0.207698,1.607669,-3.817663,4.467974,-1.607669
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.178285,0.039121,1.358801,0.031757,0.001928,0.310332,1.504342,-5.587657,4.086741,-1.504342
92,93,0.179326,0.039299,1.300189,0.032081,0.002030,0.324806,1.497556,-4.816944,4.189585,-1.497556
93,94,0.175256,0.040386,1.172254,0.031502,0.002029,0.361505,1.533636,-5.043850,4.176753,-1.533636
94,95,0.171501,0.039683,1.142880,0.030961,0.002589,0.256387,1.557580,-3.557047,4.230513,-1.557580


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.182838,0.035524,4.066981,0.029525,0.000958,0.361505,1.704331,-9.897822,5.258404,-1.704331


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.149164,0.099364,0.506768
1,2,0.133439,0.090114,0.613282
2,3,0.156476,0.102871,0.462424
3,4,0.119907,0.081202,0.710100
4,5,0.146436,0.096152,0.542360
...,...,...,...,...
35035,35036,0.226275,0.090508,0.108893
35036,35037,0.241319,0.097446,-0.028520
35037,35038,0.228734,0.091421,0.099090
35038,35039,0.227761,0.091697,0.112676


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_1, ws_10m_loc_2, ws_10...",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.LogScore'>,100,0.03,42


(None, None, None, None)

In [11]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=9)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0953
[iter 2] loss=0.6112 val_loss=0.6362 scale=1.0000 norm=1.0762
[iter 3] loss=0.5955 val_loss=0.6199 scale=1.0000 norm=1.0591
[iter 4] loss=0.5804 val_loss=0.6044 scale=1.0000 norm=1.0439
[iter 5] loss=0.5660 val_loss=0.5896 scale=1.0000 norm=1.0303
[iter 6] loss=0.5523 val_loss=0.5755 scale=1.0000 norm=1.0184
[iter 7] loss=0.5391 val_loss=0.5620 scale=1.0000 norm=1.0079
[iter 8] loss=0.5264 val_loss=0.5490 scale=1.0000 norm=0.9987
[iter 9] loss=0.5142 val_loss=0.5365 scale=1.0000 norm=0.9907
[iter 10] loss=0.5023 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4799 val_loss=0.5017 scale=1.0000 n

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.179121,0.027055,1.209181,0.031229,0.003592,0.173216,1.525175,-4.992516,3.737618,0.179121
1,2,0.174966,0.027752,1.251602,0.030172,0.003601,0.182676,1.577816,-5.071438,3.817570,0.174966
2,3,0.172346,0.029181,1.230794,0.029307,0.003332,0.201156,1.613788,-5.324899,3.830098,0.172346
3,4,0.173594,0.027057,1.360593,0.028786,0.003457,0.209069,1.628338,-5.532285,3.777957,0.173594
4,5,0.174370,0.023169,1.743915,0.028755,0.003215,0.216978,1.637097,-4.923237,3.816740,0.174370
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.174575,0.027414,1.188012,0.030692,0.003942,0.332640,1.519523,-7.394523,3.655408,0.174575
92,93,0.177200,0.027359,1.292626,0.031123,0.003788,0.349515,1.496291,-7.060641,3.699331,0.177200
93,94,0.172869,0.028389,1.280126,0.030599,0.003493,0.334727,1.536134,-6.108559,3.672218,0.172869
94,95,0.173467,0.026941,1.202586,0.030708,0.003340,0.282736,1.535249,-6.205893,3.743309,0.173467


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.183389,0.021488,4.164072,0.028494,0.002954,0.354697,1.713057,-15.890658,4.043016,0.183389


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.181373,0.120383,0.004713
1,2,0.168156,0.113316,0.148110
2,3,0.202171,0.133162,-0.249565
3,4,0.160059,0.108482,0.236545
4,5,0.192656,0.126782,-0.126919
...,...,...,...,...
35035,35036,0.258377,0.102980,-0.403313
35036,35037,0.277321,0.111418,-0.617665
35037,35038,0.245464,0.098343,-0.239932
35038,35039,0.242244,0.097878,-0.196059


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_mean, ws_100m_loc_mean...",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.CRPScore'>,100,0.03,42


(None, None, None, None)

In [19]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=10)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3056 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5283 val_loss=-0.3417 scale=1.0000 norm=1.0709
[iter 2] loss=-0.5638 val_loss=-0.3725 scale=1.0000 norm=1.0384
[iter 3] loss=-0.5939 val_loss=-0.4014 scale=1.0000 norm=1.0107
[iter 4] loss=-0.6217 val_loss=-0.4271 scale=1.0000 norm=0.9852
[iter 5] loss=-0.6465 val_loss=-0.4512 scale=1.0000 norm=0.9624
[iter 6] loss=-0.6699 val_loss=-0.4735 scale=1.0000 norm=0.9411
[iter 7] loss=-0.6915 val_loss=-0.4946 scale=1.0000 norm=0.9216
[iter 8] loss=-0.7120 val_loss=-0.5145 scale=1.0000 norm=0.9031
[iter 9] loss=-0.7315 val_loss=-0.5336 scale=1.0000 norm=0.8859
[iter 10] loss=-0.7501 val_loss=-0.5700 scale=2.0000 norm=1.7391
[iter 11] loss=-0.7856 val_loss=-0.5871 scale=1.0000 norm=0.8390
[iter 12] loss=-0.8025 val_l

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.178169,0.038029,1.184753,0.031700,0.002227,0.151036,1.534851,-5.312266,4.217456,-1.534851
1,2,0.177840,0.038137,1.253203,0.031059,0.002392,0.171275,1.557586,-4.155692,4.422803,-1.557586
2,3,0.172996,0.039860,1.324569,0.029883,0.002500,0.175911,1.599451,-4.379843,4.413207,-1.599451
3,4,0.175010,0.037047,1.355670,0.029685,0.002104,0.187715,1.607184,-3.100128,4.468014,-1.607184
4,5,0.176349,0.037241,1.552844,0.029621,0.002148,0.207818,1.609195,-4.037515,4.471000,-1.609195
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.177790,0.038247,1.270670,0.031717,0.001677,0.333711,1.510412,-5.663024,4.080989,-1.510412
92,93,0.180427,0.038298,1.282409,0.032297,0.001754,0.330069,1.490699,-5.489529,4.162583,-1.490699
93,94,0.174646,0.039109,1.219714,0.031446,0.001715,0.360213,1.533654,-5.119319,4.199768,-1.533654
94,95,0.173535,0.038604,1.155947,0.031039,0.002514,0.257108,1.547640,-3.734538,4.264084,-1.547640


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.182783,0.034712,4.123051,0.029426,0.000709,0.360213,1.706868,-10.243708,5.234022,-1.706868


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.151351,0.100617,0.477488
1,2,0.135213,0.091125,0.594954
2,3,0.158860,0.104269,0.429735
3,4,0.119152,0.080472,0.718087
4,5,0.146431,0.095901,0.534611
...,...,...,...,...
35035,35036,0.214751,0.086341,0.197117
35036,35037,0.267074,0.106986,-0.296853
35037,35038,0.233532,0.093332,0.030637
35038,35039,0.211973,0.086051,0.215467


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_mean, ws_100m_loc_mean...",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.LogScore'>,100,0.03,42


(None, None, None, None)

In [20]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=11)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=0.6450 val_loss=0.6712 scale=1.0000 norm=1.1166
[iter 1] loss=0.6277 val_loss=0.6532 scale=1.0000 norm=1.0953
[iter 2] loss=0.6112 val_loss=0.6362 scale=1.0000 norm=1.0762
[iter 3] loss=0.5955 val_loss=0.6199 scale=1.0000 norm=1.0591
[iter 4] loss=0.5804 val_loss=0.6044 scale=1.0000 norm=1.0439
[iter 5] loss=0.5660 val_loss=0.5896 scale=1.0000 norm=1.0303
[iter 6] loss=0.5523 val_loss=0.5755 scale=1.0000 norm=1.0184
[iter 7] loss=0.5391 val_loss=0.5620 scale=1.0000 norm=1.0079
[iter 8] loss=0.5264 val_loss=0.5490 scale=1.0000 norm=0.9987
[iter 9] loss=0.5142 val_loss=0.5365 scale=1.0000 norm=0.9907
[iter 10] loss=0.5023 val_loss=0.5244 scale=1.0000 norm=0.9837
[iter 11] loss=0.4909 val_loss=0.5128 scale=1.0000 norm=0.9780
[iter 12] loss=0.4799 val_loss=0.5017 scale=1.0000 n

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.187359,0.027213,1.242805,0.033236,0.003582,0.180066,1.447614,-4.902627,3.732604,0.187359
1,2,0.183309,0.030142,1.276275,0.032025,0.003599,0.186674,1.499161,-5.574791,3.804623,0.183309
2,3,0.180570,0.030806,1.261597,0.031090,0.003311,0.202221,1.539715,-5.925210,3.808504,0.180570
3,4,0.181470,0.028249,1.366303,0.030578,0.003465,0.209256,1.554281,-6.054287,3.801946,0.181470
4,5,0.181890,0.025968,1.777764,0.030555,0.003138,0.218246,1.563996,-5.395435,3.803171,0.181890
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.170613,0.027771,1.166702,0.030042,0.004123,0.331374,1.549593,-6.907003,3.648384,0.170613
92,93,0.173528,0.028062,1.310493,0.030509,0.003848,0.344137,1.520153,-6.567456,3.681160,0.173528
93,94,0.168380,0.029265,1.304247,0.029837,0.003525,0.327877,1.563779,-5.731185,3.654294,0.168380
94,95,0.168914,0.028313,1.216650,0.029943,0.003442,0.285835,1.565079,-5.894972,3.680665,0.168914


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.183625,0.021821,4.185913,0.028695,0.002956,0.344415,1.707729,-13.281745,4.013085,0.183625


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.165027,0.109887,0.325691
1,2,0.150321,0.101555,0.445656
2,3,0.181013,0.119368,0.183357
3,4,0.137799,0.093448,0.552358
4,5,0.173414,0.114351,0.256454
...,...,...,...,...
35035,35036,0.253795,0.101586,-0.441114
35036,35037,0.272171,0.109718,-0.614304
35037,35038,0.240283,0.096597,-0.225651
35038,35039,0.236011,0.095744,-0.174640


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_mean, ws_100m_loc_mean...",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.CRPScore'>,100,0.03,42


(None, None, None, None)

In [21]:
entsoe = load_entsoe()
evaluate_model(entsoe, n_estimators=100, case=12)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
[iter 0] loss=-0.4862 val_loss=-0.3056 scale=1.0000 norm=1.1098
[iter 1] loss=-0.5283 val_loss=-0.3417 scale=1.0000 norm=1.0709
[iter 2] loss=-0.5638 val_loss=-0.3725 scale=1.0000 norm=1.0384
[iter 3] loss=-0.5939 val_loss=-0.4014 scale=1.0000 norm=1.0107
[iter 4] loss=-0.6217 val_loss=-0.4271 scale=1.0000 norm=0.9852
[iter 5] loss=-0.6465 val_loss=-0.4512 scale=1.0000 norm=0.9624
[iter 6] loss=-0.6699 val_loss=-0.4735 scale=1.0000 norm=0.9411
[iter 7] loss=-0.6915 val_loss=-0.4946 scale=1.0000 norm=0.9216
[iter 8] loss=-0.7120 val_loss=-0.5145 scale=1.0000 norm=0.9031
[iter 9] loss=-0.7315 val_loss=-0.5336 scale=1.0000 norm=0.8859
[iter 10] loss=-0.7501 val_loss=-0.5700 scale=2.0000 norm=1.7391
[iter 11] loss=-0.7856 val_loss=-0.5871 scale=1.0000 norm=0.8390
[iter 12] loss=-0.8025 val_l

Unnamed: 0,Interval,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores
0,1,0.187704,0.038029,1.164785,0.033303,0.002158,0.156425,1.473215,-6.075313,4.263320,-1.473215
1,2,0.187533,0.038372,1.246101,0.032628,0.002150,0.178364,1.494113,-4.719543,4.484756,-1.494113
2,3,0.182307,0.040349,1.336160,0.031388,0.002226,0.185239,1.541484,-4.821566,4.479520,-1.541484
3,4,0.184142,0.037082,1.496363,0.031137,0.001851,0.201854,1.548836,-4.184379,4.534267,-1.548836
4,5,0.184317,0.037989,1.628630,0.030952,0.001867,0.218838,1.555159,-4.703335,4.541864,-1.555159
...,...,...,...,...,...,...,...,...,...,...,...
91,92,0.173753,0.038429,1.334697,0.031001,0.001916,0.336356,1.535138,-5.305507,3.883163,-1.535138
92,93,0.176047,0.038493,1.352622,0.031599,0.001543,0.332419,1.516063,-5.038930,4.063579,-1.516063
93,94,0.170049,0.039501,1.289128,0.030654,0.001396,0.357609,1.563181,-4.799214,4.152831,-1.563181
94,95,0.168666,0.038720,1.196543,0.030240,0.002529,0.254798,1.578751,-3.481848,4.170091,-1.578751


Unnamed: 0,CRPS_gaussian_mean,CRPS_gaussian_min,CRPS_gaussian_max,CRPS_lognormal_mean,CRPS_lognormal_min,CRPS_lognormal_max,NLL_mean,NLL_min,NLL_max,model_scores_mean
0,0.181416,0.035303,4.081012,0.029501,0.000928,0.357609,1.709748,-10.35608,5.25084,-1.709748


Unnamed: 0,Entry_no,CRPS_gaussian,CRPS_lognormal,NLL
0,1,0.144633,0.095899,0.536357
1,2,0.128880,0.086633,0.647949
2,3,0.152793,0.100019,0.482264
3,4,0.113857,0.076694,0.760556
4,5,0.140725,0.091927,0.582825
...,...,...,...,...
35035,35036,0.205726,0.082926,0.283437
35036,35037,0.256158,0.102968,-0.180527
35037,35038,0.223055,0.089436,0.132752
35038,35039,0.201806,0.082178,0.309181


Unnamed: 0,dataset,feature_columns,distribution,loss_function,iterations,learning_rate,random_state
0,entsoe,"[power_t-96, ws_10m_loc_mean, ws_100m_loc_mean...",<class 'ngboost.distns.lognormal.LogNormal'>,<class 'ngboost.scores.LogScore'>,100,0.03,42


(None, None, None, None)

# Merge summary statistics into one excel file

In [26]:
import pandas as pd
import glob

# Step 1: Get all Excel files in a folder
file_paths = glob.glob("../results/NGBoost/*.xlsx")  # Update with the correct path
merged_data = []


# Step 2: Loop through each file and extract both sheets
for file in file_paths:
    try:
        # Read "Summary_Scores" sheet
        df_scores = pd.read_excel(file, sheet_name="Summary_Scores")
        df_scores["Source_File"] = file  # Optional: Track source file
        
        # Read "Hyperparameters" sheet
        df_hyperparams = pd.read_excel(file, sheet_name="Hyperparameters")
        df_hyperparams["Source_File"] = file  # Optional: Track source file

        # Combine the two dataframes horizontally (side by side)
        combined_df = pd.concat([df_scores, df_hyperparams], axis=1)
        merged_data.append(combined_df)

    except Exception as e:
        print(f"Could not read {file}: {e}")

# Step 3: Merge all data into one DataFrame
final_merged_df = pd.concat(merged_data, ignore_index=True)

# Step 4: Save to a new Excel file
final_merged_df.to_excel("../results/NGBoost/Merged_Sheet.xlsx", index=False)

print("Merge completed! The final file is 'Merged_Sheet.xlsx'.")


Merge completed! The final file is 'Merged_Sheet.xlsx'.


In [30]:
pd.set_option('display.max_colwidth', None)  # Do not truncate content in columns
final_merged_df[['CRPS_gaussian_mean', 'NLL_mean', 'NLL_min', 'loss_function', 'feature_columns']]

Unnamed: 0,CRPS_gaussian_mean,NLL_mean,NLL_min,loss_function,feature_columns
0,0.594162,0.357223,-76.403841,<class 'ngboost.scores.CRPScore'>,['power_t-96']
1,0.182783,1.706868,-10.243708,<class 'ngboost.scores.LogScore'>,"['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6', 'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10', 'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6', 'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10']"
2,0.183625,1.707729,-13.281745,<class 'ngboost.scores.CRPScore'>,"['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6', 'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10', 'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6', 'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10', 'interval_index']"
3,0.181416,1.709748,-10.35608,<class 'ngboost.scores.LogScore'>,"['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean', 'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6', 'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10', 'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6', 'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10', 'interval_index']"
4,0.598208,0.391899,-22.42936,<class 'ngboost.scores.LogScore'>,['power_t-96']
5,0.274447,1.300109,-25.411731,<class 'ngboost.scores.CRPScore'>,"['ws_10m_loc_mean', 'ws_100m_loc_mean']"
6,0.277985,1.296382,-19.878298,<class 'ngboost.scores.LogScore'>,"['ws_10m_loc_mean', 'ws_100m_loc_mean']"
7,0.267949,1.32888,-22.550611,<class 'ngboost.scores.CRPScore'>,"['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean']"
8,0.269698,1.334739,-18.039244,<class 'ngboost.scores.LogScore'>,"['power_t-96', 'ws_10m_loc_mean', 'ws_100m_loc_mean']"
9,0.182779,1.714674,-22.048296,<class 'ngboost.scores.CRPScore'>,"['power_t-96', 'ws_10m_loc_1', 'ws_10m_loc_2', 'ws_10m_loc_3', 'ws_10m_loc_4', 'ws_10m_loc_5', 'ws_10m_loc_6', 'ws_10m_loc_7', 'ws_10m_loc_8', 'ws_10m_loc_9', 'ws_10m_loc_10', 'ws_100m_loc_1', 'ws_100m_loc_2', 'ws_100m_loc_3', 'ws_100m_loc_4', 'ws_100m_loc_5', 'ws_100m_loc_6', 'ws_100m_loc_7', 'ws_100m_loc_8', 'ws_100m_loc_9', 'ws_100m_loc_10']"


In [14]:
entsoe = load_entsoe()
evaluate_ngboost_model(entsoe=entsoe, output_file="C:/Users/Minu/Documents/NGboost/", case=1)

before
# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%
after
C:/Users/Minu/Documents/NGboost/case1.xlsx




[iter 0] loss=0.6450 val_loss=0.6806 scale=2.0000 norm=2.2332
[iter 1] loss=0.6375 val_loss=0.6722 scale=2.0000 norm=2.2170
[iter 2] loss=0.6307 val_loss=0.6647 scale=2.0000 norm=2.2042
[iter 3] loss=0.6246 val_loss=0.6579 scale=2.0000 norm=2.1944
[iter 4] loss=0.6192 val_loss=0.6519 scale=2.0000 norm=2.1869
[iter 5] loss=0.6143 val_loss=0.6464 scale=2.0000 norm=2.1814
[iter 6] loss=0.6099 val_loss=0.6415 scale=2.0000 norm=2.1775
[iter 7] loss=0.6059 val_loss=0.6370 scale=2.0000 norm=2.1751
[iter 8] loss=0.6022 val_loss=0.6330 scale=2.0000 norm=2.1740
[iter 9] loss=0.5989 val_loss=0.6293 scale=2.0000 norm=2.1739
[iter 10] loss=0.5959 val_loss=0.6259 scale=2.0000 norm=2.1747
[iter 11] loss=0.5932 val_loss=0.6228 scale=2.0000 norm=2.1763
[iter 12] loss=0.5907 val_loss=0.6200 scale=2.0000 norm=2.1787
[iter 13] loss=0.5884 val_loss=0.6175 scale=2.0000 norm=2.1816
[iter 14] loss=0.5863 val_loss=0.6152 scale=2.0000 norm=2.1851
[iter 15] loss=0.5844 val_loss=0.6131 scale=2.0000 norm=2.1890
[i



Expected 16 Excel files, but found 0 files. Skipping the merge step.


(    Interval  CRPS_gaussian_mean  CRPS_gaussian_min  CRPS_gaussian_max  \
 0          1            0.527452           0.127293           2.945731   
 1          2            0.528015           0.117377           3.034884   
 2          3            0.529974           0.118462           3.208672   
 3          4            0.533789           0.126418           3.471371   
 4          5            0.534015           0.125654           3.623714   
 ..       ...                 ...                ...                ...   
 91        92            0.527363           0.121338           2.567617   
 92        93            0.526984           0.121263           2.658296   
 93        94            0.527614           0.122556           2.759381   
 94        95            0.528161           0.119704           2.856737   
 95        96            0.528321           0.127422           2.947207   
 
     CRPS_lognormal_mean  CRPS_lognormal_min  CRPS_lognormal_max  NLL_mean  \
 0                  