# 📊 Zeitstrahl für Training, Validierung und Test

**Gesamtbeschränkung:**  
- Maximal **10.000 Einträge** in Training  

---

## 📅 Originale Datenaufteilung:
- **Trainingsdaten:** 2016-01-01 – 2022-12-13  
- **Validierungsdaten:** 2023-01-01 – 2023-12-31  
- **Testdaten:** 2024-01-01 – 2024-12-31  

---

## 🔄 Neue Strategie für einen besseren Vergleich mit NGBoost:
Da mein anderes Modell **NGBoost** direkt auf allen Validierungsdaten (2023) ausgewertet werden kann, verwende ich für dieses Modell eine alternative Strategie.  
Anstatt alle Daten von 2016–2022 zu nutzen, wird nur das letzte verfügbare Jahr (**2022**) als Trainingsbasis herangezogen.  
Für jede Quartalsperiode in 2023 wird das entsprechende Quartal in 2022 als Trainingsdatensatz genutzt.

### 📌 Quartalsweise Trainings- & Validierungsaufteilung:

| Split  | Train-Zeitraum        | Validierungszeitraum | CRPS  | NLL  |
|--------|----------------------|----------------------|-------|------|
| **1️⃣ Q1**  | 2022-01-01 – 2022-03-31  | 2023-01-01 – 2023-03-31  | CRPS₁ | NLL₁ |
| **2️⃣ Q2**  | 2022-04-01 – 2022-06-30  | 2023-04-01 – 2023-06-30  | CRPS₂ | NLL₂ |
| **3️⃣ Q3**  | 2022-07-01 – 2022-09-30  | 2023-07-01 – 2023-09-30  | CRPS₃ | NLL₃ |
| **4️⃣ Q4**  | 2022-10-01 – 2022-12-31  | 2023-10-01 – 2023-12-31  | CRPS₄ | NLL₄ |

### 📊 Durchschnittliche Bewertung:
Nach Durchführung der vier Experimente werden die finalen Metriken berechnet als:

Durchschnittlicher CRPS:
(CRPS₁ + CRPS₂ + CRPS₃ + CRPS₄) / 4

Durchschnittliche NLL:
(NLL₁ + NLL₂ + NLL₃ + NLL₄) / 4


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from analysis.datasets import load_entsoe
from analysis.splits import to_train_validation_test_data
from analysis.transformations import scale_power_data
from tabpfn import TabPFNRegressor
from analysis.transformations import add_interval_index, add_lagged_features
from torchinfo import summary
from analysis.TabPFN_copy import evaluate
import torch
import seaborn as sns
from analysis.TabPFN_copy import fit_tail_distribution, plot_cdf_pdf_dynamic, plot_pdf_from_logits
%load_ext autoreload
%autoreload 2

In [2]:
def compute_crps_pytorch(logits, bin_edges, y_values):
    """
    Computes the CRPS for multiple rows of logits and corresponding y-values using PyTorch.

    Args:
        logits: Tensor of shape (N, 5000) - unnormalized logits for each row.
        bin_edges: Tensor of shape (5001,) - common bin edges for all rows.
        y_values: Tensor of shape (N,) - target values for each row.

    Returns:
        Tensor of shape (N,) containing the CRPS values for each row.
    """

    # Convert logits to probabilities using softmax
    probs = torch.softmax(logits, dim=1)  # (N, 5000)

    # Compute CDF (cumulative sum of probabilities)
    cdf = torch.cumsum(probs, dim=1)  # (N, 5000)

    # Compute the indicator function (1 if bin edge >= y, else 0)
    # We need to compare each y_value with bin_edges and broadcast correctly
    indicators = (bin_edges[1:].unsqueeze(0) >= y_values.unsqueeze(1)).float()  # (N, 5000)

    # Step 4: Compute bin widths
    bin_widths = (bin_edges[1:] - bin_edges[:-1]).unsqueeze(0)  # (1, 5000)

    # Step 5: Compute CRPS integral for each row
    crps = torch.sum((cdf - indicators) ** 2 * bin_widths, dim=1)  # (N,)

    return crps


In [3]:
feature_columns = ['ws_10m_loc_mean', 'ws_100m_loc_mean', 'power_t-96']
target_column='power'

entsoe = load_entsoe()
entsoe = scale_power_data(entsoe)
entsoe = add_lagged_features(entsoe)
entsoe = add_interval_index(entsoe)
entsoe.dropna(inplace=True)

Data loaded and transformed successfully. Shape of DataFrame: (78912, 22)


In [None]:
train_end = "2022-12-31 23:45:00"
validation_end = "2023-12-31 23:45:00"

train, validation, test = to_train_validation_test_data(entsoe, train_end, validation_end)
X_train, y_train = train[feature_columns], train[target_column]
X_validation, y_validation = validation[feature_columns], validation[target_column]

# of training observations: 245376 | 77.76%
# of validation observations: 35040 | 11.10%
# of test observations: 35133 | 11.13%


# Train 4 models for each quarter

In [5]:
# Define the train-validation splits
splits = {
    "first_q": {
        "train": ("2022-01-01", "2022-03-31"),
        "validation": ("2023-01-01", "2023-03-31"),
        #"train": ("2022-01-01", "2022-01-03"),
        #"validation": ("2023-01-01", "2023-01-03"),
    },
    "second_q": {
        "train": ("2022-04-01", "2022-06-30"),
        "validation": ("2023-04-01", "2023-06-30"),
        #"train": ("2022-01-04", "2022-01-06"),
        #"validation": ("2023-01-04", "2023-01-06"),
    },
    "third_q": {
        "train": ("2022-07-01", "2022-09-30"),
        "validation": ("2023-07-01", "2023-09-30"),
    },
    "fourth_q": {
        "train": ("2022-10-01", "2022-12-31"),
        "validation": ("2023-10-01", "2023-12-31"),
        #"validation": ("2023-12-26", "2023-12-31")
    }
}
# Initialize dictionary to store results
results_dict = {}
summary_stats_overall = {}

# Loop through each quarter
for quarter, dates in splits.items():
    print(f"Processing {quarter}...")

    # Extract train and validation data
    X_train_q = X_train.loc[dates["train"][0]:dates["train"][1]]
    y_train_q = y_train.loc[dates["train"][0]:dates["train"][1]]
    
    X_validation_q = X_validation.loc[dates["validation"][0]:dates["validation"][1]]
    y_validation_q = y_validation.loc[dates["validation"][0]:dates["validation"][1]]

    # Train model
    model = TabPFNRegressor(device='auto', fit_mode='low_memory', random_state=42)
    model.fit(X_train_q, y_train_q)

    # Define quantiles
    quantiles_custom = np.arange(0.1, 1, 0.1)
    probabilities_q = quantiles_custom

    # Predict
    probs_val_q = model.predict(X_validation_q, output_type="full", quantiles=quantiles_custom)
    logits_q = probs_val_q["logits"]
    borders_q = probs_val_q["criterion"].borders
    all_quantiles_q = np.array(probs_val_q["quantiles"])

    # Convert y_validation to tensor
    y_validation_q_torch = torch.tensor(y_validation_q.values, dtype=torch.float32)

    # Compute CRPS and NLL using the 5000 logits and 5001 borders from TabPFN
    crps_values_torch_q = compute_crps_pytorch(logits_q, borders_q, y_validation_q_torch)
    print(f"CRPS shape for {quarter}:", crps_values_torch_q.shape)


    # Initialize max_nll_so_far to a very small number
    max_nll_so_far = float('-9999')
    nll_torch_q = probs_val_q["criterion"].forward(logits_q, y_validation_q_torch)

    # Find the max ignoring inf and NaN
    finite_nlls = nll_torch_q[torch.isfinite(nll_torch_q)]

    if finite_nlls.numel() > 0:  # numel() = number of elements in finite_nlls
        max_nll_so_far = max(max_nll_so_far, finite_nlls.max().item())
        print(f"Updated max_nll_so_far to: {max_nll_so_far} based on the current finite NLLs")

    # Find the indices of the infinite values
    infinite_indices = torch.nonzero(~torch.isfinite(nll_torch_q)).squeeze()

    # Replace infinite values with max_nll_so_far
    infinite_values = nll_torch_q[infinite_indices]
    nll_torch_q[~torch.isfinite(nll_torch_q)] = max_nll_so_far

    # Print out the indices and the values that were replaced
    print(f"Replaced infinite values at indices {infinite_indices.tolist()} in nll_torch_q with the value {max_nll_so_far}.")


    # Fit tails
    yt = entsoe["power"]
    quantile_10 = np.percentile(yt, probabilities_q * 100)
    mu_left_asym, sigma_left_asym = fit_tail_distribution(quantile_10[:2], probabilities_q[:2])
    mu_right_asym, sigma_right_asym = fit_tail_distribution(quantile_10[-2:], probabilities_q[-2:])

    # Initialize lists for CRPS and NLL calculations of the 9 deciles
    crps_cdf_linear_a_full_q = []
    crps_hybrid_cdf_a_full_q = []
    crps_normal_a_full_q = []
    nll_pdf_linear_a_full_q = []
    nll_pdf_hybrid_a_full_q = []
    nll_normal_a_full_q = []

    # Iterate through validation samples
    for i in range(y_validation_q.shape[0]):
        quantile_i = all_quantiles_q[:, i]
        y_i = y_validation_q.iloc[i]

        # Compute evaluation metrics
        cdf_linear, hybrid_cdf, crps_normal, pdf_linear, pdf_hybrid, nll_normal = evaluate(
            quantile_i, probabilities_q, y_i, -20, 5, mu_left_asym, sigma_left_asym, mu_right_asym, sigma_right_asym
        )

        # Store results
        crps_cdf_linear_a_full_q.append(cdf_linear)
        crps_hybrid_cdf_a_full_q.append(hybrid_cdf)
        crps_normal_a_full_q.append(crps_normal)
        nll_pdf_linear_a_full_q.append(pdf_linear)
        nll_pdf_hybrid_a_full_q.append(pdf_hybrid)
        nll_normal_a_full_q.append(nll_normal)

    # Compute mean values
    mean_crps_cdf_linear_q = np.mean(crps_cdf_linear_a_full_q)
    mean_crps_hybrid_cdf_q = np.mean(crps_hybrid_cdf_a_full_q)
    mean_crps_normal_q = np.mean(crps_normal_a_full_q)
    mean_nll_pdf_linear_q = np.mean(nll_pdf_linear_a_full_q)
    mean_nll_pdf_hybrid_q = np.mean(nll_pdf_hybrid_a_full_q)
    mean_nll_normal_q = np.mean(nll_normal_a_full_q)

    # Print mean values
    print(f"Mean CRPS for {quarter} - CDF Linear: {mean_crps_cdf_linear_q:.4f}")
    print(f"Mean CRPS for {quarter} - Hybrid CDF: {mean_crps_hybrid_cdf_q:.4f}")
    print(f"Mean CRPS for {quarter} - Normal: {mean_crps_normal_q:.4f}")
    print(f"Mean NLL for {quarter} - PDF Linear: {mean_nll_pdf_linear_q:.4f}")
    print(f"Mean NLL for {quarter} - PDF Hybrid: {mean_nll_pdf_hybrid_q:.4f}")
    print(f"Mean NLL for {quarter} - Normal: {mean_nll_normal_q:.4f}")

    # Store results in dictionary
    result_q = {
        'CRPS Linear': crps_cdf_linear_a_full_q,
        'CRPS Hybrid': crps_hybrid_cdf_a_full_q,
        'CRPS Normal': crps_normal_a_full_q,
        'CRPS (5000 quantiles)': crps_values_torch_q,
        'NLL Linear': nll_pdf_linear_a_full_q,
        'NLL Hybrid': nll_pdf_hybrid_a_full_q,
        'NLL Normal': nll_normal_a_full_q,
        'NLL (5000 quantiles)': nll_torch_q,
        'y values': y_validation_q_torch,
    }
        
        
        
    # Convert to DataFrame and round values
    results_q = pd.DataFrame(result_q).round(8)
    results_dict[quarter] = results_q

    summary_stats_q = {
        'CRPS Linear': [mean_crps_cdf_linear_q, np.min(crps_cdf_linear_a_full_q), np.max(crps_cdf_linear_a_full_q), np.median(crps_cdf_linear_a_full_q)],
        'CRPS Hybrid': [mean_crps_hybrid_cdf_q, np.min(crps_hybrid_cdf_a_full_q), np.max(crps_hybrid_cdf_a_full_q), np.median(crps_hybrid_cdf_a_full_q)],
        'CRPS Normal': [mean_crps_normal_q, np.min(crps_normal_a_full_q), np.max(crps_normal_a_full_q), np.median(crps_normal_a_full_q)],
        'CRPS (5000 quantiles)': [crps_values_torch_q.mean().item(), np.min(crps_values_torch_q.cpu().numpy()), np.max(crps_values_torch_q.cpu().numpy()), np.median(crps_values_torch_q.cpu().numpy())],
        'NLL Linear': [mean_nll_pdf_linear_q, np.min(nll_pdf_linear_a_full_q), np.max(nll_pdf_linear_a_full_q), np.median(nll_pdf_linear_a_full_q)],
        'NLL Hybrid': [mean_nll_pdf_hybrid_q, np.min(nll_pdf_hybrid_a_full_q), np.max(nll_pdf_hybrid_a_full_q), np.median(nll_pdf_hybrid_a_full_q)],
        'NLL Normal': [mean_nll_normal_q, np.min(nll_normal_a_full_q), np.max(nll_normal_a_full_q), np.median(nll_normal_a_full_q)],
        'NLL (5000 quantiles)': [nll_torch_q.mean().item(), np.min(nll_torch_q.cpu().numpy()), np.max(nll_torch_q.cpu().numpy()), np.median(nll_torch_q.cpu().numpy())],
    }

    # Convert the summary stats into a DataFrame
    summary_stats_q_pd = pd.DataFrame(summary_stats_q, index=['Mean', 'Min', 'Max', 'Median'])
    summary_stats_overall[quarter] = summary_stats_q_pd

    # Print summary
    print("✅ Processing complete. Results stored in `results_dict`.")

Processing first_q...




CRPS shape for first_q: torch.Size([8640])
Updated max_nll_so_far to: 13.211125373840332 based on the current finite NLLs
Replaced infinite values at indices [] in nll_torch_q with the value 13.211125373840332.


  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  crps_value, _ = quad(integrand, y_min, y_max)
  the requested tolerance from being achieved.  The error may be 
  underestimated.
  crps_value, _ = quad(integrand, y_min, y_max)


Mean CRPS for first_q - CDF Linear: 0.3729
Mean CRPS for first_q - Hybrid CDF: 0.3588
Mean CRPS for first_q - Normal: 0.3155
Mean NLL for first_q - PDF Linear: 2.1223
Mean NLL for first_q - PDF Hybrid: 0.5613
Mean NLL for first_q - Normal: 5.4108
✅ Processing complete. Results stored in `results_dict`.
Processing second_q...




CRPS shape for second_q: torch.Size([8736])
Updated max_nll_so_far to: 10.347389221191406 based on the current finite NLLs
Replaced infinite values at indices [] in nll_torch_q with the value 10.347389221191406.


  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  crps_value, _ = quad(integrand, y_min, y_max)
  the requested tolerance from being achieved.  The error may be 
  underestimated.
  crps_value, _ = quad(integrand, y_min, y_max)


Mean CRPS for second_q - CDF Linear: 0.3799
Mean CRPS for second_q - Hybrid CDF: 0.3241
Mean CRPS for second_q - Normal: 0.3113
Mean NLL for second_q - PDF Linear: 1.9890
Mean NLL for second_q - PDF Hybrid: 0.8770
Mean NLL for second_q - Normal: 2.4954
✅ Processing complete. Results stored in `results_dict`.
Processing third_q...




CRPS shape for third_q: torch.Size([8832])
Updated max_nll_so_far to: 11.481844902038574 based on the current finite NLLs
Replaced infinite values at indices [] in nll_torch_q with the value 11.481844902038574.


  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  crps_value, _ = quad(integrand, y_min, y_max)
  the requested tolerance from being achieved.  The error may be 
  underestimated.
  crps_value, _ = quad(integrand, y_min, y_max)


Mean CRPS for third_q - CDF Linear: 0.4282
Mean CRPS for third_q - Hybrid CDF: 0.3749
Mean CRPS for third_q - Normal: 0.3648
Mean NLL for third_q - PDF Linear: 2.2197
Mean NLL for third_q - PDF Hybrid: 1.0821
Mean NLL for third_q - Normal: 3.9140
✅ Processing complete. Results stored in `results_dict`.
Processing fourth_q...




CRPS shape for fourth_q: torch.Size([8832])
Updated max_nll_so_far to: 11.55695915222168 based on the current finite NLLs
Replaced infinite values at indices 8488 in nll_torch_q with the value 11.55695915222168.


  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  crps_value, _ = quad(integrand, y_min, y_max)
  in the extrapolation table.  It is assumed that the requested tolerance
  cannot be achieved, and that the returned result (if full_output = 1) is 
  the best which can be obtained.
  crps_value, _ = quad(integrand, y_min, y_max)
  the requested tolerance from being achieved.  The error may be 
  underestimated.
  crps_value, _ = quad(integrand, y_min, y_max)


Mean CRPS for fourth_q - CDF Linear: 0.3598
Mean CRPS for fourth_q - Hybrid CDF: 0.3337
Mean CRPS for fourth_q - Normal: 0.3013
Mean NLL for fourth_q - PDF Linear: 2.5172
Mean NLL for fourth_q - PDF Hybrid: 1.2181
Mean NLL for fourth_q - Normal: 10.2370
✅ Processing complete. Results stored in `results_dict`.


# Create a meta table

In [6]:
epsilon = 1e-3
lag=96
transformation = "log(power / max power value + epsilon)"

# Table for metadata

split_rows = []
for quarter, dates in splits.items():
    # Create a row with both 'train' and 'validation' in the same row
    split_rows.append({
        'quarter': quarter,
        'train_start_date': dates['train'][0],
        'train_end_date': dates['train'][1],
        'validation_start_date': dates['validation'][0],
        'validation_end_date': dates['validation'][1],
        'random_seed': 42,
        'num_splits': 4,
        'epsilon': epsilon,
        'lag': lag,
        'transformation': transformation,
        'features': feature_columns,
        'model_type': "TabPFNRegressor",
        'device': "auto",
        'fit_mode': "low_memory",
        "ignore_pretaining_limits": "False"
    })

meta_info_df = pd.DataFrame(split_rows)

# Create a summary statistics table

In [7]:
# Combine all DataFrames from results_dict into one big DataFrame
all_quarters_df = pd.concat(results_dict.values(), ignore_index=True)

# Compute mean, min, median, and max for all numeric columns
summary_stats = all_quarters_df.describe().loc[['mean', 'min', '50%', 'max']].rename(index={'50%': 'median'})

# Compute mean of "CRPS (5000 quantiles)" separately
crps_mean = all_quarters_df["CRPS (5000 quantiles)"].mean()

# Compute mean of "NLL (5000 quantiles)" separately
nll_mean = all_quarters_df["NLL (5000 quantiles)"].mean()

# Convert them into separate DataFrames
crps_mean_df = pd.DataFrame({"Metric": ["Mean CRPS (5000 quantiles)"], "Value": [crps_mean]})
nll_mean_df = pd.DataFrame({"Metric": ["Mean NLL (5000 quantiles)"], "Value": [nll_mean]})

# Write all the tables out in excel

In [8]:
# Define the output Excel file name
output_excel_file = '../../../OneDrive/Arbeit/HTWG/Master/results/TabPFN/results_ws10m_ws100m_pt_96.xlsx'
#output_excel_file = '../../../OneDrive/Arbeit/HTWG/Master/results/TabPFN/tests/2_results_ws10m_ws100m_pt_96.xlsx'

# Write everything to the Excel file
with pd.ExcelWriter(output_excel_file) as writer:
    # Write each split DataFrame to its own sheet
    for quarter, df in results_dict.items():
        df.to_excel(writer, sheet_name=quarter, index=False)
        summary_stats_overall[quarter].to_excel(writer, sheet_name=quarter, startcol=len(df.columns) + 3, index=True)

    # Write the meta_info DataFrame
    meta_info_df.to_excel(writer, sheet_name='meta_info', index=False)
    
    # Write summary statistics
    summary_stats.to_excel(writer, sheet_name="summary_stats")
    
    # Append the mean CRPS and NLL separately below the summary stats
    crps_mean_df.to_excel(writer, sheet_name="summary_stats", startrow=len(summary_stats) + 2, index=False)
    nll_mean_df.to_excel(writer, sheet_name="summary_stats", startrow=len(summary_stats) + 4, index=False)

print(f'Summary statistics and mean CRPS/NLL added to "summary_stats" sheet in {output_excel_file}')

Summary statistics and mean CRPS/NLL added to "summary_stats" sheet in ../../../OneDrive/Arbeit/HTWG/Master/results/TabPFN/results_ws10m_ws100m_pt_96.xlsx


In [9]:
print("debugged the summary stats per quarter and wrote everything into an excel file")

debugged the summary stats per quarter and wrote everything into an excel file


In [17]:
print("done with new changes of replacing inf values in q4 with max nll and writing the new summary stats per quarter into the respective sheets")

done with new changes of replacing inf values in q4 with max nll and writing the new summary stats per quarter into the respective sheets


In [20]:
# Find rows where NLL (5000 quantiles) is inf
inf_rows = results_dict["fourth_q"][results_dict["fourth_q"]["NLL (5000 quantiles)"] == float("inf")]
inf_rows


Unnamed: 0,CRPS Linear,CRPS Hybrid,CRPS Normal,CRPS (5000 quantiles),NLL Linear,NLL Hybrid,NLL Normal,NLL (5000 quantiles),y values
8488,0.215054,0.279779,0.157575,0.158623,4.010459,8.96477,17.167528,inf,-0.370978


In [None]:
pd.DataFrame(nll_torch_q)

Unnamed: 0,0
0,0.262002
1,1.082626
2,3.098203
3,3.836962
4,5.466805
...,...
8827,3.267937
8828,4.434358
8829,2.142652
8830,2.570757


In [None]:
nll_torch_q_np = nll_torch_q.cpu().numpy()  # Move to CPU if it's on GPU

inf_indices = np.where(np.isinf(nll_torch_q_np))[0]

if len(inf_indices) > 0:
    print(f"⚠️ Found {len(inf_indices)} cases where NLL is inf in {quarter}.")

    # Extract corresponding logits
    inf_logits = logits_q[inf_indices]

    # Print the logits for debugging
    logits_df = pd.DataFrame(inf_logits)

logits_df



⚠️ Found 1 cases where NLL is inf in fourth_q.


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,4990,4991,4992,4993,4994,4995,4996,4997,4998,4999
0,-25.652138,-25.23842,-27.545704,-27.573008,-27.749456,-27.88328,-28.299463,-28.315201,-28.511505,-28.58597,...,-inf,-inf,-inf,-15.670451,-inf,-inf,-inf,-inf,-inf,-15.719242


In [48]:
y_validation_q_torch[1]

tensor(-2.3246)

In [53]:
np.searchsorted(borders_q, y_validation_q_torch[8488], side='right') - 1

tensor(2940)

In [57]:
borders_q[2939:2943]

tensor([-0.3768, -0.3737, -0.3703, -0.3669])

In [60]:
pd.DataFrame(logits_q)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,4990,4991,4992,4993,4994,4995,4996,4997,4998,4999
0,-19.013660,-16.121935,-18.066088,-19.185160,-18.809097,-19.559652,-20.388332,-20.174355,-20.303059,-20.440792,...,-inf,-inf,-inf,-14.953773,-inf,-inf,-inf,-inf,-17.616362,-14.204114
1,-20.445480,-17.047121,-19.200077,-20.411682,-19.994869,-20.745758,-21.517900,-21.266085,-21.467499,-21.575579,...,-inf,-inf,-inf,-15.051413,-inf,-inf,-inf,-inf,-inf,-14.452294
2,-19.991667,-17.829699,-19.708517,-20.733616,-20.404200,-20.983063,-21.502640,-21.344337,-21.562677,-21.560261,...,-inf,-inf,-inf,-15.051413,-inf,-inf,-inf,-inf,-inf,-14.823153
3,-19.075420,-18.534924,-19.857212,-20.369707,-20.388901,-20.625570,-21.006407,-20.982119,-21.177557,-21.225622,...,-inf,-inf,-inf,-14.744682,-inf,-inf,-inf,-inf,-inf,-14.977304
4,-18.726599,-19.765089,-20.627991,-20.863737,-21.163416,-21.268990,-21.700020,-21.741943,-21.883339,-22.085009,...,-inf,-inf,-inf,-14.725990,-inf,-inf,-inf,-inf,-inf,-15.347678
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8827,-18.452223,-18.032745,-19.962942,-20.401678,-20.568182,-20.946655,-21.449667,-21.411539,-21.640854,-21.782778,...,-inf,-inf,-inf,-15.313777,-inf,-inf,-inf,-inf,-inf,-14.604100
8828,-18.149229,-18.262783,-19.873194,-20.269188,-20.455511,-20.775591,-21.263411,-21.269014,-21.464455,-21.634327,...,-inf,-inf,-inf,-15.770535,-inf,-inf,-inf,-inf,-inf,-15.496099
8829,-19.353964,-19.132006,-21.005367,-21.449732,-21.600035,-21.951370,-22.444838,-22.388763,-22.615458,-22.780737,...,-inf,-inf,-inf,-15.382770,-inf,-inf,-inf,-inf,-inf,-14.823153
8830,-19.192339,-19.081923,-20.843849,-21.220961,-21.428885,-21.700876,-22.195786,-22.145071,-22.394833,-22.565186,...,-inf,-inf,-inf,-15.579479,-inf,-inf,-inf,-inf,-inf,-14.953773


for row 8488 for y = -0.370978 some logits are infinite. The 2939th to 2943 logits are infinite
In TabPFN the log pdf for a given y is calculated the following way: 
The index of the bin is searched in which the y value lies. Then the logit of this bin is used to calculate the log pdf as log_softmax - log(bin_width). The problem is that because there are so many bins, the logits from bin to bin can fluctuate and sometimes the value is infinite. A better method would be to take the average of neighboring bins.
For this case the 9 deciles range from -0.596 to -0.517. To get the probability for y = -0.370978 we need to evaluate the pdf in the tail of the distribution.

In [72]:
all_quantiles_q[:,8488]

array([-0.59614944, -0.5627999 , -0.5508739 , -0.54409146, -0.53867894,
       -0.53359264, -0.5287777 , -0.5243036 , -0.5172177 ], dtype=float32)

In [79]:
pd.DataFrame(logits_q).describe().iloc[:, 2920:2943]


  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _ensure_numeric((avg - values) ** 2)
  sqr = _e

Unnamed: 0,2920,2921,2922,2923,2924,2925,2926,2927,2928,2929,...,2933,2934,2935,2936,2937,2938,2939,2940,2941,2942
count,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,...,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0,8832.0
mean,-10.789545,-10.965722,-inf,-10.938675,-inf,-inf,-inf,-11.154449,-inf,-11.177657,...,-inf,-11.255788,-inf,-inf,-inf,-11.598496,-inf,-inf,-inf,-inf
std,4.1372,4.159431,,4.1437,,,,4.123433,,4.069353,...,,3.919912,,,,3.835953,,,,
min,-18.714973,-18.714973,-inf,-18.714973,-inf,-inf,-inf,-18.714973,-inf,-18.714973,...,-inf,-18.714973,-inf,-inf,-inf,-18.714973,-inf,-inf,-inf,-inf
25%,-14.272323,-14.466478,-14.495466,-14.410909,-14.654531,-14.452294,-14.525319,-14.571839,-14.689622,-14.556091,...,-14.495466,-14.495466,-14.480867,-14.560028,-14.802951,-14.76373,-14.744682,-14.802951,-14.886333,-14.72599
50%,-12.298242,-12.475674,-12.511427,-12.431773,-12.667602,-12.512439,-12.630477,-12.681888,-12.772175,-12.662885,...,-12.651189,-12.675911,-12.648866,-12.752681,-12.948218,-12.973576,-12.93269,-12.993026,-13.058982,-12.956072
75%,-6.655285,-6.829577,-6.874124,-6.87539,-7.225013,-6.937379,-7.11969,-7.205738,-7.498533,-7.311081,...,-7.529891,-7.589352,-7.65183,-7.763685,-8.161169,-8.062538,-8.113452,-8.307739,-8.666306,-8.518025
max,-2.522554,-2.630617,-2.64137,-2.589702,-2.869349,-2.497407,-2.626566,-2.59139,-2.890034,-2.712614,...,-2.850115,-2.999733,-3.039183,-3.018382,-3.3728,-3.123116,-3.035992,-3.03507,-3.294441,-2.922632


In [65]:
logits_q[8488,2939:2943]

tensor([    -inf,     -inf,     -inf, -18.7150])

In [68]:
logits_q[8488,2920:2960]

tensor([-15.4961, -16.2301, -16.1500, -16.6355, -16.6355, -17.3287, -17.6164,
        -17.1055, -17.6164, -17.3287, -16.6355, -18.0218, -17.3287, -18.7150,
        -18.0218,     -inf, -18.0218, -18.7150, -17.6164,     -inf,     -inf,
            -inf, -18.7150,     -inf, -18.0218,     -inf, -18.7150, -18.7150,
            -inf,     -inf,     -inf, -18.7150, -18.0218, -18.7150, -17.6164,
        -18.7150,     -inf,     -inf, -17.6164, -18.0218])

In [59]:
logits_q[2939:2943]

tensor([[-16.7685, -17.2323, -18.4928,  ...,     -inf,     -inf, -14.7831],
        [-17.8030, -18.4433, -19.5984,  ...,     -inf,     -inf, -14.8648],
        [-21.1969, -21.9796, -23.0072,  ...,     -inf,     -inf, -15.3828],
        [-21.3522, -21.6257, -22.8704,  ...,     -inf,     -inf, -15.0774]])

In [None]:
-0.370978


In [44]:
pd.DataFrame(borders_q)

Unnamed: 0,0
0,-95.082840
1,-28.012060
2,-24.506130
3,-23.045065
4,-21.867979
...,...
4996,18.966341
4997,19.977930
4998,21.901922
4999,25.367764


In [None]:
pd.DataFrame(logits_q)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,4990,4991,4992,4993,4994,4995,4996,4997,4998,4999
0,-19.013660,-16.121935,-18.066088,-19.185160,-18.809097,-19.559652,-20.388332,-20.174355,-20.303059,-20.440792,...,-inf,-inf,-inf,-14.953773,-inf,-inf,-inf,-inf,-17.616362,-14.204114
1,-20.445480,-17.047121,-19.200077,-20.411682,-19.994869,-20.745758,-21.517900,-21.266085,-21.467499,-21.575579,...,-inf,-inf,-inf,-15.051413,-inf,-inf,-inf,-inf,-inf,-14.452294
2,-19.991667,-17.829699,-19.708517,-20.733616,-20.404200,-20.983063,-21.502640,-21.344337,-21.562677,-21.560261,...,-inf,-inf,-inf,-15.051413,-inf,-inf,-inf,-inf,-inf,-14.823153
3,-19.075420,-18.534924,-19.857212,-20.369707,-20.388901,-20.625570,-21.006407,-20.982119,-21.177557,-21.225622,...,-inf,-inf,-inf,-14.744682,-inf,-inf,-inf,-inf,-inf,-14.977304
4,-18.726599,-19.765089,-20.627991,-20.863737,-21.163416,-21.268990,-21.700020,-21.741943,-21.883339,-22.085009,...,-inf,-inf,-inf,-14.725990,-inf,-inf,-inf,-inf,-inf,-15.347678
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8827,-18.452223,-18.032745,-19.962942,-20.401678,-20.568182,-20.946655,-21.449667,-21.411539,-21.640854,-21.782778,...,-inf,-inf,-inf,-15.313777,-inf,-inf,-inf,-inf,-inf,-14.604100
8828,-18.149229,-18.262783,-19.873194,-20.269188,-20.455511,-20.775591,-21.263411,-21.269014,-21.464455,-21.634327,...,-inf,-inf,-inf,-15.770535,-inf,-inf,-inf,-inf,-inf,-15.496099
8829,-19.353964,-19.132006,-21.005367,-21.449732,-21.600035,-21.951370,-22.444838,-22.388763,-22.615458,-22.780737,...,-inf,-inf,-inf,-15.382770,-inf,-inf,-inf,-inf,-inf,-14.823153
8830,-19.192339,-19.081923,-20.843849,-21.220961,-21.428885,-21.700876,-22.195786,-22.145071,-22.394833,-22.565186,...,-inf,-inf,-inf,-15.579479,-inf,-inf,-inf,-inf,-inf,-14.953773


# Alt

## Q1
- Train: 2022-01-01 – 2022-03-31
- Validation: 2023-01-01 – 2023-03-31

In [None]:
X_train_first_quarter = X_train.loc["2022-01-01":"2022-03-31"]
y_train_first_quarter = y_train.loc["2022-01-01":"2022-03-31"]

X_validation_first_quarter = X_validation.loc["2023-01-01":"2023-03-31"]
y_validation_first_quarter = y_validation.loc["2023-01-01":"2023-03-31"]

## Q2
- Train: 2022-04-01 – 2022-06-30
- Validation: 2023-04-01 – 2023-06-30

In [None]:
X_train_second_quarter = X_train.loc["2022-04-01":"2022-06-30"]
y_train_second_quarter = y_train.loc["2022-04-01":"2022-06-30"]

X_validation_second_quarter = X_validation.loc["2023-04-01":"2023-06-30"]
y_validation_second_quarter = y_validation.loc["2023-04-01":"2023-06-30"]

## Q3
- Train: 2022-07-01 – 2022-09-30
- Validation: 2023-07-01 – 2023-09-30

In [None]:
X_train_third_quarter = X_train.loc["2022-07-01":"2022-09-30"]
y_train_third_quarter = y_train.loc["2022-07-01":"2022-09-30"]

X_validation_third_quarter = X_validation.loc["2023-07-01":"2023-09-30"]
y_validation_third_quarter = y_validation.loc["2023-07-01":"2023-09-30"]

## Q4
- Train: 2022-10-01 – 2022-12-31
- Validation: 2023-10-01 – 2023-12-31

In [None]:
X_train_fourth_quarter = X_train.loc["2022-10-01":"2022-12-31"]
y_train_fourth_quarter = y_train.loc["2022-10-01":"2022-12-31"]

X_validation_fourth_quarter = X_validation.loc["2023-10-01":"2023-12-31"]
y_validation_fourth_quarter = y_validation.loc["2023-10-01":"2023-12-31"]

In [None]:
print("done")

done


In [None]:
# Check that each quarter is correctly split
print("Train First Quarter:", X_train_first_quarter.index.min(), "to", X_train_first_quarter.index.max())
print("Validation First Quarter:", X_validation_first_quarter.index.min(), "to", X_validation_first_quarter.index.max())

print("Train Second Quarter:", X_train_second_quarter.index.min(), "to", X_train_second_quarter.index.max())
print("Validation Second Quarter:", X_validation_second_quarter.index.min(), "to", X_validation_second_quarter.index.max())

print("Train Third Quarter:", X_train_third_quarter.index.min(), "to", X_train_third_quarter.index.max())
print("Validation Third Quarter:", X_validation_third_quarter.index.min(), "to", X_validation_third_quarter.index.max())

print("Train Fourth Quarter:", X_train_fourth_quarter.index.min(), "to", X_train_fourth_quarter.index.max())
print("Validation Fourth Quarter:", X_validation_fourth_quarter.index.min(), "to", X_validation_fourth_quarter.index.max())


Train First Quarter: 2022-01-01 00:00:00 to 2022-03-31 23:45:00
Validation First Quarter: 2023-01-01 00:00:00 to 2023-03-31 23:45:00
Train Second Quarter: 2022-04-01 00:00:00 to 2022-06-30 23:45:00
Validation Second Quarter: 2023-04-01 00:00:00 to 2023-06-30 23:45:00
Train Third Quarter: 2022-07-01 00:00:00 to 2022-09-30 23:45:00
Validation Third Quarter: 2023-07-01 00:00:00 to 2023-09-30 23:45:00
Train Fourth Quarter: 2022-10-01 00:00:00 to 2022-12-31 23:45:00
Validation Fourth Quarter: 2023-10-01 00:00:00 to 2023-12-31 23:45:00


# Fit model to 1st quarter

In [None]:
model_first_q = TabPFNRegressor(device='auto', fit_mode='low_memory', random_state=42)
model_first_q.fit(X_train_first_quarter, y_train_first_quarter)
quantiles_custom_first_q = np.arange(0.1, 1, 0.1) # the quantiles to return at those probabilities
probabilities_first_q = quantiles_custom_first_q

probs_val_first_q = model_first_q.predict(X_validation_first_quarter, output_type="full", quantiles=quantiles_custom_first_q)
logits_first_q = probs_val_first_q["logits"]
borders_first_q = probs_val_first_q["criterion"].borders # returns borders appropriate for the training data used and not the standard borders from TabPFN
all_quantiles_first_q = np.array(probs_val_first_q["quantiles"])



## CRPS from 5000 logits

In [None]:
y_validation_first_q_torch = torch.tensor(y_validation_first_quarter, dtype=torch.float32)
crps_values_torch_first_q = compute_crps_pytorch(logits_first_q, borders_first_q, y_validation_first_q_torch)
print("CRPS shape:", crps_values_torch_first_q.shape)  # Should be (N,)
print("First few CRPS values:", crps_values_torch_first_q)

CRPS shape: torch.Size([8640])
First few CRPS values: tensor([0.2588, 0.2914, 0.3574,  ..., 0.2643, 0.3331, 0.4453])


  y_validation_first_q_torch = torch.tensor(y_validation_first_quarter, dtype=torch.float32)


## NLL from 5000 logits

In [None]:
#probs_val["criterion"].forward(logits, torch.tensor(y_val_lim, dtype=torch.float32))
nll_torch_first_q = probs_val_first_q["criterion"].forward(logits_first_q, y_validation_first_q_torch)

## CRPS, NLL from 9 deciles

In [None]:
#probabilities = quantiles_custom

yt = entsoe["power"]
quantile_10 = np.percentile(yt, probabilities_first_q * 100)

# Fit left and right tails
mu_left_asym, sigma_left_asym = fit_tail_distribution(quantile_10[:2], probabilities_first_q[:2])
mu_right_asym, sigma_right_asym = fit_tail_distribution(quantile_10[-2:], probabilities_first_q[-2:])

#probabilities_10= np.array([0.1, 0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9])
quantile_10 = np.percentile(yt, probabilities_first_q * 100)

# Initialize lists to store the results
crps_cdf_linear_a_full_first_q = []
crps_hybrid_cdf_a_full_first_q = []
crps_normal_a_full_first_q = []

nll_pdf_linear_a_full_first_q = []
nll_pdf_hybrid_a_full_first_q = []
nll_normal_a_full_first_q = []

for i in range(0, y_validation_first_quarter.shape[0]):
    quantile_i = all_quantiles_first_q[:, i]
    y_i = y_validation_first_quarter[i]
    
    # Modify this line to expect 6 values
    cdf_linear, hybrid_cdf, crps_normal, pdf_linear, pdf_hybrid, nll_normal = evaluate(quantile_i, probabilities_first_q, y_i, -20, 5, mu_left_asym, sigma_left_asym, mu_right_asym, sigma_right_asym)
    
    # Append the results to respective lists
    crps_cdf_linear_a_full_first_q.append(cdf_linear)
    crps_hybrid_cdf_a_full_first_q.append(hybrid_cdf)
    crps_normal_a_full_first_q.append(crps_normal)
    
    nll_pdf_linear_a_full_first_q.append(pdf_linear)
    nll_pdf_hybrid_a_full_first_q.append(pdf_hybrid)
    nll_normal_a_full_first_q.append(nll_normal)


#print("crps linear", crps_cdf_linear_a)
#print("crps hybrid", crps_hybrid_cdf_a)
#print("crps normal", crps_normal_a)

#print("NLL linaer", nll_pdf_linear_a)
#print("NLL hybrid", nll_pdf_hybrid_a)
#print("NLL normal", nll_normal_a)

# Calculate and print the mean values
mean_crps_cdf_linear_full_first_q = np.mean(crps_cdf_linear_a_full_first_q)
mean_crps_hybrid_cdf_a_full_first_q = np.mean(crps_hybrid_cdf_a_full_first_q)
mean_crps_normal_a_full_first_q = np.mean(crps_normal_a_full_first_q)

mean_nll_pdf_linear_a_full_first_q = np.mean(nll_pdf_linear_a_full_first_q)
mean_nll_pdf_hybrid_a_full_first_q = np.mean(nll_pdf_hybrid_a_full_first_q)
mean_nll_normal_a_full_first_q = np.mean(nll_normal_a_full_first_q)

# Print the results
print(f"Mean CRPS for CDF Linear interpolation: {mean_crps_cdf_linear_full_first_q:.4f}")
print(f"Mean CRPS for Hybrid CDF interpolation: {mean_crps_hybrid_cdf_a_full_first_q:.4f}")
print(f"Mean CRPS for Normal distribution interpolation: {mean_crps_normal_a_full_first_q:.4f}")

print(f"Mean NLL for PDF Linear interpolation: {mean_nll_pdf_linear_a_full_first_q:.4f}")
print(f"Mean NLL for PDF Hybrid interpolation: {mean_nll_pdf_hybrid_a_full_first_q:.4f}")
print(f"Mean NLL for Normal distribution interpolation: {mean_nll_normal_a_full_first_q:.4f}")


  y_i = y_validation_first_quarter[i]
  If increasing the limit yields no improvement it is advised to analyze 
  the integrand in order to determine the difficulties.  If the position of a 
  local difficulty can be determined (singularity, discontinuity) one will 
  probably gain from splitting up the interval and calling the integrator 
  on the subranges.  Perhaps a special-purpose integrator should be used.
  crps_value, _ = quad(integrand, y_min, y_max)
  the requested tolerance from being achieved.  The error may be 
  underestimated.
  crps_value, _ = quad(integrand, y_min, y_max)


Mean CRPS for CDF Linear interpolation: 0.3233
Mean CRPS for Hybrid CDF interpolation: 0.2984
Mean CRPS for Normal distribution interpolation: 0.2546
Mean NLL for PDF Linear interpolation: 1.2714
Mean NLL for PDF Hybrid interpolation: 0.1639
Mean NLL for Normal distribution interpolation: 0.7070


## Result

In [None]:
result_first_q = {
    'CRPS Linear': crps_cdf_linear_a_full_first_q,
    'CRPS Hybrid': crps_hybrid_cdf_a_full_first_q,
    'CRPS Normal': crps_normal_a_full_first_q,
    #'CRPS (5000 quantiles)': crps_values[0:10],
    'CRPS (5000 quantiles)': crps_values_torch_first_q,
    'NLL Linear': nll_pdf_linear_a_full_first_q,
    'NLL Hybrid': nll_pdf_hybrid_a_full_first_q,
    'NLL Normal': nll_normal_a_full_first_q,
    #'NLL (5000 quantiles)': probs_val["criterion"].forward(logits[:10,:], torch.tensor(y_validation.head(10), dtype=torch.float32)),
    'NLL (5000 quantiles)': nll_torch_first_q,
    'y values': y_validation_first_q_torch,
    #'first quantile': all_quantiles[0,:],
    #'last quantile': all_quantiles[-1,:]
}

# Create DataFrame
results_first_q = pd.DataFrame(result_first_q)
results_first_q = results_first_q.round(8)
results_first_q

Unnamed: 0,CRPS Linear,CRPS Hybrid,CRPS Normal,CRPS (5000 quantiles),NLL Linear,NLL Hybrid,NLL Normal,NLL (5000 quantiles),y values
0,0.335531,0.351515,0.269427,0.258756,5.274736,0.873116,3.379422,0.394301,-0.539165
1,0.354329,0.387881,0.319414,0.291448,5.284776,-0.004560,19.760483,1.392653,-0.518601
2,0.396949,0.390129,0.379295,0.357435,5.287517,-0.090864,52.659648,1.875144,-0.555433
3,0.364116,0.375514,0.339769,0.327781,5.288160,-0.207164,51.216794,3.055342,-0.509552
4,0.397876,0.383197,0.381887,0.370157,5.288161,-0.116637,64.281385,3.429277,-0.552363
...,...,...,...,...,...,...,...,...,...
8635,0.250530,0.178434,0.175355,0.172086,-0.123527,-0.344297,0.186721,-0.487889,-1.343460
8636,0.252027,0.179799,0.177976,0.171470,-0.131602,-0.346046,0.203189,-0.086346,-1.336049
8637,0.344401,0.273301,0.266790,0.264319,0.255278,0.430754,0.617904,0.083238,-1.332029
8638,0.406260,0.339873,0.331750,0.333067,4.154343,0.971536,0.926192,0.914429,-1.319186
