## Understanding What Drives the Forecast
In this notebook, we focus on interpretability, i.e., understanding why the model makes its predictions. We introduce a way to measure how much each input factor contributes to the forecast. This helps identify the key drivers behind the predictions and supports informed decision-making.

For this example, we use an energy day-ahead hourly prediction dataset with two main features:
- Consumption: the forecasted electricity demand.
- Residual load: the demand not covered by renewable generation.

By examining the relative importance of these features, we can see which factors most influence the forecast at different times, enabling better operational and strategic planning.

### Prerequisites

Let's import the relevant packages and load the credentials that you've inserted in `credentials.txt`.

In [None]:
import sys
import pathlib

import pandas as pd

sys.path.append(pathlib.Path().resolve().parent.as_posix())

from inait import explain, predict, plot, load_credentials

base_url, auth_key = load_credentials("../credentials.txt")

### Load the dataset and forecasting settings

We set the target variable to be the price "DE_Spot_EPEX_1H_A", besides that we will use two features to train the model: "DE_Residual_Load_15M_A_AVG", and "DE_Consumption_15M_A_AVG".

In [None]:
### Load the data
data_path = "../data/power_day_ahead.csv"
data = pd.read_csv(
    data_path, index_col=0
)  # dataset must have a valid datetime index with fixed frequency

# Configure prediction parameters
target_columns = [
    "DE_Spot_EPEX_1H_A"
]  # List of target columns to predict in the dataset
feature_columns = [
    "DE_Residual_Load_15M_A_AVG",
    "DE_Consumption_15M_A_AVG",
]  # Optional: List of feature columns to use for prediction
forecasting_horizon = 24  # Predict 24 time steps ahead
observation_length = 24  # Use last 24 time steps as historical context

In [None]:
results={}
session_ids = {}
for features in ["No Features", "With Features"]:
    if features == "With Features":
        _feature_columns = feature_columns
    else:
        _feature_columns = None

    result = predict(
        base_url=base_url,
        auth_key=auth_key,
        data=data.iloc[:-24,:], # we keep the last 24 observations as ground truth
        target_columns=target_columns,
        feature_columns=_feature_columns,
        forecasting_horizon=forecasting_horizon,
        observation_length=observation_length,
    )
    
    results[features] = result["prediction"]
    session_ids[features] = result["session_id"]

results["With Features"]

In [None]:
plot(
    historical_data=data.loc[:, target_columns], # we keep the last 24 observations as ground truth 
    predicted_data=results,
    observation_length=24*5,
)

In [None]:
explain(
    session_id=session_ids["With Features"],
    base_url=base_url,
    auth_key=auth_key,
    historical_data=data.iloc[:-24,:],
    max_features_displayed=10,
)

### Comments on results

We have run two models: one without features and one with features. The model with features demonstrates higher accuracy. The explanation plot reveals that consumption is one of the main drivers of the prediction, highlighting the critical importance of domain knowledge when selecting features for forecasting models. This analysis confirms that well-chosen features based on domain expertise can substantially improve model performance and interpretability.