## Understanding What Drives the Forecast
In this notebook, we focus on interpretability, i.e., understanding why the model makes its predictions. We introduce a way to measure how much each input factor contributes to the forecast. This helps identify the key drivers behind the predictions and supports informed decision-making.

For this example, we use the German energy day-ahead hourly prediction dataset with two main features:
- Consumption: the actual electricity demand.
- Residual load: the demand not covered by renewable generation.

By examining the relative importance of these features, we can see which factors most influence the forecast at different times, enabling better operational and strategic planning.

### Prerequisites

Let's import the relevant packages and load the credentials that you've inserted in `credentials.txt`.

In [None]:
import sys
import pathlib

import pandas as pd

sys.path.append(pathlib.Path().resolve().parent.as_posix())

from inait import explain, predict, plot, load_credentials

base_url, auth_key = load_credentials("../credentials.txt")

### Load the dataset



In [None]:
### Load the data
data_path = "../data/power_day_ahead.csv"
data = pd.read_csv(
    data_path, index_col=0
)  # dataset must have a valid datetime index with fixed frequency
plot(historical_data=data, observation_length=int(data.shape[0] * 0.5))

### Forecasting setup and run 
We set the target variable to be the price `DE_Spot_EPEX_1H_A`, besides that we will use two features to train the model: `DE_Residual_Load_15M_A_AVG`, and `DE_Consumption_15M_A_AVG`.

In [None]:
# Configure prediction parameters
target_columns = [
    "DE_Spot_EPEX_1H_A"
]  # List of target columns to predict in the dataset
feature_columns = [
    "DE_Residual_Load_15M_A_AVG",
    "DE_Consumption_15M_A_AVG",
]  # Optional: List of feature columns to use for prediction
forecasting_horizon = 24  # Predict 24 time steps ahead (1 day)
observation_length = 24  # Use last 24 time steps as historical context (1 day)

In [None]:
results, session_ids = {}, {}
for features in ["Price-only model ", "Price + other drivers model"]:
    if features == "Price + other drivers model":
        _feature_columns = feature_columns
    else:
        _feature_columns = None

    result = predict(
        base_url=base_url,
        auth_key=auth_key,
        data=data.iloc[
            :-forecasting_horizon, :
        ],  # we keep the last 24 observations as ground truth
        target_columns=target_columns,
        feature_columns=_feature_columns,
        forecasting_horizon=forecasting_horizon,
        observation_length=observation_length,
        model="inait-basic",  # Use the advanced model for prediction
    )

    results[features] = result["prediction"]
    session_ids[features] = result["session_id"]

results["Price + other drivers model"]

In [None]:
plot(
    historical_data=data.loc[:, target_columns],
    predicted_data=results,
    observation_length=24 * 5,
)

Now let's run inait `explain` feature to identify the main drivers behind the predictions. Each forecasting horizon may have different key drivers (by default, the first forecasting horizon is shown, but you can change this).

In this example, we display the drivers for the prediction at 19:00, a classic peak time in the energy market that is notoriously difficult to forecast accurately. The bars in the plot are ordered by their importance. Each bar is labeled with a variable name (e.g., price, consumption, or residual load) and a time reference in the format `t-1`, `t-2`, …, `t-observation_length`.

This time reference indicates which historical time step of that variable is influential. For example, if `price (t-1)` appears in the top 10, it means that  most recent observation of the price variable is among the most important predictors.

In [None]:
explain(
    session_id=session_ids["Price + other drivers model"],
    base_url=base_url,
    auth_key=auth_key,
    historical_data=data.iloc[
        :-24, :
    ],  # we keep the last 24 observations as ground truth
    max_drivers_displayed=10,
    forecasted_step=18,  # 19:00 is a classic peak time in the energy market
)

### Comments on results

We have run two models: one without features and one with features. The model with features demonstrates higher accuracy. The explanation plot reveals that consumption is one of the main drivers of the prediction, highlighting the critical importance of domain knowledge when selecting features for forecasting models. This analysis confirms that well-chosen features based on domain expertise can substantially improve model performance and interpretability.