## Prediction intervals for demand forecasting 

In this notebook, we introduce *prediction intervals* for the first time. Unlike a single point forecast, which reflects the most likely value of the time series, a prediction interval shows a range of plausible values based on past data. For example, if we choose an 80% prediction interval, it means that, if the pattern holds in the future, 8 out of 10 times the actual value we are trying to predict should fall within these bounds.

This is important because it turns the forecast into a **decision-making tool**. For example, in a retail setting:
- If the priority is to minimize stock surplus, decisions can be guided closer to the lower bound.
- If the priority is to reduce the risk of stockouts, the upper bound will be more relevant.

For demonstration, we use the well-known M5 dataset (source: Makridakis, Spyros, et al., ["M5 accuracy competition: Results, findings, and conclusions"](https://www.sciencedirect.com/science/article/pii/S0169207021001874), International Journal of Forecasting, 2022), which contains daily sales data for thousands of products sold in Walmart stores across the U.S. We work with only a subset of products for a single store to speed up computation, but the same approach scales to larger sets.

### Prerequisites

Let's import the relevant packages.

In [None]:
from inait_forecasting_client import predict, plot, read_file, check_coverage

### Load the dataset 

Here, we load the 3'049 products sold in one of the stores from the M5 dataset. To speed up computation, we predict only a random subset of 6 time series. Training on the full set of products would require running the simulation on a cluster (e.g., Azure ML Platform), which could take a couple of hours.

In [None]:
# Load data
data_path = "../data/M5_store_CA_1.csv"
data = read_file(data_path, index_col=0)

selected_nb_columns = 6  # If you increase this number you will generate predictions for more products but the time of computation will increase
data_sample = data.sample(n=selected_nb_columns, axis=1, random_state=20250).sort_index(
    axis=1
)  # Try to change the random_state to see different products
target_columns = list(data_sample.columns)  # List of target columns to predict

# Configure prediction parameters
forecasting_horizon = 28  # Predict 28 days ahead (4 weeks)
observation_length = 28 * 2  # Use last 56 days as historical context (8 weeks)
prediction_interval_level = (
    80  # Prediction interval level (e.g., 80% confidence interval)
)
positive_predictions_only = True  # Ensure all predictions are non-negative (in this case sales cannot be negative)

In [None]:
plot(
    historical_data=data_sample,
    observation_length=len(data_sample),
)

### Predict and plot results

In [None]:
results = predict(
    data=data_sample,
    historical_data_start_date=data_sample.index[
        200
    ],  # You can use a subset of the data by specifying a start date; this might speed up the computation at the expenses of losing some historical context in the training
    historical_data_end_date=data_sample.index[
        -forecasting_horizon
    ],  # Keep the last `forecasting_horizon` points as ground truth
    target_columns=target_columns,
    forecasting_horizon=forecasting_horizon,
    observation_length=observation_length,
    prediction_interval_level=prediction_interval_level,
    # target_transformer='NO_TARGET_TRANSFORMER',
    positive_predictions_only=positive_predictions_only,
)["prediction"]

results

In [None]:
plot(
    historical_data=data_sample,
    predicted_data=results,
    observation_length=observation_length
    * 3,  # change this to see more or less historical data; less historical data means more zoomed-in plot
)

### How well do our prediction intervals capture reality?

Now we will check how often the actual sales values fall inside the predicted interval for each product and time point. For example, with an 80% prediction interval, we expect about 8 out of 10 actual values to be within the predicted range. This helps us see if our uncertainty estimates are well-calibrated: if the percentage is much lower, our intervals may be too narrow; if much higher, they may be too wide. This is a practical way to validate whether our forecasts provide reliable guidance for decision-making.

In [None]:
check_coverage(
    historical_data=data_sample,
    prediction_data=results,
    prediction_interval_level=prediction_interval_level,
)

We observe that about 67% of the actual sales fall within the 80% prediction interval. Since we are working with only a small number of products and time points, some deviation from the nominal 80% is perfectly normal. This result is close enough to indicate that our prediction intervals are well-calibrated.

### Now it's your turn:

This notebook shows how our forecasting can not only predict future demand but also quantify uncertainty, giving decision-makers a clear range to plan for.

Try adjusting the prediction interval level; higher values increase confidence but also produce a wider band. You can also change the list of products to predict and see how the model behaves on a different set, or experiment with more advanced models by following the example in Notebook 1.

### Possible extensions for this Notebook:

1. **Introduce hierarchical forecasting**: Compare two approaches:
    - Predict all product-level time series in a department and sum the forecasts.
    - Predict the department's total demand directly.

    This choice often depends on business priorities, but ensuring forecasts are coherent across levels is key.

2. **Add what-if scenarios**: Include selling price variations as explanatory variables to explore questions like: *What happens to demand if we raise prices by 5%?*

3. **Demonstrate the value of additional explanatory variables**: Show how forecast accuracy improves when we include extra explanatory variables. In future releases, we plan to offer a built-in explanatory variable engineering tool to make this process seamless.

4. **Handling intermittent demand patterns**: Retail demand often has long periods of zero sales. A future release could include models designed specifically for these challenging patterns.