# Nixtla Tutorial
The Flash team is excited to share with you a small tutorial on Nixtla.
Before jumping into this tutorial, we recommend giving a look to this [README](README.md) in order to get more familiar with Nixtla and its pros/cons ! 

Now that’s being said, let’s dig into a small example where we will use a hierarchical dataset to forecast the Quarterly Australian Tourism Visits.

To do so we'll explore the following features in Nixtla:

1. Define statistical models using the StatsForecast Nixtla package
2. Reconcile and evaluate the base predictions


## Import libraries

In [None]:
import pandas as pd 
import numpy as np
from datasetsforecast.hierarchical import HierarchicalData
from statsforecast import StatsForecast
import random

random.seed(0)

## Import data

In this example we will use the TourismSmall dataset. 
The following cell gets:
1. `df`: the time series for the different levels in the hierarchy, 
2. `S`: the summing matrix  which recovers the full dataset from the bottom level hierarchy and,
3. `tags`: the indices of each hierarchy denoted by tags.

In [None]:
df, S, tags = HierarchicalData.load(directory='data', group='TourismSmall')

df['ds'] = pd.to_datetime(df['ds'])
test_df  = df.groupby('unique_id').tail(4).sort_values(by='ds')
train_df = df.drop(test_df.index).sort_values(by='ds')

When using you're own dataset, you need to adhere to some naming conventions:
- [unique_id] for the time series identifier, 
- [ds] for the date, 
- [y] for the target variable.

In [None]:
StatsForecast.plot(train_df, engine='plotly')

## Model definition with StatsForecast

We define the following models:

- Historical Average: Arthimetic mean
- The AutoARIMA model: An implementation of the ARIMA model that uses an automatic process to select the optimal ARIMA (Autoregressive Integrated Moving Average) model parameters for a given time series.

In [None]:
from statsforecast.models import (
    HistoricAverage,
    AutoARIMA
    )

HORIZON = 4

In [None]:
models = [
    HistoricAverage(),
    AutoARIMA(season_length=4)
]

wrapper_models = StatsForecast( 
    models=models,
    freq='Q', 
    n_jobs=-1,
    fallback_model=HistoricAverage()
)

In [None]:
forecasts_df = wrapper_models.forecast(df=train_df, h=HORIZON)

In [None]:
StatsForecast.plot(train_df, forecasts_df=forecasts_df, models=["AutoARIMA", "HistoricAverage"], plot_random=False)

## Prediction reconciliation

Large collections of time series organized into structures at different aggregation levels often require their forecasts to follow their aggregation constraints, which poses the challenge of creating novel algorithms capable of coherent forecasts.

**HierarchicalForecast** offers a collection of reconciliation methods:
- BottomUp: Simple addition to the upper levels.
- TopDown: Distributes the top levels forecasts through the hierarchies.
- MiddleOut: It anchors the base predictions in a middle level. The levels above the base predictions use the bottom-up approach, while the levels below use a top-down.

The full list of reconciliation methods is available [here](https://nixtlaverse.nixtla.io/hierarchicalforecast/index.html).

In [None]:
from hierarchicalforecast.methods import BottomUp, TopDown, MiddleOut
from hierarchicalforecast.core import HierarchicalReconciliation
from hierarchicalforecast.evaluation import HierarchicalEvaluation

In [None]:
predictions = wrapper_models.forecast(h=HORIZON)

In [None]:
reconcilers = [
    BottomUp(),
    TopDown(method='forecast_proportions'),
    MiddleOut(middle_level='Country/Purpose/State',
              top_down_method='forecast_proportions')
]

hrec = HierarchicalReconciliation(reconcilers=reconcilers)
reconciled_predictions = hrec.reconcile(Y_hat_df=predictions, Y_df=train_df, S=S, tags=tags)

In [None]:
def mse(y, y_hat):
    return np.mean((y-y_hat)**2)

evaluator = HierarchicalEvaluation(evaluators=[mse])
evaluation = evaluator.evaluate(
        Y_hat_df=reconciled_predictions, Y_test_df=test_df.set_index('unique_id'),
        tags=tags, benchmark='HistoricAverage'
)
evaluation.filter(like='ARIMA', axis=1).T