# TODOs
* Graphical or Graph pipeline?


# 2 Advanced Forecasting Pipelines
This notebook is about pipelining and tuning (grid search) for time series forecasting with `sktime`

In the previous notebook, we considered
* sequential pipelines
* tuning their hyperparameters

However, there are more aspects that might be covered by pipelines. Not only the hyperparameters of the objects in the pipeline can be considered but also the structure of the pipeline can be considered as hyperparameter.
Furthermore, in general, pipeline need not to be sequential. They can also be non-sequential (graphical pipeline).
Require specific diagnostic methods, and also persisting.

Thus, these topics are covered by that Notebook.




In [1]:
import warnings
import numpy as np

warnings.filterwarnings("ignore")

## 2.1 Tuning the Pipeline's Structure (AutoML)
In the previous notebook, we performed a grid search to find the best hyperparameters. However, also the structure of the pipeline is a hyperparamter. Thus, for tuning this hyperparameter, `sktime` contains
* MultiplexForecasters
* OptionalPassthrough
* ...

That are explained in the following before showing an AutoML example.


### 2.1.1 Tuning ComponentsModel selection

Pipeline structure choices influence performance

`sktime` allows to expose these choices via structural compositors:

* switch between transform/forecast: `MultiplexTransformer`, `MultiplexForecaster`
* transformer on/off: `OptionalPassthrough`
* sequence of transformers: `Permute`

Combine with pipelines and `FeatureUnion` for rich structure space

#### `MultiplexForecaster`
We can use the `MultiplexForecaster` to compare the performance of different forecasters. This approach might be useful if we want to compare the performance of different forecasters that have been tuned and fitted already separately. The `MultiplexForecaster` is just a forecaster compostition that provides a parameters `selected_forecaster: List[str]` that can be tuned with a grid search. The other parameters of the forecasters are not tuned.


In [28]:
from sktime.forecasting.compose import MultiplexForecaster

forecaster = MultiplexForecaster(
    forecasters=[
        ("naive", NaiveForecaster()),
        ("stl", STLForecaster()),
        ("theta", ThetaForecaster()),
    ]
)
gscv = ForecastingGridSearchCV(
    forecaster=forecaster,
    param_grid={"selected_forecaster": ["naive", "stl", "theta"]},
    cv=cv,
    n_jobs=-1,
)
gscv.fit(y)
gscv.best_params_

{'selected_forecaster': 'theta'}

#### Optional Passthrough

In `sktime` there is a transformer composition called `OptionalPassthrough()` which gets a transformer as an argument and a param `passthrough: bool`. Setting `passthrough=True` will return an identity transformation for the given data. Setting `passthrough=False` will apply the given inner transformer on the data.

In [32]:
from sktime.transformations.series.compose import OptionalPassthrough

transformer = OptionalPassthrough(transformer=Detrender(), passthrough=True)
transformer.fit_transform(y_train).head()

1991-01    266.0
1991-02    145.9
1991-03    183.1
1991-04    119.3
1991-05    180.3
Freq: M, Name: Number of shampoo sales, dtype: float64

In [33]:
y_train.head()

1991-01    266.0
1991-02    145.9
1991-03    183.1
1991-04    119.3
1991-05    180.3
Freq: M, Name: Number of shampoo sales, dtype: float64

In [34]:
transformer = OptionalPassthrough(transformer=Detrender(), passthrough=False)
transformer.fit_transform(y_train).head()

1991-01    130.376344
1991-02      1.503263
1991-03     29.930182
1991-04    -42.642900
1991-05      9.584019
Freq: M, dtype: float64

#### Permutation of transformers

Given a set of four different transformers, we would like to know which permutation (ordering) of the four transformers is having the best error. In total there are `4! = 24` different permutations. We can use the `GridSearchCV` to find the best permutation.

In [38]:
from sktime.forecasting.compose import Permute

In [39]:
forecaster = TransformedTargetForecaster(
    steps=[
        ("detrender", Detrender()),
        ("deseasonalizer", Deseasonalizer()),
        ("power", TabularToSeriesAdaptor(PowerTransformer())),
        ("scaler", TabularToSeriesAdaptor(RobustScaler())),
        ("forecaster", ExponentialSmoothing()),
    ]
)

param_grid = {
    "permutation": [
        ["detrender", "deseasonalizer", "power", "scaler", "forecaster"],
        ["power", "scaler", "detrender", "deseasonalizer", "forecaster"],
        ["scaler", "deseasonalizer", "power", "detrender", "forecaster"],
        ["deseasonalizer", "power", "scaler", "detrender", "forecaster"],
    ]
}
permuted = Permute(estimator=forecaster, permutation=None)

gscv = ForecastingGridSearchCV(
    forecaster=permuted,
    param_grid=param_grid,
    cv=cv,
    n_jobs=-1,
    verbose=1,
    scoring=MeanSquaredError(square_root=True),
)

permuted = gscv.fit(y, fh=fh)

Fitting 19 folds for each of 4 candidates, totalling 76 fits


In [40]:
permuted.cv_results_

Unnamed: 0,mean_test_MeanSquaredError,mean_fit_time,mean_pred_time,params,rank_test_MeanSquaredError
0,104.529303,0.119823,0.032105,"{'permutation': ['detrender', 'deseasonalizer'...",4.0
1,95.172818,0.11884,0.033758,"{'permutation': ['power', 'scaler', 'detrender...",1.5
2,95.243942,0.114363,0.033743,"{'permutation': ['scaler', 'deseasonalizer', '...",3.0
3,95.172818,0.119377,0.030739,"{'permutation': ['deseasonalizer', 'power', 's...",1.5


In [41]:
gscv.best_params_

{'permutation': ['power',
  'scaler',
  'detrender',
  'deseasonalizer',
  'forecaster']}

In [42]:
# worst params
gscv.cv_results_.sort_values(by="mean_test_MeanSquaredError", ascending=True).iloc[-1][
    "params"
]

{'permutation': ['detrender',
  'deseasonalizer',
  'power',
  'scaler',
  'forecaster']}

## 2.1.2 AutoML and Forecasting

Taking all incredients from above examples, we can build a forecaster that comes close to what is usually called [AutoML](https://en.wikipedia.org/wiki/Automated_machine_learning).
With AutoML we aim to automate as many steps of an ML model creation as possible. The main compositions from `sktime` that we can use for this are:
- `TransformedTargetForecaster`
- `ForecastingPipeline`
- `ForecastingGridSearchCV`
- `OptionalPassthrough`
- `Permute`

### Univariate example
Please see appendix section for an example with exogenous data

In [43]:
pipe_y = TransformedTargetForecaster(
    steps=[
        ("detrender", OptionalPassthrough(Detrender())),
        ("deseasonalizer", OptionalPassthrough(Deseasonalizer())),
        ("scaler", OptionalPassthrough(TabularToSeriesAdaptor(RobustScaler()))),
        ("forecaster", STLForecaster()),
    ]
)
permuted_y = Permute(estimator=pipe_y, permutation=None)

param_grid = {
    "permutation": [
        ["detrender", "deseasonalizer", "scaler", "forecaster"],
        ["scaler", "deseasonalizer", "detrender", "forecaster"],
    ],
    "estimator__detrender__passthrough": [True, False],
    "estimator__deseasonalizer__passthrough": [True, False],
    "estimator__scaler__passthrough": [True, False],
    "estimator__scaler__transformer__transformer__with_scaling": [True, False],
    "estimator__scaler__transformer__transformer__with_centering": [True, False],
    "estimator__forecaster__sp": [4, 8, 12],
}

gscv = ForecastingGridSearchCV(
    forecaster=permuted_y,
    param_grid=param_grid,
    cv=cv,
    n_jobs=-1,
    verbose=1,
    scoring=MeanSquaredError(square_root=True),
    error_score="raise",
)

gscv.fit(y=y_train, fh=fh)

Fitting 13 folds for each of 192 candidates, totalling 2496 fits


In [44]:
gscv.cv_results_["mean_test_MeanSquaredError"].min()

83.44136911735615

In [45]:
gscv.cv_results_["mean_test_MeanSquaredError"].max()

125.54124792614672

## 3.2 Graphical Pipeline Introduction here :)

* Argumentation Avoid nesting
* Faciliate the building of complex pipelines


#### Definition what is a graphical Pipeline

#### Potential Use-Cases

#### Stuff from the Graphical Pipeline Notebook.
### 3.2.2 Tuning graphical pipelines


## 3.3 Pipeline Diagnostics

## 3.4 Persisting Pipelines

---

### Credits: notebook 2 - pipelines

notebook creation: aiwalter

forecaster pipelines: fkiraly, aiwalter\
transformer pipelines & compositors: fkiraly, mloning, miraep8\
dunder interface: fkiraly, miraep8\

tuning, autoML: mloning, fkiraly, aiwalter\
CV and splitters: mloning, kkoralturk, khrapovs\
forecasting metrics: mloning, aiwalter, rnkuhns, fkiraly\
backtesting, evaluation: aiwalter, mloning, fkiraly, topher-lo