**Set-up instructions:** this notebook give a tutorial on the forecasting learning task supported by `sktime`.
On binder, this should run out-of-the-box.

To run this notebook as intended, ensure that `sktime` with basic dependency requirements is installed in your python environment.

To run this notebook with a local development version of sktime, either uncomment and run the below, or `pip install -e` a local clone of the `sktime` `main` branch.

In [None]:
# from os import sys
# sys.path.append("..")

# Using sktime transformers

Transformers are modularized data processing steps commonly used in machine learning. `sktime` introduces transformer types related to time series, which can be used in pipeline constructs across learning tasks. For instance, the same time series feature extractor could be used in a forecasting pipeline, or a time series classification pipeline.

Note: the term "transformer" is used in the sence of `scikit-learn`. This should be distinguished from the "transformer" referring to specific neural network architecture.

`sktime` provides common interfaces to different types of transformations for stand-alone or composition use. As all `sktime` estimators, transformers expose parameters which can be tuned inside pipelines and composites.

**Section 1** provides an overview of the different kinds of transformers in `sktime`.

**Section 2** discusses in more detail time series and panel transformers, and common uses in pipelines.

**Section 3** discusses in more detail pairwise transformers, including time series distances and time series kernels.

**Section 4** gives an introduction to how to write custom transformers compliant with the `sktime` interface.

## Table of Contents

* [1. Basic forecasting workflows](#chapter1)
    * [1.1 Data container format](#section_1_1)
    * [1.2 Basic deployment workflow - batch fitting and forecasting](#section_1_2)
        * [1.2.1 Basic deployment workflow in a nutshell](#section_1_2_1)
        * [1.2.2 Forecasters that require the horizon already in `fit`](#section_1_2_2)
        * [1.2.3 Forecasters that can make use of exogeneous data](#section_1_2_3)
        * [1.2.4 Prediction intervals](#section_1_2_4)      
    * [1.3 basic evaluation workflow - evaluating a batch of forecasts against ground truth observations](#section_1_3)   
        * [1.3.1 The basic batch forecast evaluation workflow in a nutshell - function metric interface](#section_1_3_1)
        * [1.3.2 The basic batch forecast evaluation workflow in a nutshell - metric class interface](#section_1_3_2)           
    * [1.4 advanced deployment workflow: rolling updates & forecasts](#section_1_4) 
        * [1.4.1 updating a forecaster with the update method](#section_1_4_1)    
        * [1.4.2 moving the "now" state without updating the model](#section_1_4_2)   
        * [1.4.3 walk-forward predictions on a batch of data](#section_1_4_3)  
    * [1.5 advanced evaluation worfklow: rolling re-sampling and aggregate errors, rolling back-testing](#section_1_5)         
* [2. Forecasters in sktime - main families](#chapter2)
    * [2.1 exponential smoothing, theta forecaster, autoETS from statsmodels](#section_2_1)
    * [2.2 ARIMA and autoARIMA](#section_2_2)
    * [2.3 BATS and TBATS](#section_2_3)    
    * [2.4 Facebook prophet](#section_2_4)  
    * [2.5 State Space Model (Structural Time Series)](#section_2_5)  
* [3. Advanced composition patterns - pipelines, reduction, autoML, and more](#chapter3)
    * [3.1 Reduction: from forecasting to regression](#section_3_1)
    * [3.2 Pipelining, detrending and deseasonalization](#section_3_2)    
        * [3.2.1 The basic forecasting pipeline](#section_3_2_1)
        * [3.2.2 The Detrender as pipeline component](#section_3_2_2) 
        * [3.2.3 Complex pipeline composites and parameter inspection](#section_3_2_3)        
    * [3.3 Parameter tuning](#section_3_3)      
        * [3.3.1 Basic tuning using ForecastingGridSearchCV](#section_3_3_1)  
        * [3.3.2 Tuning of complex composites](#section_3_3_2)        
        * [3.3.3 Selecting the metric and retrieving scores](#section_3_3_3) 
    * [3.4 autoML aka automated model selection, ensembling and hedging](#section_3_4) 
        * [3.4.1 autoML aka automatic model selection, using tuning plus multiplexer](#section_3_4_1)   
        * [3.4.2 autoML: selecting transformer combinations via OptimalPassthrough](#section_3_4_2)  
        * [3.4.3 simple ensembling strategies](#section_3_4_3)   
        * [3.4.4 Prediction weighted ensembles and hedge ensembles](#section_3_4_4)  
* [4. Extension guide - implementing your own forecaster](#chapter4)        
* [5. Summary](#chapter5)          

#### package imports

In [None]:
import numpy as np
import pandas as pd

## 1. Overview of transformers in `sktime` <a class="anchor" id="chapter1"></a>

This section explains the differrent types of transformers found in `sktime`.

There are four main types of transformation in `sktime`:

* transforming a series/sequence into scalar- or category-valued features. Examples: `tsfresh`, or extracting `mean` and `variance` overall.
* transforming a series into another series. Examples: detrending, smoothing, filtering, lagging.
* transforming a panel into another panel. Examples: principal component projection; applying individual series transformation to all series in the panel.
* transforming a pair of series into a scalar value. Examples: dynamic time warping distance between series/sequences; generalized alignment kernel between series/sequences.

Notably, the first three (series to primitive features, series to series, panel to panel) are covered by the same base class template and module.
Since kernels and distances for time series and sequences have the same mathematical signature and differ only in mathematical properties (e.g., definiteness assumptions), they are covered by the more abstract scientific type of "pairwise transformer".

Below, we give an overview in sub-sections:
* reviewing common data container formats for series and panels
* showcasing the signature of transformers that transform a series or panel
* showcasing the signature of transformers that transform a pair ot series to a scalar, e.g., distances or kernels
* how to search `sktime` for transformers of a certain type

### 1.1 Data contanier format<a class="anchor" id="section_1_1"></a>

`sktime` transformers apply to individual time series and panels (= collections of time series).
This is formalized as abstract "scientific types" `Series` and `Panel`, with multiple possible in-memory representations, so-called "mtypes".

For the purpose of this tutorial, we will be working with the most common mtypes. For more details and formal data type specifications, see the "datatypes and datasets" tutorial.

`Series` are commonly represented as:

* `pandas.Series` for univariate time series and sequences
* `pandas.DataFrame` for uni- or multivariate time series and sequences

The `Series.index` and `DataFrame.index` are used for representing the time series or sequence index. `sktime` supports pandas integer, period and timestamp indices.

`Panel`-s are commonly represented as:
* a `pandas.DataFrame` in a specific format, defined by the `pd-multiindex` scitype - this has a double row index, for time points and instances
* a `list` of `pandas.DataFrame`, where all `pandas.DataFrame` are in the `Series` format. The different `list` elements are the different instances

In either case, the "time" index must be a `sktime` compatible time index type, as for `Series`.

In [None]:
from sktime.datatypes import get_examples

In [None]:
# example of a univariate series
get_examples("pd.Series", "Series")[0]

In [None]:
# example of a multivariate series
get_examples("pd.DataFrame", "Series")[1]

In [None]:
# example of a panel in pd-multiindex format
get_examples("pd-multiindex", "Panel")[0]

In [None]:
# example of the same panel in df-list format
get_examples("df-list", "Panel")[0]

`sktime` supports more in-memory formats, see the "datatypes and datasets" tutorial for more details.

### 1.2 General transformer signature - simple transformers<a class="anchor" id="section_1_2"></a>

Transformers for `Series` and `Panel` have the same high-level interface. Depending which data type they are more commonly used for, they are found either in the `transformations.series` or `transformations.panel` module. As said, this does not imply a separate interface.

The most important interface points of transformers are:

1. construction, this is as with any other `sktime` estimator
2. fitting the transformer, via `fit`
3. transforming data, via `transform`
4. inverse transforming, via `inverse_transform`
5. updating the transformer, via `update` - not all transformers have this interface point (`update` is currently work in progress, as of v0.8.x, contributions are appreciated)

We show this in two examples below.

We will apply transformations to the following `Series` and `Panel` data:

In [None]:
from sktime.datatypes import get_examples
# unvariate series used in the examples
X_series = get_examples("pd.Series", "Series")[0][0:3]
# panel used in the examples
X_panel = get_examples("pd-multiindex", "Panel")[0][["var_1"]]
# sub-setting is needed since the example transformer (box-cox) 
#   wants univariate and positive data

In [None]:
X_series

In [None]:
X_panel

#### Example: transforming series to series

The Box-Cox transformer applies the Box-Cox transform to individual values in series or panels. At the start, the transformer needs to be constructed with parameter settings, this is the same as for any `sktime` estimator.

In [None]:
# constructing the transformer
from sktime.transformations.series.boxcox import BoxCoxTransformer

my_boxcox_trafo = BoxCoxTransformer(method="mle")

Now, we apply the constructed transformer `my_trafo` to a (single, univariate) series. First, the transformer is fitted:

In [None]:
# fitting the transformer
my_boxcox_trafo.fit(X_series)

Next, the transformer is applied, this results in a transformed series.

In [None]:
# transforming the series
my_boxcox_trafo.transform(X_series)

Generally, the series passed to `transform` need not be the same as in `fit` - if it is, the shorthand `fit_transform` can instead be used:

In [None]:
my_boxcox_trafo.fit_transform(X_series)

The transformer can also be applied to `Panel` data:

In [None]:
my_boxcox_trafo.fit_transform(X_panel)

#### Example: transforming series to primitive features

The summary transformer can be used to extract sample statistics such as mean and variance from a series. First, we construct the transformer:

In [None]:
# constructing the transformer
from sktime.transformations.series.summarize import SummaryTransformer

my_summary_trafo = SummaryTransformer()

As before, we can fit/apply with `fit`, `transform`, and `fit_transform`.

`SummaryTransformer` returns primitive features, hence the output will be a `pandas.DataFrame`, each row corresponding to one series in the input. 

If the input is a single series, the output of `transform` and `fit_transform` will be a one-row `DataFrame`, corresponding to mean etc of that one series:

In [None]:
my_summary_trafo.fit_transform(X_series)

If the input is a panel, the output of `transform` and `fit_transform` will be a `DataFrame` with as many rows as the `Panel` had series. The row with the corresponding index has mean etc of the series in the panel `X_panel` with the same index:

In [None]:
my_summary_trafo.fit_transform(X_panel)

#### transformers with series/panel output vs primitive output

Whether a transformer will return primitives, i.e., a `pandas.DataFrame` in `transform`, or time series like objects (`Series` or `Panel`), can be checked by using the `"scitype:transform-output"` tag. This is `"Series"` for behaviour as in the first example (box-cox), and `"Primitives"` for behaviour as in the second example (summarizer):

In [None]:
my_boxcox_trafo.get_tag("scitype:transform-output")

In [None]:
my_summary_trafo.get_tag("scitype:transform-output")

Use of tags to characterize and search for transformers will be discussed in more detail in section 4.

## 2. Forecasters in `sktime` - main families<a class="anchor" id="chapter2"></a>

`sktime` supports a number of commonly used forecasters, many of them interfaced from state-of-art forecasting packages. All forecasters are available under the unified `sktime` interface.

The main classes that are currently stably supported are:

* `ExponentialSmoothing`, `ThetaForecaster`, and `autoETS` from `statsmodels`
* `ARIMA` and `autoARIMA` from `pmdarima`
* `BATS` and `TBATS` from `tbats`
* `PolynomialTrend` for forecasting polynomial trends
* `Prophet` which interfaces Facebook `prophet`

For illustration, all estimators below will be presented on the basic forecasting workflow - though they also support the advanced forecasting and evaluation workflows under the unified `sktime` interface (see Section 1).

For use in the other workflows, simply replace the "forecaster specification block" ("`forecaster=`") by the forecaster specification block in the examples presented below.

Generally, all forecasters available in `sktime` can be listed with the `all_estimators` command:

## 4. Extension guide - implementing your own transformer<a class="anchor" id="chapter4"></a>

`sktime` is meant to be easily extensible, for direct contribution to `sktime` as well as for local/private extension with custom methods.

To extend `sktime` with a new local or contributed forecaster, a good workflow to follow is:

1. read through the [transformer extension template](https://github.com/alan-turing-institute/sktime/blob/main/extension_templates/transformer.py) - this is a `python` file with `todo` blocks that mark the places in which changes need to be added.
2. optionally, if you are planning any major surgeries to the interface: look at the [base class architecture](https://github.com/alan-turing-institute/sktime/blob/main/sktime/transformations/base.py) - note that "ordinary" extension (e.g., new algorithm) should be easily doable without this.
3. copy the transformer extension template to a local folder in your own repository (local/private extension), or to a suitable location in your clone of the `sktime` or affiliated repository (if contributed extension), inside `sktime.transformations`; rename the file and update the file docstring appropriately.
4. address the "todo" parts. Usually, this means: changing the name of the class, setting the tag values, specifying hyper-parameters, filling in `__init__`, `_fit`, `_transform`, and optional methods such as `_inverse_transform` or `_update` (for details see the extension template). You can add private methods as long as they do not override the default public interface. For more details, see the extension template.
5. to test your estimator manually: import your estimator and run it in the worfklows in Section 1; then use it in the compositors in Section 3.
6. to test your estimator automatically: call `sktime.tests.test_all_estimators.test_estimator` on your estimator - note that the function takes the class, not an object instance. Before the call, you need to register the new estimator in `sktime.tests._config`, as an import, and by adding default parameter settings to the `ESTIMATOR_TEST_PARAMS` variable (the `dict` entry `key` is the class, and entry is a `scikit-learn` parameter set). `pytest` will also add the call to its automated tests in a working clone of the `sktime` repository.

In case of direct contribution to `sktime` or one of its affiliated packages, additionally:
* add yourself as an author to the code, and to the `CODEOWNERS` for the new estimator file(s).
* create a pull request that contains only the new estimators (and their inheritance tree, if it's not just one class), as well as the automated tests as described above.
* in the pull request, describe the estimator and optimally provide a publication or other technical reference for the strategy it implements.
* before making the pull request, ensure that you have all necessary permissions to contribute the code to a permissive license (BSD-3) open source project.

## 5. Summary<a class="anchor" id="chapter5"></a>

* `sktime` comes with several forecasting algorithms (or forecasters), all of which share a common interface. The interface is fully interoperable with the `scikit-learn` interface, and provides dedicated interface points for forecasting in batch and rolling mode.

* `sktime` comes with rich composition functionality that allows to build complex pipelines easily, and connect easily with other parts of the open source ecosystem, such as `scikit-learn` and individual algorithm libraries.

* `sktime` is easy to extend, and comes with user friendly tools to facilitate implementing and testing your own forecasters and composition principles.