# Graphical Pipelines

The previously presented pipelines are sequential pipelines. I.e., the steps in the pipeline are sequentially ordered. To enable more complex tasks, these pipelines could be nested into each other. However, this makes it difficult to apply a grid search to the pipeline. Furthermore, the nesting of pipelines is not very intuitive and makes it difficult to understand the pipeline. Thus, we also propose a generalised graphical pipeline.
* Graphical means that the pipeline is not sequential anymore. Instead, the pipeline is a directed acyclic graph (DAG). This means that the pipeline consists of nodes and edges. The nodes are the steps in the pipeline and the edges are the connections between the steps. The edges specify the input and output arguments of the steps. Thus, the graphical pipeline is a generalisation of the sequential pipeline.
* Generalised means that the pipeline is not limited to one task (e.g. forecasting). Instead, the generalised graphical pipeline is applicable for various tasks, e.g., forecasting, classification, and regression. Moreover, this also enables the combination of different tasks in one pipeline. E.g., a pipeline could consist of a regression steps and a classification step.

Note that the graphical pipeline is still experimental. Thus, this graphical should not used in production. However, we would be happy to get feedback on the graphical pipeline.



### Potential Use-Cases
There exist various potential use-case for the graphical pipeline. In the following, we focus on a forecasting and a classification pipeline.
#### Forecasting Use-Case for Graphical Pipelines
In forecasting tasks, the input of forecasters might depend on the output of other forecasters for exogenous variables. These forecasters for exogenous variables might share the same input arguments. Thus, there is a branching of the data flow since the same input is used for different forecasters. Afterwards, if the output of the forecasters are combined, there is a merging of the data flow.
Thus, the graphical pipeline is a natural fit for such forecasting tasks. Furthermore, the graphical pipeline enables the combination of different forecasters in one pipeline. E.g., a pipeline could consist of a forecaster that forecasts the trend and a forecaster that forecasts the seasonality.
Integrating this in a graphical pipeline makes it easier to understand the pipeline and to apply a grid search to the pipeline.

#### Classification Use-Case for Graphical Pipelines

In classification taks, the input of classifier may rely on different features. Potentially, not all of these features are always observable. Thus, a soft sensor is required. Such a soft-sensor could be realised using a regressor.
For such a scenario, the graphical pipeline is a natural fit since it enables the combination of different tasks in one pipeline.
Note that in the current experimental state of the graphical pipeline, this use-case is not fully supported. However, we are working on this.


### Content of this Notebook:
* Understanding the API of graph pipelines
* Examples of simple pipelines and how they can be implemented with graph pipelines.
* More complex Grahpical Pipeline
    * Forecasting
* Grid Search with such a Graphical Pipeline

### Credits
The graphical pipeline was first developed by pyWATTS and was then adapted for sktime. The original implementation can be found [here](). pyWATTS is a open source library developed at the Institute of Applied Informatics and Automation at the KIT.


In [46]:
from sklearn.linear_model import Lasso, Ridge
from sklearn.preprocessing import StandardScaler

from sktime.classification.distance_based import KNeighborsTimeSeriesClassifier
from sktime.datasets import load_arrow_head, load_longley, load_macroeconomic
from sktime.forecasting.base import ForecastingHorizon
from sktime.forecasting.compose import MultiplexForecaster, make_reduction
from sktime.forecasting.model_selection import (
    ForecastingGridSearchCV,
    SlidingWindowSplitter,
    temporal_train_test_split,
)
from sktime.forecasting.sarimax import SARIMAX
from sktime.performance_metrics.forecasting import mean_absolute_error
from sktime.pipeline.pipeline import Pipeline
from sktime.transformations.series.adapt import TabularToSeriesAdaptor
from sktime.transformations.series.detrend import Deseasonalizer, Detrender
from sktime.transformations.series.difference import Differencer
from sktime.transformations.series.exponent import ExponentTransformer

## How to build a Graphical Pipeline
The API of the graphical pipeline differs from the API of the standard pipeline. Instead of providing the full pipeline specification during the initialisation of the pipeline, the standard is to build the pipeline step by step.
I.e.
1. Create the pipeline object with `Pipeline()`
2. Each step is added to the pipeline with the `add_step` method. The `add_step` method takes the following arguments:
    * skobject: The sktime object that should be added to the pipeline
    * name: The name of the step
    * edges: A dictionary that specifies the edges of the graph. The keys of the dictionary are the input arguments of the sktime object and the values are the names of the steps that should be connected to the input argument.
    * method: The method of the sktime object that should be called. If no method is specified, the default method would be inferred based on the added skobject. This parameter is used for the inverse_transform method.
    * kwargs: Additional keyword arguments that should be passed to the sktime object.

   E.g. `pipeline = pipeline.add_step(Differencer(), "differencer", edges={"X": "y"})` This would add a differencer to the pipeline.
   Note that add_step does not vary pipeline, instead it returns a new pipeline object that contains the added step. Thus, you probably want to reassign the pipeline variable to the new pipeline object.

In the following, we show a few simple Examples of the graphical pipeline, before we show more complex ones.

## Examples
### Forecasting Pipeline
In the following, we show how a simple forecasting pipeline could be implemented using the graphical pipeline. The pipeline consists of the following steps:


In [41]:
general_pipeline = Pipeline()
differencer = Differencer()

general_pipeline = general_pipeline.add_step(
    differencer, "differencer", edges={"X": "y"}
)
general_pipeline = general_pipeline.add_step(
    SARIMAX(), "sarimax", edges={"X": "X", "y": "differencer"}
)
general_pipeline = general_pipeline.add_step(
    differencer, "differencer_inv", edges={"X": "sarimax"}, method="inverse_transform"
)

In [42]:
y, X = load_longley()
y_train, y_test, X_train, X_test = temporal_train_test_split(y, X)

general_pipeline.fit(y=y_train, X=X_train, fh=[1, 2, 3, 4])
general_pipeline.predict(X=X_test)

1959    67213.735360
1960    68328.076304
1961    68737.861389
1962    71322.894013
Freq: A-DEC, Name: TOTEMP, dtype: float64

**Alternative Way in Defining the Pipeline**
An alternative to define a graphical pipeline would be to pass a list of steps to the Pipeline during creation. This would look as follows:

In [37]:
differencer = Differencer()

general_pipeline = Pipeline(
    [
        {"skobject": differencer, "name": "differencer", "edges": {"X": "y"}},
        {
            "skobject": SARIMAX(),
            "name": "sarimax",
            "edges": {"X": "X", "y": "differencer"},
        },
        {
            "skobject": differencer,
            "name": "differencer_inv",
            "edges": {"X": "sarimax"},
            "method": "inverse_transform",
        },
    ]
)

### Classification Pipeline
In the following, we show how a simple classification pipeline could be implemented using the graphical pipeline. The pipeline consists of the following steps:

In [44]:
general_pipeline = Pipeline()
general_pipeline = general_pipeline.add_step(
    ExponentTransformer(), "exponent", edges={"X": "X"}
)
general_pipeline = general_pipeline.add_step(
    KNeighborsTimeSeriesClassifier(), "classifier", edges={"X": "exponent", "y": "y"}
)

In [47]:
X, y = load_arrow_head(split="train", return_X_y=True)
general_pipeline.fit(X=X, y=y)
general_pipeline.predict(X=X)

array(['0', '1', '2', '0', '1', '2', '0', '1', '2', '0', '1', '2', '0',
       '1', '2', '0', '1', '2', '0', '1', '2', '0', '1', '2', '0', '1',
       '2', '0', '1', '2', '0', '1', '2', '0', '1', '2'], dtype='<U1')

## More Complex Examples with Grid Search
The previous exemplary pipelines could be also easily built with the sequential implementations. Thus, in the following, we show a more complex pipeline that is only implementable as sequential pipeline with a lot of nesting. This makes it more difficult to apply a grid search to the pipeline.

The considered use-case is to forecast the inflation using forecasts of the real gross domestic product, real disposable personal income, and the unemployment rate. Furthermore the unemployment rate is forecasted using the same features except the unemployment rate itself.
The data is taken from the macrodata dataset from the statsmodels package.

Thereby, we want to find the best combination of regressors for the different forecasts using either a Linear, Lasso, or Ridge Regression.

#### Pipeline Definition

In [28]:
pipeline = Pipeline()
sklearn_scaler = StandardScaler()
sktime_scaler = TabularToSeriesAdaptor(sklearn_scaler)
deseasonalizer = Deseasonalizer(sp=4)
detrender = Detrender()

pipeline = pipeline.add_step(
    sktime_scaler, name="scaler", edges={"X": "X__realgdp_realdpi_unemp"}
)
pipeline = pipeline.add_step(
    detrender, name="deseasonalizer", edges={"X": "X__realgdp_realdpi"}
)

pipeline = pipeline.add_step(
    MultiplexForecaster(
        [
            (
                "ridge",
                make_reduction(Ridge(), windows_identical=False, window_length=5),
            ),
            (
                "lasso",
                make_reduction(Lasso(), windows_identical=False, window_length=5),
            ),
        ]
    ),
    name="forecaster_gdp",
    edges={"y": "deseasonalizer__realgdp"},
)

pipeline = pipeline.add_step(
    MultiplexForecaster(
        [
            (
                "ridge",
                make_reduction(Ridge(), windows_identical=False, window_length=5),
            ),
            (
                "lasso",
                make_reduction(Lasso(), windows_identical=False, window_length=5),
            ),
        ]
    ),
    name="forecaster_dpi",
    edges={"y": "deseasonalizer__realdpi"},
)

pipeline = pipeline.add_step(
    MultiplexForecaster(
        [
            (
                "ridge",
                make_reduction(Ridge(), windows_identical=False, window_length=5),
            ),
            (
                "lasso",
                make_reduction(Lasso(), windows_identical=False, window_length=5),
            ),
        ]
    ),
    name="forecaster_unemp",
    edges={
        "y": "scaler__unemp",
        "X": [
            "forecaster_gdp",
            "forecaster_dpi",
        ],
    },
)

pipeline = pipeline.add_step(
    MultiplexForecaster(
        [
            (
                "ridge",
                make_reduction(Ridge(), windows_identical=False, window_length=5),
            ),
            (
                "lasso",
                make_reduction(Lasso(), windows_identical=False, window_length=5),
            ),
        ]
    ),
    name="forecaster_inflation",
    edges={"X": ["forecaster_dpi", "forecaster_unemp"], "y": "y"},
)

In [29]:
data = load_macroeconomic()

X = data[["realgdp", "realdpi", "unemp"]]
y = data[["infl"]]
fh = ForecastingHorizon([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])

y_train, y_test, X_train, X_test = temporal_train_test_split(y, X=X, fh=fh)
X_train

Unnamed: 0_level_0,realgdp,realdpi,unemp
Period,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1959Q1,2710.349,1886.9,5.8
1959Q2,2778.801,1919.7,5.1
1959Q3,2775.488,1916.4,5.3
1959Q4,2785.204,1931.3,5.6
1960Q1,2847.699,1955.5,5.2
...,...,...,...
2005Q3,12683.153,9308.0,5.0
2005Q4,12748.699,9358.7,4.9
2006Q1,12915.938,9533.8,4.7
2006Q2,12962.462,9617.3,4.7


In [30]:
pipeline.fit(y=y_train, X=X_train, fh=fh)
result = pipeline.predict(X=None, fh=y_test.index)
((result - y_test) ** 2).mean()

infl    18.041837
dtype: float64

In [31]:
ridge = make_reduction(Ridge(), windows_identical=False, window_length=5)
ridge.fit(y=y_train, fh=fh)
((ridge.predict() - y_test) ** 2).mean()

infl    19.608558
dtype: float64

#### Grid Search

This pipeline has multiple parameters that might be tested to find the configurations. These parameters include:
* which forecaster should be used for which variable
* what should be the hyperparameters of the forecaster
* what should be the preprocessing looks like for the different forecaster
* which features should be used for the different forecasters
* ...

Doing this manually is really annoying, thus hyperparameter searches exist. E.g. ForecastingGridSearchCV in sktime.
Since the graphical pipeline performs a forecasting task, we use this grid search to find the best configuration.
Therefore, we have to specify a paramter grid that contains the different configurations that should be tested.
The parameter grid is a dictionary that contains the different parameters that should be tested. The keys of the dictionary are the names of the steps in the pipeline and the values are the different configurations that should be tested for the step. Thus, to change the parameters of a skobject in the pipeline the key looks like: `step_name__skobject_name__parameter_name`. To change the inputs you need to vary the edges. This can be done with keys following the following scheme: `step_name_edges_Xory`

In the following, we initialise the gridsearch and perform a hyperparameter search.

In [32]:
grid = ForecastingGridSearchCV(
    pipeline,
    cv=SlidingWindowSplitter(
        window_length=len(X_train) - 20,
        step_length=4,
        fh=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
    ),
    scoring=mean_absolute_error,
    # refit=False,
    error_score="raise",
    param_grid={
        "forecaster_inflation__skobject__selected_forecaster": ["ridge", "lasso"],
        "forecaster_unemp__skobject__selected_forecaster": ["ridge", "lasso"],
        "forecaster_dpi__skobject__selected_forecaster": ["ridge", "lasso"],
        "forecaster_gdp__skobject__selected_forecaster": ["ridge", "lasso"],
        "forecaster_inflation__edges__X": [
            ["forecaster_unemp"],
            ["forecaster_unemp", "forecaster_dpi"],
        ],
        "forecaster_unemp__edges__X": [
            [],
            ["forecaster_dpi"],
            ["forecaster_gdp", "forecaster_dpi"],
        ],
    },
)
grid.fit(y=y_train, X=X_train)
result = grid.predict(X=None, fh=y_test.index)

  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = cd_fast.enet_coordinate_descent(
  model = c

In [33]:
grid.cv_results_

Unnamed: 0,mean_test__DynamicForecastingErrorMetric,mean_fit_time,mean_pred_time,params,rank_test__DynamicForecastingErrorMetric
0,1.539329,0.093885,0.053009,{'forecaster_dpi__skobject__selected_forecaste...,54.5
1,1.720565,0.082869,0.049055,{'forecaster_dpi__skobject__selected_forecaste...,58.5
2,1.311031,0.366696,0.160758,{'forecaster_dpi__skobject__selected_forecaste...,1.5
3,3.116475,0.183237,0.085956,{'forecaster_dpi__skobject__selected_forecaste...,95.5
4,1.851588,0.171509,0.089728,{'forecaster_dpi__skobject__selected_forecaste...,65.0
...,...,...,...,...,...
91,1.443361,0.076905,0.037015,{'forecaster_dpi__skobject__selected_forecaste...,46.5
92,1.443361,0.079046,0.041127,{'forecaster_dpi__skobject__selected_forecaste...,46.5
93,1.443361,0.081455,0.040803,{'forecaster_dpi__skobject__selected_forecaste...,46.5
94,1.443361,0.108554,0.060533,{'forecaster_dpi__skobject__selected_forecaste...,46.5


In [34]:
((result - y_test) ** 2).mean()

infl    19.244087
dtype: float64

In [35]:
grid.best_params_

{'forecaster_dpi__skobject__selected_forecaster': 'ridge',
 'forecaster_gdp__skobject__selected_forecaster': 'ridge',
 'forecaster_inflation__edges__X': ['forecaster_unemp'],
 'forecaster_inflation__skobject__selected_forecaster': 'ridge',
 'forecaster_unemp__edges__X': ['forecaster_dpi'],
 'forecaster_unemp__skobject__selected_forecaster': 'ridge'}

In [36]:
grid.best_score_

1.3110312019415922