# Overview

The sktime network pipeline is allows users to execute steps in a non-sequential manner. This functionality allows users to apply transformations on the input data that would not otherwise be possible to achieve inside standard linear pipelines. In addition to this, the sktime network pipeline supports applying transformations not only on `X` but also on `y`. In contrast, sklearn supports transformations on `y` only. Furthermore, the sktime pipeline allows different behaviour at `fit`, `predict` and `update` of the pipeline.

The sktime network pipeline currently supports only **forecasting tasks** and can be used by importing `sktime.forecasting.compose.NetworkPipelineForecaster`. Users need to specify the *steps* that will be executed inside the pipeline. The steps need to be provided as a *list of tuples* where each element of the list follows the convention below:

```Python
(step_name, estimator, parameters)
```
where:
1. `step_name` is a string, 
2. `estimator` is an object that can be either a `transformer` or `forecaster` 
3. `parameters` is a dictionary

The `parameters` dictionary deserves special attention.

The dictionary should be structed as *keyword arguments* taken by the signatures of the `fit`, `predict` and `update` methods of the estimator in the same step. In other words, the *dictionary keys* must correspond to the arguments in the signature of the `fit`, `predict` or `update` methods and the *dictionary values* must correspond to the values the user wants to assign. The *dictionary values* can be either:
* **The original input variables.** In this case the special strings `original_X`, `original_y` and `original_fh` must be used. For example the tuple 
```Python
('ft1', BoxCoxTransformer(), {'X':'original_X'}) 
```

specifies a step called ft1, that uses the BoxCoxTransformer and passes the value of `X` supplied in the `NetworkPipelineForecaster.fit()`, `NetworkPipelineForecaster.predict()`, `NetworkPipelineForecaster.update()` method to the corresponding `BoxCoxTransformer.fit()` or `BoxCoxTransformer.transform()` method. 

* **The output of previous steps in the pipeline.** In this case the name of the steps must be used. For example:
```Python
('ft1', BoxCoxTransformer(), {'X':'transformer1'}) 
```
specifies a step called ft1, that uses the BoxCoxTransformer and passes the value of `X` that was produced by a previous step in the pipeline called `transformer1`.

# Examples

## Simple pipeline

This example provides no benefits in comparison to a simple linear pipeline but is used here for illustration purposes.

In [1]:
from sktime.datasets import load_airline
from sktime.forecasting.arima import AutoARIMA
from sktime.forecasting.compose import NetworkPipelineForecaster
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.transformations.series.boxcox import BoxCoxTransformer

y = load_airline()
y_train, y_test = temporal_train_test_split(y, test_size=4)
network_pipeline = NetworkPipelineForecaster(
    steps=[
        ("boxcox", BoxCoxTransformer(), {"Z": "original_y"}),
        (
            "arima",
            AutoARIMA(suppress_warnings=True),
            {"fh": "original_fh", "y": "boxcox"},
        ),
    ]
)
network_pipeline.fit(y_train)
network_pipeline.predict(fh=[1, 2, 3, 4])

1960-09    552.658605
1960-10    498.021039
1960-11    463.562260
1960-12    447.898732
Freq: M, dtype: float64

## Advanced use case

This example illustrates how the pipeline can be used to apply non sequential transformations to `X` as well as transformations to `y`.

In [2]:
from sktime.datasets import load_longley
from sktime.forecasting.arima import AutoARIMA
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.transformations.panel.dataset_manipulation import (
    Concatenator,
    Converter,
    Selector,
)
from sktime.transformations.series.acf import AutoCorrelationTransformer
from sktime.transformations.series.boxcox import BoxCoxTransformer

y, X = load_longley()
y_train, y_test = temporal_train_test_split(y, test_size=4)
X_train, X_test = temporal_train_test_split(X, test_size=4)
pipe = NetworkPipelineForecaster(
    [
        ("feature_X1", Selector(1, return_dataframe=False), {"X": "original_X"}),
        ("feature_X2", Selector(2, return_dataframe=False), {"X": "original_X"}),
        ("ft1", BoxCoxTransformer(), {"Z": "feature_X1"}),
        (
            "ft1_converted",
            Converter(),
            {"obj": "ft1", "to_type": "pd.DataFrame", "as_scitype": "Series"},
        ),
        ("ft2", BoxCoxTransformer(), {"Z": "feature_X2"}),
        ("concat", Concatenator(), {"X": ["ft1_converted", "ft2"]}),
        ("new_y", AutoCorrelationTransformer(), {"Z": "original_y"}),
        (
            "y_out",
            AutoARIMA(suppress_warnings=True),
            {"fh": "original_fh", "y": "original_y", "X": "concat"},
        ),
    ]
)
pipeline = pipe.fit(y_train, X_train)
pipe.predict(fh=[1, 2, 3, 4], X=X_test)



1959    69684.582566
1960    70058.809273
1961    70780.147271
1962    72042.380096
Freq: A-DEC, dtype: float64