# Create and run pipelines

In part 2 we learn pipelines. Pipeline is a sequence of actions we want to apply to batch. At first we describe what we want to do with batch. Then at some point in the code we pass dataset to pipeline and the caclucations actually run. Such "lazy run" makes code compact and clear.

## Define pipeline

The following example shows how to define a pipeline. It simply lists actions:

In [None]:
preprocess_pipeline = (ds.Pipeline()
                       .load(fmt="wfdb", components=["signal", "meta"])
                       .random_resample_signals("normal", loc=300, scale=10)
                       .drop_short_signals(4000)
                       .segment_signals(3000, 3000)
                       .run(batch_size=300, n_epochs=None, lazy=True))

The only thing to be clarified is the last action ```run```. We set parameter ```lazy=True``` since we want to run pipeline somewhere later. When we actually run this pipeline it will load batches of size ```batch_size``` and apply actions of pipeline. Iteration stops when no ecgs are left. 

## Run pipeline

To start caclulation we pass ecg dataset into pipeline and call action ```run```:

In [None]:
processed = (eds >> preprocess_pipeline).run()

Note that transformed ecgs are NOT assigned to ```processed```, which is a pipeline again. To save results we should add ```dump``` action to pipeline or save result into pipeline variable. How to work with pipeline variables we explain in the next part of tutorial.

## Add custom action

Suppose you want to include in a pipeline a new action called ```add_value``` that adds given value to each signal.
We start with two decorators. Decorator ```action``` enables to use the function in pipeline. Decorator ```inbatch_parallel``` splits batch into individual ecgs and process each ecg in a separate thread. Under these decorators function ```add_value``` obtains index of ecg and every parameter passed from pipeline.  Function ```add_value``` do the following things: gets position of ecg in batch by index and add value to signal. Combining it together be obtain the desired action:

```python
@ds.action
@ds.inbatch_parallel(init="indices", target="threads")
def add_value(self, index, value):
    i = self.get_pos(None, "signal", index)
    self.signal[i] += value
```
Action ```add_value``` now can be included in pipeline:
```python
ppl = (eds.pipeline()
       .do_some_actions_before()
       .add_value(0.01)
       .do_some_actions_after())
```

In the next [part 3]() we will overview various methods of EcgBatch.