# ```auto_pipeline()```
----------------

Say you want to create a quick linear pipeline that just takes events from one data source, transforms those events, and sends them on to a data sink. Since we're thinking in a straight line, we can think of the jupyter notebook itself as a pipeline. In the simplest case, events would come in at the top of the notebook, are processed, and then come out the bottom.

Since sometimes you might want to do some imports/setup before launching the pipeline, in reality we divide the notebook into two sections. The setup section and the pipeline section.

`auto_pipeline()` lets you turn simple jupyter notebooks into pipelines.


To use just call `auto_pipeline(source=<source>, sink=<sink>)` at some point in your notebook, and the rest of the cells in the notebook will become processors in that pipeline. The special variable `event` will be set in the pipeline after the `auto_pipeline` call. This variable will also be sent to the sink at the end of the pipeline.


----------------
### Setup: Imports & Connections

Gets run once at launch, pulling in ```bspump``` components and any connectors you need.

In [None]:
from bspump.jupyter import *
import bspump.kafka
import json

In [None]:
some_constant=3

----------------
### Registering a named connection in ```bspump```:

In [None]:
@register_connection
def connection(app):
  return bspump.kafka.KafkaConnection(app, "KafkaConnection")

----------------
### Sample Event (for local testing)
A small JSON or bytes buffer you define so you can run the transformations interactively in Jupyter:

In [None]:
# we define a sample event to test our pipeline.
event = b"""{"foo":"bap"}"""

When you “Run All” in Jupyter, this event simulates the first incoming record.


----------------
### Pipeline Definition (```auto_pipeline```)
We use `auto_pipeline` to mark the start of the *pipeline section*. We also specifiy the source and sink for our pipeline at this time.

In [None]:
auto_pipeline(
    source=lambda app, pipeline: bspump.kafka.KafkaSource(app, pipeline, "KafkaConnection"),
    sink=lambda app, pipeline: bspump.kafka.KafkaSink(app, pipeline, "KafkaConnection")
)

----------------
### Pipeline section
Everything after this is rerun every time an event comes in. At run time, the `event` variable is automatically set with the value of the event that comes from the source.

We can do whatever transformations we please, and then, by setting `event` at the end of the notebook, the value of `event` will automatically be sent to the sink.

In [None]:
event = json.loads(event.decode("utf8"))
event

In [None]:
event["foo"] = event["foo"].upper()
event

In [None]:
event["foo"] = (" " * some_constant).join(reversed(list(event["foo"])))
event

In [None]:
event = json.dumps(event).encode()
event

----------------
### Testing
You can run the notebook yourself by typing: ```bitswan notebook AutoPipeline/main.ipynb``` This will connect us to Kafka, and if we send events to Kafka, we’ll be able to see them flow through the pipeline.
Alternativelly, you can just test it directly in the notebook by Executing cells with different events.