`auto_pipeline`
----------------

Say you want to create a quick linear pipeline that just takes events from one data source, transforms those events, and sends them on to a data sink. Since we're thinking in a straight line, we can think of the jupyter notebook itself as a pipeline. In the simplest case, events would come in at the top of the notebook, are processed, and then come out the bottom.

Since sometimes you might want to do some imports/setup before launching the pipeline, in reality we divide the notebook into two sections. The setup section and the pipeline section.

`auto_pipeline()` lets you turn simple jupyter notebooks into pipelines.


To use just call `auto_pipeline(source=<source>, sink=<sink>)` at some point in your notebook, and the rest of the cells in the notebook will become processors in that pipeline. The special variable `event` will be set in the pipeline after the `auto_pipeline` call. This variable will also be sent to the sink at the end of the pipeline.


Setup section 
-------------

Gets run once at launch

In [None]:
from bspump.jupyter import *
from bspump.unittest import ProcessorTestCase, TestCase
from bspump.test import TestSink, TestSource
from bspump.trigger import CronTrigger
from bspump.abc.source import TriggerSource
from bspump.http.web.server import *
from datetime import datetime
import json

class MyTestSink(TestSink):
    pass

class MyTestSource(TestSource):
    async def cycle(self, *args, **kwags):
        await self.Pipeline.ready()
        event = {"time_triggered": datetime.now().timestamp()}
        await self.Pipeline.process(event)


In [None]:
foovar=3

In [None]:
# we define a sample event to test our pipeline.
event = b"""{"ffoo":"bap"}"""
test_events = {
    b"""{"foo":"aaa"}""": {
        "expect": [{"foo": "aaa", "foovar": 3, "barvar": "hello", "bazvar": "aaa"}]
    },
    b"""{"foo":"aab"}""": {
        "expect": [{"foo": "aab", "foovar": 3, "barvar": "hello", "bazvar": "aab"}]
    }
}


We use `auto_pipeline` to mark the start of the *pipeline section*. We also specifiy the source and sink for our pipeline at this time.

In [None]:
auto_pipeline(
    source=lambda app, pipeline: TestSource(app, pipeline, "TestSource"),
    sink=lambda app, pipeline: TestSink(app, pipeline, "TestSink")
)


The test fails here. Supposedly the subsequent block is taken as a processor step, so the variable is limited in scope. The variables should be allowed to propagate to subsequent steps

In [None]:
barvar="hello"
bazvar=event["foo"]

Pipeline section
----------------

Everything after this is rerun every time an event comes in. At run time, the `event` variable is automatically set with the value of the event that comes from the source.

We can do whatever transformations we please, and then, by setting `event` at the end of the notebook, the value of `event` will automatically be sent to the sink.

In [None]:
event = json.loads(event.decode("utf8"))
foovar
barvar
bazvar
foovar2
event

{'foo': 'bap'}

In [None]:
event["foovar"] = foovar
event["barvar"] = barvar
event["bazvar"] = bazvar
event

{'foo': 'BAP'}