# How real time pipelines work

A graph is composed of the following:

* Step: A Step runs a function or class handler or a REST API call. MLRun comes with a list of [pre-built steps](./available-steps.md) that include data manipulation, readers, writers and model serving. You can also write your own steps using 
    standard Python functions or custom functions/classes, or can be a external REST API (the special $remote class).
* Router: A special type of step is a router with routing logic and multiple child routes/models. The basic 
    routing logic is to route to the child routes based on the Event.path. More advanced or custom routing can be used,
    for example, the ensemble router sends the event to all child routes in parallel, aggregates the result and responds.
* Queue: A queue or stream that accepts data from one or more source steps and publishes to one or more output steps. 
    Queues are best used to connect independent functions/containers. Queues can run in-memory or be implemented using a stream,     which allows it to span processes/containers.
    
The Graph server has two modes of operation (topologies):

* Router topology (default): A minimal configuration with a single router and child tasks/routes. This can be used for simple model serving or single hop configurations.
* Flow topology: A full graph/DAG. The flow topology is implemented using two engines: async (the default)
is based on [Storey](https://github.com/mlrun/storey) and asynchronous event loop; and `sync`, which supports a simple
sequence of steps.

The first graph element accepts an `Event` object, transforms/processes the event and passes the result to the next steps 
in the graph. The final result can be written out to some destination (file, DB, stream, ..) 
or returned back to the caller (one of the graph steps can be marked with `.respond()`).

## The Event object

The Graph state machine accepts an Event object (similar to a Nuclio Event) and passes 
it along the pipeline. An Event object hosts the event `body` along with other attributes 
such as `path` (http request path), `method` (GET, POST, ..), and`id` (unique event ID).

In some cases the events represent a record with a unique `key`, which can be read/set 
through the `event.key`. Records have associated `event.time` that, by default, are 
the arrival time, but can also be set by a step.

The Task steps are called with the `event.body` by default. If a task step needs to 
read or set other event elements (key, path, time, ..) you should set the task `full_event`
argument to `True`.

Task steps support optional `input_path` and `result_path` attributes that allow controlling which portion of 
the event is sent as input to the step, and where to update the returned result.

For example, for an event body `{"req": {"body": "x"}}`, `input_path="req.body"` and `result_path="resp"` 
the step gets `"x"` as the input. The output after the step is `{"req": {"body": "x"}: "resp": <step output>}`.
Note that `input_path` and `result_path` do not work together with `full_event=True`.

## The Context object

The step classes are initialized with a `context` object (when they have `context` in their `__init__` args).
The context is used to pass data and for interfacing with system services. The context object has the 
following attributes and methods.

Attributes:
* **logger**: Central logger (Nuclio logger when running in Nuclio).
* **verbose**: True if in verbose/debug mode.
* **root**: The graph object.
* **current_function**: When running in a distributed graph, the current child function name.

Methods:
* **get_param(key, default=None)**: Get the graph parameter by key. Parameters are set at the
  serving function (e.g. `function.spec.parameters = {"param1": "x"}`).
* **get_secret(key)**: Get the value of a project/user secret.
* **get_store_resource(uri, use_cache=True)**: Get the mlrun store object (data item, artifact, model, feature set, feature vector).
* **get_remote_endpoint(name, external=False)**: Return the remote nuclio/serving function http(s) endpoint given its [project/]function-name[:tag].
* **Response(headers=None, body=None, content_type=None, status_code=200)**: Create a nuclio response object, for returning detailed http responses.

Example, using the context:

In [None]:
 if self.context.verbose:
        self.context.logger.info('my message', some_arg='text')
    x = self.context.get_param('x', 0)

## Building distributed graphs

Graphs can be hosted by a single function (using zero to N containers), or span multiple functions
where each function can have its own container image and resources (replicas, GPUs/CPUs, volumes, etc.).
It has a `root` function, which is where you configure triggers (http, incoming stream, cron, ..), 
and optional downstream child functions.

You can specify the `function` attribute in `Task` or `Router` steps. This indicates where 
this step should run. When the `function` attribute is not specified it runs on the root function.</b>
`function="*"` means the step can run in any of the child functions.

Steps on different functions should be connected using a `Queue` step (a stream)

**Adding a child function:**

In [None]:
```python
fn.add_child_function('enrich', 
                      './entity_extraction.ipynb', 
                      image='mlrun/mlrun',
                      requirements=["storey", "sklearn"])
```

See a [complete example](./model-serving-get-started.html#example-nlp-processing-pipeline-with-real-time-streaming).  

## Error handling and catchers

Graph steps can raise an exception and you might want to have an error handling flow. ,
You can specify an exception handling step/branch that triggers on error.
The error handler step receives the event that entered the failed step, with two extra
attributes: `event.origin_state` indicates the name of the failed step, and `event.error`
holds the error string.

Use the `graph.error_handler()` (apply to all steps) or `step.error_handler()` 
(apply to a specific step) if you want the error from the graph or the step to be 
fed into a specific step (catcher).

Example, setting an error catcher per step: 

In [None]:
    graph.add_step("MyClass", name="my-class", after="pre-process").error_handler("catcher")
    graph.add_step("ErrHandler", name="catcher", full_event=True, after="")
    
```{note}
Additional steps can follow the `catcher` step.
```

See the [full example](./use-cases.html#example-advanced-data-processing-and-serving-ensemble)

**Exception stream:**

The graph errors/exceptions can be pushed into a special error stream. This is very convenient 
in the case of distributed and production graphs. 

Set the exception stream address (using v3io streams uri):

In [None]:
    function.spec.error_stream = 'users/admin/my-err-stream'

Another example, using a graph:

In [None]:
graph2.plot(rankdir='LR')

<graphviz.dot.Digraph at 0x7fd46e42b810>

We have the graph2_enrich which we kept when we built the graph, we can add a error handler in the following way:

In [None]:
graph2_enrich.error_handler("catcher")
graph2.add_step("ErrHandler", name="catcher", full_event=True, after="")

out!!!!
<mlrun.serving.states.TaskStep at 0x7fd46e557750>

Now, if we display the graph:

In [None]:
graph2.plot(rankdir='LR')
How to make this out????
<graphviz.dot.Digraph at 0x7fd46c36f590>

### Exception stream

The graph errors/exceptions can be pushed into a special error stream, this is very convenient 
in the case of distributed and production graphs 

setting the exception stream address (using v3io streams uri):

In [None]:
fn_preprocess2.spec.error_stream = err_stream