# 2. Incoming Event Handler
  --------------------------------------------------------------------

Handle incoming events and write them to an output V3IO Stream `incoming-events-stream`.
The received data is partitioned by `user_id`.

![Model deployment with streaming Real-time operational Pipeline](../../assets/images/model-deployment-with-streaming.png)

The rest of the notebooks rely on the output stream of this notebook. Therefore, one can change the input data without affecting the rest of the workflow.

## Initialize

Load the project

In [1]:
from mlrun import load_project
from os import path

project_path = path.abspath('conf')
project = load_project(project_path)

Get the generated stream path, this is the input we use and we output to the "incoming events" which is later consumed by other functions

In [2]:
input_stream = project.params.get('STREAM_CONFIGS').get('generated-stream')
input_stream_path =  input_stream.get('path')
print(f'Input stream path: {input_stream_path}')

Input stream path: iguazio/examples/model-deployment-with-streaming/data/generated-stream


Nuclio leverages consumer groups. When one or more Nuclio replicas join a consumer group, each replica receives its equal share of the shards, based on the number of replicas that are defined in the function.

We set up the input stream URL below. A consumer-group URL is in the form of `http://v3io-webapi:8081/<container name>/<stream path>@<consumer group name>`. In this case we use `WEB_API_USERS` for URL prefix `http://v3io-webapi:8081/<container name>` and a consumer group named **`incomingeventhandler`**.

For more information, refer to the [Nuclio v3iostream trigger reference documentation](https://nuclio.io/docs/latest/reference/triggers/v3iostream/).

In [3]:
WEB_API_USERS = project.params.get('WEB_API_USERS')
input_stream_url = path.join(WEB_API_USERS, input_stream_path) + "@incomingeventhandler"
print(f'Input stream URL: {input_stream_url}')

Input stream URL: http://v3io-webapi:8081/users/iguazio/examples/model-deployment-with-streaming/data/generated-stream@incomingeventhandler


Get the incoming-events stream path, this is where we output the data

In [4]:
output_stream = project.params.get('STREAM_CONFIGS').get('incoming-events-stream')
output_stream_path =  output_stream.get('path')
print(f'Output stream path: {output_stream_path}')

Output stream path: iguazio/examples/model-deployment-with-streaming/data/incoming-events-stream


## Create and Test a Local Function 

[Nuclio](https://nuclio.io/) is a high-performance open-source and managed serverless framework, which is available as a predefined tenant-wide platform service (`nuclio`).
The demo uses Nuclio to create and deploy serverless functions.
Therefore, you need to import the Nuclio package and configure Nuclio for your project.

The platform's Jupyter Notebook service preinstalls the [nuclio-jupyter SDK](https://github.com/nuclio/nuclio-jupyter/blob/master/README.md) for creating and deploying Nuclio functions with Python and Jupyter Notebook.
The tutorial uses the Nuclio magic commands and annotation comments of this SDK to automate function code generation.
The magic commands are initialized when you import the `nuclio` package.<br>
The `%nuclio` magic commands are used to run Nuclio commands from Jupyter notebooks (`%nuclio <Nuclio command>`).
You can also use `%%nuclio` at the start of a cell to identify the entire cell as containing Nuclio code.
The magic commands are initialized when you import the `nuclio` package.<br>
The `# nuclio: start-code`, `# nuclio: end-code`, and `# nuclio: ignore` section-marker annotations notify Nuclio of the beginning or end of code sections.
Nuclio ignores all notebook code before a `# nuclio: start-code` marker or after an `# nuclio: end-code` marker.
Nuclio translates all other notebook code sections into function code, except for sections that are marked with the `# nuclio: ignore` marker.

### Import Nuclio

The following code imports the `nuclio` Python package.

In [5]:
import nuclio

#### Configure Nuclio

The following code uses the `# nuclio: start-code` marker to instruct Nuclio to start processing code only from this location, and then performs basic Nuclio function configuration &mdash; defining the name of the function's container image (`mlrun/ml-models`), the function type (`nuclio`), and some additional package installation commands.

> **Note:** You can add code to define function dependencies and perform additional configuration after the `# nuclio: start-code` marker.

In [6]:
# nuclio: start-code

In [7]:
%nuclio cmd -c pip install v3io

In [8]:
%%nuclio config
spec.build.baseImage = "mlrun/ml-models"
kind = "nuclio"

%nuclio: setting spec.build.baseImage to 'mlrun/ml-models'
%nuclio: setting kind to 'nuclio'


## Function code

In [9]:
import os
import json

import v3io.dataplane

def init_context(context):
    V3IO_ACCESS_KEY = os.getenv('V3IO_ACCESS_KEY')
    container = os.getenv('CONTAINER')
    output_stream_path = os.getenv('OUTPUT_STREAM_PATH')
    partition_attr = os.getenv('PARTITION_ATTR')
    WEB_API = os.getenv('WEB_API')
    v3io_client = v3io.dataplane.Client(endpoint=WEB_API, access_key=V3IO_ACCESS_KEY)

    setattr(context, 'v3io_client', v3io_client)
    setattr(context, 'partition_attr', partition_attr)
    setattr(context, 'container', container)
    setattr(context, 'output_stream_path', output_stream_path)


def handler(context, event):
    if type(event.body) is dict:
        event_dict = event.body
    else:
        event_dict = json.loads(event.body)
        
    context.logger.info_with('Got invoked',
                             trigger_kind=event.trigger.kind,
                             event_body=event_dict)
        
    partition_key = event_dict.get(context.partition_attr)
    record = event_to_record(event_dict, partition_key)
    
    print(context.output_stream_path)
    resp = context.v3io_client.put_records(container=context.container, 
                                   path=context.output_stream_path, 
                                   records=[record], 
                                   raise_for_status=v3io.dataplane.RaiseForStatus.never)
    
    context.logger.info_with('Sent event to stream', 
                             record=record,
                             response_status=resp.status_code, 
                             response_body=resp.body.decode('utf-8'))
    
    return resp.status_code


def event_to_record(event_dict, partition_key):
    event_str = json.dumps(event_dict)
    return {'data': event_str, 'partition_key': str(partition_key)}

The following cell uses the `# nuclio: end-code` marker to mark the end of a Nuclio code section and instruct Nuclio to stop parsing the notebook at this point.<br>
> **IMPORTANT:** Do not remove the end-code cell.

In [10]:
# nuclio: end-code

## Environment Variables

Set a dictionary for initializing the environment variables used by the function

In [11]:
envs = {'V3IO_ACCESS_KEY': os.getenv('V3IO_ACCESS_KEY'),
        'WEB_API' : project.params.get('WEB_API'),
        'CONTAINER': project.params.get('CONTAINER'),
        'OUTPUT_STREAM_PATH': output_stream_path,
        'PARTITION_ATTR': project.params.get('PARTITION_ATTR')}

## Test Locally

In [12]:
event = nuclio.Event(body=b'{"user_id" : 111111 , "event_type": "spin"}')
for key, value in envs.items():
    os.environ[key] = str(value)
init_context(context)
handler(context, event)

Python> 2020-08-24 18:29:25,634 [info] Got invoked: {'trigger_kind': '', 'event_body': {'user_id': 111111, 'event_type': 'spin'}}
iguazio/examples/model-deployment-with-streaming/data/incoming-events-stream
Python> 2020-08-24 18:29:25,636 [info] Sent event to stream: {'record': {'data': '{"user_id": 111111, "event_type": "spin"}', 'partition_key': '111111'}, 'response_status': 200, 'response_body': '{ "FailedRecordCount":0,"Records": [{ "SequenceNumber":1,"ShardId":5 } ] }'}


200

## Nuclio Deploy

### Convert code to function

We use MLRun `code_to_function` in order to convert the python code to a Nuclio function. We then set the relevant enrivonment variables and streaming trigger.

In [13]:
from mlrun import code_to_function

gen_func = code_to_function(name='incoming')
project.set_function(gen_func)
incoming_event_handler = project.func('incoming')
incoming_event_handler.set_envs(envs)
incoming_event_handler.add_trigger('incoming',
                                   nuclio.triggers.V3IOStreamTrigger(url=input_stream_url,
                                                                     access_key=os.getenv('V3IO_ACCESS_KEY'),
                                                                     maxWorkers=10))

<mlrun.runtimes.function.RemoteRuntime at 0x7fb1f66cf4d0>

In [14]:
project.save()

### Deploy

In [15]:
#Build image
incoming_event_handler.deploy()

> 2020-08-24 18:29:27,123 [info] deploy started
[nuclio] 2020-08-24 18:29:28,226 (info) Build complete
[nuclio] 2020-08-24 18:29:32,277 (info) Function deploy complete
[nuclio] 2020-08-24 18:29:32,284 done creating model-deployment-with-streaming-iguazio-incoming, function address: 3.131.62.169:31720


'http://3.131.62.169:31720'

# Done

Continue to [**2a-stream-to-parquet.ipynb**](2a-stream-to-parquet.ipynb) to store all incoming data to parquet files.