# Real-Time Stream Viewer (HTTP)
the following function responds to HTTP requests with the list of last 10 processed twitter messages + sentiments in reverse order (newest on top), it reads records from the enriched stream, take the recent 10 messages, and reverse sort them. the function is using nuclio context to store the last results and stream pointers for max efficiency.<br> 

The code is automatically converted into a nuclio (serverless) function and and respond to HTTP requests<br>

the example demonstrate the use of `%nuclio` magic commands to specify environment variables, package dependencies,<br>configurations, and to deploy functions automatically onto a cluster.


## Initialize nuclio emulation, environment variables and configuration
use `# nuclio: ignore` for sections that don't need to be copied to the function

In [1]:
# nuclio: ignore
# if the nuclio-jupyter package is not installed run !pip install nuclio-jupyter
import nuclio 

In [2]:
%nuclio env -c V3IO_ACCESS_KEY=${V3IO_ACCESS_KEY}
%nuclio env -c V3IO_USERNAME=${V3IO_USERNAME}
%nuclio env -c V3IO_API=${V3IO_API}

### Set function configuration 
use a cron trigger with 5min interval and define the base image<br>
for more details check [nuclio function configuration reference](https://github.com/nuclio/nuclio/blob/master/docs/reference/function-configuration/function-configuration-reference.md)

In [3]:
%%nuclio config 
kind = "nuclio"
spec.build.baseImage = "mlrun/mlrun"

%nuclio: setting spec.build.baseImage to 'mlrun/mlrun'


### Install required packages
`%nuclio cmd` allows you to run image build instructions and install packages<br>
Note: `-c` option will only install in nuclio, not locally

In [4]:
%nuclio cmd -c pip install v3io

## Nuclio function implementation
this function can run in Jupyter or in nuclio (real-time serverless)

In [5]:
import v3io.dataplane
import json
import os

def init_context(context):
    access_key = os.getenv('V3IO_ACCESS_KEY', None)
    setattr(context, 'container', os.getenv('V3IO_CONTAINER', 'bigdata'))
    setattr(context, 'stream_path', os.getenv('STOCKS_STREAM', 'stocks/stocks_stream'))
    
    v3io_client = v3io.dataplane.Client(endpoint=os.getenv('V3IO_API', None), access_key=access_key)
    setattr(context, 'data', [])
    setattr(context, 'v3io_client', v3io_client) 
    setattr(context, 'limit', os.getenv('LIMIT', 10))
    try:
        resp = v3io_client.seek_shard(container=context.container, path=f'{context.stream_path}/0', seek_type='EARLIEST')
        setattr(context, 'next_location', resp.output.location)
    except:
        context.logger.info('Stream not updated yet')
    

    
def handler(context, event):
    
    if hasattr(context, 'next_location'):
        resp = context.v3io_client.get_records(container=context.container, path=f'{context.stream_path}/0', location=context.next_location, limit=context.limit)
    else:
        resp = context.v3io_client.seek_shard(container=context.container, path=f'{context.stream_path}/0', seek_type='EARLIEST')
        setattr(context, 'next_location', resp.output.location)
    context.next_location = resp.output.next_location
    context.logger.info('location: %s', context.next_location)

    for rec in resp.output.records:
        rec_data = rec.data.decode('utf-8')
        rec_json = json.loads(rec_data)
        context.data.append({'Time': rec_json['time'],
                             'Symbol': rec_json['symbol'],
                             'Sentiment': rec_json['sentiment'],
                             'Link': rec_json['link'],
                             'Content': rec_json['content']})

    context.data = context.data[-context.limit:]
    
    columns = [{'text': key, 'type': 'object'} for key in ['Time', 'Symbol', 'Sentiment', 'Link', 'Content']]
    data = [list(item.values()) for item in context.data]
    response = [{'columns': columns,
                'rows': data,
                'type': 'table'}]

    return response
                            

In [6]:
# nuclio: end-code

## Function invocation
the following section simulates nuclio function invocation and will emit the function results

In [7]:
# create a test event and invoke the function locally
init_context(context)
event = nuclio.Event(body='')
handler(context, event)

Python> 2020-10-07 14:00:36,471 [info] location: AQAAAGUAAAAnA0BRFgAAAA==


[{'columns': [{'text': 'Time', 'type': 'object'},
   {'text': 'Symbol', 'type': 'object'},
   {'text': 'Sentiment', 'type': 'object'},
   {'text': 'Link', 'type': 'object'},
   {'text': 'Content', 'type': 'object'}],
  'rows': [['2020-09-30 20:25:40',
    'AMZN',
    -0.5384615384615384,
    'https://www.investing.com/news/stock-market-news/fcc-commissioner-calls-for-new-scrutiny-of-undersea-data-cables-2312181',
    'By David Shepardson\nWASHINGTON (Reuters) - A member of the U.S. Federal Communications Commission on Wednesday called for new scrutiny of undersea cables that transmit nearly all the world s internet data traffic.\n"We must take a closer look at cables with landing locations in adversary countries," FCC Commissioner Geoffrey Starks said Wednesday at a commission meeting. "This includes the four existing submarine cables connecting the US and China, most of which are partially owned by Chinese state-owned companies."\nThe United States has repeatedly expressed concerns ab

## Deploy a function onto a cluster
the `%nuclio deploy` command deploy functions into a cluster, make sure the notebook is saved prior to running it !<br>check the help (`%nuclio help deploy`) for more information

In [8]:
from mlrun import code_to_function

# Export the bare function
fn = code_to_function('stream-viewer',
                      handler='handler')
fn.export('03-stream-viewer.yaml')

# Set parameters for current deployment
fn.set_envs({'V3IO_CONTAINER': 'bigdata',
             'STOCKS_STREAM': 'stocks/stocks_stream'})
fn.spec.max_replicas = 2

> 2020-10-07 14:00:45,434 [info] function spec saved to path: 03-stream-viewer.yaml


<mlrun.runtimes.function.RemoteRuntime at 0x7f47c1fca890>

In [9]:
addr = fn.deploy(project='stocks')

> 2020-10-07 14:00:52,039 [info] deploy started
[nuclio] 2020-10-07 14:00:54,156 (info) Build complete
[nuclio] 2020-10-07 14:00:58,202 (info) Function deploy complete
[nuclio] 2020-10-07 14:00:58,209 done updating stocks-stream-viewer, function address: 192.168.224.209:30617


In [10]:
# nuclio: ignore
# test the new API end point, take the address from the deploy log above
!curl {addr}

[{"columns": [{"text": "Time", "type": "object"}, {"text": "Symbol", "type": "object"}, {"text": "Sentiment", "type": "object"}, {"text": "Link", "type": "object"}, {"text": "Content", "type": "object"}], "rows": [["2020-09-22 00:00:00", "GOOGL", -0.5333333333333333, "https://www.investing.com/news/stock-market-news/sp-500-nasdaq-futures-rebound-after-selloff-2303071", "By Shreyashi Sanyal and Devik Jain\n(Reuters) - The S&P 500 and Nasdaq indexes were set to open higher on Tuesday, with beaten-down shares of technology-related companies leading early gains, while Dow futures were subdued on uncertainty over more U.S. fiscal stimulus.\nAll three of Wall Street s main indexes started the week on the back foot as fears about a new round of lockdowns in Europe and a stalemate in Congress over the size and shape of another coronavirus-response bill dented hopes of a swift economic recovery.\nThe blue-chip Dow (DJI) closed Monday with its worst session in two weeks, while the benchmark S&P 