# Ingest Real-Time Stock Data to Iguazio NoSQL and Time-series DB
the following example function ingest real-time stock information from an internet service (World Trading Data) into iguazio platform.<br>
everytime the data is updated it updates a NoSQL table with the recent metadata and updates the time-series DB with the new metrics (price and volume)

The same code can run inside a nuclio (serverless) function and be automatically triggered on a predefined schedule (cron) or through HTTP requests<br>

the example demonstrate the use of `%nuclio` magic commands to specify environment variables, package dependencies,<br>configurations (such as the cron schedule), and to deploy functions automatically onto a cluster.

### Initialization 
need to fill the `API_TOKEN` environment variables with real credentials.<br>
obtain your free API token from [World Trading Data](https://www.worldtradingdata.com) <br>

## Initialize nuclio emulation, environment variables and configuration
use `# nuclio: ignore` for sections that dont need to be copied to the function

In [1]:
# nuclio: ignore
# if the nuclio-jupyter package is not installed run !pip install nuclio-jupyter
import nuclio 

copy the local credentials to the nuclio function config (-c option doesn't initialize locally)

In [2]:
%nuclio env -c V3IO_ACCESS_KEY=${V3IO_ACCESS_KEY}
%nuclio env -c V3IO_USERNAME=${V3IO_USERNAME}
%nuclio env -c V3IO_API=${V3IO_API}

initialize the token and frames URL both locally and in the function config

In [3]:
%nuclio env API_TOKEN= <Insert world trading data token>

%nuclio: setting 'API_TOKEN' environment variable


### Set function configuration 
use a cron trigger with 5min interval and define the base image<br>
for more details check [nuclio function configuration reference](https://github.com/nuclio/nuclio/blob/master/docs/reference/function-configuration/function-configuration-reference.md)

In [4]:
%%nuclio config 
spec.triggers.secs.kind = "cron"
spec.triggers.secs.attributes.interval = "300s"
spec.build.baseImage = "python:3.6-jessie"

%nuclio: setting spec.triggers.secs.kind to 'cron'
%nuclio: setting spec.triggers.secs.attributes.interval to '300s'
%nuclio: setting spec.build.baseImage to 'python:3.6-jessie'


### Install required packages
`%nuclio cmd` allows you to run image build instructions and install packages<br>
Note: `-c` option will only install in nuclio, not locally

In [5]:
%%nuclio cmd -c 
pip install requests
pip install v3io_frames

## Nuclio function implementation
this function can run in Jupyter or in nuclio (real-time serverless)

In [6]:
import json
import requests
import os
import pandas as pd
import datetime
import v3io_frames as v3f

stocks = os.getenv('STOCK_LIST','GOOG,MSFT,AMZN,AAPL,INTC')
token = os.getenv('API_TOKEN') 
url = 'https://www.worldtradingdata.com/api/v1/stock?symbol={0}&api_token={1}'.format(stocks,token)
client = v3f.Client('framesd:8081',container='bigdata')

# v3io update expression template 
expr_template = "symbol='{symbol}';name='{name}';currency='{currency}';exchange='{stock_exchange_short}';" + \
    "timezone='{timezone}';price={price};volume={volume};last_trade='{last_trade_time}'"

last_trade_times = {}

# nuclio handler function 
def handler(context, event):
    
    # reading latest stock information 
    r = requests.get(url)
    stocks=[]; dates=[]; volumes=[]; prices=[]; exchanges=[]
    
    context.logger.debug('read symbols')
    for stock in r.json()['data']:
        
        symbol = stock['symbol']
        last = last_trade_times.get(symbol)
        date = datetime.datetime.strptime(stock['last_trade_time'], '%Y-%m-%d %H:%M:%S')
        
        # update the stocks table and TSDB metrics in case of new data 
        if not last or date > last:
            # update NoSQL table with stock data
            expr = expr_template.format(**stock)
            context.logger.debug_with('update expression', symbol=symbol, expr=expr)
            client.execute('kv','stocks','update', args={'key':symbol, 'expression': expr})
        
            # update time-series DB with price and volume metrics (use pandas dataframe with a single row, indexed by date)
            last_trade_times[symbol] = date 
            stocks += [symbol]
            exchanges += [stock['stock_exchange_short']]
            dates +=[date]
            volumes += [float(stock['volume'])]
            prices += [float(stock['price'])]
               
    # write price and volume metrics to the Time-Series DB, add exchange label
    if len(stocks)>0:
        df = pd.DataFrame({'volume':volumes,'price': prices}, index=[dates,stocks,exchanges], columns=['volume','price'])
        df.index.names=['time','symbol','exchange']
        context.logger.debug_with('writing data to TSDB', stocks=stocks)
        client.write(backend='tsdb', table='stock_metrics',dfs=df)
        
    return r.json()

## Function invocation
the following section simulates nuclio function invocation and will emit the function results

In [7]:
# nuclio: ignore
# create a test event and invoke the function locally 
event = nuclio.Event(body='')
handler(context, event)

{'symbols_requested': 5,
 'symbols_returned': 5,
 'data': [{'symbol': 'AAPL',
   'name': 'Apple Inc.',
   'currency': 'USD',
   'price': '175.85',
   'price_open': '175.69',
   'day_high': '177.75',
   'day_low': '173.97',
   '52_week_high': '233.47',
   '52_week_low': '142.00',
   'day_change': '0.88',
   'change_pct': '0.50',
   'close_yesterday': '174.97',
   'market_cap': '829182016779',
   'volume': '27417664',
   'volume_avg': '28138059',
   'shares': '4715280000',
   'stock_exchange_long': 'NASDAQ Stock Exchange',
   'stock_exchange_short': 'NASDAQ',
   'timezone': 'EST',
   'timezone_name': 'America/New_York',
   'gmt_offset': '-18000',
   'last_trade_time': '2019-03-04 16:00:01'},
  {'symbol': 'AMZN',
   'name': 'Amazon.com, Inc.',
   'currency': 'USD',
   'price': '1696.17',
   'price_open': '1685.00',
   'day_high': '1709.43',
   'day_low': '1674.36',
   '52_week_high': '2050.50',
   '52_week_low': '1307.00',
   'day_change': '24.44',
   'change_pct': '1.46',
   'close_yeste

## Deploy a function onto a cluster
the `%nuclio deploy` command deploy functions into a cluster, make sure the notebook is saved prior to running it !<br>check the help (`%nuclio help deploy`) for more information

In [8]:
%nuclio deploy -p stocks -c

%nuclio: [nuclio.deploy] 2019-03-04 23:22:48,635 project name not found created new (stocks)
%nuclio: [nuclio.deploy] 2019-03-04 23:22:50,702 (info) Building processor image
%nuclio: [nuclio.deploy] 2019-03-04 23:23:24,001 (info) Pushing image
%nuclio: [nuclio.deploy] 2019-03-04 23:23:43,192 (info) Build complete
%nuclio: [nuclio.deploy] 2019-03-04 23:23:47,233 (info) Function deploy complete
%nuclio: [nuclio.deploy] 2019-03-04 23:23:47,238 done creating read-stocks, function address: 3.122.204.208:31641
%nuclio: function deployed
