# Running Presto SQL Queries from a Nuclio Function

This notebook demonstrates how to create and test a Nuclio function that reads data via Presto

In [1]:
# nuclio: ignore
import nuclio

<a id="read-presto-init"></a>
## Initialize Nuclio Emulation, Environment Variables, and Configuration

> **Note:** Use `# nuclio: ignore` for sections that don't need to be copied to the function.

In [2]:
# nuclio: ignore
# copy the secret files neeeded for Presto to v3io directory
# The secrete files are used to store the credentials for the presto session  

!mkdir -p /v3io/${V3IO_HOME}/secrets
!cp /var/run/iguazio/secrets/* /v3io/${V3IO_HOME}/secrets

In [3]:
%nuclio env -c DATABASE_URL=${DATABASE_URL}
%nuclio env -c V3IO_USERNAME=${V3IO_USERNAME}

In [4]:
%%nuclio cmd -c
pip install pandas
pip install requests
pip install git+https://github.com/v3io/PyHive.git@v0.6.999 # Pyhive version with bug fixes
pip install sqlalchemy

In [5]:
%%nuclio config 
spec.build.baseImage = "python:3.6-jessie"

%nuclio: setting spec.build.baseImage to 'python:3.6-jessie'


In [6]:
# Mount secrets as /var/run/iguazio/secrets - this is the location where DATABASE_URL is configured to read from
%nuclio mount /var/run/iguazio/secrets ~/secrets

mounting volume path /var/run/iguazio/secrets as ~/secrets


In [7]:
import pandas as pd 
import os 
from sqlalchemy.engine import create_engine 
import pyhive

<a id="read-presto-Handler"></a>
## Function Handler - connect to Presto and use SQL to read data

In [8]:
def handler(context, event):
    
    mnemonic = event.body.decode('utf-8').strip()
    context.logger.info(mnemonic)
    
    # DATABASE_URL contains the Presto URL, as well as access key and location of secrets
    engine = create_engine (os.getenv('DATABASE_URL'))

    # note - make sure to create and popultate the stocks_tab table in advance - (check the getting started section in the collect-n-explore notebook)
    table_path = os.path.join('v3io.users."'+str(os.getenv('V3IO_USERNAME'))+'/examples/stocks_tab"')
    query = 'select max(endprice) endprice from '+table_path+"  where mnemonic = '"+mnemonic+"'"
    context.logger.info(query)

    df = pd.read_sql(query,engine)
    return (df.loc[0,'endprice'])

<a id="read-presto-trigger"></a>
## Trigger the Function - test locally

In [9]:
# nuclio: ignore
# note - make sure to popultate the stocks_tab table by running getting started collect-n-explore section
stock = 'BAYN' 

event = nuclio.Event(body=bytes(stock, 'utf-8'))
output = handler(context, event)
print(output)

Python> 2019-06-11 11:14:27,404 [info] BAYN
Python> 2019-06-11 11:14:27,474 [info] select max(endprice) endprice from v3io.users."iguazio/examples/stocks_tab"  where mnemonic = 'BAYN'
90.85


In [10]:
%nuclio show

%nuclio: notebook read-from-presto exported
Config:
apiVersion: nuclio.io/v1
kind: Function
metadata:
  annotations:
    nuclio.io/generated_by: function generated at 11-06-2019 by iguazio from /User/read-from-presto.ipynb
  labels: {}
  name: read-from-presto
spec:
  build:
    baseImage: python:3.6-jessie
    commands:
    - pip install pandas
    - pip install requests
    - 'pip install git+https://github.com/v3io/PyHive.git@v0.6.999 # Pyhive version
      with bug fixes'
    - pip install sqlalchemy
    functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlciBvbiAyMDE5LTA2LTExIDExOjE0CgppbXBvcnQgcGFuZGFzIGFzIHBkIAppbXBvcnQgb3MgCmZyb20gc3FsYWxjaGVteS5lbmdpbmUgaW1wb3J0IGNyZWF0ZV9lbmdpbmUgCmltcG9ydCBweWhpdmUKCmRlZiBoYW5kbGVyKGNvbnRleHQsIGV2ZW50KToKICAgIAogICAgbW5lbW9uaWMgPSBldmVudC5ib2R5LmRlY29kZSgndXRmLTgnKS5zdHJpcCgpCiAgICBjb250ZXh0LmxvZ2dlci5pbmZvKG1uZW1vbmljKQogICAgCiAgICBlbmdpbmUgPSBjcmVhdGVfZW5naW5lIChvcy5nZXRlbnYoJ0RBVEFCQVNFX1VSTCcpKQoKICAgIHRhYmxlX3BhdG

<a id="read-presto-deploy"></a>
## Deploy the Function

Run the following command to deploy the function:

In [11]:
%nuclio deploy -n read-from-presto -p examples -c

[nuclio.deploy] 2019-06-11 11:14:35,358 (info) Building processor image
[nuclio.deploy] 2019-06-11 11:14:53,512 (info) Pushing image
[nuclio.deploy] 2019-06-11 11:14:54,520 (info) Build complete
[nuclio.deploy] 2019-06-11 11:14:58,555 (info) Function deploy complete
[nuclio.deploy] 2019-06-11 11:14:58,561 done updating read-from-presto, function address: 52.58.191.99:31618
%nuclio: function deployed


<a id="read-presto-test"></a>
## Test the Function

In [None]:
# nuclio: ignore
# test the new API end point, take the address from the deploy log above
!curl -X post -d "BAYN" URL:PORT