# Drum

In [1]:
!drum --version

drum 1.3.0


# Test scoring with Sklearn model using DRUM
<a id="setup_complete"></a>

Next snippet is to test scoring.  The functionality can also be used to do batch scoring with the model.  

`../src/custom_model` contains the sklearn pkl as well as the `custom.py` file which contains hooks that allow DRUM to hook into our pkl.  

In [2]:
!drum score --code-dir ../src/custom_model --input ../data/boston_housing_test.csv --target-type regression --verbose

Detected score mode
Detected /Users/timothy.whittaker/Desktop/ODSC/odsc-ml-drum/src/custom_model/custom.py .. trying to load hooks
[32m [0m
[32m [0m
[32mComponent: generic_predictor[0m
[32mLanguage:  Python[0m
[32mOutput:[0m
[32m------------------------------------------------------------[0m
[32m------------------------------------------------------------[0m
[32mRuntime:    0.0 sec[0m
[32mNR outputs: 0[0m
[32m [0m
    Predictions
0        26.210
1        22.140
2        34.930
3        34.735
4        35.200
5        26.795
6        20.985
7        25.035
8        18.205
9        18.805
10       16.365
11       19.440
12       21.695
13       20.010
14       18.400
15       19.685
16       22.185
17       18.350
18       19.790
19       18.740


# Start the inference server locally

When starting the server, we'll use `subprocess.Popen` so we may interact with the server in this notebook

In [49]:
import subprocess
import requests
import pandas as pd
from io import BytesIO
import yaml
import time
import os
import datarobot as dr
from pprint import pprint

In [4]:
run_inference_server = ["drum",
              "server",
              "--code-dir","../src/custom_model", 
              "--address", "0.0.0.0:6789", 
              "--show-perf",
              "--target-type", "regression",
              "--logging-level", "info",
              "--show-stacktrace",
              "--verbose"
              ]

In [5]:
inference_server = subprocess.Popen(run_inference_server, stdout=subprocess.PIPE)

## Ping the Server to make sure it is running

In [6]:
## confirm the server is running
time.sleep(5) ## snoozing before pinging the server to give it time to actually start
print('check status')
requests.request("GET", "http://0.0.0.0:6789/").content

check status


b'{"message":"OK"}\n'

In [7]:
# df = pd.read_csv('/content/datarobot-user-models/tests/testdata/boston_housing_inference.csv')
df = pd.read_csv('../data/boston_housing_test.csv')

In [8]:
def score(data):
    b_buf = BytesIO()
    b_buf.write(data.to_csv(index=False).encode("utf-8"))
    b_buf.seek(0)
  
    url = "http://localhost:6789/predict/"
    files = [
        ('X', b_buf)
    ]
    response = requests.request("POST", url, files = files, timeout=None, verify=False)
    return response

## Send data to server for inference

In [9]:
df.head()

Unnamed: 0,crim,zn,indus,chas,nox,rm,age,dis,rad,tax,ptratio,b,lstat
0,0.00632,18.0,2.31,0,0.538,6.575,65.2,4.09,1,296,15.3,396.9,4.98
1,0.02731,0.0,7.07,0,0.469,6.421,78.9,4.9671,2,242,17.8,396.9,9.14
2,0.02729,0.0,7.07,0,0.469,7.185,61.1,4.9671,2,242,17.8,392.83,4.03
3,0.03237,0.0,2.18,0,0.458,6.998,45.8,6.0622,3,222,18.7,394.63,2.94
4,0.06905,0.0,2.18,0,0.458,7.147,54.2,6.0622,3,222,18.7,396.9,5.33


In [10]:
# %%timeit
predictions = score(df).json() ## score entire dataset but only show first 5 records
pprint(predictions)

{'predictions': [26.21,
                 22.14,
                 34.93,
                 34.735,
                 35.2,
                 26.795,
                 20.985,
                 25.035,
                 18.205,
                 18.805,
                 16.365,
                 19.44,
                 21.695,
                 20.01,
                 18.4,
                 19.685,
                 22.185,
                 18.35,
                 19.79,
                 18.74]}


In [11]:
pd.DataFrame(predictions)

Unnamed: 0,predictions
0,26.21
1,22.14
2,34.93
3,34.735
4,35.2
5,26.795
6,20.985
7,25.035
8,18.205
9,18.805


## Start the Flask App

Set a few environment variables for the flask app

In [12]:
os.environ["LC_ALL"] = "C.UTF-8"
os.environ["LANG"] = "C.UTF-8"
os.environ["FLASK_APP"] = "server.app"
os.environ["FLASK_ENV"] = "development"

run the flask app and lock the interpreter.  

In [13]:
!cd ../src && python -m flask run --host 0.0.0.0 --port 8080

 * Serving Flask app "server.app" (lazy loading)
 * Environment: development
 * Debug mode: on
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 245-021-255
127.0.0.1 - - [24/Oct/2020 14:38:04] "[37mGET /frontend HTTP/1.1[0m" 200 -
      crim    zn  indus  chas    nox  ...  rad    tax  ptratio      b  lstat
0  0.00632  18.0   2.31   0.0  0.538  ...  1.0  296.0     15.3  396.9   4.98

[1 rows x 13 columns]
making request
prediciton [26.21]
heylksdfmlsdmsdflklmsdfsdf
127.0.0.1 - - [24/Oct/2020 14:38:06] "[37mPOST /frontend HTTP/1.1[0m" 200 -
^C


In [57]:
# requests.request("POST","http://localhost:6789/shutdown/").json()

In [15]:
inference_server.terminate()
inference_server.stdout.readlines()

[b'Detected REST server mode - this is an advanced option\n',
 b'Detected /Users/timothy.whittaker/Desktop/ODSC/odsc-ml-drum/src/custom_model/custom.py .. trying to load hooks\n',
 b'\x1b[32m \x1b[0m\n',
 b'\x1b[32m \x1b[0m\n',
 b'\x1b[32mComponent: prediction_server\x1b[0m\n',
 b'\x1b[32mLanguage:  Python\x1b[0m\n',
 b'\x1b[32mOutput:\x1b[0m\n',
 b'\x1b[32m------------------------------------------------------------\x1b[0m\n',
 b' * Serving Flask app "datarobot_drum.drum.server" (lazy loading)\n',
 b' * Environment: production\n',
 b'   Use a production WSGI server instead.\n',
 b' * Debug mode: off\n']

# Monitoring Deployments

What follows will require a DataRobot account.  YOu can get a trial account at [https://www.datarobot.com/trial/](https://www.datarobot.com/trial/)

The following will execute a script will do a lot of things.  Specifically
* deploy the model package on datarobot which will be used for our external model
* download agents service
* configure and start up the agents service

In [41]:
! cd .. && ./setup.sh

Uploading training data - ./data/boston_housing.csv. This may take some time...
Training dataset uploaded. Catalog ID 5f94b867c35abe1be8b018f5.
Create model package
Deploy model package
Enable feature drift

Done.
DEPLOYMENT_ID=5f94b88a75e84a30d5431c79, MODEL_ID=5f94b889678d7458b6e8ba10
deployment details written to deployment_detail.yaml
grab agents tarball
unpack agents tarball
configuring agents
You should consider upgrading via the 'pip install --upgrade pip' command.[0m
INFO: AGENT_CONFIG_YAML=/Users/timothy.whittaker/Desktop/ODSC/odsc-ml-drum/datarobot-mlops-agent-6.2.4/conf/mlops.agent.conf.yaml
INFO: AGENT_LOG_PROPERTIES=/Users/timothy.whittaker/Desktop/ODSC/odsc-ml-drum/datarobot-mlops-agent-6.2.4/conf/mlops.log4j2.properties
INFO: AGENT_JVM_OPT=-Xmx1G
INFO: AGENT_JAR_PATH=/Users/timothy.whittaker/Desktop/ODSC/odsc-ml-drum/datarobot-mlops-agent-6.2.4/lib/mlops-agent-6.2.4.jar
INFO: AGENT_LOG_PATH=/Users/timothy.whittaker/Desktop/ODSC/odsc-ml-drum/datarobot-mlops-agent-6.2.4/l

# Adding Monitoring with MLOps Monitoring Agents

## Monitoring With DRUM

There are a few addition parameters we should set for the command line utility, or we may just create environment variables, and allow the drum utility to pick up the details from there.  

```
  --monitor             Monitor predictions using DataRobot MLOps. True or
                        False. (env: MONITOR).Monitoring can not be used in
                        unstructured mode.
  --deployment-id DEPLOYMENT_ID
                        Deployment id to use for monitoring model predictions
                        (env: DEPLOYMENT_ID)
  --model-id MODEL_ID   MLOps model id to use for monitoring predictions (env:
                        MODEL_ID)
  --monitor-settings MONITOR_SETTINGS
                        MLOps setting to use for connecting with the MLOps
                        Agent (env: MONITOR_SETTINGS)
```
For today, we'll set environment variables to add monitoring. 


In [44]:
with open("../deployment_detail.yaml", "r") as f:
    dep_details = yaml.load(f, Loader = yaml.FullLoader)

In [49]:
os.environ["MONITOR"] = "True"
os.environ["DEPLOYMENT_ID"] = dep_details["DEPLOYMENT_ID"]
os.environ["MODEL_ID"] = dep_details["MODEL_ID"]
os.environ["MONITOR_SETTINGS"] = "spooler_type=filesystem;directory=/tmp/ta;max_files=5;file_max_size=1045876000"

In [50]:
inference_server_with_monitoring = subprocess.Popen(run_inference_server, stdout=subprocess.PIPE)

In [51]:
!cd ../src && python -m flask run --host 0.0.0.0 --port 8080

 * Serving Flask app "server.app" (lazy loading)
 * Environment: development
 * Debug mode: on
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 245-021-255
      crim    zn  indus  chas    nox  ...  rad    tax  ptratio      b  lstat
0  0.00632  18.0   2.31   0.0  0.538  ...  2.0  296.0     12.1  396.9   4.98

[1 rows x 13 columns]
making request
prediciton [26.5]
heylksdfmlsdmsdflklmsdfsdf
127.0.0.1 - - [24/Oct/2020 14:55:07] "[37mPOST /frontend HTTP/1.1[0m" 200 -
      crim    zn  indus  chas    nox  ...  rad    tax  ptratio      b  lstat
0  0.00632  18.0   2.31   0.0  0.538  ...  2.0  296.0     12.1  396.9   4.98

[1 rows x 13 columns]
making request
prediciton [27.03]
heylksdfmlsdmsdflklmsdfsdf
127.0.0.1 - - [24/Oct/2020 14:55:11] "[37mPOST /frontend HTTP/1.1[0m" 200 -
      crim    zn  indus  chas    nox  ...  rad    tax  ptratio      b  lstat
0  0.00632  18.0   2.31   0.0  0.538  ...  2.0  350.0     12.1 

In [53]:
subprocess.call("../{}/bin/stop-agent.sh".format(agents_dir))

0

In [54]:
## check that agent is stopped 
check = subprocess.Popen(["../{}/bin/status-agent.sh".format(agents_dir)], stdout=subprocess.PIPE)
print(check.stdout.readlines())
check.terminate()

[b'DataRobot MLOps-Agent is not running as a service.\n']


In [50]:
client = dr.Client(os.environ["TOKEN"], os.path.join(os.environ["ENDPOINT"], "api/v2"))

In [53]:
deployment = dr.Deployment.get(dep_details["DEPLOYMENT_ID"])
deployment
deployment.get_service_stats()

DataError: {'metrics': DataError({'executionTime': DataError({'value': DataError({0: DataError(value is not int), 1: DataError(value is not float)})}), 'responseTime': DataError({'value': DataError({0: DataError(value is not int), 1: DataError(value is not float)})})})}

In [None]:
service_stats = deployment.get_service_stats()
service_stats.metrics