## Using MLOps DRUM to test your custom models
**Author**: Tim Whittaker

#### Scope
We'll get our hands dirty by:

* Building a simple regression model using Scikit
* Using DRUM for Batch Scoring
* Using DRUM to get a REST API endpoint
* Show a simple example app connected to the REST API
* H2O, Keras, XGBoost, and DataRobot
* Add a DataRobot remote agent if you are interested in further model monitoring



In [None]:
!pip uninstall pandas pyarrow -y -q

In [None]:
!pip install datarobot-drum -q

[K     |████████████████████████████████| 9.8 MB 6.3 MB/s 
[K     |████████████████████████████████| 17.7 MB 406 kB/s 
[K     |████████████████████████████████| 4.3 MB 42.1 MB/s 
[K     |████████████████████████████████| 378 kB 46.1 MB/s 
[K     |████████████████████████████████| 147 kB 45.1 MB/s 
[K     |████████████████████████████████| 50 kB 3.7 MB/s 
[K     |████████████████████████████████| 781 kB 40.5 MB/s 
[K     |████████████████████████████████| 10.8 MB 31.1 MB/s 
[K     |████████████████████████████████| 67 kB 4.7 MB/s 
[K     |████████████████████████████████| 198 kB 45.9 MB/s 
[K     |████████████████████████████████| 101 kB 10.9 MB/s 
[K     |████████████████████████████████| 546 kB 59.8 MB/s 
[K     |████████████████████████████████| 49 kB 4.3 MB/s 
[K     |████████████████████████████████| 54 kB 2.8 MB/s 
[K     |████████████████████████████████| 53 kB 1.7 MB/s 
[?25h  Building wheel for strictyaml (setup.py) ... [?25l[?25hdone
  Building wheel for memo

In [None]:
!pip install tensorflow -q

[?25l[K     |▊                               | 10 kB 26.0 MB/s eta 0:00:01[K     |█▍                              | 20 kB 30.5 MB/s eta 0:00:01[K     |██▏                             | 30 kB 36.4 MB/s eta 0:00:01[K     |██▉                             | 40 kB 22.0 MB/s eta 0:00:01[K     |███▌                            | 51 kB 11.6 MB/s eta 0:00:01[K     |████▎                           | 61 kB 13.3 MB/s eta 0:00:01[K     |█████                           | 71 kB 9.9 MB/s eta 0:00:01[K     |█████▊                          | 81 kB 10.9 MB/s eta 0:00:01[K     |██████▍                         | 92 kB 12.0 MB/s eta 0:00:01[K     |███████                         | 102 kB 10.2 MB/s eta 0:00:01[K     |███████▉                        | 112 kB 10.2 MB/s eta 0:00:01[K     |████████▌                       | 122 kB 10.2 MB/s eta 0:00:01[K     |█████████▏                      | 133 kB 10.2 MB/s eta 0:00:01[K     |██████████                      | 143 kB 10.2 MB/s eta 0:0

In [None]:
!pip install scikit-learn -q

In [None]:
!pip install PyYAML -q

In [None]:
!pip install xgboost -q

## Train a regression model

A simple RandomForestRegressor to predict house prices using the Concrete compressive strength dataset found in this paper:

`I-Cheng Yeh, "Modeling of strength of high performance concrete using artificial neural networks," Cement and Concrete Research, Vol. 28, No. 12, pp. 1797-1808 (1998).`.

In [None]:
import pandas as pd
from sklearn.ensemble import RandomForestRegressor
import pickle
import datetime

# Read the train and test data
TRAIN_DATA_REG = "/content/mlops-examples/drum_overview/data/concrete_train.csv"  # 14 features
TEST_DATA_REG = "/content/mlops-examples/drum_overview/data/concrete_test.csv"  # 13 features - target is removed

reg_X_train = pd.read_csv(TRAIN_DATA_REG)
reg_Y_train = reg_X_train.pop('concrete_compressive_strength')

reg_X_test = pd.read_csv(TEST_DATA_REG)

# Fit the model
rf = RandomForestRegressor()
rf.fit(reg_X_train, reg_Y_train)

# Pickle the file and write it to the file system
with open("/content/mlops-examples/drum_overview/custom_model_reg/reg_rf_model.pkl", 'wb') as pkl:
    pickle.dump(rf, pkl)
    
# Call predict to confirm it works
rf.predict(reg_X_test)

array([49.82509346, 16.17489519, 30.26148085, 10.86747245,  4.08555209,
       25.16874222, 41.43588339,  5.80623598, 33.4288026 , 35.72417162,
       51.30772196, 19.29676668,  9.84905434,  9.02206925, 40.05919681,
        9.74501242,  5.77914646, 10.02351246, 43.37829291, 48.58469855,
       12.48972674, 16.54577615, 27.35344345, 36.08659468, 38.22351523,
       14.20869383, 16.87669016, 32.05100202, 38.03291648, 39.37887016,
       14.92698993, 12.01591194, 33.75278737, 40.07624755, 40.62613224,
       11.6422711 , 17.58414769, 17.84625199, 24.72558152, 26.545481  ,
       45.04287807, 52.88341594, 36.666968  , 14.95502402, 22.91273538,
       17.32868305, 26.51189466, 34.3441332 , 36.85799422, 45.51898193,
       52.17451051, 56.35563086, 66.55284301, 68.96538848, 70.67135895,
       72.50357247, 18.86535957, 23.26905658, 28.70276177, 30.22302385,
       31.55678562, 32.45887534, 10.00872911, 16.79444552, 23.93053296,
       26.72652361, 34.35783703, 39.21809125, 40.40861635, 15.13

## Testing the Model

Input the prediction dataset that includes all features except the target feature.

In [None]:
%%sh 
drum perf-test --code-dir /content/mlops-examples/drum_overview/custom_model_reg \
--input /content/mlops-examples/drum_overview/data/concrete_test.csv \
--target-type regression

DRUM performance test
Model:      /content/mlops-examples/drum_overview/custom_model_reg
Data:       /content/mlops-examples/drum_overview/data/concrete_test.csv
# Features: 8
Preparing test data...



Running test case with timeout: 600
Running test case: 41 bytes - 1 samples, 100 iterations
Running test case with timeout: 600
Running test case: 0.1MB - 2529 samples, 50 iterations
Running test case with timeout: 600
Running test case: 10MB - 252956 samples, 5 iterations
Running test case with timeout: 600
Running test case: 50MB - 1264781 samples, 1 iterations
Test is done stopping drum server

 size     samples   iters    min      avg      max      total     used    total 
                                                         (s)      (MB)    physic
                                                                            al  
                                                                           (MB) 
41              1     100    0.017    0.018    0.030     1.844       NA  

tput: terminal attributes: No such device or address



## Validating the Model

In [None]:
%%sh 
drum validation --code-dir /content/mlops-examples/drum_overview/custom_model_reg \
--input /content/mlops-examples/drum_overview/data/concrete_test.csv \
--target-type regression > drum_validation.log

In [None]:
%%sh
tail drum_validation.log



Validation checks results
      Test case          Status   Details
Basic batch prediction   PASSED          
Null value imputation    PASSED          


# Batch Scoring with DRUM
<a id="setup_complete"></a>

At this point our model has been written to disk and we want to start making predictions with it.  To do this, we'll leverage DRUM and it's ability to natively handle our scikit learn model, all we need to do is tell DRUM where it resides as well as the data we wish to score.  

There are a lot of frameworks which DRUM supports nateively, but for those which DRUM doesn't support of these shelf, we'll just need to create some custom hooks so DRUM.  In this example, we'll highlight some very simple custom hooks, and will provide links to more complex examples.  

In [None]:
%%sh 
drum score --code-dir /content/mlops-examples/drum_overview/custom_model_reg \
--input /content/mlops-examples/drum_overview/data/concrete_test.csv \
--output /content/mlops-examples/drum_overview/data/predictions.csv --target-type regression

In [None]:
pd.read_csv("/content/mlops-examples/drum_overview/data/predictions.csv").head()

Unnamed: 0,Predictions
0,49.825093
1,16.174895
2,30.261481
3,10.867472
4,4.085552


# Start the inference server locally

Batch scoring can be very useful, but the utility DRUM offers does not stop there.  We can also leverage DRUM to serve our model as a RESTful API endpoint.  The only thing that changes is the way we will structure the command - using the `server` mode instead of `score` model.  We'll also need to provide an address which is NOT in use.  

When starting the server, we'll use `subprocess.Popen` so we may interact with the server in this notebook

In [None]:
import subprocess
import requests
import pandas as pd
from io import BytesIO
import yaml
import time
import os
import datarobot as dr
from pprint import pprint

In [None]:
run_inference_server = ["drum",
              "server",
              "--code-dir","/content/mlops-examples/drum_overview/custom_model_reg", 
              "--address", "0.0.0.0:6789", 
              "--show-perf",
              "--target-type", "regression",
              "--logging-level", "info",
              "--show-stacktrace",
              "--verbose"
              ]

In [None]:
inference_server = subprocess.Popen(run_inference_server, stdout=subprocess.PIPE)

In [None]:
## confirm the server is running
time.sleep(10) ## snoozing before pinging the server to give it time to actually start
print('check status')
requests.request("GET", "http://0.0.0.0:6789").content

check status


b'{"message":"OK"}\n'

## Send data to server for inference

The request must provide our dataset as form data.  In order to do so, we'll create a simple python function to pass the data over appropriately.  We'll leverage the same function in our simple flask app a little later.  

In [None]:
def score(data, port = "6789"):
    b_buf = BytesIO()
    b_buf.write(data.to_csv(index=False).encode("utf-8"))
    b_buf.seek(0)
  
    url = "http://localhost:{}/predict/".format(port)
    files = [
        ('X', b_buf)
    ]
    response = requests.request("POST", url, files = files, timeout=None, verify=False)
    return response

In [None]:
# %%timeit
scoring_data = pd.read_csv("/content/mlops-examples/drum_overview/data/concrete_test.csv")
predictions = score(scoring_data).json() ## score entire dataset but only show first 5 records
pprint(predictions)

{'predictions': [49.8250934562,
                 16.1748951924,
                 30.2614808518,
                 10.8674724487,
                 4.0855520908,
                 25.1687422202,
                 41.4358833915,
                 5.8062359755,
                 33.4288025954,
                 35.7241716208,
                 51.3077219623,
                 19.2967666781,
                 9.8490543438,
                 9.0220692504,
                 40.0591968127,
                 9.7450124154,
                 5.7791464634,
                 10.0235124561,
                 43.3782929081,
                 48.5846985531,
                 12.4897267396,
                 16.5457761524,
                 27.3534434528,
                 36.0865946802,
                 38.223515226,
                 14.208693829,
                 16.8766901586,
                 32.0510020203,
                 38.0329164806,
                 39.3788701592,
                 14.9269899258,
                

In [None]:
requests.request("GET", "http://0.0.0.0:6789/").content

b'{"message":"OK"}\n'

In [None]:
inference_server.terminate()
inference_server.stdout.readlines()

[b'Detected REST server mode - this is an advanced option\n',
 b'Detected /content/mlops-examples/drum_overview/custom_model_reg/custom.py .. trying to load hooks\n',
 b'\x1b[32m \x1b[0m\n',
 b'\x1b[32m \x1b[0m\n',
 b'\x1b[32mComponent: Prediction Server\x1b[0m\n',
 b'\x1b[32mOutput:\x1b[0m\n',
 b'\x1b[32m------------------------------------------------------------\x1b[0m\n']

In [None]:
#Stop the flask server
%%sh
fuser -n tcp -k 6789

## Value Prop

One may ask, what is the benefit to be had here?  Well, first of, there is not need for me to write an api to get the model up and running.  Second, DRUM allows me to abstract the framework away (provided I'm using one that is natively supported, or I can write enough python so that DRUM understands how to hook up to the model.  

For example, I could hot swap models as I see fit

While we will run through several other frameworks with in `score` you can bet they are supported in `server` mode as well!