<a href="https://colab.research.google.com/github/datarobot-community/mlops-examples/blob/master/MLOps20DRUM/Main_Script.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Using MLOps DRUM to test your custom models
**Author**: Matthew Cohen

#### Scope
The scope of this Notebook is to provide examples for using the MLOps DRUM library to test your custom model locally. 

This includes an example for both a regression and binary classification example:
1. Create a new random foreset models
1. Implement a function in custom.py to do additional prediction request pre/post processing
1. Validate they stand up to errors in input data
1. Request predictions with a test dataset

There are also examples to: 
- Test batch predictions
- Run drum as a web service
- Train a custom model


In [None]:
#Clone the repository
!git clone https://github.com/datarobot-community/mlops-examples

In [None]:
!pip install -r /content/mlops-examples/'MLOps DRUM'/requirements.txt

In [1]:
import pandas as pd
import numpy as np
import os
from sklearn.svm import SVR
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import pickle

## Train a regression model

A simple RandomForestRegressor to predict house prices in Boston.

In [2]:
# Read the train and test data
TRAIN_DATA_REG = "/content/mlops-examples/MLOps DRUM/data/boston_housing_train.csv"  # 14 features
TEST_DATA_REG = "/content/mlops-examples/MLOps DRUM/data/boston_housing_test.csv"  # 13 features - target is removed

reg_X_train = pd.read_csv(TRAIN_DATA_REG)
reg_Y_train = reg_X_train.pop('MEDV')

reg_X_test = pd.read_csv(TEST_DATA_REG)

# Fit the model
reg_rf_model = RandomForestRegressor()
reg_rf_model.fit(reg_X_train, reg_Y_train)

# Pickle the file and write it to the file system
with open("/content/mlops-examples/MLOps DRUM/custom_model_reg/reg_rf_model.pkl", 'wb') as pkl:
    pickle.dump(reg_rf_model, pkl)
    
# Call predict to confirm it works
reg_rf_model.predict(reg_X_test)

array([25.82 , 21.722, 34.747, 33.971, 35.756, 26.917, 21.962, 23.533,
       16.948])

## Generate the model template file for any additional pipeline processing

This file, custom.py, is optional but allows you to insert additional processing steps into the flow of getting predictions.  The following functions are available:

* init
* load_model
* transform
* score
* post_process

Place the file in the location specified by the --code-dir argument.  For this example, you must edit the transform function in custom.py to impute any null values to 0.  Please see the comments in custom.py for further description information of each function.

## Validate the regression model can handle data with errors

The validation check takes the input file and alters it to test various fail conditions, such as setting column values to null.  For this example, you must edit the transform function in custom.py to impute any null values to 0.

In [None]:
!drum validation --code-dir /content/mlops-examples/'MLOps DRUM'/custom_model_reg --input /content/mlops-examples/'MLOps DRUM'/data/boston_housing_test.csv --target-type regression

## Test the regression model can return predictions 

Input the prediction dataset that includes all features except the target feature.

In [11]:
!drum score --code-dir /content/mlops-examples/'MLOps DRUM'/custom_model_reg --input /content/mlops-examples/'MLOps DRUM'/data/boston_housing_test.csv  --target-type regression--output cmrunner_test_pred_results.csv --verbose

Detected score mode
Detected /Users/thodoris.petropoulos/github/mlops-examples-wip/MLOps DRUM/custom_model_reg/custom.py .. trying to load hooks
[32m [0m
[32m [0m
[32mComponent: generic_predictor[0m
[32mLanguage:  Python[0m
[32mOutput:[0m
[32m------------------------------------------------------------[0m
[32m------------------------------------------------------------[0m
[32mRuntime:    0.0 sec[0m
[32mNR outputs: 0[0m
[32m [0m


## Testing model performance

Use this to asses model response time for prediction requests.

In [15]:
!drum perf-test --code-dir /content/mlops-examples/'MLOps DRUM'/custom_model_reg --input /content/mlops-examples/'MLOps DRUM'/data/boston_housing_test.csv   --target-type regression

DRUM performance test
Model:      /Users/thodoris.petropoulos/github/mlops-examples-wip/MLOps DRUM/custom_model_reg
Data:       /Users/thodoris.petropoulos/github/mlops-examples-wip/MLOps DRUM/data/boston_housing_test.csv
# Features: 13
Preparing test data...



Running test case with timeout: 180
Running test case: 72 bytes - 1 samples, 100 iterations
[KProcessing |################################| 100/100
[?25hRunning test case with timeout: 180
Running test case: 0.1MB - 1449 samples, 50 iterations
[KProcessing |################################| 50/50
[?25hRunning test case with timeout: 180
Running test case: 10MB - 144964 samples, 5 iterations
[KProcessing |################################| 5/5
[?25hRunning test case with timeout: 180
Running test case: 50MB - 724823 samples, 1 iterations
[KProcessing |################################| 1/1
[?25hTest is done stopping drum server
[m[?7h[4l>7[r[?1;3;4;6l8
  size     samples   iters    min     avg     max    used (MB) 

## Prediction server mode

The code below launchs drum as a server and stop program flow.  So to test that it responds to prediction requests, issue this command in a terminal shell or another notebook environment:

curl -F "X=@./data/boston_housing_test.csv" localhost:6789/predict/

In [19]:
!drum server --code-dir /content/mlops-examples/'MLOps DRUM'/custom_model_reg --target-type regression --address localhost:6789

If you also want to see examples of fitting models using DRUM, see the examples [here](https://github.com/datarobot-community/mlops-examples/tree/master/Custom%20Model%20Examples/Readmissions).