<header style="padding:1px;background:#f9f9f9;border-top:3px solid #00b2b1"><img id="Teradata-logo" src="https://www.teradata.com/Teradata/Images/Rebrand/Teradata_logo-two_color.png" alt="Teradata" width="220" align="right" />

<b style = 'font-size:28px;font-family:Arial;color:#E37C4D'>ModelOps demo - ONNX with ModelOps BYOM no code</b>
</header>

![image](images/byom_meth.png) 

## Introduction

This Notebook will show you how to work with the Bring Your Own Model (BYOM) pattern and BYOM In-Vantage Scoring. This deployment pattern allows you to use whatever data science platform you want to perform model training and import back to Vantage to do a direct deployment, scoring. For understanding more about BYOM you can review official user documentation.

When you use BYOM mechanisms you can deploy models directly from IDE environment, or make a proper operationalization through ModelOps. With ModelOps, you get **full governance** around your models deployment and you can **leverage ModelOps automations** for Validation, Scoring and Monitoring with an intuitive user interface that let you audit at any time all the information from your models and provide dashboards to monitor and review alerts from your deployments.

This notebook will cover the Operationalization of the PIMA diabetes use case with ONNX standard BYOM format. **ONNX** (Open Neural Network Exchange) is a very efficient model format which was created by Microsoft and which adoption as a standard open format is incresingly rapidly. While the name suggests it is primarily related to neural networks, it can be used with most machine learning libraries and algorithms like sklearn.

## Steps in this Notebook

<li>1. Configure the Environment </li>
    <li>2. Connect to Vantage</li>
    <li>3. Train a model and export to ONNX</li>
    <li>4. Import the ONNX into ModelOps</li>
    <li>5. Go through Lifecycle - Evaluation (Automated Model Report)</li>
    <li>6. Go through Lifecycle - Approve </li>
    <li>7. Go through Lifecycle - Deploy (Publish and Schedule)</li>
    <li>8. Go through Lifecycle - Monitor (Data Drift and Performance)</li>
    <li>9. Configure Monitoring alert treshold (Optional) </li>
    <li>10. On demand Scoring from SQL (Optional)</li>

## Step 1. Configure the Environment

Here, we import the required libraries, set environment variables and environment paths (if required).



#### 1.1 Libraries installation

**A restart of the Kernel is needed to confirm changes**. We use -q parameter for a non-verbose log of the installation command, you may remove this parameter if you want to know all the steps of the pip installation.

In [None]:
%pip install -q teradataml==17.20.0.3 aoa==7.0.1 pandas==1.1.5 numpy==1.21.6 xgboost==0.90 nyoka==4.3.0 scikit-learn==0.24.2 onnx==1.10.2 skl2onnx==1.11.2 onnxruntime==1.9.0 protobuf==3.20.1 onnxmltools==1.7.0

#### 1.2 Libraries import

In [None]:
from teradataml import (
    create_context, 
    remove_context,
    get_context,
    get_connection,
    DataFrame,
    retrieve_byom,
    ONNXPredict,
    configure)
from teradatasqlalchemy.types import *
import os
import pandas as pd
import getpass
import logging
import sys
from xgboost import XGBClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn.pipeline import Pipeline

## Step 2. Connect to Vantage

<p style = 'font-size:16px;font-family:Arial'>You will be prompted to provide the password. Enter your password, press the Enter key, then use down arrow to go to next cell. Begin running steps with Shift + Enter keys.</p>

In [None]:
%run -i ../UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)
eng.execute('''SET query_band='DEMO=04_ModelOps_BYOM_PIMA_ONNX;' UPDATE FOR SESSION; ''')

## Step 3: Train a model and export to ONNX

In [None]:
from xgboost import XGBClassifier
from sklearn.preprocessing import MinMaxScaler
from sklearn.pipeline import Pipeline


train_pdf = DataFrame.from_query(f"""
SELECT 
    F.*, D.hasdiabetes 
FROM pima_patient_features F
JOIN pima_patient_diagnoses D
    ON F.patientid = D.patientid 
    WHERE F.patientid MOD 5 <> 0
""").to_pandas(all_rows=True)

features = ["NumTimesPrg", "Age", "PlGlcConc", "BloodP", "SkinThick", "TwoHourSerIns", "BMI", "DiPedFunc"]
target = "HasDiabetes"

# split data into X and y
X_train = train_pdf[features]
y_train = train_pdf[target]

model = Pipeline([('scaler', MinMaxScaler()),
                  ('xgb', XGBClassifier(eta=0.2, max_depth=6))])

model.fit(X_train, y_train)

#### Convert the model to ONNX

We can also convert the model to onnx format. This is a bit more involved as the client libraries for converting from sklearn/xgboost to onnx are not yet as mature.

In [None]:
import numpy as np
from skl2onnx import to_onnx
from skl2onnx import convert_sklearn, to_onnx, update_registered_converter
from skl2onnx.common.shape_calculator import (
    calculate_linear_classifier_output_shapes,
    calculate_linear_regressor_output_shapes)
from onnxmltools.convert.xgboost.operator_converters.XGBoost import convert_xgboost
from onnxmltools.convert import convert_xgboost as convert_xgboost_booster

update_registered_converter(
    XGBClassifier, 'XGBoostXGBClassifier',
    calculate_linear_classifier_output_shapes, convert_xgboost,
    options={'nocl': [True, False], 'zipmap': [True, False, 'columns']})


model_onnx = to_onnx(model, X_train.astype(np.float32), target_opset=15)
with open("artifacts/model.onnx", "wb") as f:
    f.write(model_onnx.SerializeToString())

## Step 4 -  Import into ModelOps to Operationalize

Note some of the images may contain PMML because the operationalization steps are the same. Go to the ModelOps UI and import this as a new model inside the **ModelOps Getting Started** project. 

Navigate to the projects screen and select the **ModelOps Getting Started** project. Click on it to navigate to the models screen of the project.

<img src="images/04_01.png" alt="Projects screen" />

If you haven't added a default connection, you will be prompted to add one. Check how to [add a default connection](link here).

Once there, click on the **DEFINE BYOM MODEL** button on the top right of the screen.

<img src="images/04_02.png" alt="Models screen" />

Ensure that **Enable Model Monitoring** is checked and classification as model type is selected. Then set the table the one you have been using (probably your username) and use as Prediction Expression

```sql
CAST(CAST(json_report AS JSON).JSONExtractValue('$.output_label[0]') AS INT)
```
<img src="images/04_03.png" alt="BYOM model import sidesheeet" />

Then click on **SAVE**, and the import job will start.

<img src="images/04_04.png" alt="BYOM model import job log" />





Go and try this Step by yourself. Launch ModelOps from this button below:

[![image](images/launchModelOps.png)](/modelops)

## Step 5. Go through Lifecycle - Evaluation (Automated Model Report)

Open the imported ModelID Lifecycle Screen. Review imported artifacts and evaluate the model with default logic (No code needed)

<img src="images/04_05.png" alt="BYOM model lifecycle"/>


Select Evaluation dataset and run the evaluation process
                                 
<img src="images/04_06.png" alt="Evaluation" width="500" height="500"/>

Now you can review the Model evaluation report generated with default metrics and confusion matrix plot - this can be customized with a custom evaluation logic (later in this notebook)

Click on View Report from ModelID Lifecycle screen

<img src="images/04_07.png" alt="Model Evaluation Report"/>

Go and try this Step by yourself. Launch ModelOps from this button below:

[![image](images/launchModelOps.png)](/modelops)

## Step 6. Go through Lifecycle - Approve 

Open the imported ModelID Lifecycle Screen. Click on Approve button to move forward on the next stage

Include description of approval and accept

<img src="images/04_08.png" alt="Approval" width="500" height="500"/>



Go and try this Step by yourself. Launch ModelOps from this button below:

[![image](images/launchModelOps.png)](/modelops)

## Step 7. Go through Lifecycle - Deploy (Publish and Schedule)

Open the imported ModelID Lifecycle Screen. Click on Deploy button to publish Model in Vantage

<img src="images/04_09.png" alt="Model Lifecycle screen"/>

Now select In-Vantage engine and click next

<img src="images/04_10.png" alt="Deployment Engine" width="500" height="500"/>

Now select which Connection (here you could use a Service Connection to run in Production optionally), Database and table you want to publish your BYOM model. Use your user and the table "aoa_byom_models"

<img src="images/04_11.png" alt="Deployment Publish" width="500" height="500"/>

Now select if you want to schedule your model scoring, datset template to gather where you want to store your predictions. 
and run the deploy process

<img src="images/04_12.png" alt="Deployment Schedule" width="500" height="500"/>


Go and try this Step by yourself. Launch ModelOps from this button below:

[![image](images/launchModelOps.png)](/modelops)

## Step 8. Go through Lifecycle - Monitor (Data Drift and Performance)

Open the Deployment and review details of the deployed model

<img src="images/04_14.png" alt="Deployment details"/>

you can run the Prediction job from here , click the button to run the prediction job

<img src="images/04_15.png" alt="Deployment Scoring Job" />


You can also review the predictions, you can take this query and run into your SQL IDE or in Notebook later

<img src="images/04_16.png" alt="Deployment Predictions" width="500" height="500"/> <img src="images/04_17.png" alt="Deployment Predictions query" width="500" height="500"/>

Review your Feature Drift, Prediction Drift. Here ModelOps shows the comparison of the distribution of data between training and evaluation/scoring, every scoring this gets updated and different KPIs are calculated. The one we used by default for monitoring is the Population Stability Index (PSI)

<img src="images/04_18.png" alt="Feature Drift" />
<img src="images/04_19.png" alt="Prediction Drift" />


For Performance Monitoring, we track over time the metrics of the model. For generating new metrics we need to create a new evaluation dataset and run an evaluation job.

<img src="images/04_20.png" alt="Performance Drift" />

this has been generated using this query for evaluation dataset target:
SELECT * FROM pima_patient_diagnoses F WHERE F.patientid MOD 8 <> 0

Now the performance has changed:

<img src="images/04_21.png" alt="Performance Drift change" />

Go and try this Step by yourself. Launch ModelOps from this button below:

[![image](images/launchModelOps.png)](/modelops)

## Step 9. Configure Monitoring alert treshold (Optional) 

we can update the Alerts configuration for the models. 

First enable Alert from Model catalog

<img src="images/04_23.png" alt="Alert Configuration" /> 

Now, you can change your default alerting mechanism.

Go to your Model, and find the Tab "Alert Configuration"

<img src="images/04_22.png" alt="Alert Configuration" /> 

Let's change the PSI value to 0.01 instead 0.2
Use Edit option:

<img src="images/04_24.png" alt="Alert Configuration PSI" width="500" height="500"/> <img src="images/04_25.png" alt="Alert Configuration PSI" width="500" height="500"/> 

After few minutes, this has generated new alerts on the model deployed
Go to Menu -> Alert and review the alerts generated for your model

<img src="images/04_26.png" alt="Alerts" />

Now go to ModelID and click on View Model Drift

<img src="images/04_27.png" alt="Model Lifecycle View Model Drift" width="500" height="500"/>

Now review the Feature drift, you can check this screen also from your deployments
<img src="images/04_28.png" alt="Feature Drift" />

Go and try this Step by yourself. Launch ModelOps from this button below:

[![image](images/launchModelOps.png)](/modelops)

## Step 10 On demand Scoring from SQL (Optional)

### 10.1 View Published Models

Once deployed via ModelOps, we can view the models published to vantage by querying the table they are published to. Note this information is available via the AOA APIs also.

In [None]:
DataFrame.from_query("SELECT TOP 10 * FROM aoa_byom_models WHERE model_type='ONNX'").head(10)

### 10.2 On-Demand Scoring
Configuring VAL and BYOM locations:

In [None]:
# configure byom/val installation
configure.val_install_location = "VAL"
configure.byom_install_location = "MLDB"


Replace model_version with the one we have operationalized:

In [None]:
#teradataml version
model_version="32d3f5c0-4ac8-45b7-944a-352c3b1ffc7a"

model = DataFrame.from_query(f"""
SELECT * FROM aoa_byom_models 
    WHERE model_version='{model_version}'
""")


preds = ONNXPredict(
        modeldata=model,
        newdata=DataFrame.from_query("SELECT * FROM pima_patient_features WHERE patientid MOD 5 = 0"),
        accumulate=['PatientId'])

preds.result.head(10)

In [None]:
#SQL version
query = f"""
SELECT * FROM MLDB.ONNXPredict (
    ON (SELECT * FROM pima_patient_features WHERE patientid MOD 5 = 0) AS InputTable
    ON (SELECT * FROM aoa_byom_models 
            WHERE model_version='{model_version}') AS ModelTable DIMENSION
    USING
      Accumulate ('patientid')
) AS td;
"""

DataFrame.from_query(query).head(10)

<footer style="padding:10px;background:#f9f9f9;border-bottom:3px solid #394851">©2023 Teradata. All Rights Reserved</footer>