In [39]:
from datetime import datetime
import time
import joblib
import pytz
import sys

from pathlib import Path
import numpy as np
import pandas as pd

from arthurai import ArthurAI
from arthurai.common.constants import InputType, OutputType, Stage

sys.path.append("..")
from model_utils import load_datasets

In this guide, we'll use the credit dataset (and a pre-trained model) to onboard a new model to the Arthur platform. We'll walk through registering the model using a sample of the training data. This is an example of a batch model.  We'll also mark a few specific attributes to be monitored for Bias. And we will enable Explainability.

#### Set up connection
Supply your API Key below to authenticate with the platform.

In [41]:
# credentials are being passed to the client via environment variables
connection = ArthurAI()

## Create Model

We'll instantiate a model object with a small amount of metadata about the model input and output types. Then, we'll use a sample of the training data to register the full data schema for this Tabular model.

In [42]:
arthur_model = connection.model(partner_model_id=f"CreditRiskModel_Batch_FG-{datetime.now().strftime('%Y%m%d%H%M%S')}",
                                display_name="Credit Risk Batch",
                                input_type=InputType.Tabular,
                                output_type=OutputType.Multiclass,
                                is_batch=True)

In [43]:
(X_train, Y_train), (X_test, Y_test) = load_datasets("../fixtures/datasets/credit_card_default.csv")

In [44]:
Y_train.head()

24240    0
8264     0
19104    1
25383    0
20824    0
Name: default payment next month, dtype: int64

In [45]:
X_train.head()

Unnamed: 0,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,...,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6
24240,360000,1,2,2,31,0,0,0,0,0,...,19013,17929,20759,3436,5059,1383,2000,3030,2009,2018
8264,360000,1,2,1,36,0,0,-1,-1,-1,...,1901,2355,4206,5889,1204,1901,2355,4206,5889,0
19104,30000,2,2,1,39,2,0,0,2,2,...,14911,15819,15346,16666,1500,3000,1500,0,1500,2000
25383,30000,2,2,2,29,2,2,0,0,0,...,36775,36914,33305,33688,0,2206,1500,2000,6000,6000
20824,220000,1,2,2,30,0,0,0,0,0,...,189016,137788,106713,96588,7667,7261,5170,4054,4500,1225


In [46]:
# load our pre-trained classifier so we can generate predictions
sk_model = joblib.load("../fixtures/serialized_models/credit_model.pkl")

# get model predictions
preds = sk_model.predict_proba(X_train)
X_train["prediction_1"] = preds[:, 1]

# # get ground truth labels
X_train["gt"] = Y_train

We need to register the schema for the outputs of the model: what will a typical prediction look like and what will a typical ground truth look like? What names, shapes, and datatypes should Arthur expect for these objects?

We'll do this all in one step with the *.build()* method. All we need to supply is:
  * the training dataframe
  * the mapping that related predictions to ground truth
  * positive predicted attribute label
  
Our classifier will be making predictions about class *0* and class *1* and will return a probability score for each class. Therefore, we'll set up a name *prediction_0* and a name *prediction_1*. Additionally, our groundtruth will be either a 0 or 1, but we'll always represent ground truth in the one-hot-endoded form. Therefore, we create two fields called *gt_0* and *gt_1*. We link these all up in a dictionary and pass that to the model.  

In [47]:
# Map our prediction attribute to the ground truth value
prediction_to_ground_truth_map = {
    "prediction_1": 1
}

arthur_model.build(X_train, 
                   ground_truth_column="gt",
                   pred_to_ground_truth_map=prediction_to_ground_truth_map)

2022-07-29 10:51:03,375 - arthurai.core.models - INFO - Please review the inferred schema. If everything looks correct, lock in your model by calling arthur_model.save()


Unnamed: 0,name,stage,value_type,categorical,is_unique,categories,bins,range,monitor_for_bias
0,LIMIT_BAL,PIPELINE_INPUT,INTEGER,False,False,[],,"[10000, 1000000]",False
1,SEX,PIPELINE_INPUT,INTEGER,True,False,"[{value: 1}, {value: 2}]",,"[None, None]",False
2,EDUCATION,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
3,MARRIAGE,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3}]",,"[None, None]",False
4,AGE,PIPELINE_INPUT,INTEGER,False,False,[],,"[21, 75]",False
5,PAY_0,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
6,PAY_2,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
7,PAY_3,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
8,PAY_4,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
9,PAY_5,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 2}, {value: 3}, {value: 4...",,"[None, None]",False


## Bias Monitoring

Next, we will specify that a few attributes (SEX, AGE, and EDUCATION) should be monitored for Bias. We can do this for either categorical or continuous attributes. In the case of continuos attributes, we will breakup the continuous range into bins by providing cutoff values. For more details, please see https://docs.arthur.ai/user-guide/bias.html

In [48]:
arthur_model.get_attribute("SEX", stage=Stage.ModelPipelineInput).monitor_for_bias = True
arthur_model.get_attribute("EDUCATION", stage=Stage.ModelPipelineInput).monitor_for_bias = True
arthur_model.get_attribute("AGE",stage=Stage.ModelPipelineInput).monitor_for_bias = True
arthur_model.get_attribute("AGE", stage=Stage.ModelPipelineInput).set(bins = [None, 35, 55, None])

ArthurAttribute(name='AGE', value_type='INTEGER', stage='PIPELINE_INPUT', id=None, label=None, position=4, categorical=False, min_range=21, max_range=75, monitor_for_bias=True, categories=None, bins=[AttributeBin(continuous_start=None, continuous_end=35), AttributeBin(continuous_start=35, continuous_end=55), AttributeBin(continuous_start=55, continuous_end=None)], is_unique=False, is_positive_predicted_attribute=False, attribute_link=None, gt_class_link=None)

Finally, for ease of reading and later use, we can optionally supply human-readable labels for the values of the attributes. As an example, if we have a column for Age and it is encoded as integers 1 and 2, we can provide a human-readable mapping with string names so that things are easier to understand in the dashboard.

In [49]:
arthur_model.set_attribute_labels(attribute_name="SEX",
                            labels={1: "Male", 2: "Female"})
arthur_model.set_attribute_labels(attribute_name="EDUCATION",
                            labels={1: "Graduate School", 2: "University",
                                    3: "High School", 4: "Less Than High School",
                                    5: "Unknown", 6: "Unreported", 0: "Other"})
arthur_model.set_attribute_labels(attribute_name="MARRIAGE",
                            labels={1: "Married", 2: "Single",
                                    3: "Other", 0: "Unknown"})

Before saving, be sure to review your model to make sure everything is correct.

When saving your model, the data is saved as the reference set, which is used as the baseline data for tracking data drift. Often, this is the training data for the associated model. Our reference dataset should include:
  * inputs 
  * ground truth
  * model predictions
  
This way, Arthur can monitor for drift and stability in all of these aspects. 

In [50]:
arthur_model.review()

Unnamed: 0,name,stage,value_type,categorical,is_unique,categories,bins,range,monitor_for_bias
0,LIMIT_BAL,PIPELINE_INPUT,INTEGER,False,False,[],,"[10000, 1000000]",False
1,SEX,PIPELINE_INPUT,INTEGER,True,False,"[{label: Male, value: 1}, {label: Female, valu...",,"[None, None]",True
2,EDUCATION,PIPELINE_INPUT,INTEGER,True,False,"[{label: Graduate School, value: 1}, {label: U...",,"[None, None]",True
3,MARRIAGE,PIPELINE_INPUT,INTEGER,True,False,"[{label: Married, value: 1}, {label: Single, v...",,"[None, None]",False
4,AGE,PIPELINE_INPUT,INTEGER,False,False,[],"[AttributeBin(continuous_start=None, continuou...","[21, 75]",True
5,PAY_0,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
6,PAY_2,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
7,PAY_3,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
8,PAY_4,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...",,"[None, None]",False
9,PAY_5,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 2}, {value: 3}, {value: 4...",,"[None, None]",False


Before saving, you can also review your model to make sure everything is correct from the output of `arthur_model.build()` or via `arthur_model.review()`.

When saving your model, the data is saved as the reference set, which is used as the baseline data for tracking data drift. Often, this is the training data for the associated model. Our reference dataset should include:
  * inputs 
  * ground truth
  * model predictions
  
This way, Arthur can monitor for drift and stability in all of these aspects. 

If you've already created your model, you can fetch it from the Arthur API. Retrieve a Model ID from the output of the `arthur_model.save()` call below, or the URL of your model page in the Arthur Dashboard.

In [51]:
model_id = arthur_model.save()

2022-07-29 10:51:14,625 - arthurai.core.data_service - INFO - Starting upload (1.374 MB in 1 files), depending on data size this may take a few minutes
2022-07-29 10:51:14,917 - arthurai.core.data_service - INFO - Upload completed: /var/folders/hl/bdslq5454bx2hb8xz6s19ggm0000gn/T/tmpplrc1032/cac1db5d-b4c7-40e6-aa68-04723ee55831-0.parquet


In [52]:
# you can fetch a model by ID. for example pull the last-created model:
# with open("fullguide_model_id.txt", "r") as f:
#     model_id = f.read()
# arthur_model = connection.get_model(model_id)

## Enable Explainability

Next, we will enable Explainability. We will point to an environment requirements file and an entrypoint file. For more details, please see https://docs.arthur.ai/user-guide/explainability.html

In [53]:
import os
path = Path(os.getcwd())
arthur_model.enable_explainability(
    df=X_train,
    project_directory=path.parents[0],
    requirements_file="requirements.txt",
    user_predict_function_import_path="xai_entrypoint",
    explanation_algo="lime",
    streaming_explainability_enabled=True)

Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch/__pycache__
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch/fixtures
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch/fixtures/serialized_models
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch/fixtures/datasets
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch/notebooks
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk_batch/notebooks/.ipynb_checkpoints


2022-07-29 10:51:17,758 - arthurai.explainability.arthur_explainer - INFO - Testing model predict() function on provided data
2022-07-29 10:51:18,904 - arthurai.explainability.arthur_explainer - INFO - Model predict() function test successful


'ok'

## Sending Inferences

The predictions/scores from your model should match the column names in the registered schema. If we take a look above at *arthur_model.review()* we'll recall that columns we created correspond to the clasiffier's output probabilities over the classes ("prediction_1" and "prediction_0") and the corresponding ground truth over the possible clases in one-hot form ("gt_1" and "gt_0").

Aside from these model-specific columns, there are two standard inputs which are needed to indentify inferences.
* First, each inference needs a unique identifier so that it can later be joined with ground truth. Include a column named **partner_inference_id** and ensure these IDs are unique across batches. For example, if you run predictions across your customer base on a daily-batch cadence, then a unique identfier could be composed of your customer_id plus the date.   
* Second, each inference needs an **inference_timestamp** and these don't have to be unique.

We'll use our clasifier to score a batch of inputs and then assemble those inputs and predictions into a dataframe with the matching column names.

In [54]:
from uuid import uuid4

num_batches = 20
batch_ids = []

for i in range(num_batches):
    batch_size=np.random.randint(1000, 5000)
    batch_id = f"batch_{str(uuid4()).split('-')[1]}"
    batch_ids.append(batch_id)

    # generate a small batch of rows from the test set, create unique id for each row
    batch_df = X_test.sample(batch_size)
    inference_ids = [f"{batch_id}-inf_{i}" for i in batch_df.index]
    
    # calculate predictions on those rows, fetch ground truth for those rows
    batch_predictions = sk_model.predict_proba(batch_df)
    batch_ground_truths = Y_test[batch_df.index]
    
    # need to include model prediction columns, and partner_inference_id
    batch_df["prediction_1"] = batch_predictions[:, 1]
    
    # assemble the inference-wise groundtruth and upload
    ground_truth_df = pd.DataFrame({'gt': batch_ground_truths})

    arthur_model.send_inferences(batch_df, batch_id=batch_id, partner_inference_ids=inference_ids)
    arthur_model.update_inference_ground_truths(ground_truth_df, partner_inference_ids=inference_ids)

2022-07-29 10:51:39,924 - arthurai.core.models - INFO - 2832 rows were missing inference_timestamp fields, so the current time was populated
2022-07-29 10:51:40,301 - arthurai.core.models - INFO - 2832 rows were missing ground_truth_timestamp fields, so the current time was populated
2022-07-29 10:51:40,751 - arthurai.core.models - INFO - 3100 rows were missing inference_timestamp fields, so the current time was populated
2022-07-29 10:51:41,150 - arthurai.core.models - INFO - 3100 rows were missing ground_truth_timestamp fields, so the current time was populated
2022-07-29 10:51:41,601 - arthurai.core.models - INFO - 3142 rows were missing inference_timestamp fields, so the current time was populated
2022-07-29 10:51:41,960 - arthurai.core.models - INFO - 3142 rows were missing ground_truth_timestamp fields, so the current time was populated
2022-07-29 10:51:42,414 - arthurai.core.models - INFO - 3235 rows were missing inference_timestamp fields, so the current time was populated
2022

Realistically, there will be some delay before you have ground truth for your model's predictions. Whether that ground truth is accessible after one minute or one year, the *send_batch_ground_truths()* method can be called at any later time. The ground truth (labels) will joined with their corresponding predictions to yield accuracy measures. 