In [11]:
import time
import sys
from datetime import datetime, timedelta
import pytz
import joblib

import numpy as np
import pandas as pd
from pathlib import Path

from arthurai import ArthurAI
from arthurai.common.constants import InputType, OutputType, Stage

sys.path.append("..")
from model_utils import load_datasets

In this guide, we'll use the credit dataset (and a pre-trained model) to onboard a new model to the Arthur platform. We'll walk through registering the model using a sample of the training data. This is an example of a streaming model. We'll also mark a few specific attributes to be monitored for Bias. And we will enable Explainability. 

#### Set up connection
Supply your API Key below to authenticate with the platform.

In [12]:
# credentials are being passed to the client via environment variables
connection = ArthurAI()

MissingParameterError: Please set api key and url either via environment variables (`ARTHUR_API_KEY` and `ARTHUR_ENDPOINT_URL`) or by passing parameters `access_key` and `url`.

## Create Model

### Loading the Data

In [5]:
(X_train, Y_train), (X_test, Y_test) = load_datasets("../fixtures/datasets/credit_card_default.csv")

In [6]:
Y_train.head()

584      0
17832    0
11647    0
1234     0
9561     0
Name: default payment next month, dtype: int64

In [7]:
X_train.head()

Unnamed: 0,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,...,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6
584,50000,1,2,2,42,0,0,0,0,0,...,49111,48943,45775,0,2200,1600,1700,1700,0,0
17832,30000,1,2,2,24,2,4,3,2,0,...,29991,29192,28210,28543,0,0,0,1100,1300,2000
11647,100000,1,2,2,26,0,0,0,0,0,...,97369,98232,97752,0,10000,43000,5000,3000,0,0
1234,280000,2,1,2,30,0,0,0,0,0,...,263734,268216,262895,264508,11000,10000,10004,10020,10100,10000
9561,100000,2,3,2,27,0,0,0,0,0,...,86981,81522,81171,79766,3151,3065,2892,2936,3000,3000


In [8]:
# load our pre-trained classifier so we can generate predictions
sk_model = joblib.load("../fixtures/serialized_models/credit_model.pkl")

# get model predictions
preds = sk_model.predict_proba(X_train)
X_train["prediction_1"] = preds[:, 1]

# # get ground truth labels
X_train["gt"] = Y_train

### Registering the Model

We'll instantiate a model object with a small amount of metadata about the model input and output types. Then, we'll use a sample of the training data to register the full data schema for this Tabular model.

In [None]:
arthur_model = connection.model(partner_model_id=f"CreditRiskModel_FG_{datetime.now().strftime('%Y%m%d%H%M%S')}",
                                display_name="Credit Risk",
                                input_type=InputType.Tabular,
                                output_type=OutputType.Multiclass)

We need to register the schema for the outputs of the model: what will a typical prediction look like and what will a typical ground truth look like? What names, shapes, and datatypes should Arthur expect for these objects?

We'll do this all in one step with the *.build()* method. All we need to supply is:
  * the training dataframe
  * the mapping that related predictions to ground truth
  * positive predicted attribute label
  
Our classifier will be making predictions about class *0* and class *1* and will return a probability score for each class. Therefore, we'll set up a name *prediction_0* and a name *prediction_1*. Additionally, our groundtruth will be either a 0 or 1, but we'll always represent ground truth in the one-hot-endoded form. Therefore, we create two fields called *gt_0* and *gt_1*. We link these all up in a dictionary and pass that to the model.  

In [None]:
# Map our prediction attribute to the ground truth value
prediction_to_ground_truth_map = {
    "prediction_1": 1
}

arthur_model.build(X_train, 
                   ground_truth_column="gt",
                   pred_to_ground_truth_map=prediction_to_ground_truth_map)

## Bias Monitoring

Next, we will specify that a few attributes (SEX, AGE, and EDUCATION) should be monitored for Bias. We can do this for either categorical or continuous attributes. In the case of continuos attributes, we will breakup the continuous range into bins by providing cutoff values. For more details, please see the [Bias Guide](https://docs.arthur.ai/user-guide/bias.html)

In [None]:
arthur_model.get_attribute("SEX", stage=Stage.ModelPipelineInput).monitor_for_bias = True
arthur_model.get_attribute("EDUCATION", stage=Stage.ModelPipelineInput).monitor_for_bias = True
arthur_model.get_attribute("AGE",stage=Stage.ModelPipelineInput).monitor_for_bias = True
arthur_model.get_attribute("AGE", stage=Stage.ModelPipelineInput).set(bins = [None, 35, 55, None])

Finally, for ease of reading and later use, we can optionally supply human-readable labels for the values of the attributes. As an example, if we have a column for Age and it is encoded as integers 1 and 2, we can provide a human-readable mapping with string names so that things are easier to understand in the dashboard.

In [None]:
arthur_model.set_attribute_labels(attribute_name="SEX",
                            labels={1: "Male", 2: "Female"})
arthur_model.set_attribute_labels(attribute_name="EDUCATION",
                            labels={1: "Graduate School", 2: "University",
                                    3: "High School", 4: "Less Than High School",
                                    5: "Unknown", 6: "Unreported", 0: "Other"})
arthur_model.set_attribute_labels(attribute_name="MARRIAGE",
                            labels={1: "Married", 2: "Single",
                                    3: "Other", 0: "Unknown"})

Before saving, be sure to review your model to make sure everything is correct.

When saving your model, the data is saved as the reference set, which is used as the baseline data for tracking data drift. Often, this is the training data for the associated model. Our reference dataset should include:
  * inputs 
  * ground truth
  * model predictions
  
This way, Arthur can monitor for drift and stability in all of these aspects. 

In [None]:
arthur_model.review()

Before saving, you can also review your model to make sure everything is correct from the output of `arthur_model.build()` or via `arthur_model.review()`.

When saving your model, the data is saved as the reference set, which is used as the baseline data for tracking data drift. Often, this is the training data for the associated model. Our reference dataset should include:
  * inputs 
  * ground truth
  * model predictions
  
This way, Arthur can monitor for drift and stability in all of these aspects. 

If you've already created your model, you can fetch it from the Arthur API. Retrieve a Model ID from the output of the `arthur_model.save()` call below, or the URL of your model page in the Arthur Dashboard.

In [None]:
model_id = arthur_model.save()
with open("fullguide_model_id.txt", "w") as f:
    f.write(model_id)

In [None]:
# you can fetch a model by ID. for example pull the last-created model:
# with open("fullguide_model_id.txt", "r") as f:
#     model_id = f.read()
# arthur_model = connection.get_model(model_id)

## Enable Explainability

Next, we will enable Explainability. We will point to an environment requirements file and an entrypoint file. For more details, please see https://docs.arthur.ai/user-guide/explainability.html

In [9]:
import os
path = Path(os.getcwd())
arthur_model.enable_explainability(
    df=X_train,
    project_directory=path.parents[0],
    requirements_file="requirements.txt",
    user_predict_function_import_path="xai_entrypoint",
    explanation_algo="lime",
    streaming_explainability_enabled=True)

Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk/__pycache__
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk/fixtures
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk/fixtures/serialized_models
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk/fixtures/datasets
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk/notebooks
Ignoring folder: /Users/karthik/Documents/Arthur/arthur-sandbox/examples/example_projects/credit_risk/notebooks/.ipynb_checkpoints


UserValueError: Failed to validate requirements file: 
Explainability requirements file lists joblib==1.1.0 but found version 1.0.1 currently installed. Packages listed in the requirements file must match the currently installed version. To resolve, update the requirements file to version joblib==1.0.1
Explainability requirements file lists numpy==1.20.3 but found version 1.20.1 currently installed. Packages listed in the requirements file must match the currently installed version. To resolve, update the requirements file to version numpy==1.20.1
Explainability requirements file lists pandas==1.1.4 but found version 1.2.1 currently installed. Packages listed in the requirements file must match the currently installed version. To resolve, update the requirements file to version pandas==1.2.1
Explainability requirements file lists pytz==2021.3 but found version 2021.1 currently installed. Packages listed in the requirements file must match the currently installed version. To resolve, update the requirements file to version pytz==2021.1
Explainability requirements file lists scikit-learn==1.0.1 but found version 0.24.2 currently installed. Packages listed in the requirements file must match the currently installed version. To resolve, update the requirements file to version scikit-learn==0.24.2

## Sending Inferences

However you are currently invoking your model's prediction (eg. through a .predict() or .predict_proba() call), you can wrap this call so that the inputs and outputs are logged with Arthur.


In [None]:
from arthurai.core.decorators import log_prediction

In [None]:
@log_prediction(arthur_model)
def model_predict(input_vec):
 return sk_model.predict_proba(input_vec)[0]

We'll create some timestamps to mimic sending the data over a period of time. If these are left out the
current time will be populated

In [None]:
# 10 timestamps over the last month
timestamps = pd.date_range(start=datetime.now(pytz.utc) - timedelta(days=30),
                           end=datetime.now(pytz.utc),
                           periods=10)

Now, as we iterate through a dataset and invoke our model for predictions, the model inputs and outputs are logged.

In [None]:
inference_ids = {}
for timestamp in timestamps:
    for i in range(np.random.randint(7, 10)):
        datarecord = X_test.sample(1)  # fetch a random row
        prediction, inference_id = model_predict(datarecord, inference_timestamp=timestamp)  # predict and log
        inference_ids[inference_id] = datarecord.index[0]  # record the inference ID with the Pandas index
    print(f"Logged {i+1} inferences with Arthur from {timestamp.strftime('%m/%d')}")


If your model scoring system is a set up in a batch processor where you run a daily, weekly, or monthly job, then we recommend setting a batch model with Arthur and using the corresponding *send_batch_inferences()* method. An example batch model can be found [here](../../credit_risk_batch/notebooks/Quickstart.ipynb).

## Updating with Ground Truth

In the future, when your ground truth lables come in, you can [update each inference](https://docs.arthur.ai/sdk/sdk_v3/arthurai.core.html#arthurai.core.models.ArthurModel.update_inference_ground_truths) by id with its corresponding label. 

In [10]:
gt_df = pd.DataFrame({'partner_inference_id': inference_ids.keys(),
                      'gt': Y_test[inference_ids.values()]})
gt_df.head(5)

NameError: name 'inference_ids' is not defined

In [None]:
_ = arthur_model.update_inference_ground_truths(gt_df)