In [3]:
from arthurai import ArthurAI
from arthurai.client.apiv3 import InputType, OutputType, Stage
import numpy as np
import joblib
import datetime
import time

In [4]:
import sys
sys.path.append("..")
from model_utils import transformations, load_datasets

In this guide, we'll use the credit dataset (and a pre-trained model) to onboard a new model to the Arthur platform. We'll walk through registering the model using a sample of the training data. This is an example of a streaming model.

#### Set up connection
Supply your API Key below to autheticate with the platform.

In [5]:
URL = "app.arthur.ai"
ACCESS_KEY = "..."

connection = ArthurAI(url=URL, access_key=ACCESS_KEY, client_version=3)

## Create Model

We'll instantiate a model object with a small amount of metadata about the model input and output types. Then, we'll use a sample of the training data to register the full data schema for this Tabular model.

In [6]:
arthur_model = connection.model(partner_model_id="CreditRiskModel_v0.0.1",
                               input_type=InputType.Tabular,
                               output_type=OutputType.Multiclass)

In [7]:
(X_train, Y_train), (X_test, Y_test) = load_datasets("../fixtures/datasets/credit_card_default.csv")

In [6]:
Y_train.head()

26917    0
25425    0
16118    0
9508     0
12653    0
Name: default payment next month, dtype: int64

In [6]:
X_train.head()

Unnamed: 0,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,...,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6
16763,20000,1,2,1,38,0,0,0,0,0,...,17012,17992,18350,18738,1565,1299,1279,637,663,1469
17133,20000,2,3,1,27,3,2,0,0,0,...,16892,17396,14017,0,0,1696,1200,22,0,0
22795,260000,2,2,1,33,0,0,0,0,0,...,135593,120909,102524,40157,4002,6067,10000,3000,40157,1466
21984,30000,2,3,2,28,-1,-1,-1,-1,-1,...,557,1299,600,450,25460,1000,1306,600,0,11961
11879,70000,2,2,2,24,0,0,-2,-1,-1,...,9660,6208,702,4320,1000,9660,6208,702,4320,1650


We need to register what the data schema is for the inputs to the model. Since your model might hundreds or thousands of input features, you can just pass us a pandas DataFrame of your training data, and we'll handle the rest.

In [8]:
arthur_model.from_dataframe(X_train, Stage.ModelPipelineInput)

We need to register the schema for the outputs of the model: what will a typical prediction look like and what will a typical ground truth look like? What names, shapes, and datatypes should Arthur expect for these objects?

Since this is a binary classification model, we'll do this all in one step with the *.add_binary_classifier_output_attributes()* method. All we need to supply is a mapping that establishes:
  * names for the model's predictions
  * names for the model's ground truth
  * the mapping that related these two
  
Our classifier will be making predictions about class *0* and class *1* and will return a probability score for each class. Therefore, we'll set up a name *prediction_0* and a name *prediction_1*. Additionally, our groundtruth will be either a 0 or 1, but we'll always represent ground truth in the one-hot-endoded form. Therefore, we create two field called *gt_0* and *gt_1*. We link these all up in a dictionary and pass that to the model.  

In [9]:
prediction_to_ground_truth_map = {
    "prediction_0": "gt_0",
    "prediction_1": "gt_1"
}

arthur_model.add_binary_classifier_output_attributes("prediction_1", prediction_to_ground_truth_map)

{'prediction_0': <arthurai.client.apiv3.attributes.ArthurAttribute at 0x10916da50>,
 'gt_0': <arthurai.client.apiv3.attributes.ArthurAttribute at 0x10916d150>,
 'prediction_1': <arthurai.client.apiv3.attributes.ArthurAttribute at 0x10916d610>,
 'gt_1': <arthurai.client.apiv3.attributes.ArthurAttribute at 0x10916dc50>}

Note that the first argument to *.add_binary_classifier_output_attributes()* is the name of the "positive predicted class", for purposes of calculating accuracy metrics. 

Before saving, you can review a model to make sure everything is correct.

In [10]:
arthur_model.review()

Unnamed: 0,name,stage,value_type,categorical,is_unique,categories,range,monitor_for_bias
0,gt_0,GROUND_TRUTH,INTEGER,True,False,"[{value: 0}, {value: 1}]","[None, None]",False
1,gt_1,GROUND_TRUTH,INTEGER,True,False,"[{value: 0}, {value: 1}]","[None, None]",False
2,LIMIT_BAL,PIPELINE_INPUT,INTEGER,False,False,[],"[10000, 1000000]",False
3,SEX,PIPELINE_INPUT,INTEGER,True,False,"[{value: 1}, {value: 2}]","[None, None]",False
4,EDUCATION,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...","[None, None]",False
5,MARRIAGE,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3}]","[None, None]",False
6,AGE,PIPELINE_INPUT,INTEGER,False,False,[],"[21, 75]",False
7,PAY_0,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...","[None, None]",False
8,PAY_2,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...","[None, None]",False
9,PAY_3,PIPELINE_INPUT,INTEGER,True,False,"[{value: 0}, {value: 1}, {value: 2}, {value: 3...","[None, None]",False


In [11]:
arthur_model.save()

'07f9805c-ee2e-4202-9550-5dee8abc53da'

### Setting baseline data
Next, we'll use the training data to set a baseline reference for calcuating data drift. 

For tracking data drift, you can upload a dataset to serve as the baseline or reference set. Often, this is a sample of your training data for the associated model. Our reference dataset should ideally include examples of
  * inputs 
  * ground truth
  * model predictions
  
for a sample of the training set. This way, Arthur can monitor for drift and stability in all of these aspects. 

In [12]:
# load our pre-trained classifier so we can generate predictions
sk_model = joblib.load("../fixtures/serialized_models/credit_model.pkl")

In [13]:
# get all input columns
reference_set = X_train.copy()

# get ground truth labels
reference_set["gt_1"] = Y_train
reference_set["gt_0"] = 1-Y_train

# get model predictions
preds = sk_model.predict_proba(X_train)
reference_set["prediction_1"] = preds[:, 1]
reference_set["prediction_0"] = preds[:, 0]


In [14]:
arthur_model.set_reference_data(data=reference_set)

{'counts': {'success': 21000, 'failure': 0, 'total': 21000}, 'failures': [[]]}

## Sending Inferences

Load test data and trained model. Let's familiarize ourselves with the data and the model.


In [9]:
X_test.shape

In [15]:
sk_model

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
                       max_depth=15, max_features='auto', max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=500,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

In [10]:
sk_model.predict_proba(X_train.iloc[0:1, :])

array([[0.94844585, 0.05155415]])

To send inferences, we'll iterate through datapoints in a test set and send telemetry to Arthur.

In [21]:
for i in range(X_test.shape[0]):
    datarecord = X_test.iloc[i:i+1, :]
    predicted_probs = sk_model.predict_proba(datarecord)[0]
    ground_truth = np.int(Y_test.iloc[i])
    ext_id = str(np.random.randint(1e9))


    arthur_model.send_inference(
        inference_timestamp=datetime.datetime.utcnow().isoformat() + 'Z',
        external_id=ext_id,
        model_pipeline_input=datarecord.to_dict(orient='records')[0],
        predicted_value={"prediction_1":predicted_probs[1], 
                         "prediction_0":predicted_probs[0]},
        ground_truth={"gt_1": ground_truth, 
                      "gt_0":1-ground_truth}
    )
    print("Sent inference with id {}".format(ext_id))
    time.sleep(0.001 * np.random.random())

Sent inference with id 110927066
Sent inference with id 255753356
Sent inference with id 723516092
Sent inference with id 6193237
Sent inference with id 573486094
Sent inference with id 760169700
Sent inference with id 682239036
Sent inference with id 3540189
Sent inference with id 627422534
Sent inference with id 357513312
Sent inference with id 115064955
Sent inference with id 402885534
Sent inference with id 156557413
Sent inference with id 506634584
Sent inference with id 487627058
Sent inference with id 49279783
Sent inference with id 715848128
Sent inference with id 153962765
Sent inference with id 261196540
Sent inference with id 669700456
Sent inference with id 334150000
Sent inference with id 305646724
Sent inference with id 202127406
Sent inference with id 897471150
Sent inference with id 27126008
Sent inference with id 961171463
Sent inference with id 257411693
Sent inference with id 826573105
Sent inference with id 26340974


KeyboardInterrupt: 

You can send inferences one at a time but you can also send them in small bunches using the *send_infereces()* method. In that case, you would send a list of dictionaries, each of which is similar to above. 

If you model scoring system is a set up in a batch processor where you run a daily, weekly, or monthly job, then we recommend setting a batch model with Arthur and using the corresponding *send_batch_inferences()* method. An example batch model can be found [here](../../credit_risk_batch/notebooks/Quickstart.ipynb).