<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Working with Watson Machine Learning

The notebook will train, create and deploy a Credit Risk model, configure OpenScale to monitor that deployment, and inject seven days' worth of historical records and measurements for viewing in the OpenScale Insights dashboard.

### Contents

- [Setup](#setup)
- [Model building and deployment](#model)
- [OpenScale configuration](#openscale)
- [Quality monitor and feedback logging](#quality)
- [Fairness, drift monitoring and explanations](#fairness)
- [Custom monitors and metrics](#custom)
- [Payload analytics](#analytics)
- [Historical data](#historical)

# 1.0 Setup <a name="setup"></a>

## 1.1 Package installation

In [1]:
from IPython.utils import io
import warnings
warnings.filterwarnings('ignore')

In [2]:
!rm -rf /home/spark/shared/user-libs/python3.6*

!pip install --upgrade numpy==1.19.2 --user --no-cache | tail -n 1
!pip install --upgrade pandas==0.25.3 --user --no-cache | tail -n 1
!pip install --upgrade requests==2.23 --user --no-cache | tail -n 1
!pip install --upgrade SciPy==1.5.2 --user --no-cache | tail -n 1
!pip install --upgrade lime==0.2.0.1 --user --no-cache | tail -n 1
!pip install --upgrade pixiedust==1.1.18 --user --no-cache | tail -n 1
!pip install --upgrade pyspark==2.4.0 --user --no-cache | tail -n 1

!pip install --upgrade ibm-cloud-sdk-core==3.3.0 --user --no-cache | tail -n 1
!pip install --upgrade ibm-watson-machine-learning==1.0.22 --user --no-cache | tail -n 1
!pip install --upgrade ibm-watson-openscale==3.0.1 --user --no-cache | tail -n 1

!pip uninstall watson-machine-learning-client  -y | tail -n 1
!pip uninstall watson-machine-learning-client-V4 -y | tail -n 1

[31mERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", but you'll have scipy 1.5.0 which is incompatible.[0m
Successfully installed numpy-1.19.2
Successfully installed pandas-0.25.3
Successfully installed requests-2.23.0
[31mERROR: tensorflow 2.1.0 has requirement scipy==1.4.1; python_version >= "3", but you'll have scipy 1.5.2 which is incompatible.[0m
Successfully installed SciPy-1.5.2
Successfully installed lime-0.2.0.1
Successfully installed astunparse-1.6.3 colour-0.1.5 geojson-2.5.0 mpld3-0.5.1 pixiedust-1.1.18
Successfully installed py4j-0.10.7 pyspark-2.4.0
Successfully installed ibm-cloud-sdk-core-3.3.0
Successfully installed ibm-watson-machine-learning-1.0.22


## 1.2 Configure credentials

To authenticate the Watson Machine Learning service and Watson OpenScale on IBM Cloud, you need to provide a platform `api_key` and an endpoint URL. Where the endpoint URL is based on the `location` of the WML instance. To get these values you can use either the IBM Cloud CLI or the IBM Cloud UI.


#### IBM Cloud CLI

You can use the [IBM Cloud CLI](https://cloud.ibm.com/docs/cli/index.html) to create a platform API Key and retrieve your instance location.

- To generate the Cloud API Key, run the following commands:
```
ibmcloud login
ibmcloud iam api-key-create API_KEY_NAME
```
  - Copy the value of `api_key` from the output.


- To retrieve the location of your WML instance, run the following commands:
```
ibmcloud login --apikey API_KEY -a https://cloud.ibm.com
ibmcloud resource service-instance "WML_INSTANCE_NAME"
```
> Note: WML_INSTANCE_NAME is the name of your Watson Machine Learning instance and should be quoted in the command.

  - Copy the value of `Location` from the output.

#### IBM Cloud UI

To generate Cloud API key:
- Go to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). 
- From that page, click your name in the top right corner, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. 
- Give your key a name and click **Create**, then copy the created key and to use it below.

To retrieve the location of your WML instance:
- Go to the [**Resources List** section of the Cloud console](https://cloud.ibm.com/resources).
- From that page, expand the **Services** section and find your Watson Machine Learning Instance.
- Based on the Location displayed in that page, select one of the following values for location variable:

|Displayed Location|Location|
|-|-|
|Dallas|us-south|
|London|eu-gb|
|Frankfurt|eu-de|
|Tokyo|jp-tok|


In [3]:
CLOUD_API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
WML_LOCATION = "us-south" # example: "us-south"

In [4]:
WML_CREDENTIALS = {
    "apikey": CLOUD_API_KEY,
    "url": "https://" + WML_LOCATION + ".ml.cloud.ibm.com"
}

In next cells, you will need to paste credentials to Cloud Object Storage. If you haven't worked with COS yet please visit [getting started with COS tutorial](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started). 
- You can find COS_API_KEY_ID and COS_RESOURCE_CRN in the **Service Credentials** menu of your COS instance (copy the `apikey` and `resource_instance_id` respectively). The COS Service Credentials must be created with Role parameter set as Writer. 
- The COS_ENDPOINT variable can be found in Endpoint panel of your COS instance(Note: this is not the same as the `Endpoint` in the service credentials). From this page, click on one of the regions and copy the public endpoint.
- The BUCKET_NAME can be anything you like as long as its globally unique. You an use the suggested value, appended with your initials and date.

Later in the notebook, the training data file will be loaded to the bucket of your instance and used as training reference in subscription.

In [5]:
COS_API_KEY_ID = "xxxxxxxxxxxxxxxxxxxxxxxxxx"
COS_RESOURCE_CRN = "crn:v1:bluemix:public:cloud-object-storage:global:a/xxxxxxxxxxxxxxxxxxxxxxxxxx::"
COS_ENDPOINT = "https://s3.us.cloud-object-storage.appdomain.cloud" #Example: "https://s3.us.cloud-object-storage.appdomain.cloud"

BUCKET_NAME = "credit-risk-training-data-jrt1020-v2" #Example: "credit-risk-training-data-uniqueID"

This tutorial can use Databases for PostgreSQL, Db2 Warehouse, or a free internal verison of PostgreSQL to create a datamart for OpenScale.

**For most scenarios, do not update the cell below (leave the values as they are).**

If you have previously configured OpenScale, it will use your existing datamart, and not interfere with any models you are currently monitoring. Or, if you do not have a paid Cloud account or would prefer not to provision this paid service, you may use the free internal PostgreSQL service with OpenScale. Leave the values as is.

Otherwise, if you want to use an external datastore as the datamart or if you previously used the internal datastore but want to delete it and create a new one:

- To use a new Db2 Warehouse or Databases for PostgreSQL instance as the datamart, provision the desired service via the Cloud catalog and create a set of credentials. Copy and paste the credentials from that service into the cell below. 
- If you previously configured OpenScale to use the free internal version of PostgreSQL, you can switch to a new datamart using a paid database service. 
- If you would like to delete the internal PostgreSQL configuration and create a new one using service credentials supplied in the cell above, set the KEEP_MY_INTERNAL_POSTGRES variable below to False below. In this case, the notebook will remove your existing internal PostgreSQL datamart and create a new one with the supplied credentials. ***NO DATA MIGRATION WILL OCCUR.***



In [6]:
DB_CREDENTIALS = None
SCHEMA_NAME = None
KEEP_MY_INTERNAL_POSTGRES = True

## 1.3 Set Custom Name

Provide a custom name to be concatenated to create a model name, deployment name and open scale monitor name. Sample value for CUSTOM_NAME could be CUSTOM_NAME = 'JRT_WOSTest1020'

**<font color='red'><< UPDATE THE VARIABLE 'CUSTOM_NAME' TO A UNIQUE NAME OF YOUR CHOOSING>></font>**


In [7]:
CUSTOM_NAME = 'JRT_WOSTest1021'

## 1.4 Run the notebook

At this point, the notebook is ready to run. You can either run the cells one at a time, or click the **Kernel** option above and select **Restart and Run All** to run all the cells.

# 2.0 Model building and deployment <a name="model"></a>

In this section you will learn how to train Spark MLLib model and next deploy it as web-service using Watson Machine Learning service.

## 2.1 Load the training data

In [8]:
!rm german_credit_data_biased_training.csv
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/german_credit_data_biased_training.csv -O german_credit_data_biased_training.csv
        
!ls -lh german_credit_data_biased_training.csv

rm: cannot remove 'german_credit_data_biased_training.csv': No such file or directory
-rw-r----- 1 wsuser watsonstudio 674K Oct 21 16:16 german_credit_data_biased_training.csv


In [9]:
from pyspark.sql import SparkSession
import pandas as pd
import json

spark = SparkSession.builder.getOrCreate()
training_data_file_name = "german_credit_data_biased_training.csv"
pd_data = pd.read_csv(training_data_file_name, sep=",", header=0)
df_data = spark.createDataFrame(pd_data)
df_data.head()

Row(CheckingStatus='0_to_200', LoanDuration=31, CreditHistory='credits_paid_to_date', LoanPurpose='other', LoanAmount=1889, ExistingSavings='100_to_500', EmploymentDuration='less_1', InstallmentPercent=3, Sex='female', OthersOnLoan='none', CurrentResidenceDuration=3, OwnsProperty='savings_insurance', Age=32, InstallmentPlans='none', Housing='own', ExistingCreditsCount=1, Job='skilled', Dependents=1, Telephone='none', ForeignWorker='yes', Risk='No Risk')

In [10]:
print("Number of records: " + str(df_data.count()))

Number of records: 5000


In [11]:
spark_df = df_data
(train_data, test_data) = spark_df.randomSplit([0.8, 0.2], 24)

print("Number of records for training: " + str(train_data.count()))
print("Number of records for evaluation: " + str(test_data.count()))

spark_df.printSchema()

Number of records for training: 4039
Number of records for evaluation: 961
root
 |-- CheckingStatus: string (nullable = true)
 |-- LoanDuration: long (nullable = true)
 |-- CreditHistory: string (nullable = true)
 |-- LoanPurpose: string (nullable = true)
 |-- LoanAmount: long (nullable = true)
 |-- ExistingSavings: string (nullable = true)
 |-- EmploymentDuration: string (nullable = true)
 |-- InstallmentPercent: long (nullable = true)
 |-- Sex: string (nullable = true)
 |-- OthersOnLoan: string (nullable = true)
 |-- CurrentResidenceDuration: long (nullable = true)
 |-- OwnsProperty: string (nullable = true)
 |-- Age: long (nullable = true)
 |-- InstallmentPlans: string (nullable = true)
 |-- Housing: string (nullable = true)
 |-- ExistingCreditsCount: long (nullable = true)
 |-- Job: string (nullable = true)
 |-- Dependents: long (nullable = true)
 |-- Telephone: string (nullable = true)
 |-- ForeignWorker: string (nullable = true)
 |-- Risk: string (nullable = true)



## 2.2 Save training data to COS

In [12]:
import ibm_boto3
from ibm_botocore.client import Config, ClientError

cos_client = ibm_boto3.resource(
    "s3",
    ibm_api_key_id=COS_API_KEY_ID,
    ibm_service_instance_id=COS_RESOURCE_CRN,
    ibm_auth_endpoint="https://iam.bluemix.net/oidc/token",
    config=Config(signature_version="oauth"),
    endpoint_url=COS_ENDPOINT
)

In [13]:
create_bucket = True
try:
    buckets = cos_client.buckets.all()
    for bucket in buckets:
        if BUCKET_NAME == bucket.name:
            print("Existing Bucket Found: {0}".format(bucket.name))
            create_bucket = False
            break
            
except ClientError as be:
    print("Client Error: {0}\n".format(be))
except Exception as e:
    print("Unable to retrieve list buckets: {0}\n".format(e))
    
if create_bucket:
    print("Creating new bucket: {0}".format(BUCKET_NAME))
    try:
        cos_client.create_bucket(Bucket=BUCKET_NAME)
        print("Bucket: {0} created!".format(BUCKET_NAME))
    except ClientError as be:
        print("Client Error: {0}\n".format(be))
    except Exception as e:
        print("Unable to create bucket: {0}".format(e))

Existing Bucket Found: credit-risk-training-data-jrt1020-v2


In [14]:
try:
    with open(training_data_file_name, "rb") as file_data:
        cos_client.Object(BUCKET_NAME, training_data_file_name).upload_fileobj(
            Fileobj=file_data
        )
except Exception as e:
    print("An exception occurred: {0}".format(e))

## 2.3 Create a model

The code below will use the CUSTOM_NAME variable set earlier to create a model name and online deployment name.

In [15]:
MODEL_NAME = CUSTOM_NAME + '_WOSNotebook_Model'
DEPLOYMENT_NAME = CUSTOM_NAME + '_WOSNotebook_Deployment'

The code below creates a Random Forest Classifier with Spark, setting up string indexers for the categorical features and the label column. Finally, this notebook creates a pipeline including the indexers and the model, and does an initial Area Under ROC evaluation of the model.

In [16]:
from pyspark.ml.feature import OneHotEncoder, StringIndexer, IndexToString, VectorAssembler
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml import Pipeline, Model

si_CheckingStatus = StringIndexer(inputCol = "CheckingStatus", outputCol = "CheckingStatus_IX")
si_CreditHistory = StringIndexer(inputCol = "CreditHistory", outputCol = "CreditHistory_IX")
si_LoanPurpose = StringIndexer(inputCol = "LoanPurpose", outputCol = "LoanPurpose_IX")
si_ExistingSavings = StringIndexer(inputCol = "ExistingSavings", outputCol = "ExistingSavings_IX")
si_EmploymentDuration = StringIndexer(inputCol = "EmploymentDuration", outputCol = "EmploymentDuration_IX")
si_Sex = StringIndexer(inputCol = "Sex", outputCol = "Sex_IX")
si_OthersOnLoan = StringIndexer(inputCol = "OthersOnLoan", outputCol = "OthersOnLoan_IX")
si_OwnsProperty = StringIndexer(inputCol = "OwnsProperty", outputCol = "OwnsProperty_IX")
si_InstallmentPlans = StringIndexer(inputCol = "InstallmentPlans", outputCol = "InstallmentPlans_IX")
si_Housing = StringIndexer(inputCol = "Housing", outputCol = "Housing_IX")
si_Job = StringIndexer(inputCol = "Job", outputCol = "Job_IX")
si_Telephone = StringIndexer(inputCol = "Telephone", outputCol = "Telephone_IX")
si_ForeignWorker = StringIndexer(inputCol = "ForeignWorker", outputCol = "ForeignWorker_IX")

In [17]:
si_Label = StringIndexer(inputCol="Risk", outputCol="label").fit(spark_df)
label_converter = IndexToString(inputCol="prediction", outputCol="predictedLabel", labels=si_Label.labels)

In [18]:
va_features = VectorAssembler(inputCols=["CheckingStatus_IX", "CreditHistory_IX", "LoanPurpose_IX", "ExistingSavings_IX", "EmploymentDuration_IX", "Sex_IX", \
                                         "OthersOnLoan_IX", "OwnsProperty_IX", "InstallmentPlans_IX", "Housing_IX", "Job_IX", "Telephone_IX", "ForeignWorker_IX", \
                                         "LoanDuration", "LoanAmount", "InstallmentPercent", "CurrentResidenceDuration", "LoanDuration", "Age", "ExistingCreditsCount", \
                                         "Dependents"], outputCol="features")

In [19]:
from pyspark.ml.classification import RandomForestClassifier
classifier = RandomForestClassifier(featuresCol="features")

pipeline = Pipeline(stages=[si_CheckingStatus, si_CreditHistory, si_EmploymentDuration, si_ExistingSavings, si_ForeignWorker, si_Housing, si_InstallmentPlans, si_Job, si_LoanPurpose, si_OthersOnLoan,\
                               si_OwnsProperty, si_Sex, si_Telephone, si_Label, va_features, classifier, label_converter])
model = pipeline.fit(train_data)


## 2.4 Evalutate Model

In [20]:
predictions = model.transform(test_data)
evaluatorDT = BinaryClassificationEvaluator(rawPredictionCol="prediction",  metricName="areaUnderROC")
area_under_curve = evaluatorDT.evaluate(predictions)

evaluatorDT = BinaryClassificationEvaluator(rawPredictionCol="prediction",  metricName="areaUnderPR")
area_under_PR = evaluatorDT.evaluate(predictions)

print("areaUnderROC = %g" % area_under_curve, "areaUnderPR = %g" % area_under_PR)

areaUnderROC = 0.711137 areaUnderPR = 0.64943


In [21]:
from sklearn.metrics import classification_report
y_pred = predictions.toPandas()['prediction']
y_pred = ['Risk' if pred == 1.0 else 'No Risk' for pred in y_pred]
y_test = test_data.toPandas()['Risk']
print(classification_report(y_test, y_pred, target_names=['Risk', 'No Risk']))

              precision    recall  f1-score   support

        Risk       0.79      0.92      0.85       643
     No Risk       0.75      0.50      0.60       318

    accuracy                           0.78       961
   macro avg       0.77      0.71      0.73       961
weighted avg       0.78      0.78      0.77       961



## 2.5 Publish the model

In this section, the notebook uses Watson Machine Learning to save the model (including the pipeline) to the WML instance. Previous versions of the model are removed so that the notebook can be run again, resetting all data for another demo.

In [22]:
import json
from ibm_watson_machine_learning import APIClient

wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

'1.0.29'

### 2.5.1 Set default space

In order to deploy a model, you would have to create deployment spaces and deploy your models there. You can list all the spaces using the .list() function, or you can create new spaces by going to menu on top left corner --> analyze --> analytics deployments --> New Deployment Space. Once you know which space you want to deploy in, simply use the GUID of the space as argument for .set.default_space() function below


In [23]:
wml_client.spaces.list()

Note: 'limit' is not provided. Only first 50 records will be displayed if the number of records exceed 50
------------------------------------  -----------------------------------------------------------  ------------------------
ID                                    NAME                                                         CREATED
c3a71249-4da6-45c7-939f-0e8d135ad694  openscale-express-path-99318397-f315-40be-a572-961b511856bd  2020-10-13T20:51:31.431Z
79a3d376-144b-4bbc-8c92-fdf187c50b67  CreditRiskDep1007                                            2020-10-07T15:21:16.858Z
50e0a36c-f923-46f1-a7b6-3acedb6762e7  creditriskspace09252020                                      2020-09-25T20:50:33.359Z
------------------------------------  -----------------------------------------------------------  ------------------------


**<font color='red'><< UPDATE THE VARIABLE 'DEPLOYMENT_SPACE_NAME' TO THE NAME OF THE DEPLOYMENT SPACE YOU CREATED PREVIOUSLY>></font>**

You should copy the name of your deployment space from the output of the previous cell to the variable in the next cell. The deployment space ID will be looked up based on the name specified below. If you do not receive a space GUID as an output to the next cell, do not proceed until you have created a deployment space.

In [24]:
DEPLOYMENT_SPACE_NAME = "CreditRiskDep1007"

In [25]:
wml_client.spaces.list()
all_spaces = wml_client.spaces.get_details()['resources']
space_id = None
for space in all_spaces:
    if space['entity']['name'] == DEPLOYMENT_SPACE_NAME:
        space_id = space["metadata"]["id"]
        print("\nDeployment Space ID: ", space_id)

if space_id is None:
    print("WARNING: Your space does not exist. Create a deployment space before proceeding to the next cell.")
    #space_id = client.spaces.store(meta_props={client.spaces.ConfigurationMetaNames.NAME: space_name})["metadata"]["guid"]

Note: 'limit' is not provided. Only first 50 records will be displayed if the number of records exceed 50
------------------------------------  -----------------------------------------------------------  ------------------------
ID                                    NAME                                                         CREATED
c3a71249-4da6-45c7-939f-0e8d135ad694  openscale-express-path-99318397-f315-40be-a572-961b511856bd  2020-10-13T20:51:31.431Z
79a3d376-144b-4bbc-8c92-fdf187c50b67  CreditRiskDep1007                                            2020-10-07T15:21:16.858Z
50e0a36c-f923-46f1-a7b6-3acedb6762e7  creditriskspace09252020                                      2020-09-25T20:50:33.359Z
------------------------------------  -----------------------------------------------------------  ------------------------

Deployment Space ID:  79a3d376-144b-4bbc-8c92-fdf187c50b67


In [26]:
WML_SPACE_ID = space_id
wml_client.set.default_space(WML_SPACE_ID)

'SUCCESS'

### 2.5.2 Remove existing model and deployment

In [27]:
wml_client.repository.list_models()
wml_client.deployments.list()

------------------------------------  ---------------------------------------------------  ------------------------  --------------
ID                                    NAME                                                 CREATED                   TYPE
a4203511-0f56-4422-807f-d28c4f9eab13  WOS German Risk Model WML V4 - JRT1020v1             2020-10-20T21:44:18.002Z  mllib_2.4
dd433ad8-98dc-4cae-b1a6-32eff7f168c2  WOS German Risk Model WML V4 - JRT1012v1             2020-10-12T15:42:11.002Z  mllib_2.4
8d5e55ff-6cbb-4aa9-af37-730f1815f33f  WOS German Risk Model WML V4 - JRT1007v2             2020-10-08T01:11:50.002Z  mllib_2.4
17f79410-cbb7-48b0-86e8-ef8ff950d4d5  RiskAutoAI - P4 GradientBoostingClassifierEstimator  2020-10-07T18:35:00.002Z  wml-hybrid_0.1
e0e81244-c8c3-4e1b-8c3b-7974e5eca9d6  RiskSpark1007                                        2020-10-07T16:14:01.002Z  mllib_2.4
------------------------------------  ---------------------------------------------------  ---------------

In [28]:
try:
    deployments_list = wml_client.deployments.get_details()
    for deployment in deployments_list["resources"]:
        model_id = deployment["entity"]["asset"]["id"]
        deployment_id = deployment["metadata"]["id"]
        if deployment["metadata"]["name"] == DEPLOYMENT_NAME:
            print("Deleting deployment id", deployment_id)
            wml_client.deployments.delete(deployment_id)
            print("Deleting model id", model_id)
            wml_client.repository.delete(model_id)
            wml_client.repository.list_models()
            wml_client.deployments.list()
except Exception as e:
    print("An exception occurred: {0}".format(e))

In [29]:
software_spec_uid = wml_client.software_specifications.get_id_by_name("spark-mllib_2.4")
print("Software Specification ID: {}".format(software_spec_uid))

model_props = {
    wml_client.repository.ModelMetaNames.NAME: "{}".format(MODEL_NAME),
    wml_client._models.ConfigurationMetaNames.SPACE_UID: WML_SPACE_ID,
    wml_client.repository.ModelMetaNames.TYPE: 'mllib_2.4',
    wml_client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: software_spec_uid
}

Software Specification ID: 390d21f8-e58b-4fac-9c55-d7ceda621326


In [30]:
published_model_details = wml_client.repository.store_model(model, model_props, training_data=df_data, pipeline=pipeline)
model_uid = wml_client.repository.get_model_uid(published_model_details)

print("Published Model Details: ")
print(json.dumps(published_model_details, indent=2))

Published Model Details: 
{
  "entity": {
    "label_column": "Risk",
    "pipeline": {
      "id": "c056ac86-13f8-4f2d-8cc0-b32e9e3c16a0"
    },
    "software_spec": {
      "id": "390d21f8-e58b-4fac-9c55-d7ceda621326",
      "name": "spark-mllib_2.4"
    },
    "training_data_references": [
      {
        "connection": {
          "access_key_id": "not_applicable",
          "endpoint_url": "not_applicable",
          "secret_access_key": "not_applicable"
        },
        "id": "1",
        "location": {},
        "schema": {
          "fields": [
            {
              "metadata": {},
              "name": "CheckingStatus",
              "nullable": true,
              "type": "string"
            },
            {
              "metadata": {},
              "name": "LoanDuration",
              "nullable": true,
              "type": "long"
            },
            {
              "metadata": {},
              "name": "CreditHistory",
              "nullable": true,
      

## 2.6 Deploy the model

The next section of the notebook deploys the model as a RESTful web service in Watson Machine Learning. The deployed model will have a scoring URL you can use to send data to the model for predictions.

In [31]:
print("Deploying model...")
meta_props = {
    wml_client.deployments.ConfigurationMetaNames.NAME: DEPLOYMENT_NAME,
    wml_client.deployments.ConfigurationMetaNames.ONLINE: {}
}
deployment = wml_client.deployments.create(model_uid, meta_props=meta_props)
deployment_uid = wml_client.deployments.get_uid(deployment)
scoring_url = wml_client.deployments.get_scoring_href(deployment)
 
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))
print("Scoring URL:{}".format(scoring_url))

Deploying model...


#######################################################################################

Synchronous deployment creation for uid: '44daffcc-6310-417c-9911-07c65a4adb2a' started

#######################################################################################


initializing.
ready


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='8a6a243d-9622-42bb-9021-1ba3e7791012'
------------------------------------------------------------------------------------------------


Model id: 44daffcc-6310-417c-9911-07c65a4adb2a
Deployment id: 8a6a243d-9622-42bb-9021-1ba3e7791012
Scoring URL:https://us-south.ml.cloud.ibm.com/ml/v4/deployments/8a6a243d-9622-42bb-9021-1ba3e7791012/predictions


In [32]:
wml_client.repository.list_models()
wml_client.deployments.list()

------------------------------------  ---------------------------------------------------  ------------------------  --------------
ID                                    NAME                                                 CREATED                   TYPE
44daffcc-6310-417c-9911-07c65a4adb2a  JRT_WOSTest1021_WOSNotebook_Model                    2020-10-21T16:18:34.002Z  mllib_2.4
a4203511-0f56-4422-807f-d28c4f9eab13  WOS German Risk Model WML V4 - JRT1020v1             2020-10-20T21:44:18.002Z  mllib_2.4
dd433ad8-98dc-4cae-b1a6-32eff7f168c2  WOS German Risk Model WML V4 - JRT1012v1             2020-10-12T15:42:11.002Z  mllib_2.4
8d5e55ff-6cbb-4aa9-af37-730f1815f33f  WOS German Risk Model WML V4 - JRT1007v2             2020-10-08T01:11:50.002Z  mllib_2.4
17f79410-cbb7-48b0-86e8-ef8ff950d4d5  RiskAutoAI - P4 GradientBoostingClassifierEstimator  2020-10-07T18:35:00.002Z  wml-hybrid_0.1
e0e81244-c8c3-4e1b-8c3b-7974e5eca9d6  RiskSpark1007                                        2020-10-07T16:1

### 2.6.1 Call the model

In [33]:
fields = ["CheckingStatus", "LoanDuration", "CreditHistory", "LoanPurpose", "LoanAmount", "ExistingSavings",
                  "EmploymentDuration", "InstallmentPercent", "Sex", "OthersOnLoan", "CurrentResidenceDuration",
                  "OwnsProperty", "Age", "InstallmentPlans", "Housing", "ExistingCreditsCount", "Job", "Dependents",
                  "Telephone", "ForeignWorker"]
values = [
            ["no_checking", 13, "credits_paid_to_date", "car_new", 1343, "100_to_500", "1_to_4", 2, "female", "none", 3,
             "savings_insurance", 46, "none", "own", 2, "skilled", 1, "none", "yes"],
            ["no_checking", 24, "prior_payments_delayed", "furniture", 4567, "500_to_1000", "1_to_4", 4, "male", "none",
             4, "savings_insurance", 36, "none", "free", 2, "management_self-employed", 1, "none", "yes"],
        ]

scoring_payload = {"input_data": [{"fields": fields, "values": values}]}
predictions = wml_client.deployments.score(deployment_uid, scoring_payload)
predictions

{'predictions': [{'fields': ['CheckingStatus',
    'LoanDuration',
    'CreditHistory',
    'LoanPurpose',
    'LoanAmount',
    'ExistingSavings',
    'EmploymentDuration',
    'InstallmentPercent',
    'Sex',
    'OthersOnLoan',
    'CurrentResidenceDuration',
    'OwnsProperty',
    'Age',
    'InstallmentPlans',
    'Housing',
    'ExistingCreditsCount',
    'Job',
    'Dependents',
    'Telephone',
    'ForeignWorker',
    'CheckingStatus_IX',
    'CreditHistory_IX',
    'EmploymentDuration_IX',
    'ExistingSavings_IX',
    'ForeignWorker_IX',
    'Housing_IX',
    'InstallmentPlans_IX',
    'Job_IX',
    'LoanPurpose_IX',
    'OthersOnLoan_IX',
    'OwnsProperty_IX',
    'Sex_IX',
    'Telephone_IX',
    'features',
    'rawPrediction',
    'probability',
    'prediction',
    'predictedLabel'],
   'values': [['no_checking',
     13,
     'credits_paid_to_date',
     'car_new',
     1343,
     '100_to_500',
     '1_to_4',
     2,
     'female',
     'none',
     3,
     'savings

# 3.0 Configure OpenScale <a name="openscale"></a>

The notebook will now import the necessary libraries and set up a Python OpenScale client.

In [34]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import *

import time

In [35]:
authenticator = IAMAuthenticator(apikey=CLOUD_API_KEY)
wos_client = APIClient(authenticator=authenticator)

#wos_client = APIClient(
#    service_instance_id="xxxxxxxxxxxxxxxxxxxx",
#    authenticator=authenticator
#)

wos_client.version

'3.0.1'

## 3.1 Create datamart

### 3.1.1 Set up datamart

Watson OpenScale uses a database to store payload logs and calculated metrics. If database credentials were **not** supplied above, the notebook will use the free, internal lite database. If database credentials were supplied, the datamart will be created there **unless** there is an existing datamart **and** the **KEEP_MY_INTERNAL_POSTGRES** variable is set to **True**. If an OpenScale datamart exists in Db2 or PostgreSQL, the existing datamart will be used and no data will be overwritten.

Prior instances of the German Credit model will be removed from OpenScale monitoring.

In [36]:
wos_client.data_marts.show()

0,1,2,3,4,5
wosfastpath,,True,active,2020-10-13 20:52:14.086000+00:00,99318397-f315-40be-a572-961b511856bd


In [37]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print("Setting up external datamart")
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS["connection"]["postgres"]["hosts"][0]["hostname"],
                        username=DB_CREDENTIALS["connection"]["postgres"]["authentication"]["username"],
                        password=DB_CREDENTIALS["connection"]["postgres"]["authentication"]["password"],
                        db=DB_CREDENTIALS["connection"]["postgres"]["database"],
                        port=DB_CREDENTIALS["connection"]["postgres"]["hosts"][0]["port"],
                        ssl=True,
                        sslmode=DB_CREDENTIALS["connection"]["postgres"]["query_options"]["sslmode"],
                        certificate_base64=DB_CREDENTIALS["connection"]["postgres"]["certificate"]["certificate_base64"]
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print("Setting up internal datamart")
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print("Using existing datamart {}".format(data_mart_id))

Using existing datamart 99318397-f315-40be-a572-961b511856bd


## 3.2  Add Service Provider

Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model. If this binding already exists, this code will output a warning message and use the existing binding.

In [38]:
wos_client.service_providers.show()

0,1,2,3,4,5
7e90d3dd-0994-411a-bc3e-2fb322acf604,active,Watson Machine Learning V2 Instance,watson_machine_learning,2020-10-20 21:45:33.824000+00:00,6dda1465-c30a-439f-a89e-e329bae8db70
7e90d3dd-0994-411a-bc3e-2fb322acf604,active,WOS ExpressPath WML pre_production binding,watson_machine_learning,2020-10-13 20:52:34.878000+00:00,4cd2b18e-3d4c-4cf2-a12c-6c1b7dbb4d64
7e90d3dd-0994-411a-bc3e-2fb322acf604,active,WOS ExpressPath WML production binding,watson_machine_learning,2020-10-13 20:52:19.518000+00:00,303da567-f37d-45e5-8188-97a430ef44f7


In [39]:
SERVICE_PROVIDER_NAME = "Watson Machine Learning V2 Instance"
SERVICE_PROVIDER_DESCRIPTION = "WML Instance"

### 3.2.1 Remove existing service provider connected with used WML instance.

Multiple service providers for the same engine instance are avaiable in Watson OpenScale. To avoid multiple service providers of used WML instance in the tutorial notebook the following code deletes existing service provder(s) and then adds new one.

In [40]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    if service_instance_name == SERVICE_PROVIDER_NAME:
        service_provider_id = service_provider.metadata.id
        wos_client.service_providers.delete(service_provider_id)
        print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

Deleted existing service_provider for WML instance: 6dda1465-c30a-439f-a89e-e329bae8db70


### 3.2.2 Add Service Provider

Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model.

**Note:** You can bind more than one engine instance if needed by calling `wos_client.service_providers.add` method. Next, you can refer to particular service provider using `service_provider_id`.

In [41]:
added_service_provider_result = wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description=SERVICE_PROVIDER_DESCRIPTION,
        service_type=ServiceTypes.WATSON_MACHINE_LEARNING,
        deployment_space_id = WML_SPACE_ID,
        operational_space_id = "production",
        credentials=WMLCredentialsCloud(
            apikey=CLOUD_API_KEY,
            url=WML_CREDENTIALS["url"],
            instance_id=None
        ),
        background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id




 Waiting for end of adding service provider f010caca-81d0-4345-abc6-a43628ead294 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




In [42]:
wos_client.service_providers.show()

0,1,2,3,4,5
7e90d3dd-0994-411a-bc3e-2fb322acf604,active,Watson Machine Learning V2 Instance,watson_machine_learning,2020-10-21 16:20:47.866000+00:00,f010caca-81d0-4345-abc6-a43628ead294
7e90d3dd-0994-411a-bc3e-2fb322acf604,active,WOS ExpressPath WML pre_production binding,watson_machine_learning,2020-10-13 20:52:34.878000+00:00,4cd2b18e-3d4c-4cf2-a12c-6c1b7dbb4d64
7e90d3dd-0994-411a-bc3e-2fb322acf604,active,WOS ExpressPath WML production binding,watson_machine_learning,2020-10-13 20:52:19.518000+00:00,303da567-f37d-45e5-8188-97a430ef44f7


In [43]:
asset_deployment_details = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_space_id = WML_SPACE_ID).result['resources'][0]
model_asset_details_from_deployment = wos_client.service_providers.get_deployment_asset(data_mart_id=data_mart_id,service_provider_id=service_provider_id,deployment_id=deployment_uid,deployment_space_id=WML_SPACE_ID)

all_assets_response = wos_client.service_providers.list_assets(
        data_mart_id,
        service_provider_id,
        deployment_space_id=WML_SPACE_ID
     ).result
print(json.dumps(all_assets_response, indent=2))

{
  "resources": [
    {
      "metadata": {
        "guid": "1cfae0db-dce3-42cd-bcd7-0773912772d2",
        "url": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments/1cfae0db-dce3-42cd-bcd7-0773912772d2?space_id=79a3d376-144b-4bbc-8c92-fdf187c50b67",
        "created_at": "2020-10-20T21:44:35.728Z",
        "modified_at": "2020-10-20T21:44:35.728Z"
      },
      "entity": {
        "name": "WOS German Risk Deployment WML V4 - JRT1020v1",
        "type": "online",
        "scoring_endpoint": {
          "url": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments/1cfae0db-dce3-42cd-bcd7-0773912772d2/predictions"
        },
        "asset": {},
        "asset_properties": {}
      }
    },
    {
      "metadata": {
        "guid": "3299e567-0627-4020-b7d9-08b9ddd15216",
        "url": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments/3299e567-0627-4020-b7d9-08b9ddd15216?space_id=79a3d376-144b-4bbc-8c92-fdf187c50b67",
        "created_at": "2020-10-12T15:43:45.480Z",
        "modifi

## 3.3 Subscriptions

In [44]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8
191a125b-ef98-48d9-ba9c-2da7a25ee789,,99318397-f315-40be-a572-961b511856bd,d9abdbb4-cf7a-4783-abf8-57635c54c787,GermanCreditRiskModel,303da567-f37d-45e5-8188-97a430ef44f7,active,2020-10-13 20:56:10.279000+00:00,4737de0d-5e04-4c91-80b1-3b934d9e9a6c
fc979971-7d8e-438b-87e1-433a3bfbb3c9,,99318397-f315-40be-a572-961b511856bd,057913ba-62a8-477a-8fc7-1644fdb846b0,GermanCreditRiskModelPreProd,4cd2b18e-3d4c-4cf2-a12c-6c1b7dbb4d64,active,2020-10-13 20:54:33.174000+00:00,b449f4a6-4249-459a-85ab-139f015429ce
c55713e8-a62a-458a-815d-80e1059bdf28,,99318397-f315-40be-a572-961b511856bd,53429ba1-b729-4f8e-a17e-f8f31094d33b,GermanCreditRiskModelChallenger,4cd2b18e-3d4c-4cf2-a12c-6c1b7dbb4d64,active,2020-10-13 20:53:05.201000+00:00,ae64e4a6-e2fa-4cfd-b477-6df4219383fc


### 3.3.1 Remove existing credit risk subscriptions

This code removes previous subscriptions to the Credit model to refresh the monitors with the new model and new data.

In [45]:
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    sub_model_id = subscription.entity.asset.asset_id
    if sub_model_id == model_uid:
        wos_client.subscriptions.delete(subscription.metadata.id)
        print("Deleted existing subscription for model: {}".format(model_uid))

### 3.3.2 Add new credit risk model subscription.

This code creates the model subscription in OpenScale using the Python client API. Note that we need to provide the model unique identifier, and some information about the model itself.

In [46]:
asset = Asset(
    asset_id=model_uid,
    url=deployment["entity"]["status"]["online_url"]["url"],
    name=model_asset_details_from_deployment["entity"]["asset"]["name"],
    asset_type=AssetTypes.MODEL,
    input_data_type=InputDataType.STRUCTURED,
    problem_type=ProblemType.BINARY_CLASSIFICATION
)
asset_deployment = AssetDeploymentRequest(
    deployment_id=deployment_uid,
    name=DEPLOYMENT_NAME,
    deployment_type=DeploymentTypes.ONLINE,
    url=deployment["entity"]["status"]["online_url"]["url"]
)
training_data_reference = TrainingDataReference(
   type="cos",
   location=COSTrainingDataReferenceLocation(
       bucket=BUCKET_NAME,
       file_name=training_data_file_name
   ),
   connection=COSTrainingDataReferenceConnection.from_dict(
       {
           "resource_instance_id": COS_RESOURCE_CRN,
           "url": COS_ENDPOINT,
           "api_key": COS_API_KEY_ID,
           "iam_url": "https://iam.bluemix.net/oidc/token"
       }
   )
)

asset_properties_request = AssetPropertiesRequest(
    label_column="Risk",
    prediction_field='predictedLabel',
    probability_fields=["probability"],
    feature_fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"],
    categorical_fields = ["CheckingStatus","CreditHistory","LoanPurpose","ExistingSavings","EmploymentDuration","Sex","OthersOnLoan","OwnsProperty","InstallmentPlans","Housing","Job","Telephone","ForeignWorker"],
    training_data_reference=training_data_reference,
    training_data_schema=SparkStruct.from_dict(model_asset_details_from_deployment["entity"]["asset_properties"]["training_data_schema"])
)

In [47]:
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=asset,
        deployment=asset_deployment,
        asset_properties=asset_properties_request).result
subscription_id = subscription_details.metadata.id
print(subscription_details)

{
  "metadata": {
    "id": "26415665-819e-4217-a105-dafe3c8e9c5d",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/f365091ee8c74129813329068811cc54:99318397-f315-40be-a572-961b511856bd:subscription:26415665-819e-4217-a105-dafe3c8e9c5d",
    "url": "/v2/subscriptions/26415665-819e-4217-a105-dafe3c8e9c5d",
    "created_at": "2020-10-21T16:21:27.179000Z",
    "created_by": "IBMid-060000GRS6"
  },
  "entity": {
    "data_mart_id": "99318397-f315-40be-a572-961b511856bd",
    "service_provider_id": "f010caca-81d0-4345-abc6-a43628ead294",
    "asset": {
      "asset_id": "44daffcc-6310-417c-9911-07c65a4adb2a",
      "url": "https://us-south.ml.cloud.ibm.com/ml/v4/deployments/8a6a243d-9622-42bb-9021-1ba3e7791012/predictions",
      "name": "JRT_WOSTest1021_WOSNotebook_Model",
      "asset_type": "model",
      "problem_type": "binary",
      "input_data_type": "structured"
    },
    "asset_properties": {
      "training_data_reference": {
        "secret_id": "7b2cffe9-1937-43c9-82a

### 3.3.3 Check Payload Logging Dataset

In [48]:
time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: {}".format(payload_data_set_id))

Payload data set id: 3e4d7c26-0135-4b1e-984c-b8ad6950e638


In [49]:
wos_client.data_sets.show()

0,1,2,3,4,5,6
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,manual_labeling,2020-10-21 16:21:28.217000+00:00,681ccb78-4b6c-48fc-bdb6-2fa8bc19627a
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,payload_logging,2020-10-21 16:21:28.102000+00:00,3e4d7c26-0135-4b1e-984c-b8ad6950e638
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,manual_labeling,2020-10-13 20:56:12.560000+00:00,66fce8c7-1579-4a5f-a5f3-10bce8a0352f
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,payload_logging,2020-10-13 20:56:12.464000+00:00,0dcb54d4-b1f2-4eb2-b36b-ab3b89fcebcf
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,feedback,2020-10-13 20:56:37.059000+00:00,28d37bc4-c624-4064-b610-e38ecf814e30
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,manual_labeling,2020-10-13 20:54:34.088000+00:00,94cc0bac-697f-4e9d-918a-6b2c2cd0fab2
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,training,2020-10-13 20:56:12.642000+00:00,60e51703-ab82-4428-afbb-5b61789e469a
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,payload_logging,2020-10-13 20:54:33.989000+00:00,e81d262b-1379-40e9-9a4f-161987a0898f
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,feedback,2020-10-13 20:54:59.381000+00:00,3377a80a-8f36-4c69-9dce-dc6a1d256539
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,training,2020-10-13 20:54:34.171000+00:00,aa1b8ae8-973c-4dfc-8199-586eb1d1eaa8


Note: First 10 records were displayed.


In [50]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8
44daffcc-6310-417c-9911-07c65a4adb2a,JRT_WOSTest1021_WOSNotebook_Model,99318397-f315-40be-a572-961b511856bd,8a6a243d-9622-42bb-9021-1ba3e7791012,JRT_WOSTest1021_WOSNotebook_Deployment,f010caca-81d0-4345-abc6-a43628ead294,active,2020-10-21 16:21:27.179000+00:00,26415665-819e-4217-a105-dafe3c8e9c5d
191a125b-ef98-48d9-ba9c-2da7a25ee789,,99318397-f315-40be-a572-961b511856bd,d9abdbb4-cf7a-4783-abf8-57635c54c787,GermanCreditRiskModel,303da567-f37d-45e5-8188-97a430ef44f7,active,2020-10-13 20:56:10.279000+00:00,4737de0d-5e04-4c91-80b1-3b934d9e9a6c
fc979971-7d8e-438b-87e1-433a3bfbb3c9,,99318397-f315-40be-a572-961b511856bd,057913ba-62a8-477a-8fc7-1644fdb846b0,GermanCreditRiskModelPreProd,4cd2b18e-3d4c-4cf2-a12c-6c1b7dbb4d64,active,2020-10-13 20:54:33.174000+00:00,b449f4a6-4249-459a-85ab-139f015429ce
c55713e8-a62a-458a-815d-80e1059bdf28,,99318397-f315-40be-a572-961b511856bd,53429ba1-b729-4f8e-a17e-f8f31094d33b,GermanCreditRiskModelChallenger,4cd2b18e-3d4c-4cf2-a12c-6c1b7dbb4d64,active,2020-10-13 20:53:05.201000+00:00,ae64e4a6-e2fa-4cfd-b477-6df4219383fc


### 3.3.4 Score the model so we can configure monitors

Now that the WML service has been bound and the subscription has been created, we need to send a request to the model before we configure OpenScale. This allows OpenScale to create a payload log in the datamart with the correct schema, so it can capture data coming into and out of the model. First, the code gets the model deployment's endpoint URL, and then sends a few records for predictions.

In [51]:
fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration",
          "InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans",
          "Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"]
values = [
  ["no_checking",13,"credits_paid_to_date","car_new",1343,"100_to_500","1_to_4",2,"female","none",3,"savings_insurance",46,"none","own",2,"skilled",1,"none","yes"],
  ["no_checking",24,"prior_payments_delayed","furniture",4567,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",36,"none","free",2,"management_self-employed",1,"none","yes"],
  ["0_to_200",26,"all_credits_paid_back","car_new",863,"less_100","less_1",2,"female","co-applicant",2,"real_estate",38,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",14,"no_credits","car_new",2368,"less_100","1_to_4",3,"female","none",3,"real_estate",29,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",4,"no_credits","car_new",250,"less_100","unemployed",2,"female","none",3,"real_estate",23,"none","rent",1,"management_self-employed",1,"none","yes"],
  ["no_checking",17,"credits_paid_to_date","car_new",832,"100_to_500","1_to_4",2,"male","none",2,"real_estate",42,"none","own",1,"skilled",1,"none","yes"],
  ["no_checking",33,"outstanding_credit","appliances",5696,"unknown","greater_7",4,"male","co-applicant",4,"unknown",54,"none","free",2,"skilled",1,"yes","yes"],
  ["0_to_200",13,"prior_payments_delayed","retraining",1375,"100_to_500","4_to_7",3,"male","none",3,"real_estate",37,"none","own",2,"management_self-employed",1,"none","yes"]
]

payload_scoring = {"input_data": [{"fields": fields, "values": values}]}
predictions = wml_client.deployments.score(deployment_uid, payload_scoring)
for pred in predictions["predictions"][0]["values"]:
    print("Scoring result: {}".format(pred[-1]))#last item in the values array is the prediction for Spark classification model

Scoring result: No Risk
Scoring result: No Risk
Scoring result: No Risk
Scoring result: No Risk
Scoring result: No Risk
Scoring result: No Risk
Scoring result: Risk
Scoring result: No Risk


#### 3.3.4.1 Validate Payload Logging

Check that automatic storing of scoring to payload logging table occured. If the payload logging table does not have expected number of records, we could manually store the records into the payload logging table.

The number of records in the payload logging table should be equal to number of records submitted in previous cell (i.e 8).

In [52]:
import uuid
from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord

time.sleep(5)
pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))

Number of records in the payload logging table: 8


In [53]:
# If automatic logging is not happening, we could manual log them
if pl_records_count == 0:
    print("Payload logging did not happen, performing explicit payload logging.")
    wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=[PayloadRecord(
                   scoring_id=str(uuid.uuid4()),
                   request=payload_scoring,
                   response=predictions,
                   response_time=460
               )])
    time.sleep(5)
    pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
    print("Number of records in the payload logging table: {}".format(pl_records_count))

# 4.0 Quality monitoring and feedback logging <a name="quality"></a>

In [54]:
wos_client.monitor_instances.show()

0,1,2,3,4,5,6
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,fairness,2020-10-13 20:56:30.974000+00:00,291b3734-a491-4bbb-a36a-641d7b47e3d4
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,drift,2020-10-13 20:56:42.555000+00:00,9104b117-7ebe-4882-b55a-1daf08ebf86d
99318397-f315-40be-a572-961b511856bd,active,c28c505d-3db2-46ce-94e6-52e8acac36f2,instance,performance,2020-10-20 22:57:20.236000+00:00,e6ddec48-986f-4faf-86ac-31cd0054d5b5
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,mrm,2020-10-13 20:56:47.963000+00:00,0e34a447-2f10-4941-bbb2-1fddaa22e1a7
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,quality,2020-10-13 20:56:36.510000+00:00,50f05fbf-ebc3-4cc4-9ebe-52b6d1f1ceba
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,explainability,2020-10-13 20:56:25.514000+00:00,d8352727-d28a-44cc-9b8e-70c4181c4ecb
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,fairness,2020-10-13 20:54:53.345000+00:00,d0fa9fa2-d812-4e11-947e-13c2f7febaf8
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,drift,2020-10-13 20:55:04.790000+00:00,36be10bd-ebe1-460d-8d46-15572f944532
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,mrm,2020-10-13 20:55:10.242000+00:00,8286d5db-33c2-446e-a3c8-cf9878fe6c78
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,quality,2020-10-13 20:54:58.853000+00:00,876893f2-ded3-4245-b52d-9444cd0d2db0


Note: First 10 records were displayed.


## 4.1 Enable quality monitoring

The code below waits ten seconds to allow the payload logging table to be set up before it begins enabling monitors. First, it turns on the quality (accuracy) monitor and sets an alert threshold of 70%. OpenScale will show an alert on the dashboard if the model accuracy measurement (area under the curve, in the case of a binary classifier) falls below this threshold.

The second paramater supplied, min_records, specifies the minimum number of feedback records OpenScale needs before it calculates a new measurement. The quality monitor runs hourly, but the accuracy reading in the dashboard will not change until an additional 50 feedback records have been added, via the user interface, the Python client, or the supplied feedback endpoint.

In [55]:
time.sleep(10)
target = Target(
        target_type=TargetTypes.SUBSCRIPTION,
        target_id=subscription_id
)

parameters = {
    "min_feedback_data_size": 50
}

thresholds = [
    MetricThresholdOverride(
        metric_id="area_under_roc",
        type=MetricThresholdTypes.LOWER_LIMIT,
        value=0.7
    )
]

quality_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters,
    thresholds=thresholds
).result

quality_monitor_instance_id = quality_monitor_details.metadata.id
print("Quality Monitor ID: {}".format(quality_monitor_instance_id))




 Waiting for end of monitor instance creation 5fe2e258-6112-4445-89e4-c74ac96c1922 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------


Quality Monitor ID: 5fe2e258-6112-4445-89e4-c74ac96c1922


## 4.2 Feedback logging

The code below downloads and stores enough feedback data to meet the minimum threshold so that OpenScale can calculate a new accuracy measurement. It then kicks off the accuracy monitor. The monitors run hourly, or can be initiated via the Python API, the REST API, or the graphical user interface.

In [56]:
!rm additional_feedback_data_v2.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/additional_feedback_data_v2.json

!ls -lh additional_feedback_data_v2.json

rm: cannot remove 'additional_feedback_data_v2.json': No such file or directory
-rw-r----- 1 wsuser watsonstudio 50K Oct 21 16:22 additional_feedback_data_v2.json


In [57]:
feedback_dataset_id = None
feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result

feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id
if feedback_dataset_id is None:
    print("Feedback data set not found. Please check quality monitor status.")

In [58]:
with open('additional_feedback_data_v2.json') as feedback_file:
    additional_feedback_data = json.load(feedback_file)

In [59]:
wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)

0

In [60]:
wos_client.data_sets.store_records(feedback_dataset_id, request_body=additional_feedback_data, background_mode=False)




 Waiting for end of storing records with request id: 265ca40d-d006-4e05-8ad4-81788b00ed5a 




active

---------------------------------------
 Successfully finished storing records 
---------------------------------------




<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x7f819eb7ea10>

In [61]:
wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)

98

## 4.3 Run Quality Monitor

In [62]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result




 Waiting for end of monitoring run 79c0772f-e766-4587-b83d-e3c46af30813 




running..
finished

---------------------------
 Successfully finished run 
---------------------------




In [63]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2020-10-21 16:22:54.168000+00:00,true_positive_rate,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.3939393939393939,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,area_under_roc,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.6662004662004662,0.7,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,precision,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.7647058823529411,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,f1_measure,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.5199999999999999,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,accuracy,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.7551020408163265,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,log_loss,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.4527625127690075,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,false_positive_rate,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.0615384615384615,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,area_under_pr,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.6350176434210048,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:22:54.168000+00:00,recall,1ffaedb2-6872-413c-b2c8-2932d81a325f,0.3939393939393939,,,['model_type:original'],quality,5fe2e258-6112-4445-89e4-c74ac96c1922,79c0772f-e766-4587-b83d-e3c46af30813,subscription,26415665-819e-4217-a105-dafe3c8e9c5d


## 4.4 Check Monitors

We can show which monitors are currently enabled, at this point, it should only be the Quality Monitor

In [64]:
wos_client.monitor_instances.show()
wos_client.data_sets.show()

0,1,2,3,4,5,6
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,quality,2020-10-21 16:22:06.708000+00:00,5fe2e258-6112-4445-89e4-c74ac96c1922
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,fairness,2020-10-13 20:56:30.974000+00:00,291b3734-a491-4bbb-a36a-641d7b47e3d4
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,drift,2020-10-13 20:56:42.555000+00:00,9104b117-7ebe-4882-b55a-1daf08ebf86d
99318397-f315-40be-a572-961b511856bd,active,c28c505d-3db2-46ce-94e6-52e8acac36f2,instance,performance,2020-10-20 22:57:20.236000+00:00,e6ddec48-986f-4faf-86ac-31cd0054d5b5
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,mrm,2020-10-13 20:56:47.963000+00:00,0e34a447-2f10-4941-bbb2-1fddaa22e1a7
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,quality,2020-10-13 20:56:36.510000+00:00,50f05fbf-ebc3-4cc4-9ebe-52b6d1f1ceba
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,explainability,2020-10-13 20:56:25.514000+00:00,d8352727-d28a-44cc-9b8e-70c4181c4ecb
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,fairness,2020-10-13 20:54:53.345000+00:00,d0fa9fa2-d812-4e11-947e-13c2f7febaf8
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,drift,2020-10-13 20:55:04.790000+00:00,36be10bd-ebe1-460d-8d46-15572f944532
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,mrm,2020-10-13 20:55:10.242000+00:00,8286d5db-33c2-446e-a3c8-cf9878fe6c78


Note: First 10 records were displayed.


0,1,2,3,4,5,6
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,feedback,2020-10-21 16:22:07.305000+00:00,ca792e3b-1564-459d-b552-23aa77d5dd0f
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,payload_logging,2020-10-21 16:21:28.102000+00:00,3e4d7c26-0135-4b1e-984c-b8ad6950e638
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,manual_labeling,2020-10-21 16:21:28.217000+00:00,681ccb78-4b6c-48fc-bdb6-2fa8bc19627a
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,manual_labeling,2020-10-13 20:56:12.560000+00:00,66fce8c7-1579-4a5f-a5f3-10bce8a0352f
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,payload_logging,2020-10-13 20:56:12.464000+00:00,0dcb54d4-b1f2-4eb2-b36b-ab3b89fcebcf
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,feedback,2020-10-13 20:56:37.059000+00:00,28d37bc4-c624-4064-b610-e38ecf814e30
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,manual_labeling,2020-10-13 20:54:34.088000+00:00,94cc0bac-697f-4e9d-918a-6b2c2cd0fab2
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,training,2020-10-13 20:56:12.642000+00:00,60e51703-ab82-4428-afbb-5b61789e469a
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,payload_logging,2020-10-13 20:54:33.989000+00:00,e81d262b-1379-40e9-9a4f-161987a0898f
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,feedback,2020-10-13 20:54:59.381000+00:00,3377a80a-8f36-4c69-9dce-dc6a1d256539


Note: First 10 records were displayed.


# 5.0 Fairness, drift monitoring and explanations 
 <a name="fairness"></a>

The code below configures fairness monitoring for our model. It turns on monitoring for two features, Sex and Age. In each case, we must specify:
  * Which model feature to monitor
  * One or more **majority** groups, which are values of that feature that we expect to receive a higher percentage of favorable outcomes
  * One or more **minority** groups, which are values of that feature that we expect to receive a higher percentage of unfavorable outcomes
  * The threshold at which we would like OpenScale to display an alert if the fairness measurement falls below (in this case, 95%)

Additionally, we must specify which outcomes from the model are favourable outcomes, and which are unfavourable. We must also provide the number of records OpenScale will use to calculate the fairness score. In this case, OpenScale's fairness monitor will run hourly, but will not calculate a new fairness rating until at least 100 records have been added. Finally, to calculate fairness, OpenScale must perform some calculations on the training data, so we provide the dataframe containing the data.

## 5.1 Enable Fairness Monitoring

In [65]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)

parameters = {
    "features": [
        {"feature": "Sex",
         "majority": ['male'],
         "minority": ['female'],
         "threshold": 0.95
         },
        {"feature": "Age",
         "majority": [[26, 75]],
         "minority": [[18, 25]],
         "threshold": 0.95
         }
    ],
    "favourable_class": ["No Risk"],
    "unfavourable_class": ["Risk"],
    "min_records": 100
}

thresholds = [
    MetricThresholdOverride(
        metric_id="fairness_value",
        type=MetricThresholdTypes.LOWER_LIMIT,
        value=0.95,
        specific_values=[
            MetricSpecificThresholdShortObject(
                value=.95,
                applies_to = [
                    ThresholdConditionObject(
                        type = "tag",
                        value = "Sex",
                        key = "feature"
                    )
                ]
            ),
            MetricSpecificThresholdShortObject(
                value=.95,
                applies_to = [
                    ThresholdConditionObject(
                        type = "tag",
                        value = "Age",
                        key = "feature"
                    )
                ]
            )
        ]
    )
]

fairness_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,
    target=target,
    parameters=parameters,
    thresholds=thresholds
).result

fairness_monitor_instance_id =fairness_monitor_details.metadata.id
print("Fairness Monitor ID: {}".format(fairness_monitor_instance_id))
#print(fairness_monitor_details)




 Waiting for end of monitor instance creation ee3be66b-27e8-4471-903d-098873a4749e 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------


Fairness Monitor ID: ee3be66b-27e8-4471-903d-098873a4749e


## 5.2 Enable Drift Monitoring

We can choose to enable model and/or data drift within the config.

In [66]:
monitor_instances = wos_client.monitor_instances.list().result.monitor_instances
for monitor_instance in monitor_instances:
    monitor_def_id=monitor_instance.entity.monitor_definition_id
    if monitor_def_id == "drift" and monitor_instance.entity.target.target_id == subscription_id:
        wos_client.monitor_instances.delete(monitor_instance.metadata.id, background_mode=False)
        print('Deleted existing drift monitor instance with id: ', monitor_instance.metadata.id)


In [67]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id

)
parameters = {
    "min_samples": 100,
    "drift_threshold": 0.05,
    "train_drift_model": True,
    "enable_model_drift": True,
    "enable_data_drift": True
}

drift_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.DRIFT.ID,
    target=target,
    parameters=parameters
).result

drift_monitor_instance_id = drift_monitor_details.metadata.id
print("Drift Monitor ID: {}".format(drift_monitor_instance_id))
print(drift_monitor_details)




 Waiting for end of monitor instance creation 7d270e68-b31a-4586-ab91-fe8b31bef8a3 




preparing.....................
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------


Drift Monitor ID: 7d270e68-b31a-4586-ab91-fe8b31bef8a3
{
  "metadata": {
    "id": "7d270e68-b31a-4586-ab91-fe8b31bef8a3",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/f365091ee8c74129813329068811cc54:99318397-f315-40be-a572-961b511856bd:monitor_instance:7d270e68-b31a-4586-ab91-fe8b31bef8a3",
    "url": "/v2/monitor_instances/7d270e68-b31a-4586-ab91-fe8b31bef8a3",
    "created_at": "2020-10-21T16:23:37.439000Z",
    "created_by": "IBMid-060000GRS6",
    "modified_at": "2020-10-21T16:25:32.579000Z",
    "modified_by": "iam-ServiceId-aeb7c04d-ba53-4adc-8008-8896ff6b462e"
  },
  "entity": {
    "data_mart_id": "99318397-f315-40be-a572-961b511856bd",
    "monitor_definition_id": "drift",
    "target": {
      "target_type": "subsc

## 5.3 Score the model again now that monitoring is configured

This next section randomly selects 200 records from the data feed and sends those records to the model for predictions. This is enough to exceed the minimum threshold for records set in the previous section, which allows OpenScale to begin calculating fairness.

In [68]:
!rm german_credit_feed.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/german_credit_feed.json -O german_credit_feed.json
        
!ls -lh german_credit_feed.json

rm: cannot remove 'german_credit_feed.json': No such file or directory
-rw-r----- 1 wsuser watsonstudio 3.0M Oct 21 16:25 german_credit_feed.json


Score 200 randomly chosen records

In [69]:
import random

with open('german_credit_feed.json', 'r') as scoring_file:
    scoring_data = json.load(scoring_file)

fields = scoring_data['fields']
values = []
for _ in range(200):
    values.append(random.choice(scoring_data['values']))
payload_scoring = {"input_data": [{"fields": fields, "values": values}]}
scoring_response = wml_client.deployments.score(deployment_uid, payload_scoring)
print("Single record scoring result:", scoring_response["predictions"][0]["values"][0][-1])

Single record scoring result: No Risk


In [70]:
time.sleep(5)

# Manually log the scoring response if its still the original 8 from before (should be 208 at this point).
if pl_records_count == 8:
    print("Payload logging did not happen, performing explicit payload logging.")
    wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=[PayloadRecord(
                   scoring_id=str(uuid.uuid4()),
                   request=payload_scoring,
                   response=scoring_response,
                   response_time=460
               )])
    
pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))

Payload logging did not happen, performing explicit payload logging.
Number of records in the payload logging table: 208


>Note: The payload table should have a total of 208 records. However, this can vary depending on how many scoring requests have been sent to the model.

## 5.4 Run Monitors


### 5.4.1 Fairness Monitor

Kick off a fairness monitor run on current data. The monitor runs hourly, but can be manually initiated using the Python client, the REST API, or the graphical user interface. We have a 10 second sleep so that the scoring of 200 payloads above can complete. NOTE: if the cell below finishes with errors, skip it and complete the notebook, then return and try again.

In [71]:
time.sleep(10)
run_details = None
try:
    run_details = wos_client.monitor_instances.run(monitor_instance_id=fairness_monitor_instance_id, background_mode=False)
except Exception as e:
    print("An exception occurred: {0}".format(e))




 Waiting for end of monitoring run cef89b9a-f22a-437b-8744-c4176bd55f7f 




running..
finished

---------------------------
 Successfully finished run 
---------------------------




In [72]:
time.sleep(10)
wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2020-10-21 16:26:13.731327+00:00,fairness_value,7140b349-feaa-4c30-8029-a69afed19221,96.4,95.0,,"['feature:Sex', 'fairness_metric_type:debiased_fairness', 'feature_value:female']",fairness,ee3be66b-27e8-4471-903d-098873a4749e,cef89b9a-f22a-437b-8744-c4176bd55f7f,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:26:13.731327+00:00,fairness_value,7140b349-feaa-4c30-8029-a69afed19221,102.4,95.0,,"['feature:Age', 'fairness_metric_type:debiased_fairness', 'feature_value:18-25']",fairness,ee3be66b-27e8-4471-903d-098873a4749e,cef89b9a-f22a-437b-8744-c4176bd55f7f,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:26:13.731327+00:00,fairness_value,f17832d6-c7de-42f6-a23d-9c7c6d51cd49,96.4,95.0,,"['feature:Sex', 'fairness_metric_type:fairness', 'feature_value:female']",fairness,ee3be66b-27e8-4471-903d-098873a4749e,cef89b9a-f22a-437b-8744-c4176bd55f7f,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:26:13.731327+00:00,fairness_value,f17832d6-c7de-42f6-a23d-9c7c6d51cd49,102.4,95.0,,"['feature:Age', 'fairness_metric_type:fairness', 'feature_value:18-25']",fairness,ee3be66b-27e8-4471-903d-098873a4749e,cef89b9a-f22a-437b-8744-c4176bd55f7f,subscription,26415665-819e-4217-a105-dafe3c8e9c5d


### 5.4.2 Drift Monitor

Kick off a drift monitor run on current data. The monitor runs every hour, but can be manually initiated using the Python client, the REST API.

In [73]:
time.sleep(5)
drift_run_details = None
try:
    drift_run_details = wos_client.monitor_instances.run(monitor_instance_id=drift_monitor_instance_id, background_mode=False)
except Exception as e:
    print("An exception occurred: {0}".format(e))




 Waiting for end of monitoring run 361773f9-bca4-484c-b39e-c7eec79fef69 




running
finished

---------------------------
 Successfully finished run 
---------------------------




In [74]:
time.sleep(5)
wos_client.monitor_instances.show_metrics(monitor_instance_id=drift_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2020-10-21 16:26:57.649066+00:00,data_drift_magnitude,c84a1203-a354-4b68-97ee-1d553dae90ac,0.0637254901960784,,,[],drift,7d270e68-b31a-4586-ab91-fe8b31bef8a3,361773f9-bca4-484c-b39e-c7eec79fef69,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:26:57.649066+00:00,drift_magnitude,c84a1203-a354-4b68-97ee-1d553dae90ac,0.0545294117647058,,0.05,[],drift,7d270e68-b31a-4586-ab91-fe8b31bef8a3,361773f9-bca4-484c-b39e-c7eec79fef69,subscription,26415665-819e-4217-a105-dafe3c8e9c5d
2020-10-21 16:26:57.649066+00:00,predicted_accuracy,c84a1203-a354-4b68-97ee-1d553dae90ac,0.7444705882352942,,,[],drift,7d270e68-b31a-4586-ab91-fe8b31bef8a3,361773f9-bca4-484c-b39e-c7eec79fef69,subscription,26415665-819e-4217-a105-dafe3c8e9c5d


## 5.5 Configure Explainability

Finally, we provide OpenScale with the training data to enable and configure the explainability features.

In [75]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_id = explainability_details.metadata.id




 Waiting for end of monitor instance creation eade92f0-6c93-41ca-8538-0ff8a1c96f17 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




### 5.5.1 Run Explanation for Sample Record

In [76]:
pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result
scoring_ids = [pl_records_resp["records"][0]["entity"]["values"]["scoring_id"]]
explanation_types = ["lime", "contrastive"]
print("Running explanations on scoring IDs: {}".format(scoring_ids))
post_exp_task_resp = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types).result
print(post_exp_task_resp)
explanation_task_id = post_exp_task_resp.metadata.explanation_task_ids[0] #Pick up the first explanation task id from list.
print("Explanations task ID: {}".format(explanation_task_id))

Running explanations on scoring IDs: ['40587d3c-cea1-43c8-b181-a3877d3c2f56-1']
{
  "metadata": {
    "explanation_task_ids": [
      "e76b7471-43b8-43d6-b972-4150a22c9608"
    ],
    "created_by": "IBMid-060000GRS6",
    "created_at": "2020-10-21T16:27:41.123376Z"
  }
}
Explanations task ID: e76b7471-43b8-43d6-b972-4150a22c9608


In [78]:
poll_limit = 5
def poll_explainability_task(client, epx_task_id):
    import time
    pcounter = 1
    while True:
        exp_task = client.monitor_instances.get_explanation_tasks(epx_task_id).result
        state = exp_task.entity.status.state
        if state == 'finished':
            return exp_task
        elif pcounter > poll_limit:
            print("Explanation task has not completed.")
            return None
        else:
            print("Explanation task not ready...")
            pcounter += 1
            time.sleep(5)
            
exp_resp = poll_explainability_task(wos_client, explanation_task_id)
print(exp_resp)

Explanation task not ready...
Explanation task not ready...
Explanation task not ready...
Explanation task not ready...
Explanation task not ready...
{
  "metadata": {
    "explanation_task_id": "e76b7471-43b8-43d6-b972-4150a22c9608",
    "created_by": "IBMid-060000GRS6",
    "created_at": "2020-10-21T16:27:41.123376Z",
    "updated_at": "2020-10-21T16:28:43.571553Z"
  },
  "entity": {
    "status": {
      "state": "finished"
    },
    "asset": {
      "id": "44daffcc-6310-417c-9911-07c65a4adb2a",
      "name": "JRT_WOSTest1021_WOSNotebook_Model",
      "input_data_type": "structured",
      "problem_type": "binary",
      "deployment": {
        "id": "8a6a243d-9622-42bb-9021-1ba3e7791012",
        "name": "JRT_WOSTest1021_WOSNotebook_Deployment"
      }
    },
    "input_features": [
      {
        "name": "CheckingStatus",
        "value": "less_0",
        "feature_type": "categorical"
      },
      {
        "name": "LoanDuration",
        "value": "30",
        "feature_type"

## 5.6 Check Monitors

We can show which monitors are currently enabled, at this point, it would be the Quality, Fairness, Drift and Explainability Monitors

In [79]:
wos_client.monitor_instances.show()
wos_client.data_sets.show()

0,1,2,3,4,5,6
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,explainability,2020-10-21 16:27:32.991000+00:00,eade92f0-6c93-41ca-8538-0ff8a1c96f17
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,drift,2020-10-21 16:23:37.439000+00:00,7d270e68-b31a-4586-ab91-fe8b31bef8a3
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,fairness,2020-10-21 16:23:26.220000+00:00,ee3be66b-27e8-4471-903d-098873a4749e
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,quality,2020-10-21 16:22:06.708000+00:00,5fe2e258-6112-4445-89e4-c74ac96c1922
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,fairness,2020-10-13 20:56:30.974000+00:00,291b3734-a491-4bbb-a36a-641d7b47e3d4
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,drift,2020-10-13 20:56:42.555000+00:00,9104b117-7ebe-4882-b55a-1daf08ebf86d
99318397-f315-40be-a572-961b511856bd,active,c28c505d-3db2-46ce-94e6-52e8acac36f2,instance,performance,2020-10-20 22:57:20.236000+00:00,e6ddec48-986f-4faf-86ac-31cd0054d5b5
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,mrm,2020-10-13 20:56:47.963000+00:00,0e34a447-2f10-4941-bbb2-1fddaa22e1a7
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,quality,2020-10-13 20:56:36.510000+00:00,50f05fbf-ebc3-4cc4-9ebe-52b6d1f1ceba
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,explainability,2020-10-13 20:56:25.514000+00:00,d8352727-d28a-44cc-9b8e-70c4181c4ecb


Note: First 10 records were displayed.


0,1,2,3,4,5,6
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,payload_logging,2020-10-21 16:21:28.102000+00:00,3e4d7c26-0135-4b1e-984c-b8ad6950e638
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,manual_labeling,2020-10-21 16:21:28.217000+00:00,681ccb78-4b6c-48fc-bdb6-2fa8bc19627a
99318397-f315-40be-a572-961b511856bd,active,26415665-819e-4217-a105-dafe3c8e9c5d,subscription,feedback,2020-10-21 16:22:07.305000+00:00,ca792e3b-1564-459d-b552-23aa77d5dd0f
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,manual_labeling,2020-10-13 20:56:12.560000+00:00,66fce8c7-1579-4a5f-a5f3-10bce8a0352f
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,payload_logging,2020-10-13 20:56:12.464000+00:00,0dcb54d4-b1f2-4eb2-b36b-ab3b89fcebcf
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,feedback,2020-10-13 20:56:37.059000+00:00,28d37bc4-c624-4064-b610-e38ecf814e30
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,manual_labeling,2020-10-13 20:54:34.088000+00:00,94cc0bac-697f-4e9d-918a-6b2c2cd0fab2
99318397-f315-40be-a572-961b511856bd,active,4737de0d-5e04-4c91-80b1-3b934d9e9a6c,subscription,training,2020-10-13 20:56:12.642000+00:00,60e51703-ab82-4428-afbb-5b61789e469a
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,payload_logging,2020-10-13 20:54:33.989000+00:00,e81d262b-1379-40e9-9a4f-161987a0898f
99318397-f315-40be-a572-961b511856bd,active,b449f4a6-4249-459a-85ab-139f015429ce,subscription,feedback,2020-10-13 20:54:59.381000+00:00,3377a80a-8f36-4c69-9dce-dc6a1d256539


Note: First 10 records were displayed.


# 6.0 Custom monitors and metrics <a name="custom"></a>

## 6.1 Register custom monitor

In [80]:
def get_definition(monitor_name):
    monitor_definitions = wos_client.monitor_definitions.list().result.monitor_definitions
    
    for definition in monitor_definitions:
        if monitor_name == definition.entity.name:
            return definition
    
    return None

In [81]:
monitor_name = CUSTOM_NAME + '_WOSNotebook_CustomMonitor'
metrics = [MonitorMetricRequest(name='sensitivity',
                                thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.8)]),
          MonitorMetricRequest(name='specificity',
                                thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.75)])]
tags = [MonitorTagRequest(name='region', description='customer geographical region')]

existing_definition = get_definition(monitor_name)

if existing_definition is None:
    custom_monitor_details = wos_client.monitor_definitions.add(name=monitor_name, metrics=metrics, tags=tags, background_mode=False).result
else:
    custom_monitor_details = existing_definition




 Waiting for end of adding monitor definition jrt_wostest1021_wosnotebook_custommonitor 




finished

-------------------------------------------------
 Successfully finished adding monitor definition 
-------------------------------------------------




In [82]:
wos_client.monitor_definitions.show()

0,1,2
jrt_wostest1021_wosnotebook_custommonitor,JRT_WOSTest1021_WOSNotebook_CustomMonitor,"['sensitivity', 'specificity']"
my_model_performance,My model performance,"['sensitivity', 'specificity']"
assurance,Assurance,"['Uncertainty', 'Confidence']"
fairness,Fairness,"['Fairness value', 'Average Odds Difference metric value', 'False Discovery Rate Difference metric value', 'Error Rate Difference metric value', 'False Negative Rate Difference metric value', 'False Omission Rate Difference metric value', 'False Positive Rate Difference metric value', 'True Positive Rate Difference metric value']"
performance,Performance,['Number of records']
explainability,Explainability,[]
mrm,Model risk management monitoring,"['Tests run', 'Tests passed', 'Tests failed', 'Tests skipped', 'Fairness score', 'Quality score', 'Drift score']"
correlations,Correlations,"['Maximum positive correlation coefficient', 'Maximum negative correlation coefficient', 'Mean absolute correlation coefficient', 'Significant correlation coefficients count']"
drift,Drift,"['Drop in accuracy', 'Predicted accuracy', 'Drop in data consistency']"
quality,Quality,"['Area under ROC', 'Area under PR', 'Proportion explained variance', 'Mean absolute error', 'Mean squared error', 'R squared', 'Root of mean squared error', 'Accuracy', 'Weighted True Positive Rate (wTPR)', 'True positive rate (TPR)', 'Weighted False Positive Rate (wFPR)', 'False positive rate (FPR)', 'Weighted recall', 'Recall', 'Weighted precision', 'Precision', 'Weighted F1-Measure', 'F1-Measure', 'Logarithmic loss']"


In [83]:
custom_monitor_id = custom_monitor_details.metadata.id
print("Custom Monitor ID: {}".format(custom_monitor_id))

Custom Monitor ID: jrt_wostest1021_wosnotebook_custommonitor


In [84]:
custom_monitor_details = wos_client.monitor_definitions.get(monitor_definition_id=custom_monitor_id).result
print("Custom monitor definition details:", custom_monitor_details)

Custom monitor definition details: {
  "metadata": {
    "id": "jrt_wostest1021_wosnotebook_custommonitor",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/f365091ee8c74129813329068811cc54:99318397-f315-40be-a572-961b511856bd:monitor_definition:jrt_wostest1021_wosnotebook_custommonitor",
    "url": "/v2/monitor_definitions/jrt_wostest1021_wosnotebook_custommonitor",
    "created_at": "2020-10-21T16:29:00.762000Z",
    "created_by": "IBMid-060000GRS6"
  },
  "entity": {
    "name": "JRT_WOSTest1021_WOSNotebook_CustomMonitor",
    "metrics": [
      {
        "name": "sensitivity",
        "thresholds": [
          {
            "type": "lower_limit",
            "default": 0.8
          }
        ],
        "expected_direction": "increasing",
        "id": "sensitivity"
      },
      {
        "name": "specificity",
        "thresholds": [
          {
            "type": "lower_limit",
            "default": 0.75
          }
        ],
        "expected_direction": "increasing

## 6.2 Enable custom monitor for subscription

In [85]:
target = Target(
        target_type=TargetTypes.SUBSCRIPTION,
        target_id=subscription_id
    )

thresholds = [MetricThresholdOverride(metric_id='sensitivity', type = MetricThresholdTypes.LOWER_LIMIT, value=0.9)]

custom_monitor_instance_details = wos_client.monitor_instances.create(
            data_mart_id=data_mart_id,
            background_mode=False,
            monitor_definition_id=custom_monitor_id,
            target=target
).result




 Waiting for end of monitor instance creation 87ab02b2-e7e0-4f4e-8f0a-cfbe1d5b6791 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




### 6.2.1 Get monitor configuration details

In [86]:
custom_monitor_instance_id = custom_monitor_instance_details.metadata.id
custom_monitor_instance_details = wos_client.monitor_instances.get(custom_monitor_instance_id).result
print(custom_monitor_instance_details)

{
  "metadata": {
    "id": "87ab02b2-e7e0-4f4e-8f0a-cfbe1d5b6791",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/f365091ee8c74129813329068811cc54:99318397-f315-40be-a572-961b511856bd:monitor_instance:87ab02b2-e7e0-4f4e-8f0a-cfbe1d5b6791",
    "url": "/v2/monitor_instances/87ab02b2-e7e0-4f4e-8f0a-cfbe1d5b6791",
    "created_at": "2020-10-21T16:29:17.211000Z",
    "created_by": "IBMid-060000GRS6"
  },
  "entity": {
    "data_mart_id": "99318397-f315-40be-a572-961b511856bd",
    "monitor_definition_id": "jrt_wostest1021_wosnotebook_custommonitor",
    "target": {
      "target_type": "subscription",
      "target_id": "26415665-819e-4217-a105-dafe3c8e9c5d"
    },
    "thresholds": [
      {
        "metric_id": "sensitivity",
        "type": "lower_limit",
        "value": 0.8
      },
      {
        "metric_id": "specificity",
        "type": "lower_limit",
        "value": 0.75
      }
    ],
    "schedule": {
      "repeat_interval": 60,
      "repeat_unit": "minute",
    

## 6.3 Storing custom metrics

In [87]:
from datetime import datetime, timezone, timedelta
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import MonitorMeasurementRequest
custom_monitoring_run_id = CUSTOM_NAME + '_WOSNotebook_CustomMonitorRun'#"11122223333111abc"
measurement_request = [MonitorMeasurementRequest(timestamp=datetime.now(timezone.utc), 
                                                 metrics=[{"specificity": 0.78, "sensitivity": 0.67, "region": "us-south"}], run_id=custom_monitoring_run_id)]
print(measurement_request[0])

{
  "timestamp": "2020-10-21T16:29:26.250432Z",
  "run_id": "JRT_WOSTest1021_WOSNotebook_CustomMonitorRun",
  "metrics": [
    {
      "specificity": 0.78,
      "sensitivity": 0.67,
      "region": "us-south"
    }
  ]
}


In [88]:
published_measurement_response = wos_client.monitor_instances.measurements.add(
    monitor_instance_id=custom_monitor_instance_id,
    monitor_measurement_request=measurement_request).result
published_measurement_id = published_measurement_response[0]["measurement_id"]
print(published_measurement_response)

[{'measurement_id': 'fa975233-9740-4e1e-b3ef-b6031e83da43', 'metrics': [{'region': 'us-south', 'sensitivity': 0.67, 'specificity': 0.78}], 'run_id': 'JRT_WOSTest1021_WOSNotebook_CustomMonitorRun', 'timestamp': '2020-10-21T16:29:26.250432Z'}]


### 6.3.1 Get custom metrics

In [89]:
time.sleep(5)
published_measurement = wos_client.monitor_instances.measurements.get(monitor_instance_id=custom_monitor_instance_id, measurement_id=published_measurement_id).result
print(published_measurement)

{
  "metadata": {
    "id": "fa975233-9740-4e1e-b3ef-b6031e83da43",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/f365091ee8c74129813329068811cc54:99318397-f315-40be-a572-961b511856bd:measurement:fa975233-9740-4e1e-b3ef-b6031e83da43",
    "url": "/v2/monitor_instances/87ab02b2-e7e0-4f4e-8f0a-cfbe1d5b6791/measurements/fa975233-9740-4e1e-b3ef-b6031e83da43",
    "created_at": "2020-10-21T16:29:28.066000Z",
    "created_by": "IBMid-060000GRS6"
  },
  "entity": {
    "timestamp": "2020-10-21T16:29:26.250432Z",
    "run_id": "JRT_WOSTest1021_WOSNotebook_CustomMonitorRun",
    "values": [
      {
        "metrics": [
          {
            "id": "sensitivity",
            "value": 0.67,
            "lower_limit": 0.8
          },
          {
            "id": "specificity",
            "value": 0.78,
            "lower_limit": 0.75
          }
        ],
        "tags": [
          {
            "id": "region",
            "value": "us-south"
          }
        ]
      }
    ],
 

# 7.0 Historical data <a name="historical"></a>

In [90]:
historyDays = 7

## 7.1 Insert historical debias metrics

In [91]:
!rm history_debias_v2.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/history_debias_v2.json

!ls -lh history_debias_v2.json

rm: cannot remove 'history_debias_v2.json': No such file or directory
-rw-r----- 1 wsuser watsonstudio 37K Oct 21 16:30 history_debias_v2.json


In [92]:
with open('history_debias_v2.json', 'r') as history_file:
    payloads = json.load(history_file)

for day in range(historyDays):
    print('Loading day', day + 1)
    daily_measurement_requests = []
    for hour in range(24):
        score_time = datetime.now(timezone.utc) + timedelta(hours=(-(24*day + hour + 1)))
        index = (day * 24 + hour) % len(payloads) # wrap around and reuse values if needed

        measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [payloads[index][0], payloads[index][1]])
        
        daily_measurement_requests.append(measurement_request)
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=fairness_monitor_instance_id,
                                            monitor_measurement_request=daily_measurement_requests).result     

print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## 7.2 Insert historical quality metrics

In [93]:
measurements = [0.76, 0.78, 0.68, 0.72, 0.73, 0.77, 0.80]
for day in range(historyDays):
    quality_measurement_requests = []
    print('Loading day', day + 1)
    for hour in range(24):
        score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))
        score_time = score_time.isoformat() + "Z"
        
        metric = {"area_under_roc": measurements[day]}
                
        measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [metric])
        quality_measurement_requests.append(measurement_request)
        
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=quality_monitor_instance_id,
                                            monitor_measurement_request=quality_measurement_requests).result    
    
print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## 7.3 Insert historical confusion matrixes

In [94]:
!rm history_quality_metrics.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_quality_metrics.json

!ls -lh history_quality_metrics.json

rm: cannot remove 'history_quality_metrics.json': No such file or directory
-rw-r----- 1 wsuser watsonstudio 79K Oct 21 16:31 history_quality_metrics.json


In [95]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import Source

with open('history_quality_metrics.json') as json_file:
    records = json.load(json_file)
    
for day in range(historyDays):
    index = 0
    cm_measurement_requests = []
    print('Loading day', day + 1)
    
    for hour in range(24):
        score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))
        score_time = score_time.isoformat() + "Z"

        metric = records[index]['metrics']
        source = records[index]['sources']

        measurement_request = {"timestamp": score_time, "metrics": [metric], "sources": [source]}
        cm_measurement_requests.append(measurement_request)

        index+=1

    response = wos_client.monitor_instances.measurements.add(monitor_instance_id=quality_monitor_instance_id, monitor_measurement_request=cm_measurement_requests).result    

print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## 7.4 Insert historical performance metrics

In [96]:
target = Target(
        target_type=TargetTypes.INSTANCE,
        target_id=payload_data_set_id
    )

performance_monitor_instance_details = wos_client.monitor_instances.create(
            data_mart_id=data_mart_id,
            background_mode=False,
            monitor_definition_id=wos_client.monitor_definitions.MONITORS.PERFORMANCE.ID,
            target=target
).result
performance_monitor_instance_id = performance_monitor_instance_details.metadata.id

for day in range(historyDays):
    performance_measurement_requests = []
    print('Loading day', day + 1)
    for hour in range(24):
        score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))
        score_time = score_time.isoformat() + "Z"
        score_count = random.randint(60, 600)
        
        metric = {"record_count": score_count, "data_set_type": "scoring_payload"}
        
        measurement_request = {"timestamp": score_time, "metrics": [metric]}
        performance_measurement_requests.append(measurement_request)
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=performance_monitor_instance_id,
                                            monitor_measurement_request=performance_measurement_requests).result    

print('Finished')




 Waiting for end of monitor instance creation 2af05746-5cdd-497c-b218-b42d69f46697 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------


Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## 7.5 Insert historical drift measurements

In [98]:
!rm history_drift_measurement_*.json

with io.capture_output() as captured_0:
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_0.json
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_1.json
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_2.json
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_3.json
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_4.json
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_5.json
    !wget https://raw.githubusercontent.com/IBM/credit-risk-workshop-cpd/master/data/openscale/history_drift_measurement_6.json

!ls -lh history_drift_measurement_*.json

-rw-r----- 1 wsuser watsonstudio 832K Oct 21 16:32 history_drift_measurement_0.json
-rw-r----- 1 wsuser watsonstudio 868K Oct 21 16:32 history_drift_measurement_1.json
-rw-r----- 1 wsuser watsonstudio 870K Oct 21 16:32 history_drift_measurement_2.json
-rw-r----- 1 wsuser watsonstudio 910K Oct 21 16:32 history_drift_measurement_3.json
-rw-r----- 1 wsuser watsonstudio 841K Oct 21 16:32 history_drift_measurement_4.json
-rw-r----- 1 wsuser watsonstudio 836K Oct 21 16:32 history_drift_measurement_5.json
-rw-r----- 1 wsuser watsonstudio 840K Oct 21 16:32 history_drift_measurement_6.json


In [99]:
for day in range(historyDays):
    drift_measurements = []

    with open("history_drift_measurement_{}.json".format(day), 'r') as history_file:
        drift_daily_measurements = json.load(history_file)
    print('Loading day', day + 1)

    #Historical data contains 8 records per day - each represents 3 hour drift window.
    
    for nb_window, records in enumerate(drift_daily_measurements):
        for record in records:
            window_start =  datetime.utcnow() + timedelta(hours=(-(24 * day + (nb_window+1)*3 + 1))) # first_payload_record_timestamp_in_window (oldest)
            window_end = datetime.utcnow() + timedelta(hours=(-(24 * day + nb_window*3 + 1)))# last_payload_record_timestamp_in_window (most recent)
            #modify start and end time for each record
            record['sources'][0]['data']['start'] = window_start.isoformat() + "Z"
            record['sources'][0]['data']['end'] = window_end.isoformat() + "Z"
            
            metric = record['metrics'][0]
            source = record['sources'][0]

            measurement_request = {"timestamp": window_start.isoformat() + "Z", "metrics": [metric], "sources": [source]}
            
            drift_measurements.append(measurement_request)
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=drift_monitor_instance_id,
                                            monitor_measurement_request=drift_measurements).result    
    
    print("Daily loading finished.")

Loading day 1
Daily loading finished.
Loading day 2
Daily loading finished.
Loading day 3
Daily loading finished.
Loading day 4
Daily loading finished.
Loading day 5
Daily loading finished.
Loading day 6
Daily loading finished.
Loading day 7
Daily loading finished.


# 8.0 Additional data to help debugging

In [100]:
print("Datamart id: {}".format(data_mart_id))
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))
print("Scoring URL:{}".format(scoring_url))

Datamart id: 99318397-f315-40be-a572-961b511856bd
Model id: 44daffcc-6310-417c-9911-07c65a4adb2a
Deployment id: 8a6a243d-9622-42bb-9021-1ba3e7791012
Scoring URL:https://us-south.ml.cloud.ibm.com/ml/v4/deployments/8a6a243d-9622-42bb-9021-1ba3e7791012/predictions


# 9.0 Identify transactions for Explainability

Transaction IDs identified by the cells below can be copied and pasted into the Explainability tab of the OpenScale dashboard.

In [101]:
#wos_client.data_sets.show_records(payload_data_set_id, limit=5)
pl_pd = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=5, output_type=ResponseTypes.PANDAS).result
df = pl_pd[['scoring_id','predictedLabel','probability']]
df.head(5)

Unnamed: 0,scoring_id,predictedLabel,probability
0,40587d3c-cea1-43c8-b181-a3877d3c2f56-1,No Risk,"[0.7816576666900658, 0.21834233330993413]"
1,40587d3c-cea1-43c8-b181-a3877d3c2f56-10,No Risk,"[0.7037593951439151, 0.296240604856085]"
2,40587d3c-cea1-43c8-b181-a3877d3c2f56-100,No Risk,"[0.8881266777755187, 0.11187332222448132]"
3,40587d3c-cea1-43c8-b181-a3877d3c2f56-101,No Risk,"[0.9288427665537337, 0.07115723344626637]"
4,40587d3c-cea1-43c8-b181-a3877d3c2f56-102,No Risk,"[0.8580685752103218, 0.14193142478967818]"


## Congratulations!

You have finished the hands-on lab for IBM Watson OpenScale. You can now view the OpenScale dashboard. Choose the `OpenScale` service instance and launch the application UI. Click on the tile for the model you've created to see fairness, accuracy, and performance monitors. Click on the timeseries graph to get detailed information on transactions during a specific time window.

OpenScale shows model performance over time. You have two options to keep data flowing to your OpenScale graphs:
  * Download, configure and schedule the [model feed notebook](https://raw.githubusercontent.com/emartensibm/german-credit/master/german_credit_scoring_feed.ipynb). This notebook can be set up with your WML credentials, and scheduled to provide a consistent flow of scoring requests to your model, which will appear in your OpenScale monitors.
  * Re-run this notebook. Running this notebook from the beginning will delete and re-create the model and deployment, and re-create the historical data. Please note that the payload and measurement logs for the previous deployment will continue to be stored in your datamart, and can be deleted if necessary.