<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Working with Watson Machine Learning

This notebook should be run in a Watson Studio project, using **Default Spark Python** runtime environment. **If you are viewing this in Watson Studio and do not see Python 3.6 with Spark in the upper right corner of your screen, please update the runtime now.** It requires service credentials for the following Cloud services:
  * Watson OpenScale
  * Watson Machine Learning
  
If you have a paid Cloud account, you may also provision a **Databases for PostgreSQL** or **Db2 Warehouse** service to take full advantage of integration with Watson Studio and continuous learning services. If you choose not to provision this paid service, you can use the free internal PostgreSQL storage with OpenScale, but will not be able to configure continuous learning for your model.

The notebook will train, create and deploy a German Credit Risk model, configure OpenScale to monitor that deployment, and inject seven days' worth of historical records and measurements for viewing in the OpenScale Insights dashboard.

### Contents

- [Setup](#setup)
- [Model building and deployment](#model)
- [OpenScale configuration](#openscale)
- [Quality monitor and feedback logging](#quality)
- [Fairness, drift monitoring and explanations](#fairness)
- [Custom monitors and metrics](#custom)
- [Historical data](#historical)

### Note: Sample using most latest openscale V2 client from test pypi and will be changed to production release once available. It does not cover followings aspects for now:

- Historical payload logging 
- Historical manual labeling

# Setup <a name="setup"></a>

## Spark check

In [None]:
!pip install pyspark==2.3.0 --no-cache | tail -n 1

In [1]:
try:
    from pyspark.sql import SparkSession
except:
    print('Error: Spark runtime is missing. If you are using Watson Studio change the notebook runtime to Spark.')
    raise 

## Package installation

In [3]:
import warnings
warnings.filterwarnings('ignore')

In [1]:
!rm -rf /home/spark/shared/user-libs/python3.6*

!pip install --upgrade pandas==0.25.3 --no-cache | tail -n 1
!pip install --upgrade requests==2.23 --no-cache | tail -n 1
!pip install numpy==1.16.4 --no-cache | tail -n 1
!pip install SciPy --no-cache | tail -n 1
!pip install lime --no-cache | tail -n 1

!pip install --upgrade watson-machine-learning-client | tail -n 1
!pip install ibm-cloud-sdk-core --no-cache | tail -n 1
!pip install ibm-watson-openscale --extra-index-url https://test.pypi.org/simple/ --no-cache | tail -n 1
#!pip install ibm-watson-openscale --no-cache | tail -n 1



In [3]:
!pip show ibm-watson-openscale

Name: ibm-watson-openscale
Version: 3.0.0.17
Summary: Client library for IBM Watson OpenScale
Home-page: https://github.ibm.com/watson-developer-cloud/openscale-python-sdk
Author: IBM Watson
Author-email: watdevex@us.ibm.com
License: Apache 2.0
Location: /opt/conda/envs/Python36/lib/python3.6/site-packages
Requires: requests, python-dateutil, ibm-cloud-sdk-core, pandas
Required-by: 


## Provision services and configure credentials

If you have not already, provision an instance of IBM Watson OpenScale using the [OpenScale link in the Cloud catalog](https://cloud.ibm.com/catalog/services/watson-openscale).

Your Cloud API key can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below.

**NOTE:** You can also get OpenScale `API_KEY` using IBM CLOUD CLI.

How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

How to get api key using console:
```
bx login --sso
bx iam api-key-create 'my_key'
```

In [5]:
CLOUD_API_KEY = "***"

Next you will need credentials for Watson Machine Learning. If you already have a WML instance, you may use credentials for it. To provision a new Lite instance of WML, use the [Cloud catalog](https://cloud.ibm.com/catalog/services/machine-learning), give your service a name, and click **Create**. Once your instance is created, click the **Service Credentials** link on the left side of the screen. Click the **New credential** button, give your credentials a name, and click **Add**. Your new credentials can be accessed by clicking the **View credentials** button. Copy and paste your WML credentials into the cell below.

In [6]:
WML_CREDENTIALS = {

}

### Cloud object storage details

In [7]:
COS_API_KEY_ID = "***"
COS_RESOURCE_CRN = "***" # eg "crn:v1:bluemix:public:cloud-object-storage:global:a/3bf0d9003abfb5d29761c3e97696b71c:d6f04d83-6c4f-4a62-a165-696756d63903::"
COS_ENDPOINT = "***" # Current list avaiable at https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints

In [8]:
BUCKET_NAME = "***" #example: "credit-risk-training-data"

This tutorial can use Databases for PostgreSQL, Db2 Warehouse, or a free internal verison of PostgreSQL to create a datamart for OpenScale.

If you have previously configured OpenScale, it will use your existing datamart, and not interfere with any models you are currently monitoring. Do not update the cell below.

If you do not have a paid Cloud account or would prefer not to provision this paid service, you may use the free internal PostgreSQL service with OpenScale. Do not update the cell below.

To provision a new instance of Db2 Warehouse, locate [Db2 Warehouse in the Cloud catalog](https://cloud.ibm.com/catalog/services/db2-warehouse), give your service a name, and click **Create**. Once your instance is created, click the **Service Credentials** link on the left side of the screen. Click the **New credential** button, give your credentials a name, and click **Add**. Your new credentials can be accessed by clicking the **View credentials** button. Copy and paste your Db2 Warehouse credentials into the cell below.

To provision a new instance of Databases for PostgreSQL, locate [Databases for PostgreSQL in the Cloud catalog](https://cloud.ibm.com/catalog/services/databases-for-postgresql), give your service a name, and click **Create**. Once your instance is created, click the **Service Credentials** link on the left side of the screen. Click the **New credential** button, give your credentials a name, and click **Add**. Your new credentials can be accessed by clicking the **View credentials** button. Copy and paste your Databases for PostgreSQL credentials into the cell below.

In [8]:
DB_CREDENTIALS = None

__If you previously configured OpenScale to use the free internal version of PostgreSQL, you can switch to a new datamart using a paid database service.__ If you would like to delete the internal PostgreSQL configuration and create a new one using service credentials supplied in the cell above, set the __KEEP_MY_INTERNAL_POSTGRES__ variable below to __False__ below. In this case, the notebook will remove your existing internal PostgreSQL datamart and create a new one with the supplied credentials. __*NO DATA MIGRATION WILL OCCUR.*__

In [9]:
KEEP_MY_INTERNAL_POSTGRES = True

## Run the notebook

At this point, the notebook is ready to run. You can either run the cells one at a time, or click the **Kernel** option above and select **Restart and Run All** to run all the cells.

# Model building and deployment <a name="model"></a>

In this section you will learn how to train Spark MLLib model and next deploy it as web-service using Watson Machine Learning service.

## Load the training data from github

In [11]:
from IPython.utils import io

with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/german_credit_data_biased_training.csv -O german_credit_data_biased_training.csv
!ls -lh german_credit_data_biased_training.csv

-rw-r----- 1 dsxuser dsxuser 674K Aug 19 16:54 german_credit_data_biased_training.csv


In [12]:
from pyspark.sql import SparkSession
import pandas as pd
import json
import datetime

spark = SparkSession.builder.getOrCreate()
pd_data = pd.read_csv("german_credit_data_biased_training.csv", sep=",", header=0)
df_data = spark.read.csv(path="german_credit_data_biased_training.csv", sep=",", header=True, inferSchema=True)
training_data_file_name = "german_credit_data_biased_training.csv"
df_data.head()

Row(CheckingStatus='0_to_200', LoanDuration=31, CreditHistory='credits_paid_to_date', LoanPurpose='other', LoanAmount=1889, ExistingSavings='100_to_500', EmploymentDuration='less_1', InstallmentPercent=3, Sex='female', OthersOnLoan='none', CurrentResidenceDuration=3, OwnsProperty='savings_insurance', Age=32, InstallmentPlans='none', Housing='own', ExistingCreditsCount=1, Job='skilled', Dependents=1, Telephone='none', ForeignWorker='yes', Risk='No Risk')

## Explore data

In [13]:
df_data.printSchema()

root
 |-- CheckingStatus: string (nullable = true)
 |-- LoanDuration: integer (nullable = true)
 |-- CreditHistory: string (nullable = true)
 |-- LoanPurpose: string (nullable = true)
 |-- LoanAmount: integer (nullable = true)
 |-- ExistingSavings: string (nullable = true)
 |-- EmploymentDuration: string (nullable = true)
 |-- InstallmentPercent: integer (nullable = true)
 |-- Sex: string (nullable = true)
 |-- OthersOnLoan: string (nullable = true)
 |-- CurrentResidenceDuration: integer (nullable = true)
 |-- OwnsProperty: string (nullable = true)
 |-- Age: integer (nullable = true)
 |-- InstallmentPlans: string (nullable = true)
 |-- Housing: string (nullable = true)
 |-- ExistingCreditsCount: integer (nullable = true)
 |-- Job: string (nullable = true)
 |-- Dependents: integer (nullable = true)
 |-- Telephone: string (nullable = true)
 |-- ForeignWorker: string (nullable = true)
 |-- Risk: string (nullable = true)



In [14]:
print("Number of records: " + str(df_data.count()))

Number of records: 5000


## Save training data to Cloud Object Storage

In [15]:
import ibm_boto3
from ibm_botocore.client import Config, ClientError

cos_client = ibm_boto3.resource("s3",
    ibm_api_key_id=COS_API_KEY_ID,
    ibm_service_instance_id=COS_RESOURCE_CRN,
    ibm_auth_endpoint="https://iam.bluemix.net/oidc/token",
    config=Config(signature_version="oauth"),
    endpoint_url=COS_ENDPOINT
)

In [16]:
with open(training_data_file_name, "rb") as file_data:
    cos_client.Object(BUCKET_NAME, training_data_file_name).upload_fileobj(
        Fileobj=file_data
    )

## Create a model

In [17]:
spark_df = df_data
(train_data, test_data) = spark_df.randomSplit([0.8, 0.2], 24)

MODEL_NAME = "Spark German Risk Model - Final"
DEPLOYMENT_NAME = "Spark German Risk Deployment - Final"

print("Number of records for training: " + str(train_data.count()))
print("Number of records for evaluation: " + str(test_data.count()))

Number of records for training: 4016
Number of records for evaluation: 984


The code below creates a Random Forest Classifier with Spark, setting up string indexers for the categorical features and the label column. Finally, this notebook creates a pipeline including the indexers and the model, and does an initial Area Under ROC evaluation of the model.

In [18]:
from pyspark.ml.feature import OneHotEncoder, StringIndexer, IndexToString, VectorAssembler
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml import Pipeline, Model
from pyspark.ml.feature import SQLTransformer

features = [x for x in spark_df.columns if x != 'Risk']
categorical_features = ['CheckingStatus', 'CreditHistory', 'LoanPurpose', 'ExistingSavings', 'EmploymentDuration', 'Sex', 'OthersOnLoan', 'OwnsProperty', 'InstallmentPlans', 'Housing', 'Job', 'Telephone', 'ForeignWorker']
categorical_num_features = [x + '_IX' for x in categorical_features]
si_list = [StringIndexer(inputCol=x, outputCol=y) for x, y in zip(categorical_features, categorical_num_features)]
va_features = VectorAssembler(inputCols=categorical_num_features + [x for x in features if x not in categorical_features], outputCol="features")

In [19]:
si_label = StringIndexer(inputCol="Risk", outputCol="label").fit(spark_df)
label_converter = IndexToString(inputCol="prediction", outputCol="predictedLabel", labels=si_label.labels)

In [20]:
from pyspark.ml.classification import RandomForestClassifier

classifier = RandomForestClassifier(featuresCol="features")
feature_filter = SQLTransformer(statement="SELECT * FROM __THIS__")
pipeline = Pipeline(stages= si_list + [si_label, va_features, classifier, label_converter, feature_filter])
model = pipeline.fit(train_data)

**Note:** If you want filter features from model output please replace **`*`** with feature names to be retained in **`SQLTransformer`** statement.

In [21]:
predictions = model.transform(test_data)
evaluatorDT = BinaryClassificationEvaluator(rawPredictionCol="prediction")
area_under_curve = evaluatorDT.evaluate(predictions)

print("areaUnderROC = %g" % area_under_curve)

areaUnderROC = 0.708922


## Publish the model

In this section, the notebook uses the supplied Watson Machine Learning credentials to save the model (including the pipeline) to the WML instance. Previous versions of the model are removed so that the notebook can be run again, resetting all data for another demo.

In [22]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient
import json

wml_client = WatsonMachineLearningAPIClient(WML_CREDENTIALS)

### Remove existing model and deployment

In [23]:
model_deployment_ids = wml_client.deployments.get_uids()
for deployment_id in model_deployment_ids:
    deployment = wml_client.deployments.get_details(deployment_id)
    model_id = deployment['entity']['deployable_asset']['guid']
    if deployment['entity']['name'] == DEPLOYMENT_NAME:
        print('Deleting deployment id', deployment_id)
        wml_client.deployments.delete(deployment_id)
        print('Deleting model id', model_id)
        wml_client.repository.delete(model_id)
wml_client.repository.list_models()

Deleting deployment id e93463f5-de55-4c80-ada0-dd5b8fe5b5e3
Deleting model id 55d2094f-604d-4675-a11d-590feeca79a3
------------------------------------  ------------------------------  ------------------------  -----------------
GUID                                  NAME                            CREATED                   FRAMEWORK
c3fbb6ba-1dcb-4aa4-9073-2c0f32d7451c  Scikit German Risk Model        2020-08-18T18:29:56.529Z  scikit-learn-0.20
91be60bd-c9af-4103-9732-4be6a336ae50  Income Classifier Binary Model  2020-07-23T17:58:45.824Z  mllib-2.3
------------------------------------  ------------------------------  ------------------------  -----------------


In [24]:
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spark/credit-risk/meta/credit-risk-meta.json -O credit-risk-meta.json
!ls -lh credit-risk-meta.json

-rw-r----- 1 dsxuser dsxuser 14K Aug 19 16:55 credit-risk-meta.json


In [25]:
with open('credit-risk-meta.json') as f:
    [training_data_reference, *_] = json.load(f)['model_meta']['training_data_reference']

In [26]:
print(training_data_reference)

{'connection': {'db': 'BLUDB', 'host': 'dashdb-txn-sbox-yp-dal09-03.services.dal.bluemix.net', 'password': 'khhz72v+6mcwwkfv', 'username': 'cmb91569'}, 'name': 'German credit risk training data', 'source': {'tablename': 'CREDIT_RISK_TRAIN_DATA', 'type': 'db2'}}


In [27]:
model_props = {
    wml_client.repository.ModelMetaNames.NAME: "{}".format(MODEL_NAME),
    wml_client.repository.ModelMetaNames.EVALUATION_METHOD: "binary",
    wml_client.repository.ModelMetaNames.TRAINING_DATA_REFERENCE: training_data_reference,
    wml_client.repository.ModelMetaNames.EVALUATION_METRICS: [
        {
           "name": "areaUnderROC",
           "value": area_under_curve,
           "threshold": 0.7
        }
    ]
}

In [28]:
wml_models = wml_client.repository.get_details()
model_uid = None
for model_in in wml_models['models']['resources']:
    if MODEL_NAME == model_in['entity']['name']:
        model_uid = model_in['metadata']['guid']
        break

if model_uid is None:
    print("Storing model ...")

    published_model_details = wml_client.repository.store_model(model=model, meta_props=model_props, training_data=train_data, pipeline=pipeline)
    model_uid = wml_client.repository.get_model_uid(published_model_details)
    print("Done")

Storing model ...
Done


In [29]:
model_uid

'16462c0b-fa59-49f0-ba40-6b9c722e78d0'

## Deploy the model

The next section of the notebook deploys the model as a RESTful web service in Watson Machine Learning. The deployed model will have a scoring URL you can use to send data to the model for predictions.

In [30]:
wml_deployments = wml_client.deployments.get_details()
deployment_uid = None
for deployment in wml_deployments['resources']:
    if DEPLOYMENT_NAME == deployment['entity']['name']:
        deployment_uid = deployment['metadata']['guid']
        break

if deployment_uid is None:
    print("Deploying model...")

    deployment = wml_client.deployments.create(artifact_uid=model_uid, name=DEPLOYMENT_NAME, asynchronous=False)
    deployment_uid = wml_client.deployments.get_uid(deployment)
    
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))

Deploying model...


#######################################################################################

Synchronous deployment creation for uid: '16462c0b-fa59-49f0-ba40-6b9c722e78d0' started

#######################################################################################


INITIALIZING
DEPLOY_IN_PROGRESS
DEPLOY_SUCCESS


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='f3208415-a6b3-450c-ae82-d42dd25d7ceb'
------------------------------------------------------------------------------------------------


Model id: 16462c0b-fa59-49f0-ba40-6b9c722e78d0
Deployment id: f3208415-a6b3-450c-ae82-d42dd25d7ceb


## Sample scoring

In [31]:
fields = ["CheckingStatus", "LoanDuration", "CreditHistory", "LoanPurpose", "LoanAmount", "ExistingSavings",
                  "EmploymentDuration", "InstallmentPercent", "Sex", "OthersOnLoan", "CurrentResidenceDuration",
                  "OwnsProperty", "Age", "InstallmentPlans", "Housing", "ExistingCreditsCount", "Job", "Dependents",
                  "Telephone", "ForeignWorker"]
values = [
            ["no_checking", 13, "credits_paid_to_date", "car_new", 1343, "100_to_500", "1_to_4", 2, "female", "none", 3,
             "savings_insurance", 46, "none", "own", 2, "skilled", 1, "none", "yes"],
            ["no_checking", 24, "prior_payments_delayed", "furniture", 4567, "500_to_1000", "1_to_4", 4, "male", "none",
             4, "savings_insurance", 36, "none", "free", 2, "management_self-employed", 1, "none", "yes"],
        ]

scoring_payload = {"fields": fields, "values": values}

In [84]:
scoring_url = wml_client.deployments.get_scoring_url(deployment)

wml_client.deployments.score(scoring_url, scoring_payload)

{'fields': ['CheckingStatus',
  'LoanDuration',
  'CreditHistory',
  'LoanPurpose',
  'LoanAmount',
  'ExistingSavings',
  'EmploymentDuration',
  'InstallmentPercent',
  'Sex',
  'OthersOnLoan',
  'CurrentResidenceDuration',
  'OwnsProperty',
  'Age',
  'InstallmentPlans',
  'Housing',
  'ExistingCreditsCount',
  'Job',
  'Dependents',
  'Telephone',
  'ForeignWorker',
  'CheckingStatus_IX',
  'CreditHistory_IX',
  'LoanPurpose_IX',
  'ExistingSavings_IX',
  'EmploymentDuration_IX',
  'Sex_IX',
  'OthersOnLoan_IX',
  'OwnsProperty_IX',
  'InstallmentPlans_IX',
  'Housing_IX',
  'Job_IX',
  'Telephone_IX',
  'ForeignWorker_IX',
  'features',
  'rawPrediction',
  'probability',
  'prediction',
  'predictedLabel'],
 'values': [['no_checking',
   13,
   'credits_paid_to_date',
   'car_new',
   1343,
   '100_to_500',
   '1_to_4',
   2,
   'female',
   'none',
   3,
   'savings_insurance',
   46,
   'none',
   'own',
   2,
   'skilled',
   1,
   'none',
   'yes',
   0.0,
   1.0,
   0.0,
   

# Configure OpenScale <a name="openscale"></a>

The notebook will now import the necessary libraries and set up a Python OpenScale client.

In [85]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *

authenticator = IAMAuthenticator(apikey=CLOUD_API_KEY)
wos_client = APIClient(authenticator=authenticator, service_url="https://api.aiopenscale.test.cloud.ibm.com")
wos_client.version

'3.0.0.17'

### Get Watson OpenScale GUID

Each instance of OpenScale has a unique ID. We can get this value using the Cloud API key specified at the beginning of the notebook.
1. Please update the `url` in the below WOS_CREDENTIALS payload as per the environment that you are using.
2. Please update the `DASHBOARD_URL` in the below cell as per the environment that you are using.

In [86]:
from ibm_watson_openscale.utils import get_instance_guid
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator

authenticator = IAMAuthenticator(
                    apikey=CLOUD_API_KEY,
                    url=IAM_URL
                )
WOS_GUID = get_instance_guid(authenticator)

WOS_CREDENTIALS = {
    "instance_guid": WOS_GUID,
    "apikey": CLOUD_API_KEY,
    "url": "https://api.aiopenscale.cloud.ibm.com"
}
DASHBOARD_URL = "https://aiopenscale.cloud.ibm.com"


if WOS_GUID is None:
    print('Watson OpenScale GUID NOT FOUND')
else:
    print(WOS_GUID)

5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56


If Watson OpenScale GUID has not been found you need provide the GUID to variabe `WOS_GUID`.

#### Get OpenScale `instance_guid` 

How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)



How to get your OpenScale instance GUID
- log in to IBM Cloud:
```
bx login --sso
```
- if your resource group is different than `default` switch to resource group containing OpenScale instance
```
bx target -g <myResourceGroup>
```
- get details of the instance
```
bx resource service-instance 'AI-OpenScale-instance_name'
```

## Create schema and datamart

### Set up datamart

Watson OpenScale uses a database to store payload logs and calculated metrics. If database credentials were **not** supplied above, the notebook will use the free, internal lite database. If database credentials were supplied, the datamart will be created there **unless** there is an existing datamart **and** the **KEEP_MY_INTERNAL_POSTGRES** variable is set to **True**. If an OpenScale datamart exists in Db2 or PostgreSQL, the existing datamart will be used and no data will be overwritten.

Prior instances of the German Credit model will be removed from OpenScale monitoring.

In [87]:
wos_client.data_marts.show()

0,1,2,3,4,5
WOS Data Mart,Data Mart created by WOS tutorial notebook,True,active,2020-07-22 22:17:14.701000+00:00,5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56


In [88]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['connection']['postgres']['hosts'][0]['hostname'],
                        username=DB_CREDENTIALS['connection']['postgres']['authentication']['username'],
                        password=DB_CREDENTIALS['connection']['postgres']['authentication']['password'],
                        db=DB_CREDENTIALS['connection']['postgres']['database'],
                        port=DB_CREDENTIALS['connection']['postgres']['hosts'][0]['port'],
                        ssl=True,
                        sslmode=DB_CREDENTIALS['connection']['postgres']['query_options']['sslmode'],
                        certificate_base64=DB_CREDENTIALS['connection']['postgres']['certificate']['certificate_base64']
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))
    

Using existing datamart 5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56


### Remove existing service provider connected with used  WML instance. 

Multiple service providers for the same engine instance are avaiable in Watson OpenScale. To avoid multiple service providers of used WML instance in the tutorial notebook the following code deletes existing service provder(s) and then adds new one. 

In [89]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_id = service_provider.entity.instance_id
    if service_instance_id == WML_CREDENTIALS['instance_id']:
        wos_client.service_providers.delete(service_provider.metadata.id)
        print('Deleted existing service_provider for WML instance', service_instance_id)

Deleted existing service_provider for WML instance fa2ef988-b919-468b-a05b-2a3c9df845d5


## Add service provider

Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model.

**Note:** You can bind more than one engine instance if needed by calling `wos_client.service_providers.add` method. Next, you can refer to particular service provider using `service_provider_id`.

In [90]:
added_service_provider_result = wos_client.service_providers.add(
        name="Watson Machine Learning service provider",
        description="Service Provider added by tutorial WOS notebook",
        service_type=ServiceTypes.WATSON_MACHINE_LEARNING,
        credentials=WMLCredentialsCloud(
            apikey=WML_CREDENTIALS['apikey'],
            url=WML_CREDENTIALS['url'],
            instance_id=WML_CREDENTIALS['instance_id']
        ),
        background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id




 Waiting for end of adding service provider 1c0ebe30-98a9-4d77-a3d6-64b241f2f876 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




In [91]:
wos_client.service_providers.show()

0,1,2,3,4,5
fa2ef988-b919-468b-a05b-2a3c9df845d5,active,Watson Machine Learning service provider,watson_machine_learning,2020-08-19 17:07:16.541000+00:00,1c0ebe30-98a9-4d77-a3d6-64b241f2f876


In [92]:
asset_deployment_details = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_id=deployment_uid).result['resources'][0]
print("Get asset details for {asset_name} with id: {id}".format(asset_name=asset_deployment_details['entity']['name'], id = asset_deployment_details['metadata']['guid']))

Get asset details for Spark German Risk Deployment - Final with id: f3208415-a6b3-450c-ae82-d42dd25d7ceb


## Subscriptions

### Remove existing credit risk subscriptions

This code removes previous subscriptions to the German Credit model to refresh the monitors with the new model and new data.

In [93]:
wos_client.subscriptions.show()

This code removes previous subscriptions to the German Credit model to refresh the monitors with the new model and new data.

In [95]:
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    sub_model_id = subscription.entity.asset.asset_id
    if sub_model_id == model_uid:
        wos_client.subscriptions.delete(subscription.metadata.id)
        print('Deleted existing subscription for model', model_uid)

This code creates the model subscription in OpenScale using the Python client API. Note that we need to provide the model unique identifier, and some information about the model itself.

### This code creates the model subscription in OpenScale using the Python client API. Note that we need to provide the model unique identifier, and some information about the model itself.

In [96]:
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=Asset(
            asset_id=published_model_details['metadata']['guid'],
            url=published_model_details['metadata']['url'],
            asset_type=AssetTypes.MODEL,
            input_data_type=InputDataType.STRUCTURED,
            problem_type=ProblemType.BINARY_CLASSIFICATION
        ),
        deployment=AssetDeploymentRequest(
            deployment_id=asset_deployment_details['metadata']['guid'],
            name=asset_deployment_details['entity']['name'],
            deployment_type= DeploymentTypes.ONLINE,
            url=asset_deployment_details['metadata']['url']
        ),
        asset_properties=AssetPropertiesRequest(
            label_column='Risk',
            probability_fields=['probability'],
            prediction_field='predictedLabel',
            feature_fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"],
            categorical_fields = ["CheckingStatus","CreditHistory","LoanPurpose","ExistingSavings","EmploymentDuration","Sex","OthersOnLoan","OwnsProperty","InstallmentPlans","Housing","Job","Telephone","ForeignWorker"],
            training_data_reference=TrainingDataReference(type='cos',
                                                          location=COSTrainingDataReferenceLocation(bucket = BUCKET_NAME,
                                                                                                    file_name = training_data_file_name),
                                                          connection=COSTrainingDataReferenceConnection.from_dict({
                                                                        "resource_instance_id": COS_RESOURCE_CRN,
                                                                        "url": COS_ENDPOINT,
                                                                        "api_key": COS_API_KEY_ID,
                                                                        "iam_url": "https://iam.bluemix.net/oidc/token"})),
            training_data_schema=SparkStruct.from_dict(asset_deployment_details['entity']['asset_properties']['training_data_schema'])
        )
    ).result
subscription_id = subscription_details.metadata.id
subscription_id

'5d8ac6bc-4337-49ab-ac13-85d23541eb40'

In [97]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

Payload data set id:  26a54806-dbf5-4fb8-b378-271694732053


In [98]:
wos_client.data_sets.show()

0,1,2,3,4,5,6
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,5d8ac6bc-4337-49ab-ac13-85d23541eb40,subscription,manual_labeling,2020-08-19 17:07:43.933000+00:00,140a31e9-3bd7-40aa-a2e4-9c16e158f2cf
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,5d8ac6bc-4337-49ab-ac13-85d23541eb40,subscription,payload_logging,2020-08-19 17:07:43.870000+00:00,26a54806-dbf5-4fb8-b378-271694732053
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,data_mart,explanations_whatif,2020-08-10 21:46:35.549000+00:00,60e2672b-d4c2-4479-8312-0433f9f738a5
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,data_mart,explanations,2020-08-10 21:46:35.104000+00:00,40e2510a-d4e8-411c-b1f7-4b58c79aa2b2


Get subscription list

In [99]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8
16462c0b-fa59-49f0-ba40-6b9c722e78d0,,5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,f3208415-a6b3-450c-ae82-d42dd25d7ceb,Spark German Risk Deployment - Final,1c0ebe30-98a9-4d77-a3d6-64b241f2f876,active,2020-08-19 17:07:43.221000+00:00,5d8ac6bc-4337-49ab-ac13-85d23541eb40


### Score the model so we can configure monitors

Now that the WML service has been bound and the subscription has been created, we need to send a request to the model before we configure OpenScale. This allows OpenScale to create a payload log in the datamart with the correct schema, so it can capture data coming into and out of the model. First, the code gets the model deployment's endpoint URL, and then sends a few records for predictions.

In [100]:
credit_risk_scoring_endpoint = None
print(deployment_uid)

for deployment in wml_client.deployments.get_details()['resources']:
    if deployment_uid in deployment['metadata']['guid']:
        credit_risk_scoring_endpoint = deployment['entity']['scoring_url']
        
print(credit_risk_scoring_endpoint)

f3208415-a6b3-450c-ae82-d42dd25d7ceb
https://us-south.ml.cloud.ibm.com/v3/wml_instances/fa2ef988-b919-468b-a05b-2a3c9df845d5/deployments/f3208415-a6b3-450c-ae82-d42dd25d7ceb/online


In [101]:
fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"]
values = [
  ["no_checking",13,"credits_paid_to_date","car_new",1343,"100_to_500","1_to_4",2,"female","none",3,"savings_insurance",46,"none","own",2,"skilled",1,"none","yes"],
  ["no_checking",24,"prior_payments_delayed","furniture",4567,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",36,"none","free",2,"management_self-employed",1,"none","yes"],
  ["0_to_200",26,"all_credits_paid_back","car_new",863,"less_100","less_1",2,"female","co-applicant",2,"real_estate",38,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",14,"no_credits","car_new",2368,"less_100","1_to_4",3,"female","none",3,"real_estate",29,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",4,"no_credits","car_new",250,"less_100","unemployed",2,"female","none",3,"real_estate",23,"none","rent",1,"management_self-employed",1,"none","yes"],
  ["no_checking",17,"credits_paid_to_date","car_new",832,"100_to_500","1_to_4",2,"male","none",2,"real_estate",42,"none","own",1,"skilled",1,"none","yes"],
  ["no_checking",33,"outstanding_credit","appliances",5696,"unknown","greater_7",4,"male","co-applicant",4,"unknown",54,"none","free",2,"skilled",1,"yes","yes"],
  ["0_to_200",13,"prior_payments_delayed","retraining",1375,"100_to_500","4_to_7",3,"male","none",3,"real_estate",37,"none","own",2,"management_self-employed",1,"none","yes"]
]

payload_scoring = {"fields": fields,"values": values}
scoring_response = wml_client.deployments.score(credit_risk_scoring_endpoint, payload_scoring)

print('Single record scoring result:', '\n fields:', scoring_response['fields'], '\n values: ', scoring_response['values'][0])

Single record scoring result: 
 fields: ['CheckingStatus', 'LoanDuration', 'CreditHistory', 'LoanPurpose', 'LoanAmount', 'ExistingSavings', 'EmploymentDuration', 'InstallmentPercent', 'Sex', 'OthersOnLoan', 'CurrentResidenceDuration', 'OwnsProperty', 'Age', 'InstallmentPlans', 'Housing', 'ExistingCreditsCount', 'Job', 'Dependents', 'Telephone', 'ForeignWorker', 'CheckingStatus_IX', 'CreditHistory_IX', 'LoanPurpose_IX', 'ExistingSavings_IX', 'EmploymentDuration_IX', 'Sex_IX', 'OthersOnLoan_IX', 'OwnsProperty_IX', 'InstallmentPlans_IX', 'Housing_IX', 'Job_IX', 'Telephone_IX', 'ForeignWorker_IX', 'features', 'rawPrediction', 'probability', 'prediction', 'predictedLabel'] 
 values:  ['no_checking', 13, 'credits_paid_to_date', 'car_new', 1343, '100_to_500', '1_to_4', 2, 'female', 'none', 3, 'savings_insurance', 46, 'none', 'own', 2, 'skilled', 1, 'none', 'yes', 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, [20, [1, 3, 5, 13, 14, 15, 16, 17, 18, 19], [1.0, 1.0, 1.0, 13.0, 

In [102]:
time.sleep(5)
wos_client.data_sets.get_records_count(payload_data_set_id)

8

# Quality monitoring and feedback logging <a name="quality"></a>

## Enable quality monitoring

The code below waits ten seconds to allow the payload logging table to be set up before it begins enabling monitors. First, it turns on the quality (accuracy) monitor and sets an alert threshold of 70%. OpenScale will show an alert on the dashboard if the model accuracy measurement (area under the curve, in the case of a binary classifier) falls below this threshold.

The second paramater supplied, min_records, specifies the minimum number of feedback records OpenScale needs before it calculates a new measurement. The quality monitor runs hourly, but the accuracy reading in the dashboard will not change until an additional 50 feedback records have been added, via the user interface, the Python client, or the supplied feedback endpoint.

In [55]:
import time

time.sleep(10)
target = Target(
        target_type=TargetTypes.SUBSCRIPTION,
        target_id=subscription_id
)
parameters = {
    "min_feedback_data_size": 50
}
quality_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters
).result




 Waiting for end of monitor instance creation 6b235eeb-47d9-4f06-8c86-65b4889d9c37 




preparing
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




In [56]:
quality_monitor_instance_id = quality_monitor_details.metadata.id

## Feedback logging

The code below downloads and stores enough feedback data to meet the minimum threshold so that OpenScale can calculate a new accuracy measurement. It then kicks off the accuracy monitor. The monitors run hourly, or can be initiated via the Python API, the REST API, or the graphical user interface.

In [137]:
!rm additional_feedback_data_v2.json
!wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/additional_feedback_data_v2.json

--2020-08-18 17:56:17--  https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/additional_feedback_data_v2.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.48.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.48.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 50890 (50K) [text/plain]
Saving to: ‘additional_feedback_data_v2.json’


2020-08-18 17:56:17 (24.4 MB/s) - ‘additional_feedback_data_v2.json’ saved [50890/50890]



### Get feedback logging dataset ID

In [138]:
feedback_dataset_id = None
feedback_dataset_id = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if feedback_dataset_id is None:
    print("Feedback data set not found. Please check quality monitor status.")

In [139]:
with open('additional_feedback_data_v2.json') as feedback_file:
    additional_feedback_data = json.load(feedback_file)

In [140]:
wos_client.data_sets.store_records(feedback_dataset_id, request_body=additional_feedback_data, background_mode=False)




 Waiting for end of storing records with request id: d8cf5821-5a2e-410d-ab85-c2a01009dda2 




active

---------------------------------------
 Successfully finished storing records 
---------------------------------------




<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x7f5494d546a0>

In [141]:
wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)

98

In [142]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result




 Waiting for end of monitoring run 8f27c6c2-3395-4962-afe4-788540c0b9cf 




running
finished

---------------------------
 Successfully finished run 
---------------------------




In [143]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2020-08-18 17:56:30.130000+00:00,true_positive_rate,416722f4-3020-420e-a366-727d47253c55,0.3939393939393939,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,area_under_roc,416722f4-3020-420e-a366-727d47253c55,0.6662004662004662,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,precision,416722f4-3020-420e-a366-727d47253c55,0.7647058823529411,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,f1_measure,416722f4-3020-420e-a366-727d47253c55,0.5199999999999999,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,accuracy,416722f4-3020-420e-a366-727d47253c55,0.7551020408163265,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,log_loss,416722f4-3020-420e-a366-727d47253c55,0.4436118523630204,,0.8,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,false_positive_rate,416722f4-3020-420e-a366-727d47253c55,0.0615384615384615,,0.8,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,area_under_pr,416722f4-3020-420e-a366-727d47253c55,0.6350176434210048,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:56:30.130000+00:00,recall,416722f4-3020-420e-a366-727d47253c55,0.3939393939393939,0.8,,['model_type:original'],quality,eb9905ab-f789-419d-b4c1-23d9eaf1b926,8f27c6c2-3395-4962-afe4-788540c0b9cf,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e


# Fairness, drift monitoring and explanations <a name="fairness"></a>

### Fairness configuration

The code below configures fairness monitoring for our model. It turns on monitoring for two features, Sex and Age. In each case, we must specify:

  * Which model feature to monitor
  * One or more **majority** groups, which are values of that feature that we expect to receive a higher percentage of favorable outcomes
  * One or more **minority** groups, which are values of that feature that we expect to receive a higher percentage of unfavorable outcomes
  * The threshold at which we would like OpenScale to display an alert if the fairness measurement falls below (in this case, 95%)

Additionally, we must specify which outcomes from the model are favourable outcomes, and which are unfavourable. We must also provide the number of records OpenScale will use to calculate the fairness score. In this case, OpenScale's fairness monitor will run hourly, but will not calculate a new fairness rating until at least 200 records have been added. Finally, to calculate fairness, OpenScale must perform some calculations on the training data, so we provide the dataframe containing the data.

In [144]:
wos_client.monitor_instances.show()

0,1,2,3,4,5,6
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e,subscription,quality,2020-08-18 17:56:07.346000+00:00,eb9905ab-f789-419d-b4c1-23d9eaf1b926
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,9afc14f1-2014-4fe2-95a9-fbb57f518e93,subscription,fairness,2020-08-10 20:45:13.815000+00:00,7b0e3d9a-1d77-4430-b824-cacca7a65216
5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56,active,9afc14f1-2014-4fe2-95a9-fbb57f518e93,subscription,performance,2020-08-10 20:44:55.164000+00:00,920c5f4a-1bff-49aa-9a75-b07cdc946b9d


In [145]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id

)
parameters = {
    "features": [
        {"feature": "Sex",
         "majority": ['male'],
         "minority": ['female'],
         "threshold": 0.95
         },
        {"feature": "Age",
         "majority": [[26, 75]],
         "minority": [[18, 25]],
         "threshold": 0.95
         }
    ],
    "favourable_class": ["No Risk"],
    "unfavourable_class": ["Risk"],
    "min_records": 4
}

fairness_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,
    target=target,
    parameters=parameters).result
fairness_monitor_instance_id =fairness_monitor_details.metadata.id
fairness_monitor_instance_id




 Waiting for end of monitor instance creation 5633bdb9-e053-45bd-a6c9-6710a7e18802 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




'5633bdb9-e053-45bd-a6c9-6710a7e18802'

### Drift configuration

In [146]:
monitor_instances = wos_client.monitor_instances.list().result.monitor_instances
for monitor_instance in monitor_instances:
    monitor_def_id=monitor_instance.entity.monitor_definition_id
    if monitor_def_id == "drift" and monitor_instance.entity.target.target_id == subscription_id:
        wos_client.monitor_instances.delete(monitor_instance.metadata.id)
        print('Deleted existing drift monitor instance with id: ', monitor_instance.metadata.id)


target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id

)
parameters = {
    "min_samples": 20,
    "train_drift_model": True,
    "enable_model_drift": False,
    "enable_data_drift": True
}

drift_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.DRIFT.ID,
    target=target,
    parameters=parameters
).result

drift_monitor_instance_id = drift_monitor_details.metadata.id
drift_monitor_instance_id




 Waiting for end of monitor instance creation 7e3823b6-812d-49ab-b67c-76b686bbb95c 




preparing
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




'7e3823b6-812d-49ab-b67c-76b686bbb95c'

## Score the model again now that monitoring is configured

This next section randomly selects 200 records from the data feed and sends those records to the model for predictions. This is enough to exceed the minimum threshold for records set in the previous section, which allows OpenScale to begin calculating fairness.

In [53]:
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/german_credit_feed.json -O german_credit_feed.json
!ls -lh german_credit_feed.json

-rw-r----- 1 dsxuser dsxuser 3.0M Aug 19 17:03 german_credit_feed.json


Score 200 randomly chosen records

In [54]:
import random

with open('german_credit_feed.json', 'r') as scoring_file:
    scoring_data = json.load(scoring_file)

fields = scoring_data['fields']
values = []
for _ in range(200):
    values.append(random.choice(scoring_data['values']))
payload_scoring = {"fields": fields, "values": values}

scoring_response = wml_client.deployments.score(credit_risk_scoring_endpoint, payload_scoring)
time.sleep(5)

**Note:** Now in payload table should be total 208 records.

In [56]:
print('Number of records in payload table: ', wos_client.data_sets.get_records_count(data_set_id=payload_data_set_id))

Number of records in payload table:  0


## Run fairness monitor

Kick off a fairness monitor run on current data. The monitor runs hourly, but can be manually initiated using the Python client, the REST API, or the graphical user interface.

In [150]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=fairness_monitor_instance_id, background_mode=False)




 Waiting for end of monitoring run cb6cb1ff-7a34-49b3-851c-072acd703cdb 




running.
finished

---------------------------
 Successfully finished run 
---------------------------




In [151]:
time.sleep(10)

wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2020-08-18 17:58:06.440004+00:00,fairness_value,cea5ca4f-08a5-4155-86f4-5a3f8f3a25d6,100.0,80.0,,"['feature:Age', 'fairness_metric_type:debiased_fairness', 'feature_value:18-25']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,cb6cb1ff-7a34-49b3-851c-072acd703cdb,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:58:06.440004+00:00,fairness_value,cea5ca4f-08a5-4155-86f4-5a3f8f3a25d6,100.0,80.0,,"['feature:Sex', 'fairness_metric_type:debiased_fairness', 'feature_value:female']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,cb6cb1ff-7a34-49b3-851c-072acd703cdb,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:58:06.440004+00:00,fairness_value,353499c3-e017-4851-9ffa-c134f704d05a,100.0,80.0,,"['feature:Sex', 'fairness_metric_type:fairness', 'feature_value:female']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,cb6cb1ff-7a34-49b3-851c-072acd703cdb,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:58:06.440004+00:00,fairness_value,353499c3-e017-4851-9ffa-c134f704d05a,100.0,80.0,,"['feature:Age', 'fairness_metric_type:fairness', 'feature_value:18-25']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,cb6cb1ff-7a34-49b3-851c-072acd703cdb,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:57:42.906980+00:00,fairness_value,502948b8-fd3f-468c-9e29-a39531676691,100.0,80.0,,"['feature:Sex', 'fairness_metric_type:debiased_fairness', 'feature_value:female']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,be27017a-1dc3-49d1-a7d0-a61c9c610c07,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:57:42.906980+00:00,fairness_value,502948b8-fd3f-468c-9e29-a39531676691,100.0,80.0,,"['feature:Age', 'fairness_metric_type:debiased_fairness', 'feature_value:18-25']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,be27017a-1dc3-49d1-a7d0-a61c9c610c07,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:57:42.906980+00:00,fairness_value,fb84a53e-7ea8-469b-9782-47736122b8b9,100.0,80.0,,"['feature:Sex', 'fairness_metric_type:fairness', 'feature_value:female']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,be27017a-1dc3-49d1-a7d0-a61c9c610c07,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e
2020-08-18 17:57:42.906980+00:00,fairness_value,fb84a53e-7ea8-469b-9782-47736122b8b9,100.0,80.0,,"['feature:Age', 'fairness_metric_type:fairness', 'feature_value:18-25']",fairness,5633bdb9-e053-45bd-a6c9-6710a7e18802,be27017a-1dc3-49d1-a7d0-a61c9c610c07,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e


## Run drift monitor


Kick off a drift monitor run on current data. The monitor runs every hour, but can be manually initiated using the Python client, the REST API.

In [152]:
drift_run_details = wos_client.monitor_instances.run(monitor_instance_id=drift_monitor_instance_id, background_mode=False)




 Waiting for end of monitoring run 859cf518-3627-45ee-b92a-69e9584bd03b 




finished

---------------------------
 Successfully finished run 
---------------------------




In [153]:
time.sleep(5)

wos_client.monitor_instances.show_metrics(monitor_instance_id=drift_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2020-08-18 17:58:32.320160+00:00,data_drift_magnitude,738db6d2-7a8f-4d8f-8e8d-aeb8cc3c1ff4,0.0865384615384615,,,[],drift,7e3823b6-812d-49ab-b67c-76b686bbb95c,859cf518-3627-45ee-b92a-69e9584bd03b,subscription,5a7b5f9b-a5dd-4d20-97f4-bda8b9c55d4e


## Configure Explainability

Finally, we provide OpenScale with the training data to enable and configure the explainability features.

In [103]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_id = explainability_details.metadata.id




 Waiting for end of monitor instance creation e1c85625-a73a-4240-a99f-6f23c5bb8a7d 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




## Run explanation for sample record

In [104]:
pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result
scoring_ids = [pl_records_resp["records"][0]["entity"]["values"]["scoring_id"]]
print("Running explanations on scoring IDs: {}".format(scoring_ids))
explanation_types = ["lime", "contrastive"]
result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types).result
print(result)

Running explanations on scoring IDs: ['859bda8165ca31b4d5e9cfdb8e7e4eba-1']
{
  "metadata": {
    "explanation_task_ids": [
      "69608524-d497-4cde-895a-bf820f923f81"
    ],
    "created_by": "IBMid-310002F0G1",
    "created_at": "2020-08-19T17:08:35.594417Z"
  }
}


# Custom monitors and metrics <a name="custom"></a>

## Register custom monitor

In [85]:
def get_definition(monitor_name):
    monitor_definitions = wos_client.monitor_definitions.list().result.monitor_definitions
    
    for definition in monitor_definitions:
        if monitor_name == definition.entity.name:
            return definition
    
    return None

In [86]:
monitor_name = 'my model performance'
metrics = [MonitorMetricRequest(name='sensitivity',
                                thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.8)]),
          MonitorMetricRequest(name='specificity',
                                thresholds=[MetricThreshold(type=MetricThresholdTypes.LOWER_LIMIT, default=0.75)])]
tags = [MonitorTagRequest(name='region', description='customer geographical region')]

existing_definition = get_definition(monitor_name)

if existing_definition is None:
    custom_monitor_details = wos_client.monitor_definitions.add(name=monitor_name, metrics=metrics, tags=tags, background_mode=False).result
else:
    custom_monitor_details = existing_definition

### Show available monitors types

In [146]:
wos_client.monitor_definitions.show()

0,1,2
my_model_performance,my model performance,"['sensitivity', 'specificity']"
assurance,Assurance,"['Uncertainty', 'Confidence']"
fairness,Fairness,"['Fairness value', 'Average Odds Difference metric value', 'False Discovery Rate Difference metric value', 'Error Rate Difference metric value', 'False Negative Rate Difference metric value', 'False Omission Rate Difference metric value', 'False Positive Rate Difference metric value', 'True Positive Rate Difference metric value']"
performance,Performance,['Number of records']
explainability,Explainability,[]
mrm,Model risk management monitoring,"['Tests run', 'Tests passed', 'Tests failed', 'Tests skipped', 'Fairness score', 'Quality score', 'Drift score']"
correlations,Correlations,"['Maximum positive correlation coefficient', 'Maximum negative correlation coefficient', 'Mean absolute correlation coefficient', 'Significant correlation coefficients count']"
drift,Drift,"['Drop in accuracy', 'Predicted accuracy', 'Drop in data consistency']"
quality,Quality,"['Area under ROC', 'Area under PR', 'Proportion explained variance', 'Mean absolute error', 'Mean squared error', 'R squared', 'Root of mean squared error', 'Accuracy', 'Weighted True Positive Rate (wTPR)', 'True positive rate (TPR)', 'Weighted False Positive Rate (wFPR)', 'False positive rate (FPR)', 'Weighted recall', 'Recall', 'Weighted precision', 'Precision', 'Weighted F1-Measure', 'F1-Measure', 'Logarithmic loss']"


### Get monitors uids and details

In [147]:
custom_monitor_id = custom_monitor_details.metadata.id

print(custom_monitor_id)

my_model_performance


In [148]:
custom_monitor_details = wos_client.monitor_definitions.get(monitor_definition_id=custom_monitor_id).result
print('Monitor definition details:', custom_monitor_details)

Monitor definition details: {
  "metadata": {
    "id": "my_model_performance",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/e0b56432b1f1bd804706dc29b8a89ca1:5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56:monitor_definition:my_model_performance",
    "url": "/v2/monitor_definitions/my_model_performance",
    "created_at": "2020-08-10T22:12:32.125000Z",
    "created_by": "IBMid-310002F0G1"
  },
  "entity": {
    "name": "my model performance",
    "metrics": [
      {
        "name": "sensitivity",
        "thresholds": [
          {
            "type": "lower_limit",
            "default": 0.8
          }
        ],
        "expected_direction": "increasing",
        "id": "sensitivity"
      },
      {
        "name": "specificity",
        "thresholds": [
          {
            "type": "lower_limit",
            "default": 0.75
          }
        ],
        "expected_direction": "increasing",
        "id": "specificity"
      }
    ],
    "tags": [
      {
        "name": "region

## Enable custom monitor for subscription

In [149]:
target = Target(
        target_type=TargetTypes.SUBSCRIPTION,
        target_id=subscription_id
    )

thresholds = [MetricThresholdOverride(metric_id='sensitivity', type = MetricThresholdTypes.LOWER_LIMIT, value=0.9)]

custom_monitor_instance_details = wos_client.monitor_instances.create(
            data_mart_id=data_mart_id,
            background_mode=False,
            monitor_definition_id=custom_monitor_id,
            target=target
).result




 Waiting for end of monitor instance creation 076ccea1-9ef0-4e8d-8a25-af219abc2818 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




### Get monitor instance id and configuration details

In [150]:
custom_monitor_instance_id = custom_monitor_instance_details.metadata.id

In [151]:
custom_monitor_instance_details = wos_client.monitor_instances.get(custom_monitor_instance_id).result
print(custom_monitor_instance_details)

{
  "metadata": {
    "id": "076ccea1-9ef0-4e8d-8a25-af219abc2818",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/e0b56432b1f1bd804706dc29b8a89ca1:5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56:monitor_instance:076ccea1-9ef0-4e8d-8a25-af219abc2818",
    "url": "/v2/monitor_instances/076ccea1-9ef0-4e8d-8a25-af219abc2818",
    "created_at": "2020-08-10T22:15:30.716000Z",
    "created_by": "IBMid-310002F0G1"
  },
  "entity": {
    "data_mart_id": "5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56",
    "monitor_definition_id": "my_model_performance",
    "target": {
      "target_type": "subscription",
      "target_id": "b65cc203-e95e-4c7e-a6c3-fd8f98f470ee"
    },
    "thresholds": [
      {
        "metric_id": "sensitivity",
        "type": "lower_limit",
        "value": 0.8
      },
      {
        "metric_id": "specificity",
        "type": "lower_limit",
        "value": 0.75
      }
    ],
    "schedule": {
      "repeat_interval": 60,
      "repeat_unit": "minute",
      "repeat_type": "min

## Storing custom metrics

In [152]:
from datetime import datetime, timezone, timedelta
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import MonitorMeasurementRequest
custom_monitoring_run_id = "11122223333111abc"
measurement_request = [MonitorMeasurementRequest(timestamp=datetime.now(timezone.utc), 
                                                 metrics=[{"specificity": 0.78, "sensitivity": 0.67, "region": "us-south"}], run_id=custom_monitoring_run_id)]
print(measurement_request[0])

{
  "timestamp": "2020-08-10T22:18:11.880845Z",
  "run_id": "11122223333111abc",
  "metrics": [
    {
      "specificity": 0.78,
      "sensitivity": 0.67,
      "region": "us-south"
    }
  ]
}


In [153]:
published_measurement_response = wos_client.monitor_instances.measurements.add(
    monitor_instance_id=custom_monitor_instance_id,
    monitor_measurement_request=measurement_request).result
published_measurement_id = published_measurement_response[0]["measurement_id"]
print(published_measurement_response)

[{'measurement_id': 'aedd796a-0bf2-41fc-8302-f33d89e0e6bf', 'metrics': [{'region': 'us-south', 'sensitivity': 0.67, 'specificity': 0.78}], 'run_id': '11122223333111abc', 'timestamp': '2020-08-10T22:18:11.880845Z'}]


### List and get custom metrics

In [154]:
time.sleep(5)
published_measurement = wos_client.monitor_instances.measurements.get(monitor_instance_id=custom_monitor_instance_id, measurement_id=published_measurement_id).result
print(published_measurement)

{
  "metadata": {
    "id": "aedd796a-0bf2-41fc-8302-f33d89e0e6bf",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/e0b56432b1f1bd804706dc29b8a89ca1:5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56:measurement:aedd796a-0bf2-41fc-8302-f33d89e0e6bf",
    "url": "/v2/monitor_instances/076ccea1-9ef0-4e8d-8a25-af219abc2818/measurements/aedd796a-0bf2-41fc-8302-f33d89e0e6bf",
    "created_at": "2020-08-10T22:18:48.397000Z",
    "created_by": "IBMid-310002F0G1"
  },
  "entity": {
    "timestamp": "2020-08-10T22:18:11.880845Z",
    "run_id": "11122223333111abc",
    "values": [
      {
        "metrics": [
          {
            "id": "sensitivity",
            "value": 0.67,
            "lower_limit": 0.8
          },
          {
            "id": "specificity",
            "value": 0.78,
            "lower_limit": 0.75
          }
        ],
        "tags": [
          {
            "id": "region",
            "value": "us-south"
          }
        ]
      }
    ],
    "issue_count": 1,
    "t

# Historical data <a name="historical"></a>

In [95]:
historyDays = 7

## Insert historical fairness metrics

In [91]:
!rm history_fairness_v2.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/history_fairness_v2.json
!ls -lh history_fairness_v2.json

rm: cannot remove ‘history_fairness_v2.json’: No such file or directory
-rw-r----- 1 dsxuser dsxuser 37K Aug 17 19:51 history_fairness_v2.json


In [94]:
from datetime import datetime, timedelta, timezone

with open('history_fairness_v2.json', 'r') as history_file:
    payloads = json.load(history_file)

for day in range(historyDays):
    print('Loading day', day + 1)
    daily_measurement_requests = []
    
    for hour in range(24):
        score_time = datetime.now(timezone.utc) + timedelta(hours=(-(24*day + hour + 1)))
        index = (day * 24 + hour) % len(payloads) # wrap around and reuse values if needed
 
        measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [payloads[index][0], payloads[index][1]])
        daily_measurement_requests.append(measurement_request)
        
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=fairness_monitor_instance_id,
                                            monitor_measurement_request=daily_measurement_requests).result     
print('Finished')

[[{'fairness_metric_type': 'fairness',
   'feature': 'Sex',
   'fairness_value': 0.947,
   'feature_value': 'minority'},
  {'fairness_metric_type': 'fairness',
   'feature': 'Age',
   'fairness_value': 0.954,
   'feature_value': 'minority'}],
 [{'fairness_metric_type': 'fairness',
   'feature': 'Sex',
   'fairness_value': 0.947,
   'feature_value': 'minority'},
  {'fairness_metric_type': 'fairness',
   'feature': 'Age',
   'fairness_value': 0.954,
   'feature_value': 'minority'}],
 [{'fairness_metric_type': 'fairness',
   'feature': 'Sex',
   'fairness_value': 0.923,
   'feature_value': 'minority'},
  {'fairness_metric_type': 'fairness',
   'feature': 'Age',
   'fairness_value': 1.029,
   'feature_value': 'minority'}],
 [{'fairness_metric_type': 'fairness',
   'feature': 'Sex',
   'fairness_value': 0.949,
   'feature_value': 'minority'},
  {'fairness_metric_type': 'fairness',
   'feature': 'Age',
   'fairness_value': 0.992,
   'feature_value': 'minority'}],
 [{'fairness_metric_type': '

## Insert historical debias metrics

In [185]:
!rm history_debias_v2.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/history_debias_v2.json
!ls -lh history_debias_v2.json

rm: cannot remove ‘history_debias_v2.json’: No such file or directory
-rw-r----- 1 dsxuser dsxuser 37K Aug 10 22:52 history_debias_v2.json


In [186]:
with open('history_debias_v2.json', 'r') as history_file:
    payloads = json.load(history_file)

for day in range(historyDays):
    print('Loading day', day + 1)
    daily_measurement_requests = []
    for hour in range(24):
        score_time = datetime.now(timezone.utc) + timedelta(hours=(-(24*day + hour + 1)))
        index = (day * 24 + hour) % len(payloads) # wrap around and reuse values if needed

        measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [payloads[index][0], payloads[index][1]])
        
        daily_measurement_requests.append(measurement_request)
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=fairness_monitor_instance_id,
                                            monitor_measurement_request=daily_measurement_requests).result     

print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## Insert historical quality metrics

In [96]:
measurements = [0.76, 0.78, 0.68, 0.72, 0.73, 0.77, 0.80]
for day in range(historyDays):
    quality_measurement_requests = []
    print('Loading day', day + 1)
    for hour in range(24):
        score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))
        score_time = score_time.isoformat() + "Z"
        
        metric = {"area_under_roc": measurements[day]}
                
        measurement_request = MonitorMeasurementRequest(timestamp=score_time,metrics = [metric])
        quality_measurement_requests.append(measurement_request)
        
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=quality_monitor_instance_id,
                                            monitor_measurement_request=quality_measurement_requests).result    
    
print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## Insert historical confusion matrixes

In [188]:
!rm history_quality_metrics.json
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/history_quality_metrics.json
!ls -lh history_quality_metrics.json

rm: cannot remove ‘history_quality_metrics.json’: No such file or directory
-rw-r----- 1 dsxuser dsxuser 79K Aug 10 22:52 history_quality_metrics.json


In [98]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import Source

with open('history_quality_metrics.json') as json_file:
    records = json.load(json_file)
    
for day in range(historyDays):
    index = 0
    cm_measurement_requests = []
    print('Loading day', day + 1)
    
    for hour in range(24):
        score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))
        score_time = score_time.isoformat() + "Z"

        metric = records[index]['metrics']
        source = records[index]['sources']

        
        measurement_request = {"timestamp": score_time, "metrics": [metric], "sources": [source]}
        cm_measurement_requests.append(measurement_request)

        index+=1

    response = wos_client.monitor_instances.measurements.add(monitor_instance_id=quality_monitor_instance_id, monitor_measurement_request=cm_measurement_requests).result    

print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## Insert historical performance metrics

In [99]:
target = Target(
        target_type=TargetTypes.INSTANCE,
        target_id=payload_data_set_id
    )


performance_monitor_instance_details = wos_client.monitor_instances.create(
            data_mart_id=data_mart_id,
            background_mode=False,
            monitor_definition_id=wos_client.monitor_definitions.MONITORS.PERFORMANCE.ID,
            target=target
).result
performance_monitor_instance_id = performance_monitor_instance_details.metadata.id





 Waiting for end of monitor instance creation 3e94ae81-eef7-48ac-ae2d-26a0d1e5f5c9 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




In [100]:
for day in range(historyDays):
    performance_measurement_requests = []
    print('Loading day', day + 1)
    for hour in range(24):
        score_time = datetime.utcnow() + timedelta(hours=(-(24*day + hour + 1)))
        score_time = score_time.isoformat() + "Z"
        score_count = random.randint(60, 600)
        
        metric = {"record_count": score_count, "data_set_type": "scoring_payload"}
        
        measurement_request = {"timestamp": score_time, "metrics": [metric]}
        performance_measurement_requests.append(measurement_request)
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=performance_monitor_instance_id,
                                            monitor_measurement_request=performance_measurement_requests).result    

print('Finished')

Loading day 1
Loading day 2
Loading day 3
Loading day 4
Loading day 5
Loading day 6
Loading day 7
Finished


## Insert historical drift measurements

In [101]:
with io.capture_output() as captured:
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_0.json
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_1.json
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_2.json
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_3.json
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_4.json
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_5.json
    !wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wos/history_drift_measurement_6.json
!ls -lh history_drift_measurement_*.json

-rw-r----- 1 dsxuser dsxuser 832K Aug 17 20:01 history_drift_measurement_0.json
-rw-r----- 1 dsxuser dsxuser 868K Aug 17 20:01 history_drift_measurement_1.json
-rw-r----- 1 dsxuser dsxuser 870K Aug 17 20:01 history_drift_measurement_2.json
-rw-r----- 1 dsxuser dsxuser 910K Aug 17 20:01 history_drift_measurement_3.json
-rw-r----- 1 dsxuser dsxuser 841K Aug 17 20:01 history_drift_measurement_4.json
-rw-r----- 1 dsxuser dsxuser 836K Aug 17 20:01 history_drift_measurement_5.json
-rw-r----- 1 dsxuser dsxuser 840K Aug 17 20:01 history_drift_measurement_6.json


In [102]:
for day in range(historyDays):
    drift_measurements = []

    with open("history_drift_measurement_{}.json".format(day), 'r') as history_file:
        drift_daily_measurements = json.load(history_file)
    print('Loading day', day + 1)

    #Historical data contains 8 records per day - each represents 3 hour drift window.
    
    for nb_window, records in enumerate(drift_daily_measurements):
        for record in records:
            window_start =  datetime.utcnow() + timedelta(hours=(-(24 * day + (nb_window+1)*3 + 1))) # first_payload_record_timestamp_in_window (oldest)
            window_end = datetime.utcnow() + timedelta(hours=(-(24 * day + nb_window*3 + 1)))# last_payload_record_timestamp_in_window (most recent)
            #modify start and end time for each record
            record['sources'][0]['data']['start'] = window_start.isoformat() + "Z"
            record['sources'][0]['data']['end'] = window_end.isoformat() + "Z"
            
            
            metric = record['metrics'][0]
            source = record['sources'][0]

            measurement_request = {"timestamp": window_start.isoformat() + "Z", "metrics": [metric], "sources": [source]}
            
            drift_measurements.append(measurement_request)
        
    response = wos_client.monitor_instances.measurements.add(
                                            monitor_instance_id=drift_monitor_instance_id,
                                            monitor_measurement_request=drift_measurements).result    

    
    print("Daily loading finished.")

Loading day 1
Daily loading finished.
Loading day 2
Daily loading finished.
Loading day 3
Daily loading finished.
Loading day 4
Daily loading finished.
Loading day 5
Daily loading finished.
Loading day 6
Daily loading finished.
Loading day 7
Daily loading finished.


## Additional data to help debugging

In [201]:
print('Datamart:', data_mart_id)
print('Model:', model_uid)
print('Deployment:', deployment_uid)
print('Scoring URL:', credit_risk_scoring_endpoint)

Datamart: 5a0b9076-fcf6-49e8-a824-9e3a6b4c2a56
Model: 514273e1-15ec-4cdd-8571-8da9fe5183f0
Deployment: 4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54
Scoring URL: https://us-south.ml.cloud.ibm.com/v3/wml_instances/fa2ef988-b919-468b-a05b-2a3c9df845d5/deployments/4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54/online


## Identify transactions for Explainability

Transaction IDs identified by the cells below can be copied and pasted into the Explainability tab of the OpenScale dashboard.

In [202]:
wos_client.data_sets.show_records(payload_data_set_id, limit=5)

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45
036f9a64-dd9d-4bcb-b958-607001caee36,"[0.6465201352411825, 0.35347986475881765]",5148,442e9f7e274985f08873cf910ec7c9c2-1,0.0,1.0,1.0,2,500_to_1000,2020-08-10T21:42:17.518Z,0.6465201352411825,,"[1.0, 2.0, 1.0, 2.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0, 27.0, 5148.0, 2.0, 3.0, 38.0, 2.0, 2.0]",less_0,furniture,No Risk,1.0,0.0,skilled,stores,2.0,2.0,none,27,38,outstanding_credit,1.0,3,1.0,0.0,0.0,0.0,yes,own,0.0,2,4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54,"[12.93040270482365, 7.069597295176353]",yes,0.0,male,2,car_other,1_to_4,No Risk,"[0.6465201352411825, 0.35347986475881765]"
036f9a64-dd9d-4bcb-b958-607001caee36,"[0.7279642825968492, 0.2720357174031508]",2320,442e9f7e274985f08873cf910ec7c9c2-10,1.0,0.0,1.0,3,less_100,2020-08-10T21:42:17.518Z,0.7279642825968492,,"[20, [0, 1, 5, 13, 14, 15, 16, 17, 18, 19], [1.0, 1.0, 1.0, 17.0, 2320.0, 3.0, 3.0, 41.0, 1.0, 1.0]]",less_0,car_new,No Risk,0.0,0.0,skilled,none,0.0,1.0,none,17,41,credits_paid_to_date,0.0,3,0.0,0.0,0.0,0.0,yes,own,0.0,1,4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54,"[14.559285651936987, 5.440714348063017]",none,0.0,female,1,savings_insurance,1_to_4,No Risk,"[0.7279642825968492, 0.2720357174031508]"
036f9a64-dd9d-4bcb-b958-607001caee36,"[0.9693813316521208, 0.030618668347879192]",426,442e9f7e274985f08873cf910ec7c9c2-100,1.0,0.0,1.0,1,less_100,2020-08-10T21:42:17.518Z,0.9693813316521208,,"[1.0, 3.0, 0.0, 0.0, 3.0, 1.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 12.0, 426.0, 1.0, 1.0, 19.0, 1.0, 1.0]",less_0,car_new,No Risk,0.0,0.0,skilled,stores,0.0,3.0,none,12,19,all_credits_paid_back,1.0,1,0.0,3.0,0.0,0.0,yes,rent,1.0,1,4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54,"[19.387626633042416, 0.6123733669575838]",none,0.0,female,1,savings_insurance,less_1,No Risk,"[0.9693813316521208, 0.030618668347879192]"
036f9a64-dd9d-4bcb-b958-607001caee36,"[0.9121713291882478, 0.08782867081175208]",250,442e9f7e274985f08873cf910ec7c9c2-101,1.0,0.0,1.0,1,less_100,2020-08-10T21:42:17.518Z,0.9121713291882478,,"[20, [0, 1, 5, 8, 10, 13, 14, 15, 16, 17, 18, 19], [1.0, 1.0, 1.0, 1.0, 1.0, 15.0, 250.0, 1.0, 2.0, 22.0, 1.0, 1.0]]",less_0,car_new,No Risk,0.0,0.0,unskilled,stores,0.0,1.0,none,15,22,credits_paid_to_date,1.0,2,0.0,0.0,1.0,0.0,yes,own,0.0,1,4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54,"[18.243426583764958, 1.7565734162350415]",none,0.0,female,1,savings_insurance,1_to_4,No Risk,"[0.9121713291882478, 0.08782867081175208]"
036f9a64-dd9d-4bcb-b958-607001caee36,"[0.6932962389637517, 0.3067037610362483]",3562,442e9f7e274985f08873cf910ec7c9c2-102,0.0,1.0,2.0,4,unknown,2020-08-10T21:42:17.518Z,0.6932962389637517,,"[2.0, 1.0, 1.0, 4.0, 2.0, 0.0, 0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 44.0, 3562.0, 4.0, 2.0, 38.0, 2.0, 1.0]",0_to_200,furniture,No Risk,0.0,0.0,unskilled,stores,4.0,1.0,none,44,38,credits_paid_to_date,1.0,2,1.0,2.0,1.0,0.0,yes,own,0.0,2,4a93dd1f-9ff2-43c5-87e5-6a1912e7ab54,"[13.865924779275034, 6.1340752207249665]",yes,0.0,male,1,savings_insurance,greater_7,No Risk,"[0.6932962389637517, 0.3067037610362483]"


## Congratulations!

You have finished the hands-on lab for IBM Watson OpenScale. You can now view the [OpenScale Dashboard](https://aiopenscale.cloud.ibm.com/). Click on the tile for the German Credit model to see fairness, accuracy, and performance monitors. Click on the timeseries graph to get detailed information on transactions during a specific time window.

## Next steps

OpenScale shows model performance over time. You have two options to keep data flowing to your OpenScale graphs:
  * Download, configure and schedule the [model feed notebook](https://raw.githubusercontent.com/emartensibm/german-credit/master/german_credit_scoring_feed.ipynb). This notebook can be set up with your WML credentials, and scheduled to provide a consistent flow of scoring requests to your model, which will appear in your OpenScale monitors.
  * Re-run this notebook. Running this notebook from the beginning will delete and re-create the model and deployment, and re-create the historical data. Please note that the payload and measurement logs for the previous deployment will continue to be stored in your datamart, and can be deleted if necessary.