# OpenScale Mortgage Default Configuration

This notebook is an optional portion of the OpenScale Mortgage Default lab. It will configure OpenScale monitoring for the mortgage default model using the Python client, as opposed to the graphical user interface. It should be run using Python 3.6 or higher in a Watson Studio project. It assumes that you have provisioned an instance of OpenScale with your IBM Cloud account, and that you have set up the mortgage default model.

Only the top two cells need to be altered. Paste service credentials for your Watson Machine Learning instance in the first cell. In the second cell, paste your Cloud API key, and ensure that the model and deployment names match the names used in earlier portions of the lab. The third cell can optionally be used if you would like to provide a new instance of OpenScale with a paid database service to use as a datamart.

In [None]:
WML_CREDENTIALS = {
  "apikey": "xxxx",
  "iam_apikey_description": "Auto-generated for key 115c2a10-af74-4512-a5bd-3160b3aa7783",
  "iam_apikey_name": "xxx",
  "iam_role_crn": "crn:v1:bluemix:public:iam::::serviceRole:Writer",
  "iam_serviceid_crn": "xxx",
  "instance_id": "xxx",
  "url": "https://us-south.ml.cloud.ibm.com"
}

You can generate a Cloud API key [here](https://cloud.ibm.com/iam/apikeys).

In [None]:
CLOUD_API_KEY = "xxx"

MODEL_NAME = "Mortgage Default"
DEPLOYMENT_NAME = "Mortgage Default - Production"

If you have already set up an OpenScale datamart, or if you would like to use the free internal PostgreSQL datamart, you can skip the following cell. If you are setting up a new instance of OpenScale and would like to use a paid database service, paste your Db2 or PostgreSQL credentials below.

In [None]:
DB_CREDENTIALS = None

You may now run the rest of the notebook.

In [None]:
!pip install --upgrade ibm-ai-openscale --no-cache | tail -n 1

In [None]:
from ibm_ai_openscale import APIClient
from ibm_ai_openscale.engines import *
from ibm_ai_openscale.utils import *
from ibm_ai_openscale.supporting_classes import PayloadRecord, Feature
from ibm_ai_openscale.supporting_classes.enums import *

Get the instance ID for Watson OpenScale.

In [None]:
import requests
from ibm_ai_openscale.utils import get_instance_guid

WOS_GUID = get_instance_guid(api_key=CLOUD_API_KEY)
WOS_CREDENTIALS = {
    "instance_guid": WOS_GUID,
    "apikey": CLOUD_API_KEY,
    "url": "https://api.aiopenscale.cloud.ibm.com"
}

if WOS_GUID is None:
    print('Watson OpenScale GUID NOT FOUND')
else:
    print(WOS_GUID)

Use the Cloud API key and WOS instance ID to create a new OpenScale client.

In [None]:
ai_client = APIClient(aios_credentials=WOS_CREDENTIALS)
ai_client.version

Set up the OpenScale datamart. First check for an existing datamart. If none is found, create one using the DB_CREDENTIALS if provided. If no credentials were provided, use the free internal datamart.

In [None]:
try:
    data_mart_details = ai_client.data_mart.get_details()
    if 'internal_database' in data_mart_details and data_mart_details['internal_database']:
        print('Using existing internal datamart')
    else:
        print('Using existing external datamart')
except:
    if DB_CREDENTIALS is None:
        print('Setting up internal datamart')
        ai_client.data_mart.setup(internal_db=True)
    else:
        print('Setting up external datamart')
        try:
            ai_client.data_mart.setup(db_credentials=DB_CREDENTIALS)
        except:
            print('Setup failed, trying Db2 setup')
            ai_client.data_mart.setup(db_credentials=DB_CREDENTIALS, schema=DB_CREDENTIALS['username'])

In [None]:
data_mart_details = ai_client.data_mart.get_details()

Create a WML client.

In [None]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

wml_client = WatsonMachineLearningAPIClient(WML_CREDENTIALS)
wml_instance_id = wml_client.service_instance.get_instance_id()

In [None]:
print(wml_instance_id)

Bind the OpenScale datamart to the WML instance. If the binding already exists, this will generate an error message, but will not affect the remainder of the notebook.

In [None]:
binding_uid = ai_client.data_mart.bindings.add('WML Binding', WatsonMachineLearningInstance(WML_CREDENTIALS))
bindings_details = ai_client.data_mart.bindings.get_details()

ai_client.data_mart.bindings.list()

In [None]:
print(binding_uid)

Get the model ID and scoring endpoint for the deployed model.

In [None]:
mortgage_deployment_id = None
mortgage_model_uid = None
model_deployment_ids = wml_client.deployments.get_uids()
for deployment_id in model_deployment_ids:
    deployment = wml_client.deployments.get_details(deployment_id)
    mortgage_model_uid = deployment['entity']['deployable_asset']['guid']
    if deployment['entity']['name'] == DEPLOYMENT_NAME:
        mortgage_deployment_id = deployment_id
        break
deployment_details = wml_client.deployments.get_details(mortgage_deployment_id)
scoring_endpoint = deployment_details['entity']['scoring_url']

In [None]:
print('Model UID:', mortgage_model_uid)
print('Scoring URL:', scoring_endpoint)

List all the subscribed models.

In [None]:
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
ai_client.data_mart.subscriptions.list()

The credentials below point to the training data for the model, in CSV format. OpenScale uses the training data to train the drift model, and generate distribution statistics for the explainability service and the fairness monitor. If you don't want to provide this information to OpenScale, it is possible to run a custom notebook to create this data.

In [None]:
cos_credentials = {
    "apikey": "yqcPbWZ0AQPHleHVerrR4Wx5e9pymBdMgydbEra5zCif",
    "api_key": "yqcPbWZ0AQPHleHVerrR4Wx5e9pymBdMgydbEra5zCif",
    "url": "https://s3.us.cloud-object-storage.appdomain.cloud",
    "iam_url": 'https://iam.bluemix.net/oidc/token',
    "cos_hmac_keys": {
        "access_key_id": "2d1be760f19241d695a534960da6eb80",
        "secret_access_key": "e1252b952f47a6b3f42305b8ffe6f9bd7d10e45f966b9a62"
    },
    "endpoints": "https://control.cloud-object-storage.cloud.ibm.com/v2/endpoints",
    "iam_apikey_description": "Auto-generated for key 2d1be760-f192-41d6-95a5-34960da6eb80",
    "iam_apikey_name": "FastStartLab",
    "iam_role_crn": "crn:v1:bluemix:public:iam::::serviceRole:Reader",
    "iam_serviceid_crn": "crn:v1:bluemix:public:iam-identity::a/7d8b3c34272c0980d973d3e40be9e9d2::serviceid:ServiceId-568ba191-a3bf-48f2-a30c-f3a4af7ec61d",
    "resource_instance_id": "crn:v1:bluemix:public:cloud-object-storage:global:a/7d8b3c34272c0980d973d3e40be9e9d2:2883ef10-23f1-4592-8582-2f2ef4973639::"
}

Create the subscription in OpenScale so we can monitor the model. Required information includes feature columns, categorical columns, problem types, input types, and output types.

In [None]:
subscription = ai_client.data_mart.subscriptions.add(WatsonMachineLearningAsset(
    mortgage_model_uid,
    problem_type=ProblemType.BINARY_CLASSIFICATION,
    input_data_type=InputDataType.STRUCTURED,
    label_column='MortgageDefault',
    prediction_column='prediction',
    probability_column='probability',
    transaction_id_column='ID',
    feature_columns = ['AppliedOnline','Residence','Location','Income','Yrs_at_Current_Address','Yrs_with_Current_Employer',\
                   'Number_of_Cards','Creditcard_Debt','Loan_Amount','Loans','SalePrice'],
    categorical_columns = ['AppliedOnline','Residence','Location'],
    training_data_reference = {
        'type': 'cos',
        'location': {
            'bucket': 'faststartlab-donotdelete-pr-nhfd4jnhlxgpc7',
            'file_name': 'Mortgage_Full_Records.csv',
            'firstlineheader': True,
            'file_format': 'csv'
        },
        'connection': cos_credentials,
        'name': 'training data reference'
    }
))

if subscription is None:
    print('Subscription already exists; get the existing one')
    subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
    for sub in subscriptions_uids:
        if ai_client.data_mart.subscriptions.get_details(sub)['entity']['asset']['name'] == MODEL_NAME:
            subscription = ai_client.data_mart.subscriptions.get(sub)

In [None]:
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
ai_client.data_mart.subscriptions.list()

In [None]:
subscription_details = subscription.get_details()

In [None]:
!rm mortgage_feed.json
!wget https://raw.githubusercontent.com/emartensibm/mortgage-default/master/mortgage_feed.json

In [None]:
import json

with open('mortgage_feed.json', 'r') as scoring_file:
    data = json.load(scoring_file)

In [None]:
data['fields'][1:]

In [None]:
scoring_payload = {
    "fields": data['fields'][1:],
    "values": [],
    "meta":{
        "fields": ["ID"],
        "values": []
    }
}

In [None]:
import random
import string

letters = string.digits

for _ in range(0, 101):
    value_to_score = random.choice(data['values'])
    scoring_payload['values'].append(value_to_score[1:])
    scoring_payload['meta']['values'].append([int(''.join(random.choices(letters, k=8)))])
print(len(scoring_payload['values']))

In [None]:
predictions = wml_client.deployments.score(scoring_endpoint, scoring_payload)
print(predictions['values'][0])

In [None]:
time.sleep(10)
subscription.payload_logging.get_records_count()

In [None]:
subscription.quality_monitoring.enable(threshold=0.7, min_records=100)

In [None]:
import pandas as pd

url = 'https://raw.githubusercontent.com/emartensibm/mortgage-default/master/Mortgage_Full_Records.csv'
df_raw = pd.read_csv(url)
pd_data = df_raw.drop('ID', axis=1)
pd_data.head()

In [None]:
subscription.fairness_monitoring.enable(
    features=[
        Feature("AppliedOnline", majority=['NO'], minority=['YES'], threshold=0.90)
    ],
    favourable_classes=['NO'],
    unfavourable_classes=['YES'],
    min_records=100
)

In [None]:
subscription.drift_monitoring.enable(threshold=0.05, min_records=100)

In [None]:
drift_status = None
while drift_status != 'finished':
    drift_details = subscription.drift_monitoring.get_details()
    drift_status = drift_details['parameters']['config_status']['state']
    if drift_status != 'finished':
        print(drift_status)
        time.sleep(30)
print(drift_status)

In [None]:
print(drift_details['parameters'])

In [None]:
fairness_run_details = subscription.fairness_monitoring.run(background_mode=False)

In [None]:
fairness_run_details

In [None]:
subscription.fairness_monitoring.show_table()

In [None]:
drift_run_details = subscription.drift_monitoring.run(background_mode=False)

In [None]:
from ibm_ai_openscale.supporting_classes import *

subscription.explainability.enable()

In [None]:
transaction_id = subscription.payload_logging.get_table_content(limit=1)['scoring_id'].values[0]

print(transaction_id)

In [None]:
explain_run = subscription.explainability.run(transaction_id=transaction_id, background_mode=False, cem=False)

In [None]:
explain_run