<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

This notebook should be run in a Watson Studio project, using Default Spark Python runtime environment. If you are viewing this in Watson Studio and do not see Python 3.6 with Spark in the upper right corner of your screen, please update the runtime now. 

## Provision services and configure credentials

If you have not already, provision an instance of IBM Watson OpenScale and two instances of IBM Watson Machine Learning using the Cloud catalog.

Your Cloud API key can be generated by going to the Users section of the Cloud console. From that page, click your name, scroll down to the API Keys section, and click Create an IBM Cloud API key. Give your key a name and click Create, then copy the created key and paste it below.

NOTE: You can also get OpenScale API_KEY using IBM CLOUD CLI.

How to install IBM Cloud (bluemix) console: [Instructions](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

<b>How to get api key using console:</b>

<li> bx login --sso
<li> bx iam api-key-create 'my_key'

## Credentials for IBM Cloud services

### Retrieve your IBM Cloud API key

1.	From the IBM Cloud toolbar, click your Account name, such as <Your user name>’s Account.
1.	From the Manage menu, click Access (IAM).
1.	In the navigation bar, click IBM Cloud API keys.
1.	Click the Create an IBM Cloud API key button.
1.	Type a name and description and then click Save.
1.	Copy the newly created API key and paste it into your notebook in the following **CLOUD_API_KEY** code box, which is the first code box.

    Note: replace everything between the two sets of double quotation marks (").

In [None]:
CLOUD_API_KEY = "YOUR_CLOUD_API_KEY_HERE"

### Retrieve your Watson Machine Learning credentials

1.	Go to the IBM Cloud dashboard.
1.	In the Resource summary section, click Services.
1.	Click Machine Learning-Pre-Prod.
1.	In the navigation pane, click Service credentials.
1.	Click the New credential button.
1.	Copy your credentials by clicking the copy icon.
1.	Return to the notebook editor and update the credentials by replacing the sample credentials with your own in the second code box.
1.	Repeat the preceding steps for the prod instance in the third code box.

   **Note**: You need to replace everything including the opening bracket ({) and the closing bracket (}).
   
   **IMPORTANT**: If you are reusing a WML instance that is already bound to Watson OpenScale. Please specify that instance credentials in `PROD_WML_CREDENTIALS`

In [None]:
PRE_PROD_WML_CREDENTIALS = "YOUR_PRE_PROD_WML_CREDENTIALS_HERE"

#####################################
# Example credentials will look below
#####################################

# PRE_PROD_WML_CREDENTIALS = {
#  "apikey": "*******",
#  "iam_apikey_description": "*******",
#  "iam_apikey_name": "*******",
#  "iam_role_crn": "*******",
#  "iam_serviceid_crn": "*******",
#  "instance_id": "*******",
#  "url": "https://us-south.ml.cloud.ibm.com"
# }

In [None]:
PROD_WML_CREDENTIALS = "YOUR_PROD_WML_CREDENTIALS_HERE"

#####################################
# Example credentials will look below
#####################################

# PROD_WML_CREDENTIALS = {
#  "apikey": "*******",
#  "iam_apikey_description": "*******",
#  "iam_apikey_name": "*******",
#  "iam_role_crn": "*******",
#  "iam_serviceid_crn": "*******",
#  "instance_id": "*******",
#  "url": "https://us-south.ml.cloud.ibm.com"
# }

In [None]:
DB_CREDENTIALS = None

In [None]:
KEEP_MY_INTERNAL_POSTGRES = True

In [None]:
IAM_URL="https://iam.ng.bluemix.net/oidc/token"

## Package installation
The following opensource packages must be installed into this notebook instance so that they are available to use during processing.

In [None]:
!rm -rf /home/spark/shared/user-libs/python3.6*

!pip install pyspark==2.3 --no-cache | tail -n 1
!pip install numpy==1.15.4 --no-cache | tail -n 1
!pip install --upgrade ibm-ai-openscale --no-cache | tail -n 1
!pip install --upgrade watson-machine-learning-client | tail -n 1
!pip install --upgrade SciPy --no-cache | tail -n 1

## Load the training data from Github
So you don't have to manually generate training data, we've provided a sample and placed it in a publicly available Github repo.

In [None]:
!rm german_credit_data_biased_training.csv
!wget https://raw.githubusercontent.com/pmservice/ai-openscale-tutorials/master/assets/historical_data/german_credit_risk/wml/german_credit_data_biased_training.csv

## Deploy the Spark Credit Risk Model to Watson Machine Learning

The following cell deploys the Spark version of the German Credit Risk Model to the specified Machine Learning instance in the specified deployment space. You'll notice that this version of the German Credit Risk model has an auc-roc score around 71%.

In [None]:
def deploy_credit_risk_spark_model(wml_credentials, model_name, deployment_name):

    import numpy 
    numpy.version.version

    import pandas as pd
    import json

    from pyspark import SparkContext, SQLContext
    from pyspark.ml import Pipeline
    from pyspark.ml.classification import RandomForestClassifier,GBTClassifier
    from pyspark.ml.evaluation import BinaryClassificationEvaluator
    from pyspark.ml.feature import StringIndexer, VectorAssembler, IndexToString
    from pyspark.sql.types import StructType, DoubleType, StringType, ArrayType

    from pyspark.sql import SparkSession
    from pyspark import SparkFiles

    spark = SparkSession.builder.getOrCreate()
    pd_data = pd.read_csv("german_credit_data_biased_training.csv", sep=",", header=0)
    spark_df = spark.read.csv(path="german_credit_data_biased_training.csv", sep=",", header=True, inferSchema=True)
    spark_df.head()

    (train_data, test_data) = spark_df.randomSplit([0.9, 0.1], 24)
    print("Number of records for training: " + str(train_data.count()))
    print("Number of records for evaluation: " + str(test_data.count()))

    si_CheckingStatus = StringIndexer(inputCol='CheckingStatus', outputCol='CheckingStatus_IX')
    si_CreditHistory = StringIndexer(inputCol='CreditHistory', outputCol='CreditHistory_IX')
    si_LoanPurpose = StringIndexer(inputCol='LoanPurpose', outputCol='LoanPurpose_IX')
    si_ExistingSavings = StringIndexer(inputCol='ExistingSavings', outputCol='ExistingSavings_IX')
    si_EmploymentDuration = StringIndexer(inputCol='EmploymentDuration', outputCol='EmploymentDuration_IX')
    si_Sex = StringIndexer(inputCol='Sex', outputCol='Sex_IX')
    si_OthersOnLoan = StringIndexer(inputCol='OthersOnLoan', outputCol='OthersOnLoan_IX')
    si_OwnsProperty = StringIndexer(inputCol='OwnsProperty', outputCol='OwnsProperty_IX')
    si_InstallmentPlans = StringIndexer(inputCol='InstallmentPlans', outputCol='InstallmentPlans_IX')
    si_Housing = StringIndexer(inputCol='Housing', outputCol='Housing_IX')
    si_Job = StringIndexer(inputCol='Job', outputCol='Job_IX')
    si_Telephone = StringIndexer(inputCol='Telephone', outputCol='Telephone_IX')
    si_ForeignWorker = StringIndexer(inputCol='ForeignWorker', outputCol='ForeignWorker_IX')
    si_Label = StringIndexer(inputCol="Risk", outputCol="label").fit(spark_df)
    label_converter = IndexToString(inputCol="prediction", outputCol="predictedLabel", labels=si_Label.labels)

    va_features = VectorAssembler(
    inputCols=["CheckingStatus_IX", "CreditHistory_IX", "LoanPurpose_IX", "ExistingSavings_IX",
               "EmploymentDuration_IX", "Sex_IX", "OthersOnLoan_IX", "OwnsProperty_IX", "InstallmentPlans_IX",
               "Housing_IX", "Job_IX", "Telephone_IX", "ForeignWorker_IX", "LoanDuration", "LoanAmount",
               "InstallmentPercent", "CurrentResidenceDuration", "LoanDuration", "Age", "ExistingCreditsCount",
               "Dependents"], outputCol="features")

    classifier=GBTClassifier(featuresCol="features")

    pipeline = Pipeline(
    stages=[si_CheckingStatus, si_CreditHistory, si_EmploymentDuration, si_ExistingSavings, si_ForeignWorker,
            si_Housing, si_InstallmentPlans, si_Job, si_LoanPurpose, si_OthersOnLoan,
            si_OwnsProperty, si_Sex, si_Telephone, si_Label, va_features, classifier, label_converter])

    model = pipeline.fit(train_data)
    predictions = model.transform(test_data)
    evaluator = BinaryClassificationEvaluator(rawPredictionCol="prediction")
    auc = evaluator.evaluate(predictions)

    print("Accuracy = %g" % auc)

    from watson_machine_learning_client import WatsonMachineLearningAPIClient

    wml_client = WatsonMachineLearningAPIClient(wml_credentials)
    print(wml_client.service_instance.get_url())


    # Remove existing model and deployment
    MODEL_NAME=model_name
    DEPLOYMENT_NAME=deployment_name

    model_deployment_ids = wml_client.deployments.get_uids()
    for deployment_id in model_deployment_ids:
        deployment = wml_client.deployments.get_details(deployment_id)
        model_id = deployment['entity']['deployable_asset']['guid']
        if deployment['entity']['name'] == DEPLOYMENT_NAME:
            print('Deleting deployment id', deployment_id)
            wml_client.deployments.delete(deployment_id)
            print('Deleting model id', model_id)
            wml_client.repository.delete(model_id)
    wml_client.repository.list_models()
    
    training_data_reference = {
        "connection": {
            "db": "BLUDB",
            "host": "dashdb-txn-sbox-yp-dal09-03.services.dal.bluemix.net",
            "password": "khhz72v+6mcwwkfv",
            "username": "cmb91569"
        },
        "name": "German credit risk training data",
        "source": {
            "tablename": "CREDIT_RISK_TRAIN_DATA",
            "type": "db2"
        }
    }

    # Save Model
    model_props_rf = {
        wml_client.repository.ModelMetaNames.NAME: MODEL_NAME,
        wml_client.repository.ModelMetaNames.DESCRIPTION: MODEL_NAME,
        wml_client.repository.ModelMetaNames.EVALUATION_METHOD: "binary",
        wml_client.repository.ModelMetaNames.FRAMEWORK_NAME: "mllib",
        wml_client.repository.ModelMetaNames.FRAMEWORK_VERSION: "2.3",
        wml_client.repository.ModelMetaNames.TRAINING_DATA_REFERENCE: training_data_reference,
        wml_client.repository.ModelMetaNames.EVALUATION_METRICS: [
            {
               "name": "areaUnderROC",
               "value": auc,
               "threshold": 0.7
            }
        ]
    }

    published_model_details = wml_client.repository.store_model(model=model, meta_props=model_props_rf, training_data=train_data, pipeline=pipeline)
    print(published_model_details)

    # List models in the repository
    wml_client.repository.list_models()

    # Get the model UID
    model_uid = wml_client.repository.get_model_uid(published_model_details)
    model_uid


    # Deploy model
    wml_deployments = wml_client.deployments.get_details()
    deployment_uid = None
    for deployment in wml_deployments['resources']:
        if DEPLOYMENT_NAME == deployment['entity']['name']:
            deployment_uid = deployment['metadata']['guid']
            break

    if deployment_uid is None:
        print("Deploying model...")

        deployment = wml_client.deployments.create(artifact_uid=model_uid, name=DEPLOYMENT_NAME, description=DEPLOYMENT_NAME, asynchronous=False)
        deployment_uid = wml_client.deployments.get_uid(deployment)

    print("Model id: {}".format(model_uid))
    print("Deployment id: {}".format(deployment_uid))

    deployment_uid=wml_client.deployments.get_uid(deployment)
    deployment_uid

    fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"]
    values = [
      ["no_checking",13,"credits_paid_to_date","car_new",1343,"100_to_500","1_to_4",2,"female","none",3,"savings_insurance",46,"none","own",2,"skilled",1,"none","yes"],
      ["no_checking",24,"prior_payments_delayed","furniture",4567,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",36,"none","free",2,"management_self-employed",1,"none","yes"],
      ["0_to_200",26,"all_credits_paid_back","car_new",863,"less_100","less_1",2,"female","co-applicant",2,"real_estate",38,"none","own",1,"skilled",1,"none","yes"],
      ["0_to_200",14,"no_credits","car_new",2368,"less_100","1_to_4",3,"female","none",3,"real_estate",29,"none","own",1,"skilled",1,"none","yes"],
      ["0_to_200",4,"no_credits","car_new",250,"less_100","unemployed",2,"female","none",3,"real_estate",23,"none","rent",1,"management_self-employed",1,"none","yes"],
      ["no_checking",17,"credits_paid_to_date","car_new",832,"100_to_500","1_to_4",2,"male","none",2,"real_estate",42,"none","own",1,"skilled",1,"none","yes"],
      ["no_checking",33,"outstanding_credit","appliances",5696,"unknown","greater_7",4,"male","co-applicant",4,"unknown",54,"none","free",2,"skilled",1,"yes","yes"],
      ["0_to_200",13,"prior_payments_delayed","retraining",1375,"100_to_500","4_to_7",3,"male","none",3,"real_estate",37,"none","own",2,"management_self-employed",1,"none","yes"]
    ]

    scoring_payload = {"fields": fields, "values": values}
    print(scoring_payload)

    # Score the model deployment
    credit_risk_scoring_endpoint = None
    print(deployment_uid)

    for deployment in wml_client.deployments.get_details()['resources']:
        if deployment_uid in deployment['metadata']['guid']:
            credit_risk_scoring_endpoint = deployment['entity']['scoring_url']

    print(credit_risk_scoring_endpoint)

    scoring_response = wml_client.deployments.score(credit_risk_scoring_endpoint, scoring_payload)
    scoring_response
    
    return model_uid, deployment_uid


## Deploy the Scikit-Learn Credit Risk Model to Watson Machine Learning

The following cell deploys the Scikit-learn version of the German Credit Risk Model to the specified Machine Learning instance in the specified deployment space. This version of the German Credit Risk model has an auc-roc score around 85% and will be called the "Challenger."

In [None]:
def deploy_credit_risk_scikit_model(wml_credentials, model_name, deployment_name):

    import pandas as pd
    import json
    import sys
    import numpy
    import sklearn
    import sklearn.ensemble
    numpy.set_printoptions(threshold=sys.maxsize)
    from sklearn.utils.multiclass import type_of_target
    from sklearn.model_selection import train_test_split
    from sklearn.pipeline import Pipeline
    from sklearn.impute import SimpleImputer
    from sklearn.preprocessing import StandardScaler, OrdinalEncoder
    from sklearn.compose import ColumnTransformer
    from sklearn.model_selection import cross_validate
    from sklearn.metrics import get_scorer
    from sklearn.model_selection import cross_validate
    from sklearn.metrics import classification_report

    data_df=pd.read_csv ("german_credit_data_biased_training.csv")

    data_df.head()

    target_label_name = "Risk"
    feature_cols= data_df.drop(columns=[target_label_name])
    label= data_df[target_label_name]

    # Set model evaluation properties
    optimization_metric = 'roc_auc'
    random_state = 33
    cv_num_folds = 3
    holdout_fraction = 0.1

    if type_of_target(label.values) in ['multiclass', 'binary']:
        X_train, X_holdout, y_train, y_holdout = train_test_split(feature_cols, label, test_size=holdout_fraction, random_state=random_state, stratify=label.values)
    else:
        X_train, X_holdout, y_train, y_holdout = train_test_split(feature_cols, label, test_size=holdout_fraction, random_state=random_state)

    # Data preprocessing transformer generation

    numeric_transformer = Pipeline(steps=[
        ('imputer', SimpleImputer(strategy='median')),
        ('scaler', StandardScaler())])
    categorical_transformer = Pipeline(steps=[
        ('imputer', SimpleImputer(strategy='most_frequent')),
        ('OrdinalEncoder', OrdinalEncoder(categories='auto',dtype=numpy.float64 ))])

    numeric_features = feature_cols.select_dtypes(include=['int64', 'float64']).columns
    categorical_features = feature_cols.select_dtypes(include=['object']).columns

    preprocessor = ColumnTransformer(
        transformers=[
            ('num', numeric_transformer, numeric_features),
            ('cat', categorical_transformer, categorical_features)])

    # Initiate model and create pipeline
    model=sklearn.ensemble.gradient_boosting.GradientBoostingClassifier()
    gbt_pipeline = Pipeline(steps=[('preprocessor', preprocessor), ('classifier', model)])
    model_gbt=gbt_pipeline.fit(X_train, y_train)

    y_pred = model_gbt.predict(X_holdout)


    # Evaluate model performance on test data and Cross validation
    scorer = get_scorer(optimization_metric)
    scorer(model_gbt,X_holdout, y_holdout)

    # Cross validation -3 folds
    cv_results = cross_validate(model_gbt,X_train,y_train, scoring={optimization_metric:scorer})
    numpy.mean(cv_results['test_' + optimization_metric])

    print(classification_report(y_pred, y_holdout))


    # Initiate WML
    from watson_machine_learning_client import WatsonMachineLearningAPIClient
    wml_client = WatsonMachineLearningAPIClient(wml_credentials)
    print(wml_client.service_instance.get_url())


    # Remove existing model and deployment
    MODEL_NAME=model_name
    DEPLOYMENT_NAME=deployment_name

    model_deployment_ids = wml_client.deployments.get_uids()
    for deployment_id in model_deployment_ids:
        deployment = wml_client.deployments.get_details(deployment_id)
        model_id = deployment['entity']['deployable_asset']['guid']
        if deployment['entity']['name'] == DEPLOYMENT_NAME:
            print('Deleting deployment id', deployment_id)
            wml_client.deployments.delete(deployment_id)
            print('Deleting model id', model_id)
            wml_client.repository.delete(model_id)
    wml_client.repository.list_models()

    # Store Model
    model_props_gbt = {
        wml_client.repository.ModelMetaNames.NAME: MODEL_NAME,
        wml_client.repository.ModelMetaNames.DESCRIPTION: MODEL_NAME,
        wml_client.repository.ModelMetaNames.FRAMEWORK_NAME: "scikit-learn",
        wml_client.repository.ModelMetaNames.FRAMEWORK_VERSION: "0.19",
        wml_client.repository.ModelMetaNames.RUNTIME_NAME: "python"
    }

    published_model_details = wml_client.repository.store_model(model=model_gbt, meta_props=model_props_gbt, training_data=feature_cols,training_target=label)
    print(published_model_details)

    # List models in the repository
    wml_client.repository.list_models()

    # Get the model UID
    model_uid = wml_client.repository.get_model_uid(published_model_details)
    model_uid


    # Deploy model
    wml_deployments = wml_client.deployments.get_details()
    deployment_uid = None
    for deployment in wml_deployments['resources']:
        if DEPLOYMENT_NAME == deployment['entity']['name']:
            deployment_uid = deployment['metadata']['guid']
            break

    if deployment_uid is None:
        print("Deploying model...")

        deployment = wml_client.deployments.create(artifact_uid=model_uid, name=DEPLOYMENT_NAME, description=DEPLOYMENT_NAME, asynchronous=False)
        deployment_uid = wml_client.deployments.get_uid(deployment)

    print("Model id: {}".format(model_uid))
    print("Deployment id: {}".format(deployment_uid))

    deployment_uid=wml_client.deployments.get_uid(deployment)
    deployment_uid


    # Sample scoring
    fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"]
    values = [
      ["no_checking",13,"credits_paid_to_date","car_new",1343,"100_to_500","1_to_4",2,"female","none",3,"savings_insurance",46,"none","own",2,"skilled",1,"none","yes"],
      ["no_checking",24,"prior_payments_delayed","furniture",4567,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",36,"none","free",2,"management_self-employed",1,"none","yes"],
      ["0_to_200",26,"all_credits_paid_back","car_new",863,"less_100","less_1",2,"female","co-applicant",2,"real_estate",38,"none","own",1,"skilled",1,"none","yes"],
      ["0_to_200",14,"no_credits","car_new",2368,"less_100","1_to_4",3,"female","none",3,"real_estate",29,"none","own",1,"skilled",1,"none","yes"],
      ["0_to_200",4,"no_credits","car_new",250,"less_100","unemployed",2,"female","none",3,"real_estate",23,"none","rent",1,"management_self-employed",1,"none","yes"],
      ["no_checking",17,"credits_paid_to_date","car_new",832,"100_to_500","1_to_4",2,"male","none",2,"real_estate",42,"none","own",1,"skilled",1,"none","yes"],
      ["no_checking",33,"outstanding_credit","appliances",5696,"unknown","greater_7",4,"male","co-applicant",4,"unknown",54,"none","free",2,"skilled",1,"yes","yes"],
      ["0_to_200",13,"prior_payments_delayed","retraining",1375,"100_to_500","4_to_7",3,"male","none",3,"real_estate",37,"none","own",2,"management_self-employed",1,"none","yes"]
    ]

    payload_scoring = {"fields": fields,"values": values}
    print(payload_scoring)

    credit_risk_scoring_endpoint = None
    print(deployment_uid)

    for deployment in wml_client.deployments.get_details()['resources']:
        if deployment_uid in deployment['metadata']['guid']:
            credit_risk_scoring_endpoint = deployment['entity']['scoring_url']

    print(credit_risk_scoring_endpoint)

    scoring_response = wml_client.deployments.score(credit_risk_scoring_endpoint, payload_scoring)
    scoring_response

    return model_uid, deployment_uid


# Deploy the models

The following cells will deploy both the PreProd and Challenger models into the WML instance that is designated as Pre-Production.

In [None]:
PRE_PROD_MODEL_NAME="German Credit Risk Model - PreProd"
PRE_PROD_DEPLOYMENT_NAME="German Credit Risk Model - PreProd"

PRE_PROD_CHALLENGER_MODEL_NAME="German Credit Risk Model - Challenger"
PRE_PROD_CHALLENGER_DEPLOYMENT_NAME="German Credit Risk Model - Challenger"

In [None]:
pre_prod_model_uid, pre_prod_deployment_uid = deploy_credit_risk_spark_model(PRE_PROD_WML_CREDENTIALS, PRE_PROD_MODEL_NAME, PRE_PROD_DEPLOYMENT_NAME)

In [None]:
challenger_model_uid, challenger_deployment_uid = deploy_credit_risk_scikit_model(PRE_PROD_WML_CREDENTIALS, PRE_PROD_CHALLENGER_MODEL_NAME, PRE_PROD_CHALLENGER_DEPLOYMENT_NAME)

# Configure OpenScale 
The notebook will now import the necessary libraries and set up a Python OpenScale client.

In [None]:
from ibm_ai_openscale import APIClient
from ibm_ai_openscale.engines import *
from ibm_ai_openscale.utils import *
from ibm_ai_openscale.supporting_classes import PayloadRecord, Feature
from ibm_ai_openscale.supporting_classes.enums import *

### Get Watson OpenScale GUID
Each instance of OpenScale has a unique ID. We can get this value using the Cloud API key specified at the beginning of the notebook.

In [None]:
from ibm_ai_openscale.utils import get_instance_guid


WOS_GUID = get_instance_guid(api_key=CLOUD_API_KEY)
WOS_CREDENTIALS = {
    "instance_guid": WOS_GUID,
    "apikey": CLOUD_API_KEY,
    "url": "https://api.aiopenscale.cloud.ibm.com"
}

if WOS_GUID is None:
    print('Watson OpenScale GUID NOT FOUND')
else:
    print(WOS_GUID)

In [None]:
ai_client = APIClient(aios_credentials=WOS_CREDENTIALS)
ai_client.version

## Create schema and datamart

### Set up datamart
Watson OpenScale uses a database to store payload logs and calculated metrics. If database credentials were not supplied above, the notebook will use the free, internal lite database. If database credentials were supplied, the datamart will be created there unless there is an existing datamart and the KEEP_MY_INTERNAL_POSTGRES variable is set to True. If an OpenScale datamart exists in Db2 or PostgreSQL, the existing datamart will be used and no data will be overwritten.

Prior instances of the German Credit model will be removed from OpenScale monitoring.

In [None]:
try:
    data_mart_details = ai_client.data_mart.get_details()
    if 'internal_database' in data_mart_details and data_mart_details['internal_database']:
        if KEEP_MY_INTERNAL_POSTGRES:
            print('Using existing internal datamart.')
        else:
            if DB_CREDENTIALS is None:
                print('No postgres credentials supplied. Using existing internal datamart')
            else:
                print('Switching to external datamart')
                ai_client.data_mart.delete(force=True)
                ai_client.data_mart.setup(db_credentials=DB_CREDENTIALS)
    else:
        print('Using existing external datamart')
except:
    if DB_CREDENTIALS is None:
        print('Setting up internal datamart')
        ai_client.data_mart.setup(internal_db=True)
    else:
        print('Setting up external datamart')
        try:
            ai_client.data_mart.setup(db_credentials=DB_CREDENTIALS)
        except:
            print('Setup failed, trying Db2 setup')
            ai_client.data_mart.setup(db_credentials=DB_CREDENTIALS, schema=DB_CREDENTIALS['username'])

In [None]:
data_mart_details = ai_client.data_mart.get_details()

In [None]:
data_mart_details

## Bind WML machine learning instance as Pre-Prod

Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model. If a binding with name "WML Pre-Prod" already exists, this code will delete that binding a create a new one.

In [None]:
all_bindings = ai_client.data_mart.bindings.get_details()['service_bindings']
existing_binding = False
for binding in all_bindings:
    binding_uid = binding['metadata']['guid']
    if binding['metadata']['guid'] == PRE_PROD_WML_CREDENTIALS['instance_id']:
        existing_binding = True
        break

if not existing_binding:
    binding_uid = ai_client.data_mart.bindings.add('WML Pre-Prod', WatsonMachineLearningInstance(PRE_PROD_WML_CREDENTIALS))
    
bindings_details = ai_client.data_mart.bindings.get_details()
ai_client.data_mart.bindings.list()

In [None]:
print(binding_uid)

In [None]:
ai_client.data_mart.bindings.get_details(binding_uid)

In [None]:
ai_client.data_mart.bindings.list_assets()

## Generate an IAM token

The following is a function that will generate an IAM access token used to interact with the Watson OpenScale APIs

In [None]:
import json
import requests
import base64
from requests.auth import HTTPBasicAuth
import time

In [None]:
def generate_access_token():
    headers={}
    headers["Content-Type"] = "application/x-www-form-urlencoded"
    headers["Accept"] = "application/json"
    auth = HTTPBasicAuth("bx", "bx")
    data = {
        "grant_type": "urn:ibm:params:oauth:grant-type:apikey",
        "apikey": CLOUD_API_KEY
    }
    response = requests.post(IAM_URL, data=data, headers=headers, auth=auth)
    json_data = response.json()
    iam_access_token = json_data['access_token']
    return iam_access_token

## Patch the binding as pre-production 

We patch the binding created previously as `pre_production`.

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

payload = [
 {
   "op": "replace",
   "path": "/operational_space_id",
   "value": "pre_production"
 }
]

In [None]:
SERVICE_PROVIDER_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/service_providers/{1}".format(WOS_GUID, binding_uid)

response = requests.patch(SERVICE_PROVIDER_URL, json=payload, headers=headers)
json_data = response.json()
print(json_data)

## Subscriptions
### Remove existing PreProd and Challenger credit risk subscriptions
This code removes previous subscriptions to the German Credit model to refresh the monitors with the new model and new data.

In [None]:
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
for subscription in subscriptions_uids:
    sub_name = ai_client.data_mart.subscriptions.get_details(subscription)['entity']['asset']['name']
    if sub_name == PRE_PROD_MODEL_NAME or sub_name == PRE_PROD_CHALLENGER_MODEL_NAME:
        ai_client.data_mart.subscriptions.delete(subscription)
        print('Deleted existing subscription for', sub_name)

In [None]:
pre_prod_subscription = ai_client.data_mart.subscriptions.add(WatsonMachineLearningAsset(
    pre_prod_model_uid,
    binding_uid=binding_uid,
    problem_type=ProblemType.BINARY_CLASSIFICATION,
    input_data_type=InputDataType.STRUCTURED,
    label_column='Risk',
    prediction_column='predictedLabel',
    probability_column='probability',
    feature_columns = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"],
    categorical_columns = ["CheckingStatus","CreditHistory","LoanPurpose","ExistingSavings","EmploymentDuration","Sex","OthersOnLoan","OwnsProperty","InstallmentPlans","Housing","Job","Telephone","ForeignWorker"]
))

if pre_prod_subscription is None:
    print('Subscription already exists; get the existing one')
    subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
    for sub in subscriptions_uids:
        if ai_client.data_mart.subscriptions.get_details(sub)['entity']['asset']['name'] == PRE_PROD_MODEL_NAME:
            pre_prod_subscription = ai_client.data_mart.subscriptions.get(sub)

In [None]:
challenger_subscription = ai_client.data_mart.subscriptions.add(WatsonMachineLearningAsset(
    challenger_model_uid,
    binding_uid=binding_uid,
    problem_type=ProblemType.BINARY_CLASSIFICATION,
    input_data_type=InputDataType.STRUCTURED,
    label_column='Risk',
    prediction_column='prediction',
    probability_column='probability',
    feature_columns = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"],
    categorical_columns = ["CheckingStatus","CreditHistory","LoanPurpose","ExistingSavings","EmploymentDuration","Sex","OthersOnLoan","OwnsProperty","InstallmentPlans","Housing","Job","Telephone","ForeignWorker"]
))

if challenger_subscription is None:
    print('Subscription already exists; get the existing one')
    subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
    for sub in subscriptions_uids:
        if ai_client.data_mart.subscriptions.get_details(sub)['entity']['asset']['name'] == PRE_PROD_CHALLENGER_MODEL_NAME:
            challenger_subscription = ai_client.data_mart.subscriptions.get(sub)

In [None]:
ai_client.data_mart.subscriptions.list()

In [None]:
pre_prod_subscription.uid

In [None]:
challenger_subscription.uid

## Patch the training data reference to the challenger subscription

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

training_data_reference = {
  "connection": {
    "connection_string": "jdbc:db2://dashdb-txn-sbox-yp-dal09-03.services.dal.bluemix.net:50000/BLUDB:retrieveMessagesFromServerOnGetMessage=true;",
    "database_name": "BLUDB",
    "hostname": "dashdb-txn-sbox-yp-dal09-03.services.dal.bluemix.net",
    "password": "khhz72v+6mcwwkfv",
    "username": "cmb91569"
  },
  "location": {
    "schema_name": "CMB91569",
    "table_name": "CREDIT_RISK_TRAIN_DATA"
  },
  "name": "German credit risk training data",
  "type": "db2"
}

payload = [
 {
   "op": "replace",
   "path": "/asset_properties/training_data_reference",
   "value": training_data_reference
 }
]

In [None]:
SUBSCRIPTION_URL = WOS_CREDENTIALS["url"] + "/v1/data_marts/{0}/service_bindings/{1}/subscriptions/{2}".format(WOS_GUID, binding_uid, challenger_subscription.uid)

response = requests.patch(SUBSCRIPTION_URL, json=payload, headers=headers)
json_data = response.json()
print(json_data)

### Score the model so we can configure monitors
Now that the WML service has been bound and the subscription has been created, we need to send a request to the model before we configure OpenScale. This allows OpenScale to create a payload log in the datamart with the correct schema, so it can capture data coming into and out of the model. First, the code gets the model deployment's endpoint URL, and then sends a few records for predictions.

In [None]:
fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"]
values = [
  ["no_checking",13,"credits_paid_to_date","car_new",1343,"100_to_500","1_to_4",2,"female","none",3,"savings_insurance",46,"none","own",2,"skilled",1,"none","yes"],
  ["no_checking",24,"prior_payments_delayed","furniture",4567,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",36,"none","free",2,"management_self-employed",1,"none","yes"],
  ["0_to_200",26,"all_credits_paid_back","car_new",863,"less_100","less_1",2,"female","co-applicant",2,"real_estate",38,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",14,"no_credits","car_new",2368,"less_100","1_to_4",3,"female","none",3,"real_estate",29,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",4,"no_credits","car_new",250,"less_100","unemployed",2,"female","none",3,"real_estate",23,"none","rent",1,"management_self-employed",1,"none","yes"],
  ["no_checking",17,"credits_paid_to_date","car_new",832,"100_to_500","1_to_4",2,"male","none",2,"real_estate",42,"none","own",1,"skilled",1,"none","yes"],
  ["no_checking",33,"outstanding_credit","appliances",5696,"unknown","greater_7",4,"male","co-applicant",4,"unknown",54,"none","free",2,"skilled",1,"yes","yes"],
  ["0_to_200",13,"prior_payments_delayed","retraining",1375,"100_to_500","4_to_7",3,"male","none",3,"real_estate",37,"none","own",2,"management_self-employed",1,"none","yes"]
]

payload_scoring = {"fields": fields,"values": values}

In [None]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

wml_client = WatsonMachineLearningAPIClient(PRE_PROD_WML_CREDENTIALS)

In [None]:
pre_prod_credit_risk_scoring_endpoint = None
print(pre_prod_deployment_uid)

for deployment in wml_client.deployments.get_details()['resources']:
    if pre_prod_deployment_uid in deployment['metadata']['guid']:
        pre_prod_credit_risk_scoring_endpoint = deployment['entity']['scoring_url']
        
print(pre_prod_credit_risk_scoring_endpoint)

In [None]:
scoring_response = wml_client.deployments.score(pre_prod_credit_risk_scoring_endpoint, payload_scoring)

print('Single record scoring result:', '\n fields:', scoring_response['fields'], '\n values: ', scoring_response['values'][0])

In [None]:
time.sleep(10)
pre_prod_subscription.payload_logging.get_records_count()

In [None]:
challenger_credit_risk_scoring_endpoint = None
print(challenger_deployment_uid)

for deployment in wml_client.deployments.get_details()['resources']:
    if challenger_deployment_uid in deployment['metadata']['guid']:
        challenger_credit_risk_scoring_endpoint = deployment['entity']['scoring_url']
        
print(challenger_credit_risk_scoring_endpoint)

In [None]:
scoring_response = wml_client.deployments.score(challenger_credit_risk_scoring_endpoint, payload_scoring)

print('Single record scoring result:', '\n fields:', scoring_response['fields'], '\n values: ', scoring_response['values'][0])

In [None]:
time.sleep(10)
challenger_subscription.payload_logging.get_records_count()

# Quality monitoring

## Enable quality monitoring
The code below waits ten seconds to allow the payload logging table to be set up before it begins enabling monitors. First, it turns on the quality (accuracy) monitor and sets an alert threshold of 80%. OpenScale will show an alert on the dashboard if the model accuracy measurement (area under the curve, in the case of a binary classifier) falls below this threshold.

The second paramater supplied, min_records, specifies the minimum number of feedback records OpenScale needs before it calculates a new measurement. The quality monitor runs hourly, but the accuracy reading in the dashboard will not change until an additional 50 feedback records have been added, via the user interface, the Python client, or the supplied feedback endpoint.

In [None]:
time.sleep(10)
pre_prod_subscription.quality_monitoring.enable(threshold=0.8, min_records=100)

In [None]:
time.sleep(10)
challenger_subscription.quality_monitoring.enable(threshold=0.8, min_records=100)

# Fairness, drift monitoring and explanations 

## Fairness configuration
The code below configures fairness monitoring for our model. It turns on monitoring for two features, Sex and Age. In each case, we must specify:

Which model feature to monitor
One or more majority groups, which are values of that feature that we expect to receive a higher percentage of favorable outcomes
One or more minority groups, which are values of that feature that we expect to receive a higher percentage of unfavorable outcomes
The threshold at which we would like OpenScale to display an alert if the fairness measurement falls below (in this case, 80%)
Additionally, we must specify which outcomes from the model are favourable outcomes, and which are unfavourable. We must also provide the number of records OpenScale will use to calculate the fairness score. In this case, OpenScale's fairness monitor will run hourly, but will not calculate a new fairness rating until at least 100 records have been added. Finally, to calculate fairness, OpenScale must perform some calculations on the training data, so we provide the dataframe containing the data.

In [None]:
import pandas as pd
pd_data = pd.read_csv("german_credit_data_biased_training.csv", sep=",", header=0)

pre_prod_subscription.fairness_monitoring.enable(
            features=[
                Feature("Sex", majority=['male'], minority=['female'], threshold=0.80),
                Feature("Age", majority=[[26,74]], minority=[[19,25]], threshold=0.80)
            ],
            favourable_classes=['No Risk'],
            unfavourable_classes=['Risk'],
            min_records=100,
            training_data=pd_data
        )

In [None]:
challenger_subscription.fairness_monitoring.enable(
            features=[
                Feature("Sex", majority=['male'], minority=['female'], threshold=0.80),
                Feature("Age", majority=[[26,74]], minority=[[19,25]], threshold=0.80)
            ],
            favourable_classes=['No Risk'],
            unfavourable_classes=['Risk'],
            min_records=100,
            training_data=pd_data
        )

## Drift configuration

Enable the drift configuration for both the subscription created with a threshold of 10% and minimal sample as 100 records.

In [None]:
pre_prod_subscription.drift_monitoring.enable(threshold=0.10, min_records=100)

In [None]:
drift_status = None
while drift_status != 'finished':
    drift_details = pre_prod_subscription.drift_monitoring.get_details()
    drift_status = drift_details['parameters']['config_status']['state']
    if drift_status != 'finished':
        print(datetime.utcnow().strftime('%H:%M:%S'), drift_status)
        time.sleep(30)
print(drift_status)

In [None]:
challenger_subscription.drift_monitoring.enable(threshold=0.10, min_records=100)

In [None]:
drift_status = None
while drift_status != 'finished':
    drift_details = challenger_subscription.drift_monitoring.get_details()
    drift_status = drift_details['parameters']['config_status']['state']
    if drift_status != 'finished':
        print(datetime.utcnow().strftime('%H:%M:%S'), drift_status)
        time.sleep(30)
print(drift_status)

## Configure Explainability
Finally, we provide OpenScale with the training data to enable and configure the explainability features.

In [None]:
from ibm_ai_openscale.supporting_classes import *

pre_prod_subscription.explainability.enable(training_data=pd_data)
challenger_subscription.explainability.enable(training_data=pd_data)

## Enable MRM 

We enable the MRM configuration for both the subscriptions

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

payload = {
  "data_mart_id": WOS_GUID,
  "monitor_definition_id": "mrm",
  "target": {
    "target_id": pre_prod_subscription.uid,
    "target_type": "subscription"
  },
  "parameters": {
  },
  "managed_by": "user"
}

MONITOR_INSTANCES_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances".format(WOS_GUID)

response = requests.post(MONITOR_INSTANCES_URL, json=payload, headers=headers)
json_data = response.json()
print(json_data)
if "metadata" in json_data and "id" in json_data["metadata"]:
    pre_prod_mrm_instance_id = json_data["metadata"]["id"]

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

payload = {
  "data_mart_id": WOS_GUID,
  "monitor_definition_id": "mrm",
  "target": {
    "target_id": challenger_subscription.uid,
    "target_type": "subscription"
  },
  "parameters": {
  },
  "managed_by": "user"
}

MONITOR_INSTANCES_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances".format(WOS_GUID)

response = requests.post(MONITOR_INSTANCES_URL, json=payload, headers=headers)
json_data = response.json()
print(json_data)
if "metadata" in json_data and "id" in json_data["metadata"]:
    challenger_mrm_instance_id = json_data["metadata"]["id"]

## Create test data sets from the training data 

In [None]:
test_data_1 = pd_data[1:201]
test_data_1.to_csv("german_credit_risk_test_data_1.csv", encoding="utf-8", index=False)
test_data_2 = pd_data[201:401]
test_data_2.to_csv("german_credit_risk_test_data_2.csv", encoding="utf-8", index=False)
test_data_3 = pd_data[401:601]
test_data_3.to_csv("german_credit_risk_test_data_3.csv", encoding="utf-8", index=False)
test_data_4 = pd_data[601:801]
test_data_4.to_csv("german_credit_risk_test_data_4.csv", encoding="utf-8", index=False)

## Function to upload, evaluate and check the status of the evaluation

This function will upload the test data CSV and trigger the risk evaluation. It will iterate and check the status of the evaluation until its finished with a finite wait duration

In [None]:
def upload_and_evaluate(file_name, mrm_instance_id):
    
    print("Running upload and evaluate for {}".format(file_name))
    import json
    import time
    from datetime import datetime

    status = None
    monitoring_run_id = None
    GET_UPLOAD_AND_EVALUATION_STATUS_RETRIES = 32
    GET_UPLOAD_AND_EVALUATION_STATUS_INTERVAL = 10
    
    if file_name is not None:
        
        headers = {}
        headers["Content-Type"] = "text/csv"
        headers["Authorization"] = "Bearer {}".format(generate_access_token())
        
        POST_EVALUATIONS_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitoring_services/mrm/monitor_instances/{1}/risk_evaluations?test_data_set_name={2}".format(WOS_GUID, mrm_instance_id, file_name)

        with open(file_name) as file:
            f = file.read()
            b = bytearray(f, 'utf-8')

        response = requests.post(POST_EVALUATIONS_URL, data=bytes(b), headers=headers)
        if response.ok is False:
            print("Upload and evalaute for {0} failed with {1}: {2}".format(file_name, response.status_code, response.reason))
            return
        
        headers = {}
        headers["Content-Type"] = "application/json"
        headers["Authorization"] = "Bearer {}".format(generate_access_token())

        GET_EVALUATIONS_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitoring_services/mrm/monitor_instances/{1}/risk_evaluations".format(WOS_GUID, mrm_instance_id)
        
        for i in range(GET_UPLOAD_AND_EVALUATION_STATUS_RETRIES):
        
            response = requests.get(GET_EVALUATIONS_URL, headers=headers)
            if response.ok is False:
                print("Getting status of upload and evalaute for {0} failed with {1}: {2}".format(file_name, response.status_code, response.reason))
                return

            response = json.loads(response.text)
            if "metadata" in response and "id" in response["metadata"]:
                monitoring_run_id = response["metadata"]["id"]
            if "entity" in response and "status" in response["entity"]:
                status = response["entity"]["status"]["state"]
            
            if status is not None:
                print(datetime.utcnow().strftime('%H:%M:%S'), status.lower())
                if status.lower() in ["finished", "completed"]:
                    break
                elif "error"in status.lower():
                    print(response)
                    break

            time.sleep(GET_UPLOAD_AND_EVALUATION_STATUS_INTERVAL)

    return status, monitoring_run_id

## Perform Risk Evaluations

We now start performing evaluations of smaller data sets against both the PreProd and Challenger subscriptions

In [None]:
upload_and_evaluate("german_credit_risk_test_data_1.csv", pre_prod_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_2.csv", pre_prod_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_3.csv", pre_prod_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_4.csv", pre_prod_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_1.csv", challenger_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_2.csv", challenger_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_3.csv", challenger_mrm_instance_id)

In [None]:
upload_and_evaluate("german_credit_risk_test_data_4.csv", challenger_mrm_instance_id)

## Explore the Model Risk Management UI

Here is a quick recap of what we have done so far.

1. We've deployed two Credit Risk Model to a WML instance that is designated as Pre-Production
2. We've created subscriptions of these two model deployments in OpenScale
3. Configured all monitors supported by OpenScale for these subscriptions
4. We've performed a few risk evaluations against both these susbscription with the same set of test data

Now, please explore the Model Risk Management UI to visualize the results, compare the performance of models, download the evaluation report as PDF

Link to OpenScale : https://aiopenscale.cloud.ibm.com/aiopenscale/insights?mrm=true

# Promote pre-production model to production 

After you have reviewed the evaluation results of the PreProd Vs Challenger and if you make the decision to promote the PreProd model to Production, the first thing you need to do is to deploy the model into a WML instance that is designated as Production instance

## Deploy model to production WML instance 

In [None]:
PROD_MODEL_NAME="German Credit Risk Model - Prod"
PROD_DEPLOYMENT_NAME="German Credit Risk Model - Prod"

In [None]:
prod_model_uid, prod_deployment_uid = deploy_credit_risk_spark_model(PROD_WML_CREDENTIALS, PROD_MODEL_NAME, PROD_DEPLOYMENT_NAME)

## Bind WML machine learning instance as Prod

Watson OpenScale needs to be bound to the Watson Machine Learning instance to capture payload data into and out of the model. If a binding with name "WML Prod" already exists, this code will delete that binding a create a new one.

In [None]:
all_bindings = ai_client.data_mart.bindings.get_details()['service_bindings']
existing_binding = False
for binding in all_bindings:
    binding_uid = binding['metadata']['guid']
    if binding['metadata']['guid'] == PROD_WML_CREDENTIALS['instance_id']:
        existing_binding = True
        break

if not existing_binding:
    binding_uid = ai_client.data_mart.bindings.add('WML Prod', WatsonMachineLearningInstance(PROD_WML_CREDENTIALS))
    
bindings_details = ai_client.data_mart.bindings.get_details()
ai_client.data_mart.bindings.list()

In [None]:
print(binding_uid)
ai_client.data_mart.bindings.get_details(binding_uid)

In [None]:
ai_client.data_mart.bindings.list_assets()

## Patch binding as production

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

payload = [
 {
   "op": "replace",
   "path": "/operational_space_id",
   "value": "production"
 }
]

In [None]:
SERVICE_PROVIDER_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/service_providers/{1}".format(WOS_GUID, binding_uid)
response = requests.patch(SERVICE_PROVIDER_URL, json=payload, headers=headers)
json_data = response.json()
print(json_data)

## Remove existing prod subscription

This code removes previous subscription that matches the name `German Credit Risk Model - Prod` as it is expected this subscription is created only via this notebook.

In [None]:
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
for subscription in subscriptions_uids:
    sub_name = ai_client.data_mart.subscriptions.get_details(subscription)['entity']['asset']['name']
    if sub_name == PROD_MODEL_NAME:
        ai_client.data_mart.subscriptions.delete(subscription)
        print('Deleted existing subscription for', sub_name)

# Import configuration settings from pre-prod model

With MRM we provide a important feature that lets you copy the configuration settings of your pre-production subscription to the production subscription. To try this out

1. Navigate to Model Monitors view in Insights dashboard of OpenScale
2. Click on the Add to dashboard
3. Select the production model deployment from WML production machine learning provider and click on Configure
4. In Selections saved dialog, click on Configure monitors
5. Click on Import settings
6. In the Import configuration settings dialog, choose the `German Credit Risk Model - PreProd` as the subscription from which you want to import the settings and click Configure
7. In the Replace existing settings? dialog, click on Import

All the configuration settings are now copied into the production subscription


<b>Note: The next set of cells should be executed only after finishing the import settings from the OpenScale dashboard</b>

## Score the production model so that we can trigger monitors

Now that the production subscription is configured by copying the configuration, there would be schedules created for each of the monitors to run on a scheduled basis. 
Quality, Fairness and Mrm will run hourly. Drift will run once in three hours.

For this demo purpose, we will trigger the monitors on-demand so that we can see the model summary dashboard without having to wait the entire hour. 
To do that lets first push some records in the Payload Logging table.

In [None]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient

wml_client = WatsonMachineLearningAPIClient(PROD_WML_CREDENTIALS)

In [None]:
prod_credit_risk_scoring_endpoint = None
print(prod_deployment_uid)

for deployment in wml_client.deployments.get_details()['resources']:
    if prod_deployment_uid in deployment['metadata']['guid']:
        prod_credit_risk_scoring_endpoint = deployment['entity']['scoring_url']
        
print(prod_credit_risk_scoring_endpoint)

In [None]:
df = pd_data.sample(n=400)
df = df.drop(['Risk'], axis=1)
fields = df.columns.tolist()
values = df.values.tolist()

payload_scoring = {"fields": fields,"values": values}

In [None]:
scoring_response = wml_client.deployments.score(prod_credit_risk_scoring_endpoint, payload_scoring)

print('Single record scoring result:', '\n fields:', scoring_response['fields'], '\n values: ', scoring_response['values'][0])

In [None]:
prod_subscription = ai_client.data_mart.subscriptions.get(name=PROD_DEPLOYMENT_NAME)
prod_subscription.uid

In [None]:
time.sleep(10)
prod_subscription.payload_logging.get_records_count()

## Fetch all monitor instances

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

MONITOR_INSTANCES_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances?target.target_id={1}&target.target_type=subscription".format(WOS_GUID, prod_subscription.uid)
print(MONITOR_INSTANCES_URL)

response = requests.get(MONITOR_INSTANCES_URL, headers=headers)
monitor_instances = response.json()["monitor_instances"]

drift_monitor_instance_id = None
quality_monitor_instance_id = None
mrm_monitor_instance_id = None

if monitor_instances is not None:
    for monitor_instance in monitor_instances:
        if "entity" in monitor_instance and "monitor_definition_id" in monitor_instance["entity"]:
            monitor_name = monitor_instance["entity"]["monitor_definition_id"]
            if "metadata" in monitor_instance and "id" in monitor_instance["metadata"]:
                id = monitor_instance["metadata"]["id"]
                if monitor_name == "drift":
                    drift_monitor_instance_id = id
                elif monitor_name == "quality":
                    quality_monitor_instance_id = id
                elif monitor_name == "mrm":
                    mrm_monitor_instance_id = id
                    
print("Quality monitor instance id - {0}".format(quality_monitor_instance_id))
print("Drift monitor instance id - {0}".format(drift_monitor_instance_id))
print("MRM monitor instance id - {0}".format(mrm_monitor_instance_id))

## Function to get the monitoring run details

In [None]:
def get_monitoring_run_details(monitor_instance_id, monitoring_run_id):
    
    headers = {}
    headers["Content-Type"] = "application/json"
    headers["Authorization"] = "Bearer {}".format(generate_access_token())
    
    MONITORING_RUNS_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances/{1}/runs/{2}".format(WOS_GUID, monitor_instance_id, monitoring_run_id)
    response = requests.get(MONITORING_RUNS_URL, headers=headers, verify=False)
    return response.json()

## Run on-demand Quality

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

if quality_monitor_instance_id is not None:
    MONITOR_RUN_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances/{1}/runs".format(WOS_GUID, quality_monitor_instance_id)
    payload = {
        "triggered_by": "user"
    }
    print("Triggering Quality computation with {}".format(MONITOR_RUN_URL))
    response = requests.post(MONITOR_RUN_URL, json=payload, headers=headers, verify=False)
    json_data = response.json()
    print()
    print(json_data)
    print()
    if "metadata" in json_data and "id" in json_data["metadata"]:
        quality_monitoring_run_id = json_data["metadata"]["id"]
    print("Done triggering Quality computation")

In [None]:
from datetime import datetime

quality_run_status = None
while quality_run_status != 'finished':
    monitoring_run_details = get_monitoring_run_details(quality_monitor_instance_id, quality_monitoring_run_id)
    quality_run_status = monitoring_run_details["entity"]["status"]["state"]
    if quality_run_status == "error":
        print(monitoring_run_details)
        break
    if quality_run_status != 'finished':
        print(datetime.utcnow().strftime('%H:%M:%S'), quality_run_status)
        time.sleep(10)
print(quality_run_status)

## Run on-demand Drift

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

if drift_monitor_instance_id is not None:
    MONITOR_RUN_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances/{1}/runs".format(WOS_GUID, drift_monitor_instance_id)
    payload = {
        "triggered_by": "user"
    }
    print("Triggering Drift computation with {}".format(MONITOR_RUN_URL))
    response = requests.post(MONITOR_RUN_URL, json=payload, headers=headers, verify=False)
    json_data = response.json()
    print()
    print(json_data)
    print()
    if "metadata" in json_data and "id" in json_data["metadata"]:
        drift_monitoring_run_id = json_data["metadata"]["id"]
    print("Done triggering Drift computation")

In [None]:
from datetime import datetime

drift_run_status = None
while drift_run_status != 'finished':
    monitoring_run_details = get_monitoring_run_details(drift_monitor_instance_id, drift_monitoring_run_id)
    drift_run_status = monitoring_run_details["entity"]["status"]["state"]
    if drift_run_status == "error":
        print(monitoring_run_details)
        break
    if drift_run_status != 'finished':
        print(datetime.utcnow().strftime('%H:%M:%S'), drift_run_status)
        time.sleep(10)
print(drift_run_status)

## Run on-demand Fairness

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

FAIRNESS_RUNS_URL = WOS_CREDENTIALS["url"] + "/v1/fairness_monitoring/{0}/runs".format(prod_subscription.uid)
payload = {
  "binding_id": binding_uid,
  "subscription_id": prod_subscription.uid,
  "deployment_id": prod_deployment_uid,
  "data_mart_id": WOS_GUID
}

print("Triggering Fairness computation with {}".format(FAIRNESS_RUNS_URL))
response = requests.post(FAIRNESS_RUNS_URL, json=payload, headers=headers, verify=False)
json_data = response.json()
print()
print(json_data)
print()
print("Done triggering Fairness computation")

In [None]:
from datetime import datetime

fairness_run_status = None
time.sleep(5)
while fairness_run_status != 'FINISHED':
    fairness_monitoring_details = prod_subscription.fairness_monitoring.get_details()
    fairness_run_status = fairness_monitoring_details["parameters"]["run_status"][prod_deployment_uid]["run_status"]
    if fairness_run_status == "FINISHED WITH ERRORS":
        print(fairness_monitoring_details)
        break
    if fairness_run_status != 'FINISHED':
        print(datetime.utcnow().strftime('%H:%M:%S'), fairness_run_status)
        time.sleep(10)
print(fairness_run_status)

## Run on-demand MRM

In [None]:
headers = {}
headers["Content-Type"] = "application/json"
headers["Authorization"] = "Bearer {}".format(generate_access_token())

if mrm_monitor_instance_id is not None:
    MONITOR_RUN_URL = WOS_CREDENTIALS["url"] + "/openscale/{0}/v2/monitor_instances/{1}/runs".format(WOS_GUID, mrm_monitor_instance_id)
    payload = {
        "triggered_by": "user"
    }
    print("Triggering MRM computation with {}".format(MONITOR_RUN_URL))
    response = requests.post(MONITOR_RUN_URL, json=payload, headers=headers, verify=False)
    json_data = response.json()
    print()
    print(json_data)
    print()
    if "metadata" in json_data and "id" in json_data["metadata"]:
        mrm_monitoring_run_id = json_data["metadata"]["id"]
    print("Done triggering MRM computation")

In [None]:
from datetime import datetime

mrm_run_status = None
while mrm_run_status != 'finished':
    monitoring_run_details = get_monitoring_run_details(mrm_monitor_instance_id, mrm_monitoring_run_id)
    mrm_run_status = monitoring_run_details["entity"]["status"]["state"]
    if mrm_run_status == "error":
        print(monitoring_run_details)
        break
    if mrm_run_status != 'finished':
        print(datetime.utcnow().strftime('%H:%M:%S'), mrm_run_status)
        time.sleep(10)
print(mrm_run_status)

## Refresh the model summary of the production subscription in the OpenScale dashboard

This brings us to the end of this demo exercise. Thank you for trying it out.