# Configure credentials

For OpenScale credentials, click on your OpenScale instance from the Cloud console and paste the resulting URL into the AIOS_URL variable below to get your instance ID.

In [1]:
AIOS_URL = 'https://console.bluemix.net/services/aiopenscale/crn%3Av1%3Abluemix%3Apublic%3Aaiopenscale%3Aus-south%3Aa%2F7d8b3c34272c0980d973d3e40be9e9d2%3A371eb42f-c440-4295-a64e-1c907492d158%3A%3A?ace_config=%7B%22region%22%3A%22us-south%22%2C%22orgGuid%22%3A%22%22%2C%22redirect%22%3A%22https%3A%2F%2Fconsole.bluemix.net%2Fdashboard%2Fapps%2F%22%2C%22bluemixUIVersion%22%3A%22v6%22%2C%22crn%22%3A%22crn%3Av1%3Abluemix%3Apublic%3Aaiopenscale%3Aus-south%3Aa%2F7d8b3c34272c0980d973d3e40be9e9d2%3A371eb42f-c440-4295-a64e-1c907492d158%3A%3A%22%2C%22id%22%3A%222da019f3-0fd6-4c25-966d-f3952481a870%22%7D&env_id=ibm%3Ayp%3Aus-south'
AIOS_GUID = AIOS_URL.lower().split('?')[0].split('%2f')[1].split('%3a')[1]
print(AIOS_GUID)

371eb42f-c440-4295-a64e-1c907492d158


Your Cloud API key can be generated by going to the Cloud console and clicking Manage->Account->Users, selecting "Platform API Keys" from the menu on the left, and then clicking the "Create" button.

In [None]:
CLOUD_API_KEY = "PASTE KEY HERE"

AIOS_CREDENTIALS = {
    "instance_guid": AIOS_GUID,
    "apikey": CLOUD_API_KEY,
    "url": "https://api.aiopenscale.cloud.ibm.com"
}

Copy and paste your WML credentials into the cell below.

In [None]:
WML_CREDENTIALS = {
  "PASTE HERE"
}

Copy and paste your PostgreSQL credentials in the cell below. If you do not have a paid PostgreSQL service, leave the variable set to 'None' and OpenScale will use the internal lite version of PostgreSQL. However, you will not be able to connect Watson Studio to your OpenScale feedback data for model evaluations and re-training.

If you have previously configured OpenScale, this notebook will use your existing datamart UNLESS:
1. You have configured OpenScale to use the internal lite version of PostgreSQL, AND:
2. You provide new PostgreSQL credentials.

In this case, the notebook will remove your existing datamart and create a new one with the supplied credentials.

In [None]:
POSTGRES_CREDENTIALS = None

For Cloud Object Storage, choose a valid, unique name for your bucket that DOES NOT ALREADY EXIST in your COS instance. This notebook will create the bucket. If you use an existing bucket, the notebook will be unable to write the file. Then paste your COS credentials in the cell below. 

In [None]:
COS_BUCKET_NAME = "your-name-here-german-credit-training"
COS_CREDENTIALS = {
  "PASTE HERE"
}

Paste your Spark credentials in the cell below. If you do not wish to do the model evaluation and retraining portion of the lab, you may skip this part and leave the variable set to None.

In [None]:
SPARK_CREDENTIALS = None

# Package installation

In [None]:
!rm -rf $PIP_BUILD
!pip install --upgrade watson-machine-learning-client --no-cache | tail -n 1
!pip install --upgrade ibm-ai-openscale --no-cache | tail -n 1
!pip install psycopg2-binary | tail -n 1

Restart the kernel to assure the new libraries are being used.

# Load and explore data

## Load the training data from github

In [None]:
!rm credit_risk_training.csv
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/credit_risk_training.csv

In [None]:
from pyspark.sql import SparkSession
import json

spark = SparkSession.builder.getOrCreate()
df_data = spark.read.csv(path="credit_risk_training.csv", sep=",", header=True, inferSchema=True)
df_data.head()

## Store the training data in COS

In [None]:
import ibm_boto3
from ibm_botocore.client import Config
import io, urllib

cos = ibm_boto3.resource('s3',
                      ibm_api_key_id=COS_CREDENTIALS['apikey'],
                      ibm_service_instance_id=COS_CREDENTIALS['resource_instance_id'],
                      ibm_auth_endpoint='https://iam.bluemix.net/oidc/token',
                      config=Config(signature_version='oauth'),
                      endpoint_url='https://s3-api.us-geo.objectstorage.softlayer.net')

buckets = []
for bucket in cos.buckets.all():
        buckets.append(bucket.name)
        
if COS_BUCKET_NAME not in buckets:
    cos.create_bucket(Bucket=COS_BUCKET_NAME)

cos.Bucket(COS_BUCKET_NAME).upload_file('credit_risk_training.csv', 'credit_risk_training.csv')

## Explore data

In [None]:
df_data.printSchema()

In [None]:
print("Number of records: " + str(df_data.count()))

# Create a model

In [None]:
spark_df = df_data
(train_data, test_data) = spark_df.randomSplit([0.8, 0.2], 24)

MODEL_NAME = "AIOS Spark German Risk Model - Final"
DEPLOYMENT_NAME = "AIOS Spark German Risk Deployment - Final"

print("Number of records for training: " + str(train_data.count()))
print("Number of records for evaluation: " + str(test_data.count()))

spark_df.printSchema()

In [None]:
from pyspark.ml.feature import OneHotEncoder, StringIndexer, IndexToString, VectorAssembler
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml import Pipeline, Model

si_CheckingStatus = StringIndexer(inputCol = 'CheckingStatus', outputCol = 'CheckingStatus_IX')
si_CreditHistory = StringIndexer(inputCol = 'CreditHistory', outputCol = 'CreditHistory_IX')
si_LoanPurpose = StringIndexer(inputCol = 'LoanPurpose', outputCol = 'LoanPurpose_IX')
si_ExistingSavings = StringIndexer(inputCol = 'ExistingSavings', outputCol = 'ExistingSavings_IX')
si_EmploymentDuration = StringIndexer(inputCol = 'EmploymentDuration', outputCol = 'EmploymentDuration_IX')
si_Sex = StringIndexer(inputCol = 'Sex', outputCol = 'Sex_IX')
si_OthersOnLoan = StringIndexer(inputCol = 'OthersOnLoan', outputCol = 'OthersOnLoan_IX')
si_OwnsProperty = StringIndexer(inputCol = 'OwnsProperty', outputCol = 'OwnsProperty_IX')
si_InstallmentPlans = StringIndexer(inputCol = 'InstallmentPlans', outputCol = 'InstallmentPlans_IX')
si_Housing = StringIndexer(inputCol = 'Housing', outputCol = 'Housing_IX')
si_Job = StringIndexer(inputCol = 'Job', outputCol = 'Job_IX')
si_Telephone = StringIndexer(inputCol = 'Telephone', outputCol = 'Telephone_IX')
si_ForeignWorker = StringIndexer(inputCol = 'ForeignWorker', outputCol = 'ForeignWorker_IX')

In [None]:
si_Label = StringIndexer(inputCol="Risk", outputCol="label").fit(spark_df)
label_converter = IndexToString(inputCol="prediction", outputCol="predictedLabel", labels=si_Label.labels)

In [None]:
va_features = VectorAssembler(inputCols=["CheckingStatus_IX", "CreditHistory_IX", "LoanPurpose_IX", "ExistingSavings_IX", "EmploymentDuration_IX", "Sex_IX", \
                                         "OthersOnLoan_IX", "OwnsProperty_IX", "InstallmentPlans_IX", "Housing_IX", "Job_IX", "Telephone_IX", "ForeignWorker_IX", \
                                         "LoanDuration", "LoanAmount", "InstallmentPercent", "CurrentResidenceDuration", "LoanDuration", "Age", "ExistingCreditsCount", \
                                         "Dependents"], outputCol="features")

In [None]:
from pyspark.ml.classification import RandomForestClassifier
classifier = RandomForestClassifier(featuresCol="features")

pipeline = Pipeline(stages=[si_CheckingStatus, si_CreditHistory, si_EmploymentDuration, si_ExistingSavings, si_ForeignWorker, si_Housing, si_InstallmentPlans, si_Job, si_LoanPurpose, si_OthersOnLoan,\
                               si_OwnsProperty, si_Sex, si_Telephone, si_Label, va_features, classifier, label_converter])
model = pipeline.fit(train_data)

In [None]:
predictions = model.transform(test_data)
evaluatorDT = BinaryClassificationEvaluator(rawPredictionCol="prediction")
area_under_curve = evaluatorDT.evaluate(predictions)

#default evaluation is areaUnderROC
print("areaUnderROC = %g" % area_under_curve)

# Save and deploy the model

In [None]:
from watson_machine_learning_client import WatsonMachineLearningAPIClient
import json

wml_client = WatsonMachineLearningAPIClient(WML_CREDENTIALS)

### Remove existing model and deployment

In [None]:
model_deployment_ids = wml_client.deployments.get_uids()
deleted_model_id = None
for deployment_id in model_deployment_ids:
    deployment = wml_client.deployments.get_details(deployment_id)
    model_id = deployment['entity']['deployable_asset']['guid']
    if deployment['entity']['name'] == DEPLOYMENT_NAME:
        print('Deleting deployment id', deployment_id)
        wml_client.deployments.delete(deployment_id)
        print('Deleting model id', model_id)
        wml_client.repository.delete(model_id)
        deleted_model_id = model_id
wml_client.repository.list_models()

In [None]:
training_data_reference = {
    'name': 'german credit training data',
    'connection': {
        "iam_url": "https://iam.ng.bluemix.net/oidc/token",
        "api_key": COS_CREDENTIALS["apikey"],
        "resource_instance_id": COS_CREDENTIALS["resource_instance_id"],
        "url": "https://s3-api.us-geo.objectstorage.softlayer.net/"
    },
    'source': {
        'bucket': COS_BUCKET_NAME,
        "firstlineheader": "true",
        "file_name": "credit_risk_training.csv",
        "infer_schema": "1",
        "type": "bluemixcloudobjectstorage",
        "file_format": "csv"
    }
}

In [None]:
OUTPUT_DATA_SCHEMA = {'fields': [{'metadata': {'measure': 'discrete',
      'modeling_role': 'feature'},
     'name': 'CheckingStatus',
     'nullable': True,
     'type': 'string'},
    {'metadata': {'modeling_role': 'feature'}, 'name': 'LoanDuration', 'nullable': True, 'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'}, 'name': 'CreditHistory', 'nullable': True, 'type': 'string'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'}, 'name': 'LoanPurpose', 'nullable': True, 'type': 'string'},
    {'metadata': {'modeling_role': 'feature'}, 'name': 'LoanAmount', 'nullable': True, 'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'}, 'name': 'ExistingSavings', 'nullable': True, 'type': 'string'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'}, 'name': 'EmploymentDuration', 'nullable': True, 'type': 'string'},
    {'metadata': {'modeling_role': 'feature'}, 'name': 'InstallmentPercent', 'nullable': True, 'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'}, 'name': 'Sex','nullable': True,'type': 'string'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'OthersOnLoan','nullable': True,'type': 'string'},
    {'metadata': {'modeling_role': 'feature'},'name': 'CurrentResidenceDuration','nullable': True,'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'OwnsProperty','nullable': True,'type': 'string'},
    {'metadata': {'modeling_role': 'feature'},'name': 'Age','nullable': True,'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'InstallmentPlans','nullable': True,'type': 'string'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'Housing','nullable': True,'type': 'string'},
    {'metadata': {'modeling_role': 'feature'},'name': 'ExistingCreditsCount','nullable': True,'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'Job','nullable': True,'type': 'string'},
    {'metadata': {'modeling_role': 'feature'},'name': 'Dependents','nullable': True,'type': 'integer'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'Telephone','nullable': True,'type': 'string'},
    {'metadata': {'measure': 'discrete', 'modeling_role': 'feature'},'name': 'ForeignWorker','nullable': True,'type': 'string'},
    {'metadata': {'modeling_role': 'probability'},'name': 'probability','nullable': True,'type': {'containsNull': True, 'elementType': 'double', 'type': 'array'}},
    {'metadata': {'modeling_role': 'prediction'},'name': 'prediction','nullable': True,'type': 'double'},
    {'metadata': {'modeling_role': 'decoded-target'},'name': 'predictedLabel','nullable': True,'type': 'string'},
    {'metadata': {'modeling_role': 'debiased-prediction'},'name': 'debiased_prediction','nullable': True,'type': 'double'},
    {'metadata': {'modeling_role': 'debiased-probability'},'name': 'debiased_probability','nullable': True,'type': {'containsNull': True,'elementType': 'double','type': 'array'}}],
   'type': 'struct'}

In [None]:
model_props = {
    wml_client.repository.ModelMetaNames.NAME: "{}".format(MODEL_NAME),
    wml_client.repository.ModelMetaNames.TRAINING_DATA_REFERENCE: training_data_reference,
    wml_client.repository.ModelMetaNames.EVALUATION_METHOD: "binary",
    wml_client.repository.ModelMetaNames.EVALUATION_METRICS: [
        {
           "name": "areaUnderROC",
           "value": area_under_curve,
           "threshold": 0.7
        }
    ],
    wml_client.repository.ModelMetaNames.OUTPUT_DATA_SCHEMA: OUTPUT_DATA_SCHEMA
}

In [None]:
wml_models = wml_client.repository.get_details()
model_uid = None
for model_in in wml_models['models']['resources']:
    if MODEL_NAME == model_in['entity']['name']:
        model_uid = model_in['metadata']['guid']
        break

if model_uid is None:
    print("Storing model ...")

    published_model_details = wml_client.repository.store_model(model=model, meta_props=model_props, training_data=train_data, pipeline=pipeline)
    model_uid = wml_client.repository.get_model_uid(published_model_details)
    print("Done")

In [None]:
model_uid

In [None]:
wml_deployments = wml_client.deployments.get_details()
deployment_uid = None
for deployment in wml_deployments['resources']:
    if DEPLOYMENT_NAME == deployment['entity']['name']:
        deployment_uid = deployment['metadata']['guid']
        break

if deployment_uid is None:
    print("Deploying model...")

    deployment = wml_client.deployments.create(artifact_uid=model_uid, name=DEPLOYMENT_NAME, asynchronous=False)
    deployment_uid = wml_client.deployments.get_uid(deployment)
    
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))

# Configure OpenScale

In [None]:
from ibm_ai_openscale import APIClient
from ibm_ai_openscale.engines import *
from ibm_ai_openscale.utils import *
from ibm_ai_openscale.supporting_classes import PayloadRecord, Feature
from ibm_ai_openscale.supporting_classes.enums import *

## Create schema and datamart

In [None]:
ai_client = APIClient(aios_credentials=AIOS_CREDENTIALS)
ai_client.version

### Set up datamart

In [None]:
SCHEMA_NAME = 'data_mart_credit_risk'

In [None]:
try:
    data_mart_details = ai_client.data_mart.get_details()
    if 'internal_database' in data_mart_details['database_configuration'] and data_mart_details['database_configuration']['internal_database']:
        if POSTGRES_CREDENTIALS is None:
            print('Using existing internal datamart')
        else:
            print('Switching to external datamart')
            ai_client.data_mart.delete(force=True)
            create_postgres_schema(postgres_credentials=POSTGRES_CREDENTIALS, schema_name=SCHEMA_NAME)
            ai_client.data_mart.setup(db_credentials=POSTGRES_CREDENTIALS, schema=SCHEMA_NAME)
    else:
        print('Using existing external datamart')
except:
    if POSTGRES_CREDENTIALS is None:
        print('Setting up internal datamart')
        ai_client.data_mart.setup(internal_db=True)
    else:
        print('Setting up external datamart')
        create_postgres_schema(postgres_credentials=POSTGRES_CREDENTIALS, schema_name=SCHEMA_NAME)
        ai_client.data_mart.setup(db_credentials=POSTGRES_CREDENTIALS, schema=SCHEMA_NAME)

In [None]:
data_mart_details = ai_client.data_mart.get_details()
data_mart_details

## Bind machine learning engines

In [None]:
binding_uid = ai_client.data_mart.bindings.add('WML instance', WatsonMachineLearningInstance(WML_CREDENTIALS))
if binding_uid is None:
    binding_uid = ai_client.data_mart.bindings.get_details()['service_bindings'][0]['metadata']['guid']
bindings_details = ai_client.data_mart.bindings.get_details()
ai_client.data_mart.bindings.list()

In [None]:
print(binding_uid)

In [None]:
ai_client.data_mart.bindings.list_assets()

## Subscriptions

### Remove existing credit risk subscriptions

In [None]:
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
for subscription in subscriptions_uids:
    sub_name = ai_client.data_mart.subscriptions.get_details(subscription)['entity']['asset']['name']
    if sub_name == MODEL_NAME:
        ai_client.data_mart.subscriptions.delete(subscription)
        print('Deleted existing subscription for', MODEL_NAME)

In [None]:
# subscription = ai_client.data_mart.subscriptions.add(WatsonMachineLearningAsset(model_uid))
subscription = ai_client.data_mart.subscriptions.add(WatsonMachineLearningAsset(
    model_uid,
    label_column='Risk',
    prediction_column='predictedLabel',
    probability_column='probability'
))
if subscription is None:
    print('Exists already')
    # subscription already exists; get the existing one
    subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
    for sub in subscriptions_uids:
        if ai_client.data_mart.subscriptions.get_details(sub)['entity']['asset']['name'] == MODEL_NAME:
            subscription = ai_client.data_mart.subscriptions.get(sub)

Get subscription list

In [None]:
subscriptions_uids = ai_client.data_mart.subscriptions.get_uids()
ai_client.data_mart.subscriptions.list()

In [None]:
subscription.get_details()

### Score the model

In [None]:
# wml_client = client.data_mart.bindings.get_native_engine_client(binding_uid=subscription.binding_uid)
credit_risk_scoring_endpoint = None
deployment_uid = subscription.get_deployment_uids()[0]

print(deployment_uid)

for deployment in wml_client.deployments.get_details()['resources']:
    if deployment_uid in deployment['metadata']['guid']:
        credit_risk_scoring_endpoint = deployment['entity']['scoring_url']
        
print(credit_risk_scoring_endpoint)

In [None]:
fields = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"]
values = [
  ["no_checking",13,"credits_paid_to_date","car_new",1343,"100_to_500","1_to_4",2,"female","none",3,"savings_insurance",46,"none","own",2,"skilled",1,"none","yes"],
  ["no_checking",24,"prior_payments_delayed","furniture",4567,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",36,"none","free",2,"management_self-employed",1,"none","yes"],
  ["0_to_200",26,"all_credits_paid_back","car_new",863,"less_100","less_1",2,"female","co-applicant",2,"real_estate",38,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",14,"no_credits","car_new",2368,"less_100","1_to_4",3,"female","none",3,"real_estate",29,"none","own",1,"skilled",1,"none","yes"],
  ["0_to_200",4,"no_credits","car_new",250,"less_100","unemployed",2,"female","none",3,"real_estate",23,"none","rent",1,"management_self-employed",1,"none","yes"],
  ["no_checking",17,"credits_paid_to_date","car_new",832,"100_to_500","1_to_4",2,"male","none",2,"real_estate",42,"none","own",1,"skilled",1,"none","yes"],
  ["no_checking",33,"outstanding_credit","appliances",5696,"unknown","greater_7",4,"male","co-applicant",4,"unknown",54,"none","free",2,"skilled",1,"yes","yes"],
  ["0_to_200",13,"prior_payments_delayed","retraining",1375,"100_to_500","4_to_7",3,"male","none",3,"real_estate",37,"none","own",2,"management_self-employed",1,"none","yes"]
]

payload_scoring = {"fields": fields,"values": values}
scoring_response = wml_client.deployments.score(credit_risk_scoring_endpoint, payload_scoring)

print(scoring_response)

## Quality and feedback monitoring

### Enable quality monitoring

In [None]:
subscription.quality_monitoring.enable(problem_type=ProblemType.BINARY_CLASSIFICATION, threshold=0.7, min_records=5)

### Feedback logging

In [None]:
subscription.feedback_logging.store(
    [
        ["no_checking",28,"outstanding_credit","appliances",5990,"500_to_1000","greater_7",5,"male","co-applicant",3,"car_other",55,"none","free",2,"skilled",2,"yes","yes","Risk"],
        ["greater_200",22,"all_credits_paid_back","car_used",3376,"less_100","less_1",3,"female","none",2,"car_other",32,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",39,"credits_paid_to_date","vacation",6434,"unknown","greater_7",5,"male","none",4,"car_other",39,"none","own",2,"skilled",2,"yes","yes","Risk"],
        ["0_to_200",20,"credits_paid_to_date","furniture",2442,"less_100","unemployed",3,"female","none",1,"real_estate",42,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["greater_200",4,"all_credits_paid_back","education",4206,"less_100","unemployed",1,"female","none",3,"savings_insurance",27,"none","own",1,"management_self-employed",1,"none","yes","No Risk"],
        ["greater_200",23,"credits_paid_to_date","car_used",2963,"greater_1000","greater_7",4,"male","none",4,"car_other",46,"none","own",2,"skilled",1,"none","yes","Risk"],
        ["no_checking",31,"prior_payments_delayed","vacation",2673,"500_to_1000","1_to_4",3,"male","none",2,"real_estate",35,"stores","rent",1,"skilled",2,"none","yes","Risk"],
        ["no_checking",37,"prior_payments_delayed","other",6971,"500_to_1000","1_to_4",3,"male","none",3,"savings_insurance",54,"none","own",2,"skilled",1,"yes","yes","Risk"],
        ["no_checking",14,"all_credits_paid_back","car_new",1525,"500_to_1000","4_to_7",3,"male","none",4,"real_estate",33,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["less_0",10,"prior_payments_delayed","furniture",4037,"less_100","4_to_7",3,"male","none",3,"savings_insurance",31,"none","rent",1,"skilled",1,"none","yes","Risk"],
        ["0_to_200",28,"credits_paid_to_date","retraining",1152,"less_100","less_1",2,"female","none",2,"savings_insurance",20,"stores","own",1,"skilled",1,"none","yes","No Risk"],
        ["less_0",17,"credits_paid_to_date","car_new",1880,"less_100","less_1",3,"female","co-applicant",2,"savings_insurance",41,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["0_to_200",39,"prior_payments_delayed","appliances",5685,"100_to_500","1_to_4",4,"female","none",2,"unknown",37,"none","own",2,"skilled",1,"yes","yes","Risk"],
        ["no_checking",32,"prior_payments_delayed","radio_tv",5105,"500_to_1000","1_to_4",4,"male","none",5,"savings_insurance",44,"none","own",2,"management_self-employed",1,"none","yes","Risk"],
        ["no_checking",38,"prior_payments_delayed","appliances",4990,"500_to_1000","greater_7",4,"male","none",4,"car_other",50,"bank","own",2,"unemployed",2,"yes","yes","Risk"],
        ["less_0",17,"credits_paid_to_date","furniture",1017,"less_100","less_1",2,"female","none",1,"car_other",30,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["less_0",33,"all_credits_paid_back","car_new",3618,"500_to_1000","4_to_7",2,"male","none",3,"unknown",31,"stores","own",2,"unskilled",1,"none","yes","No Risk"],
        ["less_0",12,"no_credits","car_new",3037,"less_100","less_1",1,"female","none",2,"car_other",31,"stores","own",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",23,"prior_payments_delayed","furniture",1440,"100_to_500","1_to_4",3,"female","none",3,"real_estate",39,"stores","own",1,"unskilled",1,"yes","yes","No Risk"],
        ["less_0",18,"prior_payments_delayed","retraining",4032,"less_100","1_to_4",2,"female","none",2,"car_other",36,"none","rent",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",11,"prior_payments_delayed","car_used",944,"greater_1000","1_to_4",3,"male","none",4,"real_estate",35,"none","own",1,"management_self-employed",1,"yes","yes","No Risk"],
        ["no_checking",36,"prior_payments_delayed","appliances",5927,"unknown","greater_7",4,"male","co-applicant",3,"savings_insurance",47,"none","own",2,"skilled",1,"none","yes","Risk"],
        ["no_checking",50,"outstanding_credit","other",4694,"unknown","greater_7",4,"male","none",4,"unknown",37,"none","own",1,"skilled",2,"yes","yes","Risk"],
        ["no_checking",32,"prior_payments_delayed","radio_tv",10584,"100_to_500","1_to_4",3,"male","co-applicant",3,"unknown",46,"stores","own",2,"unskilled",2,"yes","yes","No Risk"],
        ["no_checking",41,"prior_payments_delayed","furniture",8900,"500_to_1000","4_to_7",4,"male","co-applicant",3,"car_other",26,"none","free",2,"skilled",1,"yes","yes","Risk"],
        ["0_to_200",14,"credits_paid_to_date","car_used",1144,"100_to_500","less_1",2,"female","none",2,"real_estate",33,"none","rent",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",14,"outstanding_credit","appliances",1680,"100_to_500","greater_7",4,"male","none",3,"car_other",47,"none","own",1,"management_self-employed",1,"none","yes","No Risk"],
        ["0_to_200",23,"credits_paid_to_date","retraining",3387,"less_100","less_1",3,"female","none",3,"savings_insurance",28,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",14,"credits_paid_to_date","furniture",1269,"500_to_1000","greater_7",2,"male","none",2,"savings_insurance",39,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",36,"prior_payments_delayed","appliances",9570,"100_to_500","4_to_7",4,"male","co-applicant",3,"car_other",53,"none","free",2,"skilled",1,"yes","yes","No Risk"],
        ["less_0",16,"credits_paid_to_date","car_new",1428,"less_100","4_to_7",1,"male","none",1,"car_other",20,"bank","rent",1,"unemployed",1,"yes","yes","No Risk"],
        ["no_checking",24,"outstanding_credit","car_used",4620,"greater_1000","1_to_4",3,"male","none",4,"savings_insurance",40,"none","own",2,"skilled",1,"yes","yes","No Risk"],
        ["no_checking",34,"prior_payments_delayed","furniture",2196,"500_to_1000","greater_7",3,"male","none",4,"savings_insurance",27,"none","own",1,"skilled",1,"none","yes","No Risk"],
        ["no_checking",25,"prior_payments_delayed","car_used",8708,"100_to_500","1_to_4",4,"male","none",5,"car_other",43,"none","free",2,"management_self-employed",1,"none","yes","No Risk"],
        ["no_checking",37,"outstanding_credit","radio_tv",10550,"unknown","greater_7",5,"male","co-applicant",4,"unknown",48,"stores","own",2,"unemployed",2,"yes","yes","Risk"],
        ["no_checking",27,"prior_payments_delayed","radio_tv",4981,"500_to_1000","4_to_7",4,"male","none",4,"savings_insurance",47,"none","own",2,"management_self-employed",2,"yes","yes","No Risk"],
        ["less_0",13,"all_credits_paid_back","car_new",2436,"less_100","less_1",2,"female","none",1,"savings_insurance",19,"stores","own",1,"skilled",1,"none","yes","No Risk"],
        ["greater_200",25,"outstanding_credit","appliances",4136,"100_to_500","4_to_7",3,"male","none",2,"car_other",46,"bank","own",1,"unemployed",1,"yes","yes","No Risk"],
        ["no_checking",15,"credits_paid_to_date","retraining",4014,"less_100","1_to_4",4,"male","co-applicant",4,"savings_insurance",33,"none","own",1,"skilled",1,"yes","yes","Risk"],
        ["no_checking",28,"prior_payments_delayed","appliances",5440,"100_to_500","4_to_7",3,"male","none",2,"unknown",40,"none","own",2,"skilled",1,"yes","yes","Risk"],
        ["less_0",13,"prior_payments_delayed","appliances",250,"500_to_1000","4_to_7",2,"male","none",3,"car_other",28,"stores","own",1,"skilled",1,"none","yes","No Risk"],
        ["less_0",19,"credits_paid_to_date","furniture",2111,"less_100","4_to_7",3,"male","none",2,"savings_insurance",34,"bank","own",1,"unemployed",2,"none","yes","No Risk"],
        ["no_checking",27,"prior_payments_delayed","appliances",6455,"100_to_500","4_to_7",3,"male","none",4,"car_other",43,"none","own",1,"skilled",1,"none","yes","Risk"],
        ["less_0",17,"credits_paid_to_date","car_used",250,"less_100","4_to_7",3,"female","none",2,"real_estate",40,"none","free",2,"skilled",1,"none","yes","No Risk"],
        ["no_checking",27,"prior_payments_delayed","radio_tv",4521,"100_to_500","less_1",4,"male","none",4,"savings_insurance",28,"none","own",1,"management_self-employed",2,"yes","yes","No Risk"],
        ["no_checking",37,"prior_payments_delayed","other",7945,"500_to_1000","1_to_4",4,"male","none",4,"savings_insurance",39,"none","own",2,"management_self-employed",1,"none","yes","No Risk"],
        ["less_0",6,"all_credits_paid_back","car_used",250,"less_100","1_to_4",2,"male","none",2,"savings_insurance",28,"stores","rent",1,"skilled",1,"none","yes","Risk"],
        ["less_0",14,"all_credits_paid_back","appliances",1431,"less_100","unemployed",1,"female","none",1,"car_other",25,"stores","own",1,"skilled",1,"none","yes","Risk"],
        ["greater_200",5,"credits_paid_to_date","car_used",250,"less_100","4_to_7",3,"male","none",2,"savings_insurance",42,"none","rent",1,"skilled",1,"none","yes","No Risk"]
    ]
)

In [None]:
subscription.feedback_logging.show_table()

In [None]:
run_details = subscription.quality_monitoring.run()
status = run_details['status']
id = run_details['id']
print(id)

print("Run status: {}".format(status))

start_time = time.time()
elapsed_time = 0

while status != 'completed' and elapsed_time < 60:
    time.sleep(10)
    run_details = subscription.quality_monitoring.get_run_details(run_uid=id)
    status = run_details['status']
    elapsed_time = time.time() - start_time
    print("Run status: {}".format(status))

In [None]:
subscription.quality_monitoring.get_run_details()

In [None]:
subscription.quality_monitoring.show_table()

In [None]:
subscription.quality_monitoring._get_data_from_rest_api()

In [None]:
ai_client.data_mart.get_deployment_metrics()

## Fairness monitoring

In [None]:
subscription.fairness_monitoring.enable(
            features=[
                Feature("Sex", majority=['male'], minority=['female'], threshold=0.95),
                Feature("Age", majority=[[26,75]], minority=[[18,25]], threshold=0.95)
            ],
            prediction_column='Risk',
            favourable_classes=['No Risk'],
            unfavourable_classes=['Risk'],
            min_records=200
        )

## Score the model again now that monitoring is configured

In [None]:
!rm german_credit_feed.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/master/german_credit_feed.json

In [None]:
import random

with open('german_credit_feed.json', 'r') as scoring_file:
    scoring_data = json.load(scoring_file)

fields = scoring_data['fields']
values = []
for _ in range(200):
    values.append(random.choice(scoring_data['values']))
payload_scoring = {"fields": fields, "values": values}

scoring_response = wml_client.deployments.score(credit_risk_scoring_endpoint, payload_scoring)
print(scoring_response)

## Run Fairness monitoring

In [None]:
run_details = subscription.fairness_monitoring.run()

In [None]:
subscription.fairness_monitoring.show_table()

In [None]:
subscription.get_details()

# Create historical data

In [None]:
!rm payload_history*.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_1.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_2.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_3.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_4.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_5.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_6.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/binary/payload_history_7.json

In [None]:
historyDays = 7

In [None]:
from ibm_ai_openscale.supporting_classes import PayloadRecord, Feature
import datetime
import time

for day in range(historyDays):
    print('Loading day {}'.format(day + 1))
    history_file = 'payload_history_' + str(day + 1) + '.json'
    with open(history_file) as f:
        payloads = json.load(f)
        hourly_records = int(len(payloads) / 24)
        index = 0
        for hour in range(24):
            recordsList = []
            for i in range(hourly_records):
                score_time = str(datetime.datetime.utcnow() + datetime.timedelta(hours=(-(24*day + hour + 1))))
                recordsList.append(PayloadRecord(request=payloads[index]['request'], response=payloads[index]['response'], scoring_timestamp=score_time))
                index += 1
            subscription.payload_logging.store(records=recordsList)
print('Finished')

## Run historical fairness monitoring

In [None]:
data_mart_id = subscription.get_details()['metadata']['url'].split('/service_bindings')[0].split('marts/')[1]
print(data_mart_id)

In [None]:
token_data = {
    'grant_type': 'urn:ibm:params:oauth:grant-type:apikey',
    'response_type': 'cloud_iam',
    'apikey': AIOS_CREDENTIALS['apikey']
}

response = requests.post('https://iam.bluemix.net/identity/token', data=token_data)
iam_token = response.json()['access_token']
iam_headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer %s' % iam_token
}

metrics_url = 'https://api.aiopenscale.cloud.ibm.com/v1/fairness_monitoring'
request_params = {"fairness_history_run": "true"}
fairness_history_payload = {
    "data_mart_id": data_mart_id,
    "asset_id": model_uid,
    "deployment_id": deployment_uid,
    "fairness_history_run": "true",
    "parameters": {
        "model_type": "binary_classification",
        "features": [
            {
                "feature": "Sex",
                "majority": ['male'],
                "minority": ['female'],
                "threshold": 0.95
            }, 
            {
                "feature": "Age",
                "majority": [[26,75]],
                "minority": [[18,25]],
                "threshold": 0.95
            }
            ],
        "class_label": "predictedLabel",
        "favourable_class": ["No Risk"],
        "unfavourable_class": ["Risk"],
        "min_records": 200
    }
}

response = requests.post(metrics_url, json=fairness_history_payload, headers=iam_headers, params=request_params)
print(response.text)

In [None]:
performance_metrics_url = 'https://api.aiopenscale.cloud.ibm.com' + subscription.get_details()['metadata']['url'].split('/service_bindings')[0] + '/metrics'
print(performance_metrics_url)

In [None]:
# store performance monitor history in MeasurementFacts table
import random
token_data = {
    'grant_type': 'urn:ibm:params:oauth:grant-type:apikey',
    'response_type': 'cloud_iam',
    'apikey': AIOS_CREDENTIALS['apikey']
}

response = requests.post('https://iam.bluemix.net/identity/token', data=token_data)
iam_token = response.json()['access_token']
iam_headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer %s' % iam_token
}

for day in range(historyDays):
    print('Day', day + 1)
    for hour in range(24):
        score_time = (datetime.datetime.utcnow() + datetime.timedelta(hours=(-(24*day + hour + 1)))).strftime('%Y-%m-%dT%H:%M:%SZ')
        score_count = random.randint(60, 600)
        score_resp = random.uniform(60, 300)

        performanceMetric = {
            'metric_type': 'performance',
            'binding_id': binding_uid,
            'timestamp': score_time,
            'subscription_id': model_uid,
            'asset_revision': model_uid,
            'deployment_id': deployment_uid,
            'value': {
                'response_time': score_resp,
                'records': score_count
            }
        }

        response = requests.post(performance_metrics_url, json=[performanceMetric], headers=iam_headers)
print('Finished')

## Load historical quality MeasurementFacts to AIOS

In [None]:
token_data = {
    'grant_type': 'urn:ibm:params:oauth:grant-type:apikey',
    'response_type': 'cloud_iam',
    'apikey': AIOS_CREDENTIALS['apikey']
}

response = requests.post('https://iam.bluemix.net/identity/token', data=token_data)
iam_token = response.json()['access_token']
iam_headers = {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer %s' % iam_token
}

measurements = [0.76, 0.78, 0.68, 0.72, 0.73, 0.77, 0.80]
for day in range(historyDays):
    print('Day', day + 1)
    for hour in range(24):
        score_time = (datetime.datetime.utcnow() + datetime.timedelta(hours=(-(24*day + hour + 1)))).strftime('%Y-%m-%dT%H:%M:%SZ')
        
        qualityMetric = {
            'metric_type': 'quality',
            'binding_id': binding_uid,
            'timestamp': score_time,
            'subscription_id': model_uid,
            'asset_revision': model_uid,
            'deployment_id': deployment_uid,
            'value': {
                'quality': measurements[day],
                'threshold': 0.7,
                'metrics': [
                    {
                        'name': 'auroc',
                        'value': measurements[day],
                        'threshold': 0.7
                    }
                ]
            }
        }

        response = requests.post(performance_metrics_url, json=[qualityMetric], headers=iam_headers)
print('Finished')

## Explainability

In [None]:
from ibm_ai_openscale.supporting_classes import *
subscription.explainability.enable(
    problem_type=ProblemType.BINARY_CLASSIFICATION,
            input_data_type=InputDataType.STRUCTURED,
            feature_columns = ["CheckingStatus","LoanDuration","CreditHistory","LoanPurpose","LoanAmount","ExistingSavings","EmploymentDuration","InstallmentPercent","Sex","OthersOnLoan","CurrentResidenceDuration","OwnsProperty","Age","InstallmentPlans","Housing","ExistingCreditsCount","Job","Dependents","Telephone","ForeignWorker"],
            categorical_columns = ["CheckingStatus","CreditHistory","LoanPurpose","ExistingSavings","EmploymentDuration","Sex","OthersOnLoan","OwnsProperty","InstallmentPlans","Housing","Job","Telephone","ForeignWorker"],
            label_column='predictedLabel',
            training_data_reference=BluemixCloudObjectStorageReference(
                COS_CREDENTIALS,
                COS_BUCKET_NAME + '/credit_risk_training.csv',
                first_line_header=True
            )
        )

In [None]:
subscription.explainability.get_details()

## Explain a transaction

In [None]:
# subscription.explainability.run('759509c1e6e85cd72605e57c181681f8-7')

## Additional data to help debugging

In [None]:
print('Datamart:', data_mart_id)
print('Model:', model_uid)
print('Deployment:', deployment_uid)
print('Binding:', binding_uid)
print('Scoring URL:', credit_risk_scoring_endpoint)

In [None]:
subscription.payload_logging.get_details()

In [None]:
subscription.payload_logging.print_table_schema()

In [None]:
if POSTGRES_CREDENTIALS is not None:
    print('Configure performance monitoring in Watson Studio. Feedback data can be found by creating a connection to:')
    uri = data_mart_details['database_configuration']['credentials']['uri']
    print('Username:', uri.split('//')[1].split(':')[0])
    print('Password:', uri.split(':')[2].split('@')[0])
    print('Port:', uri.split('@')[1].split(':')[1].split('/')[0])
    print('Hostname:', uri.split('@')[1].split(':')[0])
    print('Database:', uri.split('@')[1].split('/')[-1])
    print()
    print('Your feedback data is located in the', data_mart_details['database_configuration']['location']['schema'], 'schema,')
    print('in the Feedback_', model_uid, ' table.', sep='')


## Set up continuous learning

In [None]:
if POSTGRES_CREDENTIALS is not None:
    accuracy_details = subscription.quality_monitoring.get_details()
    feedback_data_reference = accuracy_details['parameters']['feedback_data_reference']
    feedback_data_reference['connection']['db'] = feedback_data_reference['connection']['uri'].split('/')[-1]
    feedback_data_reference['location']['tablename'] = feedback_data_reference['location']['table_name']
    feedback_data_reference['source'] = {
        "tablename": feedback_data_reference['location']['tablename'],
        "type": "postgresql"
    }
    print(feedback_data_reference)

In [None]:
if POSTGRES_CREDENTIALS is not None and SPARK_CREDENTIALS is not None:
    system_config = {
        wml_client.learning_system.ConfigurationMetaNames.FEEDBACK_DATA_REFERENCE: feedback_data_reference,
        wml_client.learning_system.ConfigurationMetaNames.MIN_FEEDBACK_DATA_SIZE: 50,
        wml_client.learning_system.ConfigurationMetaNames.SPARK_REFERENCE: SPARK_CREDENTIALS,
        wml_client.learning_system.ConfigurationMetaNames.AUTO_RETRAIN: "conditionally",
        wml_client.learning_system.ConfigurationMetaNames.AUTO_REDEPLOY: "always"
    }

    wml_client.learning_system.setup(model_uid=model_uid, meta_props=system_config)
    wml_client.learning_system.get_details(model_uid)

### Feed in 100 rows of feedback data via OpenScale API to simulate input from apps

In [None]:
!rm additional_feedback_data.json
!wget https://raw.githubusercontent.com/emartensibm/german-credit/master/additional_feedback_data.json

In [None]:
with open('additional_feedback_data.json') as feedback_file:
    additional_feedback_data = json.load(feedback_file)
subscription.feedback_logging.store(additional_feedback_data['data'])

The cell below will kick off a learning system evaluation, retrain, and redeploy of the model. You may un-comment this line and run it if you would prefer not to perform this portion of the demo in Watson Studio.

In [None]:
# wml_client.learning_system.run(model_uid, asynchronous=False)