<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Working with SPSS Collaboration and Deployment Services

This notebook shows how to log the payload for the model deployed on custom model serving engine using Watson OpenScale python sdk.

Contents
 1. Setup
 2. Binding machine learning engine
 3. Subscriptions
 4. Performance monitor, scoring and payload logging
 5. Quality monitor and feedback logging
 6. Fairness monitoring and explanations

## 1. Setup

### Sample model creation using SPSS Modeler

- Download training data set from [here](https://github.com/pmservice/ai-openscale-tutorials/blob/master/notebooks/data/credit_risk_training.csv)
- Download SPSS Modeler stream from [here](https://github.com/pmservice/ai-openscale-tutorials/blob/master/notebooks/data/german_credit_risk_tutorial.str)
- Deploy the model using SPSS C&DS as web service

### Installation and authentication

In [1]:
!pip install --upgrade ibm-ai-openscale | tail -n 1

Successfully installed ibm-ai-openscale-2.1.7


Import and initiate.

In [2]:
from ibm_ai_openscale import APIClient4ICP
from ibm_ai_openscale.supporting_classes import PayloadRecord
from ibm_ai_openscale.supporting_classes.enums import InputDataType, ProblemType
from ibm_ai_openscale.engines import *
from ibm_ai_openscale.utils import *
import urllib3
import os

urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

#### Let's define some constants required to set up data mart:

- AIOS_CREDENTIALS (ICP)
- DATABASE_CREDENTIALS (DB2 on ICP)
- SCHEMA_NAME

In [3]:
AIOS_CREDENTIALS = {
    "url": "***",
    "username": "***",
    "password": "***"
}

In [4]:
# The code was removed by Watson Studio for sharing.

In [5]:
DATABASE_CREDENTIALS = {
    "jdbcurl": "***",
    "hostname": "***",
    "username": "***",
    "password": "***",
    "port": 50000,
    "db": "***",
    "dsn": "***",
    "uri": "***"
}

In [6]:
# The code was removed by Watson Studio for sharing.

In [7]:
print(DATABASE_CREDENTIALS)

{'db_type': 'db2', 'hostname': 'https://9.30.43.150', 'password': 'C0wTiger', 'port': 50000, 'db': 'SAMPLE', 'username': 'db2inst1', 'uri': 'db2://9.30.43.150:50000/SAMPLE:user=db2inst1;password=C0wTiger;'}


In [26]:
SCHEMA_NAME = 'SPSSTF01'

In [27]:
client = APIClient4ICP(AIOS_CREDENTIALS)

In [28]:
client.version

'2.1.7'

### DataMart setup

In [29]:
#client.data_mart.delete()

In [30]:
client.data_mart.setup(db_credentials=DATABASE_CREDENTIALS, schema=SCHEMA_NAME)

In [31]:
data_mart_details = client.data_mart.get_details()
print(data_mart_details)

{'database_configuration': {'database_type': 'db2', 'credentials': {'hostname': 'https://9.30.43.150', 'db_type': 'db2', 'username': 'db2inst1', 'uri': 'db2://9.30.43.150:50000/SAMPLE:user=db2inst1;password=C0wTiger;', 'db': 'SAMPLE', 'port': 50000, 'password': 'C0wTiger'}, 'location': {'schema': 'SPSSTF01'}, 'name': 'db2'}, 'status': {'state': 'active'}, 'service_instance_crn': 'N/A'}


<a id="binding"></a>
## 2. Bind machine learning engines

### Bind  `SPSS C&DS` machine learning engine

Provide credentials using following fields:
- `username`
- `password`
- `url`

In [32]:
SPSS_CDS_ENGINE_CREDENTIALS = {
        "username": "***",
        "password": "***",
        "url": "***",
    }

In [33]:
# The code was removed by Watson Studio for sharing.

In [34]:
BINDING_NAME = 'My SPSS C&DS engine'
binding_details = client.data_mart.bindings.get_details()
binding_uid = [binding["entity"]["instance_id"] for binding in binding_details['service_bindings'] if binding["entity"]["name"] == BINDING_NAME]

if len(binding_uid) > 0:
    [binding_uid] = binding_uid
else:
    binding_uid = client.data_mart.bindings.add(BINDING_NAME, SPSSMachineLearningInstance(SPSS_CDS_ENGINE_CREDENTIALS))

<a id="subsciption"></a>
## 3. Subscriptions

### Add subscriptions

List available deployments.

**Note:** Depending on number of assets it may take some time.

In [35]:
client.data_mart.bindings.list_assets(binding_uid=binding_uid)

0,1,2,3,4,5,6
091e60170f73c4410000016a9a18aecd8265,ai_drug_mlp,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8272,ai_drug_svm,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8258,ai_drug_mlp_proba_BiasQA,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8ad2,diamonds,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd81f2,NewTelcoChurnScore,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8ada,employee,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8be2,walking_activity,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8ae2,ulcer-recurrence,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False
091e60170f73c4410000016a9a18aecd8ac9,customer_visits,0000-01-01T00:00:00.0Z,model,,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,False


**Action:** Assign your credit risk model source_uid to `source_uid` variable below.

In [36]:
MODEL_NAME='german_credit_risk_tutorial_BiasQA'
asset_details = client.data_mart.bindings.get_asset_details(binding_uid=binding_uid)
source_uid = [asset["source_uid"] for asset in asset_details if asset["name"]==MODEL_NAME]

if len(source_uid)>0:
    [source_uid] = source_uid
else:
    raise ValueError('Model with name "{}" not found.'.format(MODEL_NAME))

In [37]:
subscription = client.data_mart.subscriptions.add(
    SPSSMachineLearningAsset(source_uid=source_uid,
                             binding_uid=binding_uid,
                             input_data_type=InputDataType.STRUCTURED,
                             problem_type=ProblemType.BINARY_CLASSIFICATION,
                             label_column="Risk",
                             prediction_column="$N-Risk",
                             class_probability_columns=["$NP-No Risk", "$NP-Risk"]))

#### Get subscriptions list

In [38]:
subscriptions = client.data_mart.subscriptions.get_details()

In [39]:
subscriptions_uids = client.data_mart.subscriptions.get_uids()
print(subscriptions_uids)

['091e60170f73c4410000016a9a18aecd8242']


#### List subscriptions

In [40]:
client.data_mart.subscriptions.list()

0,1,2,3,4
091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,model,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,2019-05-14T12:23:23.280Z


## 4. Performance monitor, scoring and payload logging

### Score the credit risk model and measure response time

In [41]:
import requests
from requests.auth import HTTPBasicAuth
import time
import json

subscription_details = subscription.get_details()
scoring_endpoint = subscription_details['entity']['deployments'][0]['scoring_endpoint']['url']
input_table_id = subscription_details['entity']['asset_properties']['input_data_schema']['id']
node_id = subscription_details['entity']['asset']['name']

scoring_payload = {'requestInputTable': [{'id': input_table_id, 'requestInputRow': [{'input': [
            {'name': 'CheckingStatus', 'value': '0_to_200'}, {'name': 'LoanDuration', 'value': '31'},
            {'name': 'CreditHistory', 'value': 'credits_paid_to_date'}, {'name': 'LoanPurpose', 'value': 'other'},
            {'name': 'LoanAmount', 'value': '1889'}, {'name': 'ExistingSavings', 'value': '100_to_500'},
            {'name': 'EmploymentDuration', 'value': 'less_1'}, {'name': 'InstallmentPercent', 'value': '3'},
            {'name': 'Sex', 'value': 'female'}, {'name': 'OthersOnLoan', 'value': 'none'},
            {'name': 'CurrentResidenceDuration', 'value': '3'}, {'name': 'OwnsProperty', 'value': 'savings_insurance'},
            {'name': 'Age', 'value': '32'}, {'name': 'InstallmentPlans', 'value': 'none'},
            {'name': 'Housing', 'value': 'own'}, {'name': 'ExistingCreditsCount', 'value': '1'},
            {'name': 'Job', 'value': 'skilled'}, {'name': 'Dependents', 'value': '1'},
            {'name': 'Telephone', 'value': 'none'}, {'name': 'ForeignWorker', 'value': 'yes'}]}]}], 'id': node_id}

start_time = time.time()
resp_score = requests.post(url=scoring_endpoint, json=scoring_payload, auth=HTTPBasicAuth(username=SPSS_CDS_ENGINE_CREDENTIALS['username'], password=SPSS_CDS_ENGINE_CREDENTIALS['password']))

response_time = int((time.time() - start_time)*1000)
result = resp_score.json()

print(result)

{'providedBy': 'german_credit_risk_tutorial_BiasQA', 'id': '1575a603-43b5-470d-a61d-bf8cbe2fbac9', 'columnNames': {'name': ['CheckingStatus', 'LoanDuration', 'CreditHistory', 'LoanPurpose', 'LoanAmount', 'ExistingSavings', 'EmploymentDuration', 'InstallmentPercent', 'Sex', 'OthersOnLoan', 'CurrentResidenceDuration', 'OwnsProperty', 'Age', 'InstallmentPlans', 'Housing', 'ExistingCreditsCount', 'Job', 'Dependents', 'Telephone', 'ForeignWorker', '$N-Risk', '$NC-Risk', '$NP-No Risk', '$NP-Risk']}, 'rowValues': [{'value': [{'value': '0_to_200'}, {'value': '31'}, {'value': 'credits_paid_to_date'}, {'value': 'other'}, {'value': '1889'}, {'value': '100_to_500'}, {'value': 'less_1'}, {'value': '3'}, {'value': 'female'}, {'value': 'none'}, {'value': '3'}, {'value': 'savings_insurance'}, {'value': '32'}, {'value': 'none'}, {'value': 'own'}, {'value': '1'}, {'value': 'skilled'}, {'value': '1'}, {'value': 'none'}, {'value': 'yes'}, {'value': 'No Risk'}, {'value': '0.8252855725848809'}, {'value': '0

### Store the request and response in payload logging table

#### Store the payload using Python SDK

**Hint:** You can embed payload logging code into your application so it is logged automatically each time you score the model.

In [42]:
records_list = [PayloadRecord(request=scoring_payload, response=result, response_time=response_time), 
                PayloadRecord(request=scoring_payload, response=result, response_time=response_time)]

for i in range(1, 10):
    records_list.append(PayloadRecord(request=scoring_payload, response=result, response_time=response_time))

subscription.payload_logging.store(records=records_list)

print("Waiting 10 seconds for propagation...", end=" ")
time.sleep(10)
print('payload data should be propagated.')

Waiting 10 seconds for propagation... payload data should be propagated.


### Performance metrics of scoring requests

In [43]:
subscription.performance_monitoring.show_table()

0,1,2,3,4,5,6,7
2019-05-14 12:24:16.846000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.846000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.846000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.846000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.845000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.845000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.845000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.845000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.845000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,
2019-05-14 12:24:16.845000+00:00,23.0,1,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,,


## 5. Quality monitor and feedback logging

### Enable quality monitoring

You need to provide the monitoring `threshold` and `min_records` (minimal number of feedback records).

In [44]:
subscription.quality_monitoring.enable(threshold=0.7, min_records=10)

### Feedback records logging

Feedback records are used to evaluate your model. The predicted values are compared to real values (feedback records).

You can check the schema of feedback table using below method.

In [45]:
subscription.feedback_logging.print_table_schema()

0,1,2
CheckingStatus,string,True
LoanDuration,integer,True
CreditHistory,string,True
LoanPurpose,string,True
LoanAmount,integer,True
ExistingSavings,string,True
EmploymentDuration,string,True
InstallmentPercent,integer,True
Sex,string,True
OthersOnLoan,string,True


The feedback records can be send to feedback table using below code.

#### Store feedback using CSV format from string

In [46]:
from ibm_ai_openscale.supporting_classes.enums import FeedbackFormat

feedback_records = "0_to_200,31,credits_paid_to_date,other,1889,100_to_500,less_1,3,female,none,3,savings_insurance,32,none,own,1,skilled,1,none,yes,Risk"

for i in range(0, 10):
    feedback_records = feedback_records + '\n' + feedback_records

subscription.feedback_logging.store(feedback_data=feedback_records, feedback_format=FeedbackFormat.CSV)

#### Store feedback using CSV format from file

In [47]:
!wget https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spss/credit-risk/data/credit_risk_feedback.csv

--2019-05-14 12:26:48--  https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spss/credit-risk/data/credit_risk_feedback.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.68.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.68.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 40768 (40K) [text/plain]
Saving to: ‘credit_risk_feedback.csv’


2019-05-14 12:26:48 (779 KB/s) - ‘credit_risk_feedback.csv’ saved [40768/40768]



In [48]:
with open('credit_risk_feedback.csv', 'rb') as csvFile:
    subscription.feedback_logging.store(feedback_data=csvFile, feedback_format=FeedbackFormat.CSV, data_header=True, data_delimiter=',')

In [49]:
subscription.feedback_logging.describe_table()

       LoanDuration    LoanAmount  InstallmentPercent  \
count   1315.000000   1315.000000         1315.000000   
mean      29.539163   2383.437262            3.058555   
std        5.494063   1391.763046            0.492324   
min        4.000000    250.000000            1.000000   
25%       31.000000   1889.000000            3.000000   
50%       31.000000   1889.000000            3.000000   
75%       31.000000   1889.000000            3.000000   
max       61.000000  10430.000000            5.000000   

       CurrentResidenceDuration          Age  ExistingCreditsCount  \
count               1315.000000  1315.000000           1315.000000   
mean                   3.034221    33.610646              1.128517   
std                    0.447775     5.316086              0.363143   
min                    1.000000    19.000000              1.000000   
25%                    3.000000    32.000000              1.000000   
50%                    3.000000    32.000000              1.000000

### Run quality monitoring on demand

By default, quality monitoring is run on hourly schedule. You can also trigger it on demand using below code.

In [50]:
run_details = subscription.quality_monitoring.run(background_mode=False)




 Waiting for end of quality monitoring run 48bba8d4-118e-45a1-bb52-ad2c094b0086 




initializing.
running
completed

---------------------------
 Successfully finished run 
---------------------------




### Show the quality metrics

In [51]:
subscription.quality_monitoring.show_table()

0,1,2,3,4,5,6,7,8,9
2019-05-14 12:27:22.739000+00:00,true_positive_rate,214c5480-7d37-4470-bcb6-f00136b10526,0.0367207514944491,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,area_under_roc,214c5480-7d37-4470-bcb6-f00136b10526,0.3100270424138912,0.7,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,precision,214c5480-7d37-4470-bcb6-f00136b10526,0.4174757281553398,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,f1_measure,214c5480-7d37-4470-bcb6-f00136b10526,0.0675039246467817,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,accuracy,214c5480-7d37-4470-bcb6-f00136b10526,0.0965779467680608,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,log_loss,214c5480-7d37-4470-bcb6-f00136b10526,1.56717570703848,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,false_positive_rate,214c5480-7d37-4470-bcb6-f00136b10526,0.4166666666666667,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,area_under_pr,214c5480-7d37-4470-bcb6-f00136b10526,0.6453002137149899,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA
2019-05-14 12:27:22.739000+00:00,recall,214c5480-7d37-4470-bcb6-f00136b10526,0.0367207514944491,,,model_type: original,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA


Get all calculated metrics.

In [52]:
deployment_uids = subscription.get_deployment_uids()

In [53]:
subscription.quality_monitoring.get_metrics(deployment_uid=deployment_uids[0])

[{'asset_id': '091e60170f73c4410000016a9a18aecd8242',
  'binding_id': '1d9297e2-5047-4b77-80a3-2fbca8f4fc87',
  'process': 'Accuracy_evaluation_48bba8d4-118e-45a1-bb52-ad2c094b0086',
  'tags': [{'id': 'model_type', 'value': 'original'}],
  'ts': '2019-05-14T12:27:22.739Z',
  'measurement_id': '214c5480-7d37-4470-bcb6-f00136b10526',
  'monitor_definition_id': 'quality',
  'subscription_id': '091e60170f73c4410000016a9a18aecd8242',
  'metrics': [{'id': 'true_positive_rate', 'value': 0.036720751494449186},
   {'lower_limit': 0.7, 'id': 'area_under_roc', 'value': 0.31002704241389123},
   {'id': 'precision', 'value': 0.4174757281553398},
   {'id': 'f1_measure', 'value': 0.06750392464678179},
   {'id': 'accuracy', 'value': 0.09657794676806084},
   {'id': 'log_loss', 'value': 1.56717570703848},
   {'id': 'false_positive_rate', 'value': 0.4166666666666667},
   {'id': 'area_under_pr', 'value': 0.6453002137149899},
   {'id': 'recall', 'value': 0.036720751494449186}]}]

## 6. Fairness monitoring and explanations

### Enable and run fairness monitoring

In [54]:
import pandas as pd
import numpy as np

In [55]:
training_data_df = pd.read_csv("https://raw.githubusercontent.com/pmservice/wml-sample-models/master/spss/credit-risk/data/credit_risk_training.csv",
                               dtype={'CheckingStatus': str, 'LoanDuration': int, 'CreditHistory': str, 'LoanPurpose': str,
                                      'LoanAmount': int, 'ExistingSavings': str, 'EmploymentDuration': str, 'InstallmentPercent': int,
                                      'Sex': str, 'OthersOnLoan': str, 'CurrentResidenceDuration': int, 'OwnsProperty': str,
                                      'Age': int, 'InstallmentPlans': str, 'Housing': str, 'ExistingCreditsCount': int,
                                      'Job': str, 'Dependents': int, 'Telephone': str, 'ForeignWorker': str, 'Risk': str})

In [56]:
feature_columns = [col for col in list(training_data_df.columns) if col!="Risk"]
print(feature_columns)

categorical_features = [col for col, col_type in training_data_df.dtypes.to_dict().items() if col_type is np.dtype('O') and col!="Risk"]
print(categorical_features)

['CheckingStatus', 'LoanDuration', 'CreditHistory', 'LoanPurpose', 'LoanAmount', 'ExistingSavings', 'EmploymentDuration', 'InstallmentPercent', 'Sex', 'OthersOnLoan', 'CurrentResidenceDuration', 'OwnsProperty', 'Age', 'InstallmentPlans', 'Housing', 'ExistingCreditsCount', 'Job', 'Dependents', 'Telephone', 'ForeignWorker']
['CheckingStatus', 'CreditHistory', 'LoanPurpose', 'ExistingSavings', 'EmploymentDuration', 'Sex', 'OthersOnLoan', 'OwnsProperty', 'InstallmentPlans', 'Housing', 'Job', 'Telephone', 'ForeignWorker']


In [57]:
subscription.update(feature_columns=feature_columns, categorical_columns=categorical_features)

In [58]:
from ibm_ai_openscale.supporting_classes import Feature

subscription.fairness_monitoring.enable(
            features=[
                Feature("Sex", majority=['male'], minority=['female'], threshold=0.95),
                Feature("Age", majority=[[26, 75]], minority=[[18, 25]], threshold=0.95)
            ],
            favourable_classes=['No Risk'],
            unfavourable_classes=['Risk'],
            min_records=4,
            training_data=training_data_df
        )

In [59]:
fairness_run = subscription.fairness_monitoring.run(background_mode=False)




 Counting bias for deployment_uid=german_credit_risk_tutorial_BiasQA 




RUNNING..
FINISHED

---------------------------
 Successfully finished run 
---------------------------




### Check fairness run results

In [60]:
subscription.fairness_monitoring.show_table()

0,1,2,3,4,5,6,7,8,9,10
2019-05-14 12:29:41.325992+00:00,Sex,female,False,1.0,100.0,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,
2019-05-14 12:29:41.325992+00:00,Age,"[18, 25]",False,1.0,100.0,1d9297e2-5047-4b77-80a3-2fbca8f4fc87,091e60170f73c4410000016a9a18aecd8242,091e60170f73c4410000016a9a18aecd8242,german_credit_risk_tutorial_BiasQA,


In [61]:
scoring_payload = {"fields": [v for v in training_data_df.columns.values.tolist() if v != 'Risk'], 
                   "values": training_data_df.iloc[0:1,0:-1].values.tolist()}
print(scoring_payload)

{'fields': ['CheckingStatus', 'LoanDuration', 'CreditHistory', 'LoanPurpose', 'LoanAmount', 'ExistingSavings', 'EmploymentDuration', 'InstallmentPercent', 'Sex', 'OthersOnLoan', 'CurrentResidenceDuration', 'OwnsProperty', 'Age', 'InstallmentPlans', 'Housing', 'ExistingCreditsCount', 'Job', 'Dependents', 'Telephone', 'ForeignWorker'], 'values': [['0_to_200', 31, 'credits_paid_to_date', 'other', 1889, '100_to_500', 'less_1', 3, 'female', 'none', 3, 'savings_insurance', 32, 'none', 'own', 1, 'skilled', 1, 'none', 'yes']]}


### Explainability configuration and run

#### Enable explainability

In [63]:
subscription.explainability.enable(training_data=training_data_df)

#### Get sample transaction_id from payload logging table (`scoring_id`)

In [64]:
transaction_id = subscription.payload_logging.get_table_content(limit=1)['scoring_id'].values[0]
print(transaction_id)

3fe9b98e-259a-456a-9be1-b1765030ae26-1


#### Run explanation for sample `transaction_id`

In [65]:
explain_run = subscription.explainability.run(transaction_id=transaction_id, background_mode=False)




 Looking for explanation for 3fe9b98e-259a-456a-9be1-b1765030ae26-1 




in_progress...................
finished

---------------------------
 Successfully finished run 
---------------------------




In [80]:
from tabulate import tabulate
from IPython.core.display import HTML

In [85]:
explain_result = pd.DataFrame.from_dict(explain_run['entity']['predictions'][0]['explanation_features'])
HTML(tabulate(explain_result[['feature_name', 'weight']], headers=['feature', 'importance'] , showindex=False, floatfmt=".2f", tablefmt='html'))

feature,importance
LoanDuration,-0.19
ForeignWorker,0.15
OthersOnLoan,0.14
EmploymentDuration,0.12
ExistingCreditsCount,0.11
InstallmentPlans,-0.08
CreditHistory,-0.06
Housing,0.06
Job,-0.06
Telephone,0.04


---

### Authors
Lukasz Cmielowski, PhD, is an Automation Architect and Data Scientist at IBM with a track record of developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.

Wojciech Sobala, Data Scientist at IBM