In [1]:
# The code was removed by Watson Studio for sharing.

<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Use Watson OpenScale to Monitor Models Deployed on SAP AI Core

You can use this notebook to programmatically set up and configure Watson OpenScale for the trained Utilities Demand Response Prediction Model. It will configure an OpenScale data mart subscription for Custom ML Provider deployment in OpenScale's free internal PostgreSQL database. It will also configure the quality, explain, and fairness monitors for the model, and inject records and measurements for viewing in the OpenScale Insights dashboard.

This notebook works correctly with kernel `IBM Runtime 22.2 on Python 3.10 XS` in IBM Watson Studio, or else use standard Python 3.10 runtime. 

### Prerequisites:

- An IBM Cloud account
- A Watson OpenScale instance
- An IBM Cloud Object Storage (COS) bucket

## Table of Contents
    
1. [Save training data to IBM Cloud Object Storage](#save_data_to_cos)
2. [Configure OpenScale](#openscale)<br>
    2.1 [Create a Watson OpenScale Python API Client](#create_wos_client)<br>
    2.2 [Set up data mart](#setup_data_mart)<br>
    2.3 [Create a service provider](#create_service_provider)<br>
    2.4 [Create a headless subscription](#create_subscription)<br>
3. [Payload and response](#payload)
4. [Quality monitor](#quality)<br>
    4.1 [Enable quality monitoring](#enable_quality_monitoring)<br>
    4.2 [Feedback logging](#feedback_logging)<br>
    4.3 [Run quality monitor](#run_quality_monitor)<br>
5. [Fairness monitor](#fairness)
6. [Explainability monitor](#explain)
7. [Summary](#summary)

<div class="alert alert-block alert-danger">
<b>Stop kernel of other notebooks.</b></div>

**Note:** If you have other notebooks currently running with the same `IBM Runtime 22.2 on Python 3.10 XS` environment, **stop their kernels** before running this notebook. All these notebooks share the same runtime environment, and if they are running in parallel, you may encounter memory issues. To stop the kernel of another notebook, open that notebook, and select **File > Stop Kernel**.

<div class="alert alert-block alert-warning">
<b>Set Project token.</b></div>

Before you can begin working on this notebook in Watson Studio in Cloud Pak for Data as a Service, you need to ensure that the project token is set so that you can access the project assets via the notebook.

When this notebook is added to the project, a project access token should be inserted at the top of the notebook in a code cell. If you do not see the cell above, add the token to the notebook by clicking **More > Insert project token** from the notebook action bar.  By running the inserted hidden code cell, a project object is created that you can use to access project resources.

![ws-project.mov](https://media.giphy.com/media/jSVxX2spqwWF9unYrs/giphy.gif)
<div class="alert alert-block alert-info">
<b>Tip:</b> Cell execution</div>

Note that you can step through the notebook execution cell by cell, by selecting **Shift-Enter**. Or you can execute the entire notebook by selecting **Cell -> Run All** from the menu.

<div class="alert alert-block alert-warning">
<b>Set some secrets and variables.</b></div>

Before you start running the notebook, you'll need to set some credentials and other variables as shown. To avoid sharing them accidentally, copy the following example code block into a new code cell, and make it a hidden cell by adding `# @hidden_cell` to the top.

```
# @hidden_cell
# Set your IBM Cloud API Key
ibmcloud_api_key = '<your_ibm_cloud_api_key>'

# Set your IBM Cloud Object Store credential
BUCKET_NAME = "<name_of_cos_bucket>" 
COS_RESOURCE_CRN = "<cos_service_credential_resource_instance_id>"
COS_API_KEY_ID = "<cos_service_credential_apikey>"
COS_ENDPOINT = "<cos_service_endpoint>"
IAM_URL = "https://iam.cloud.ibm.com/identity/token"

# Set Watson OpenScale service url
wos_service_url = "https://<region>.api.aiopenscale.cloud.ibm.com"

# Training data set
training_data_file_name = "<name_of_training_data_file>"
```

**NOTE:**

- Refer to IBM Cloud [documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey) to create an IBM Cloud API Key if you haven't done so.
- Refer to IBM Cloud [documentation](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-service-credentials) to generate Cloud object storage credentials.
- `COS_API_KEY_ID` is the value found in the Service Credential as `apikey`.
- `COS_RESOURCE_CRN` is the value found in the Service Credential as `resource_instance_id`.
- `COS_ENDPOINT` is a service endpoint URL, inclusive of the https:// protocol. This value is not the endpoints value that is found in the Service Credential. For more information about endpoints, see [Endpoints and storage locations](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-endpoints).
- `wos_service_url` is set to `https://api.aiopenscale.cloud.ibm.com` by default, which works for OpenScale instances in Dallas (us-south).


In [2]:
# The code was removed by Watson Studio for sharing.

## 1. Save training data to IBM Cloud Object Storage <a name="save_data_to_cos"></a>

You must either provide Watson OpenScale access to training data that is stored in Db2 or IBM Cloud Object Storage (COS), which will be used later as training refecence in subscription. We use an existing COS bucket here, which is a different bucket from the one associated from the Watson Studio instance.

Assuming the training data included in the Git repository has already been uploaded into the Watson Studio project, you can download it to the local file system of the runtime environment with the following code cell:

In [3]:
# Download data asset from project storage and store it in the local file system
wslib.download_file("epp_train.csv", "epp_train.csv")

{'file_name': 'epp_train.csv', 'summary': ['loaded data', 'saved to file']}

In [4]:
import pandas as pd

# Read data from the CSV file into a DataFrame
df_raw = pd.read_csv("epp_train.csv")

# Change the order of the columns
df_raw = df_raw[['employee_id', 'department', 'region', 'education', 'gender',
       'recruitment_channel', 'no_of_trainings', 'age', 'previous_year_rating',
       'length_of_service', 'kpis_met_above_80_percent', 'any_awards_won',
       'avg_training_score', 'is_promoted']]

df_raw.head()

Unnamed: 0,employee_id,department,region,education,gender,recruitment_channel,no_of_trainings,age,previous_year_rating,length_of_service,kpis_met_above_80_percent,any_awards_won,avg_training_score,is_promoted
0,45709,Sales & Marketing,region_31,Bachelor's,f,other,1,29,,1,0,0,49,0
1,66874,Sales & Marketing,region_27,Bachelor's,f,other,1,30,,1,0,0,50,0
2,36904,Sales & Marketing,region_15,Bachelor's,m,other,1,29,3.0,2,0,0,51,0
3,32877,Sales & Marketing,region_2,Bachelor's,f,other,1,40,3.0,12,0,0,50,0
4,58415,Sales & Marketing,region_7,Bachelor's,m,other,1,45,4.0,5,0,0,50,0


In [5]:
df_raw.to_csv(training_data_file_name,index=False)

Create a client for the COS bucket:

In [6]:
import ibm_boto3
from ibm_botocore.client import Config, ClientError

cos_client = ibm_boto3.resource("s3",
    ibm_api_key_id=COS_API_KEY_ID,
    ibm_service_instance_id=COS_RESOURCE_CRN,
    ibm_auth_endpoint=IAM_URL,
    config=Config(signature_version="oauth"),
    endpoint_url=COS_ENDPOINT
)

Upload the training data to the COS bucket:

In [7]:
with open(training_data_file_name, "rb") as file_data:
    cos_client.Object(BUCKET_NAME, training_data_file_name).upload_fileobj(
        Fileobj=file_data
    )

Apart from these inputs, the following details can also be updated while running the notebook.

1. Columns used in the model and the target column details in [Create Subscription](#subscription) section.
2. Favourable class, unfavourable class, features to monitor bias for, majority and minority columns for bias in [Fairness monitor](#fairness) section .

## 2. Configure OpenScale <a name="openscale"></a>

These are the set of steps involved in configuring OpenScale:

1. Create an Watson OpenScale Python API Client to start working with client library
2. Create or update a data mart using the API Client.
3. Create a service provide and subscription
4. List all the existing bindings.
5. Create and manage subscriptions of machine learning models deployments.

### 2.1 Create a Watson OpenScale Python API Client <a name="create_wos_client"></a>

First, import the necessary libraries and set up a Python OpenScale client.

In [8]:
from ibm_watson_openscale import APIClient
from ibm_watson_openscale.utils import *
from ibm_watson_openscale.supporting_classes import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import *
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator

Get an instance of the OpenScale SDK client:

In [9]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator,BearerTokenAuthenticator

authenticator = IAMAuthenticator(apikey=ibmcloud_api_key)
wos_client = APIClient(authenticator=authenticator, service_url=wos_service_url)
wos_client.version

'3.0.32'

### 2.2 Setup data mart <a name="setup_data_mart"></a>

Watson OpenScale uses a database to store payload logs and calculated metrics. The data mart will be created unless there is an existing data mart. If there is an existing data mart, it will be used throughout this notebook.

Show the current list of the data marts:

In [10]:
wos_client.data_marts.show()

0,1,2,3,4,5
,,False,active,2023-10-23 17:25:18.348000+00:00,bd282900-1e77-4a76-8a0e-5da926dfc7d5


In [11]:
KEEP_MY_INTERNAL_POSTGRES = False
#DB_CREDENTIALS = None
#SCHEMA_NAME = None

data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by Industry Accelerator",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['hostname'],
                        username=DB_CREDENTIALS['username'],
                        password=DB_CREDENTIALS['password'],
                        db=DB_CREDENTIALS['database'],
                        port=DB_CREDENTIALS['port'],
                        ssl=True,
                        sslmode=DB_CREDENTIALS['sslmode'],
                        certificate_base64=DB_CREDENTIALS['certificate_base64']
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))

Using existing datamart bd282900-1e77-4a76-8a0e-5da926dfc7d5


### 2.3 Create a service provider <a name="create_service_provider"></a>

Once the data mart is setup, you can create and manage service providers of models deployments.

In [12]:
SERVICE_PROVIDER_NAME = "EPP Demo - SAP AI CORE"
SERVICE_PROVIDER_DESCRIPTION = "Added by tutorial WOS notebook to showcase monitoring Fairness, Quality and Explainability against a Custom ML provider."

Delete existing service provider, if any, with the same name:

In [13]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    if service_instance_name == SERVICE_PROVIDER_NAME:
        service_provider_id = service_provider.metadata.id
        wos_client.service_providers.delete(service_provider_id)
        print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

Deleted existing service_provider for WML instance: 9c3b46b6-e30b-4afa-a291-c5c72de25983


Create a new service provider:

In [14]:
MLCredentials = {}
added_service_provider_result = wos_client.service_providers.add(
    name=SERVICE_PROVIDER_NAME,
    description=SERVICE_PROVIDER_DESCRIPTION,
    service_type=ServiceTypes.CUSTOM_MACHINE_LEARNING,
    operational_space_id = "production",
    credentials=MLCredentials,
    background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id




 Waiting for end of adding service provider 300482e9-4945-4839-9c65-4d7015d8c06b 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




List all the service providers for the WOS client.

In [15]:
wos_client.service_providers.show()

0,1,2,3,4,5
,active,EPP Demo - SAP AI CORE,custom_machine_learning,2023-11-10 18:30:30.826000+00:00,300482e9-4945-4839-9c65-4d7015d8c06b


### 2.4 Create a headless subscription <a name="create_subscription"></a>
Watson OpenScale monitors a model deployment with a subscription. There is a one-to-one mapping between deployment and subscription. A headless subscription is a subscription that has an empty deployment URL. This is a useful option when the model scoring endpoint cannot be provided for external applications like OpenScale, for reasons like security restrictions, air-gap deployment constraints, or firewall restrictions. The other use is in case of Batch Scoring models, where the scoring happens asynchronously, and OpenScale does not have access to the scoring endpoint.

In [16]:
SUBSCRIPTION_NAME = "Custom ML Provider for EPP"

Delete existing subscription, if any, with the same asset name.

In [17]:
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    if subscription.entity.asset.name == "[SAP AI CORE] " + SUBSCRIPTION_NAME:
        sub_model_id = subscription.metadata.id
        wos_client.subscriptions.delete(subscription.metadata.id)
        print('Deleted existing subscription for model', sub_model_id)

The following code cell creates the model subscription in OpenScale using the Python client API.

In order to create a new subscription, you will need to provide following details
1. `Model details` including Model id, type, input_data_type(STRUCTURED or UNSTRUCTURED), Problem type(BINARY_CLASSIFICATION, REGRESSION, MULTI_CLASSIFICATION)
2. `Model deployment details` including deployment name, id, type, url.
3. `Training Data Reference`: training data specified in the user inputs cell
4. `Asset Properties`: <br>
    a. `label_column`: prediction column name<br>
    b. `feature_fields`: columns used in the training data set<br>
    c. `categorical_fields`: catgorical columns in the training data set<br>

When a subscription is successfully created, you should be able to see it in the path `{HOST}/aiopenscale/insights`.

In [18]:
import uuid

asset_id = str(uuid.uuid4())
asset_name = '[SAP AI CORE] ' + SUBSCRIPTION_NAME
url = ''

asset_deployment_id = str(uuid.uuid4())
asset_deployment_name = asset_name

feature_columns = ['no_of_trainings', 'age', 'previous_year_rating', 'length_of_service', 'kpis_met_above_80_percent', 'any_awards_won', 
                   'avg_training_score', 'department_Finance', 'department_HR', 'department_Legal', 'department_Operations', 
                   'department_Procurement', 'department_R&D', 'department_Sales & Marketing', 'department_Technology', 
                   'education_Below Secondary', "education_Master's & above", 'gender_m']

cat_features = []

subscription_details = wos_client.subscriptions.add(data_mart_id,
    service_provider_id,
    asset=Asset(
        asset_id=asset_id,
        name=asset_name,
        url=url,
        asset_type=AssetTypes.MODEL,
        input_data_type=InputDataType.STRUCTURED,
        problem_type=ProblemType.BINARY_CLASSIFICATION
    ),
    deployment=AssetDeploymentRequest(
        deployment_id=asset_deployment_id,
        name=asset_deployment_name,
        deployment_type= DeploymentTypes.ONLINE
    ),
    asset_properties=AssetPropertiesRequest(
        label_column="is_promoted",
        probability_fields=["probability"],
        prediction_field="prediction",
        feature_fields = feature_columns,
        categorical_fields = cat_features,
        training_data_reference = TrainingDataReference(
            type="cos",
            location=COSTrainingDataReferenceLocation(
                bucket = BUCKET_NAME,
                file_name = training_data_file_name),
            connection=COSTrainingDataReferenceConnection.from_dict({
                "resource_instance_id": COS_RESOURCE_CRN,
                "url": COS_ENDPOINT,
                "api_key": COS_API_KEY_ID,
                "iam_url": IAM_URL})
        )
    )                                      
).result

subscription_id = subscription_details.metadata.id
print('Subscription ID: ' + subscription_id)

Subscription ID: bcc12080-e71a-4dc6-b943-00f1d22fef79


In [19]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8
d5847a4c-0084-44e1-a203-641dd2fdbf45,[SAP AI CORE] Custom ML Provider for EPP,bd282900-1e77-4a76-8a0e-5da926dfc7d5,9ae1414f-1574-446a-b816-84200ee231ff,[SAP AI CORE] Custom ML Provider for EPP,300482e9-4945-4839-9c65-4d7015d8c06b,preparing,2023-11-10 18:30:36.678000+00:00,bcc12080-e71a-4dc6-b943-00f1d22fef79


In [20]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(
    type=DataSetTypes.PAYLOAD_LOGGING, 
    target_target_id=subscription_id, 
    target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id:", payload_data_set_id)

Payload data set id: cb80da44-cce0-4231-8712-7df01495cf01


In [21]:
wos_client.subscriptions.get(subscription_id).result.to_dict()

{'metadata': {'id': 'bcc12080-e71a-4dc6-b943-00f1d22fef79',
  'crn': 'crn:v1:bluemix:public:aiopenscale:us-south:a/b33b491bb5884090942d033f7542b552:bd282900-1e77-4a76-8a0e-5da926dfc7d5:subscription:bcc12080-e71a-4dc6-b943-00f1d22fef79',
  'url': '/v2/subscriptions/bcc12080-e71a-4dc6-b943-00f1d22fef79',
  'created_at': '2023-11-10T18:30:36.678000Z',
  'created_by': 'IBMid-27000724PY'},
 'entity': {'data_mart_id': 'bd282900-1e77-4a76-8a0e-5da926dfc7d5',
  'service_provider_id': '300482e9-4945-4839-9c65-4d7015d8c06b',
  'asset': {'asset_id': 'd5847a4c-0084-44e1-a203-641dd2fdbf45',
   'url': '',
   'name': '[SAP AI CORE] Custom ML Provider for EPP',
   'asset_type': 'model',
   'problem_type': 'binary',
   'input_data_type': 'structured'},
  'asset_properties': {'training_data_reference': {'secret_id': 'a19e555c-9161-45bd-9e39-6507ffcdd788'},
   'output_data_schema': {'type': 'struct',
    'fields': [{'metadata': {'columnInfo': {'columnLength': 128},
       'modeling_role': 'record-id',
  

## 3. Payload and response <a name="payload"></a>

Data Transformation:

In [22]:
df_raw = df_raw.drop(columns=["employee_id", "recruitment_channel", "region"])

In [23]:
# Handle missing values
df_raw["education"].fillna(df_raw["education"].mode()[0], inplace=True)
df_raw["previous_year_rating"].fillna(1, inplace=True)

# Encode categorical columns
categorical_columns = df_raw.select_dtypes(include=['object']).columns.tolist()
df_raw = pd.get_dummies(df_raw, columns=categorical_columns, drop_first=True)

In [24]:
df_raw

Unnamed: 0,no_of_trainings,age,previous_year_rating,length_of_service,kpis_met_above_80_percent,any_awards_won,avg_training_score,is_promoted,department_Finance,department_HR,department_Legal,department_Operations,department_Procurement,department_R&D,department_Sales & Marketing,department_Technology,education_Below Secondary,education_Master's & above,gender_m
0,1,29,1.0,1,0,0,49,0,0,0,0,0,0,0,1,0,0,0,0
1,1,30,1.0,1,0,0,50,0,0,0,0,0,0,0,1,0,0,0,0
2,1,29,3.0,2,0,0,51,0,0,0,0,0,0,0,1,0,0,0,1
3,1,40,3.0,12,0,0,50,0,0,0,0,0,0,0,1,0,0,0,0
4,1,45,4.0,5,0,0,50,0,0,0,0,0,0,0,1,0,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
54803,2,48,3.0,20,1,0,59,0,0,0,0,1,0,0,0,0,0,1,0
54804,3,49,5.0,19,1,0,69,0,0,0,0,0,1,0,0,0,0,0,1
54805,2,49,3.0,11,0,0,49,0,0,0,0,0,0,0,1,0,0,0,1
54806,2,48,3.0,7,0,0,52,0,0,0,0,0,0,0,1,0,0,0,0


In the below cell, we sample 10 records from the input file and create a sample payload scoring response to imitate the predictions coming from SAP AI Core endpoints. This allows OpenScale to create a payload log in the data mart with the correct schema, so it can capture data coming into and out of the model.

In [25]:
df_payload = df_raw.dropna().head(30)

payload_scoring_request={"fields":feature_columns,"values":df_payload[feature_columns].values.tolist()}
payload_scoring_request['meta']={'fields':['gender_m'], 'values':[[i[8]] for i in df_payload[feature_columns].values.tolist()] }
payload={"input_data":[payload_scoring_request]}
payload_scoring_values=df_payload[feature_columns+["is_promoted"]].values.tolist()

payload_scoring_response = {'predictions': [{'fields': ['prediction', 'probability'],
    'values': [[1,[0.12      , 0.88      ]],
        [0,[0.97      , 0.03      ]],
        [0,[1.        , 0.        ]],
        [0,[1.        , 0.        ]],
        [0,[0.93      , 0.07      ]],
        [0,[0.8       , 0.2       ]],
        [0,[1.        , 0.        ]],
        [0,[0.95      , 0.05      ]],
        [0,[0.75      , 0.25      ]],
        [1,[0.48      , 0.52      ]],
        [0,[0.96      , 0.04      ]],
        [1,[0.01      , 0.99      ]],
        [1,[0.34      , 0.66      ]],
        [0,[0.8       , 0.2       ]],
        [0,[0.84      , 0.16      ]],
        [0,[0.97      , 0.03      ]],
        [1,[0.34      , 0.66      ]],
        [0,[1.        , 0.        ]],
        [0,[0.575     , 0.425     ]],
        [0,[0.94      , 0.06      ]],
        [0,[0.86      , 0.14      ]],
        [0,[0.97      , 0.03      ]],
        [0,[0.98      , 0.02      ]],
        [0,[0.89      , 0.11      ]],
        [0,[0.68      , 0.32      ]],
        [1,[0.04666667, 0.95333333]],
        [0,[0.52      , 0.48      ]],
        [0,[1.        , 0.        ]],
        [1,[0.32      , 0.68      ]],
        [0,[1.        , 0.        ]]]}]}


#     'values': [[0, [0.99, 0.01]],
#         [0, [1.0, 0.0]],
#         [0, [0.97, 0.03]],       
#         [0, [1.0, 0.0]],
#         [0, [1.0, 0.0]],
#         [0, [0.98, 0.02]],
#         [0, [0.93, 0.07]],
#         [0, [1.0, 0.0]],
#         [1, [0.31, 0.69]],
#         [0, [0.98, 0.02]]]}]}

In [26]:
from ibm_watson_openscale.supporting_classes.payload_record import PayloadRecord

records_list = []

pl_record = PayloadRecord(request=payload_scoring_request, response=payload_scoring_response)
records_list.append(pl_record)

wos_client.data_sets.store_records(data_set_id=payload_data_set_id, request_body=records_list)

pl_records_count = wos_client.data_sets.get_records_count(payload_data_set_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))

Number of records in the payload logging table: 0


By scoring against the deployment will store the data as payload records, which is called payload logging. It can be verified if payload logging has loaded records into payload table, If the records are not loaded, the `store_records()` method can be used as below to store the records manually.

In [27]:
wos_client.data_sets.show_records(data_set_id=payload_data_set_id)

## 4. Quality monitor <a name="quality"></a>

### 4.1 Enable quality monitoring <a name="enable_quality_monitoring"></a>

The code below waits ten seconds to allow the payload logging table to be set up before it begins enabling monitors. It selects a target subscription and sets parameters required to create a quality monitor instance:

- parameters: `min_feedback_data_size` specifies the minimum number of feedback records OpenScale needs before it calculates a new measurement.
- thresholds: Sets an alert threshold of 80%. OpenScale will show an alert on the dashboard if the model accuracy measurement (area under the curve, in the case of a binary classifier) falls below this threshold.

The quality monitor runs hourly, but the accuracy reading in the dashboard will not change until enough feedback records have been added, via the user interface, the Python client, or the supplied feedback endpoint.

In [28]:
import time

time.sleep(30)
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "min_feedback_data_size": 5
}
thresholds = [{
    "metric_id": "area_under_roc",
    "type": "lower_limit",
    "value": .80
}]
quality_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters,
    thresholds=thresholds
).result




 Waiting for end of monitor instance creation 6198e05e-2b8e-4fca-8655-c8a2e899e254 




preparing
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




In [29]:
quality_monitor_instance_id = quality_monitor_details.metadata.id
quality_monitor_instance_id

'6198e05e-2b8e-4fca-8655-c8a2e899e254'

### 4.2 Feedback Logging <a name="feedback_logging"></a>

There are two ways OpenScale APIs can be used to log the feedback records to the OpenScale data mart:

- Using OpenScale Python SDK
- Using OpenScale REST APIs

If one cannot use the SDK in the customer environment for any reason, the alternative is to use the REST APIs.

In [30]:
feedback_dataset_id = None
feedback_dataset = wos_client.data_sets.list(
    type=DataSetTypes.FEEDBACK, 
    target_target_id=subscription_id, 
    target_target_type=TargetTypes.SUBSCRIPTION).result
feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id
if feedback_dataset_id is None:
    print("Feedback data set not found. Please check quality monitor status.")
feedback_dataset_id

'39495da7-4680-43c6-896c-6bf0a2e8da10'

In [31]:
feedback_log_req={}
feedback_log_req['fields']=feature_columns+["is_promoted"]+ ["_original_prediction","_original_probability","_debiased_prediction","_debiased_probability"]
feedback_log_req['values']=[]
for x in range(len(payload_scoring_response['predictions'][0]['values'])):
    feedback_log_req['values'].append(payload_scoring_values[x]+payload_scoring_response['predictions'][0]['values'][x]+payload_scoring_response['predictions'][0]['values'][x])

In [32]:
#payload_scorings =  [{"fields": payload_scoring['input_data'][0]['fields'], "values": payload_scoring['input_data'][0]['values']}]
wos_client.data_sets.store_records(feedback_dataset_id, request_body=[feedback_log_req], background_mode=False)




 Waiting for end of storing records with request id: 987c42dc-881a-432e-856d-c904f06efa6e 




active

---------------------------------------
 Successfully finished storing records 
---------------------------------------




<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x7fcaeefd9270>

In [33]:
pl_records_count = wos_client.data_sets.get_records_count(feedback_dataset_id)
print("Number of records in the payload logging table: {}".format(pl_records_count))

Number of records in the payload logging table: 30


### 4.3 Run quality monitor <a name="run_quality_monitor"></a>

In [34]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id, background_mode=False).result




 Waiting for end of monitoring run 7ff1c1f7-edda-406e-8c25-5ce67bc60792 




finished

---------------------------
 Successfully finished run 
---------------------------




In [35]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2023-11-10 18:31:34.251000+00:00,area_under_roc,79a0fd5e-7c0a-4d64-bc93-b8a5016af6fb,0.0,0.8,,['model_type:original'],quality,6198e05e-2b8e-4fca-8655-c8a2e899e254,7ff1c1f7-edda-406e-8c25-5ce67bc60792,subscription,bcc12080-e71a-4dc6-b943-00f1d22fef79
2023-11-10 18:31:34.251000+00:00,accuracy,79a0fd5e-7c0a-4d64-bc93-b8a5016af6fb,0.7666666666666667,,,['model_type:original'],quality,6198e05e-2b8e-4fca-8655-c8a2e899e254,7ff1c1f7-edda-406e-8c25-5ce67bc60792,subscription,bcc12080-e71a-4dc6-b943-00f1d22fef79
2023-11-10 18:31:34.251000+00:00,log_loss,79a0fd5e-7c0a-4d64-bc93-b8a5016af6fb,0.5642658915554771,,,['model_type:original'],quality,6198e05e-2b8e-4fca-8655-c8a2e899e254,7ff1c1f7-edda-406e-8c25-5ce67bc60792,subscription,bcc12080-e71a-4dc6-b943-00f1d22fef79
2023-11-10 18:31:34.251000+00:00,area_under_pr,79a0fd5e-7c0a-4d64-bc93-b8a5016af6fb,0.0,,,['model_type:original'],quality,6198e05e-2b8e-4fca-8655-c8a2e899e254,7ff1c1f7-edda-406e-8c25-5ce67bc60792,subscription,bcc12080-e71a-4dc6-b943-00f1d22fef79


## 5. Fairness monitor <a name="fairness"></a>

IBM Watson OpenScale helps in detection of Bias at run time. It monitors the data which has been sent to the model as well as the model prediction (payload data). It then identifies bias. If Watson OpenScale reports a bias, it will be something that enterprises would want to fix. Watson OpenScale not only identifies Fairness issues in the model at runtime, it also helps to automatically de-bias the models.

The code below configures fairness monitoring for our model. It turns on monitoring for feature `GENDER`, in which case you must specify:

-  Which model feature to monitor
-  One or more **majority** groups, which are values of that feature that you expect to receive a higher percentage of favorable outcomes
-  One or more **minority** groups, which are values of that feature that you expect to receive a higher percentage of unfavorable outcomes
-  The threshold at which you would like OpenScale to display an alert if the fairness measurement falls below (such as 75%)

Additionally, you must specify which outcomes from the model are favourable outcomes, and which are unfavourable. You must also provide the number of records OpenScale will use to calculate the fairness score. In this case, OpenScale's fairness monitor will run hourly, but will not calculate a new fairness rating until at least 30 records have been added. Finally, to calculate fairness, OpenScale must perform some calculations on the training data, so you provide the dataframe containing the data.

In [36]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id

)
parameters = {
    "features": [
        {"type":"int",
         "feature": "gender_m",
         "majority": [[1,1]],
         "minority": [[0,0]],
         "threshold": 0.8
         }
    ],
    "favourable_class": [1],
    "unfavourable_class": [0],
    "min_records": 30
}

fairness_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,
    target=target,
    parameters=parameters).result
fairness_monitor_instance_id = fairness_monitor_details.metadata.id
fairness_monitor_instance_id




 Waiting for end of monitor instance creation fba95b3d-121a-4151-ba95-65ed8f2f8f8c 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




'fba95b3d-121a-4151-ba95-65ed8f2f8f8c'

In [37]:
time.sleep(20)
wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2023-11-10 18:31:48.964131+00:00,fairness_value,75842e66-4922-4183-b14a-5b5e1f608389,38.889,80.0,,"['feature:gender_m', 'fairness_metric_type:fairness', 'feature_value:0-0']",fairness,fba95b3d-121a-4151-ba95-65ed8f2f8f8c,7005c642-260d-469e-8422-156461ec9fb8,subscription,bcc12080-e71a-4dc6-b943-00f1d22fef79


## 6. Explainability monitor <a name="explain"></a>

You provide OpenScale with the training data to enable and configure the explainability features. Watson OpenScale provides LIME based and Contrastive explanations for the specified transactions.

In [38]:
wslib.download_file('explainability.tar.gz')
with open('explainability.tar.gz', mode='rb') as explainability_tar:
    wos_client.monitor_instances.upload_explainability_archive(subscription_id=subscription_id, archive=explainability_tar)

print('Uploaded explainability configuration archive successfully.')

Uploaded explainability configuration archive successfully.


In [39]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result




 Waiting for end of monitor instance creation aef46f18-1c56-4566-8eed-55d104cc3096 




preparing
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




## Summary <a name="summary"></a>

In this notebook, we've demonstrated how to programmatically set up and configure Watson OpenScale to monitor the trained Utilities Demand Response Prediction Model with a headless subscription, in which case OpenScale does not have access to the scoring endpoint of the model deployment, due to security restrictions, air-gapped deployment constraints, firewall restrictions, etc. With the quality, explain, and fairness monitors configured for the model, you can now deploy the model in SAP AI Core and send payload data to OpenScale for model evaluation.

<hr>

Sample Materials, provided under <a href="https://github.com/IBM/Industry-Accelerators/blob/master/CPD%20SaaS/LICENSE" target="_blank" rel="noopener noreferrer">license</a>. <br>
Licensed Materials - Property of IBM. <br>
© Copyright IBM Corp. 2023. All Rights Reserved. <br>
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. <br>