<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# IBM Watson OpenScale and Batch Processing:<br>Remote Spark

## Contents

* [1. Setup](#setup)
* [2. Configure Watson OpenScale](#openscale)
* [3. Set up a subscription](#subscription)
* [4. Quality monitoring](#quality)
* [5. Drift monitoring](#drift)
* [6. Fairness monitoring](#fairness)
* [7. Explainability monitoring](#explainability)

# 1. Setup <a name="setup"></a>

## Package installation

First import some of the packages you need to use. After you finish installing the following software packages, restart the kernel.



In [1]:
import warnings
warnings.filterwarnings('ignore')
%env PIP_DISABLE_PIP_VERSION_CHECK=1

env: PIP_DISABLE_PIP_VERSION_CHECK=1


In [None]:
!pip install --upgrade ibm-watson-openscale

In [2]:
!pip show ibm-watson-openscale

Name: ibm-watson-openscale
Version: 3.0.39
Summary: Client library for IBM Watson OpenScale
Home-page: https://github.ibm.com/watson-developer-cloud/openscale-python-sdk
Author: IBM Watson OpenScale
Author-email: kishore.patel@in.ibm.com
License: Apache 2.0
Location: /opt/conda/envs/Python-RT24.1-Premium/lib/python3.11/site-packages
Requires: ibm-cloud-sdk-core, pandas, python-dateutil, requests
Required-by: 


## Configure credentials

Provide your IBM Watson OpenScale credentials in the following cell:



In [3]:
WOS_CREDENTIALS = {
    "url": "", 
    "instance_id": "",
    "version": "",
    "username": "",
    "password": ""
}

## Specify model details

### Serviceprovider and subscription metadata

In [4]:
# Service Provider

SERVICE_PROVIDER_NAME = ""
SERVICE_PROVIDER_DESCRIPTION = ""

# Subscription

SUBSCRIPTION_NAME = ""
SUBSCRIPTION_DESCRIPTION = ""

### Spark Cluster

Make sure that the Apache Spark manager on the Spark cluster is running, and then provide the following details:

- SPARK_ENGINE_ENDPOINT: _Endpoint URL where the Spark Manager Application is running_
- SPARK_ENGINE_USERNAME: _Username to connect to Spark Manager Application_
- SPARK_ENGINE_PASSWORD: _Password to connect to Spark Manager Application_
- SPARK_ENGINE_NAME: _Custom display name for the Spark Manager Application_
- SPARK_ENGINE_DESCRIPTION: _Custom description for the Spark Manager Application_

In [5]:
SPARK_ENGINE_NAME=""
SPARK_ENGINE_DESCRIPTION=""
SPARK_ENGINE_ENDPOINT=""
SPARK_ENGINE_ENDPOINT_USERNAME=""
SPARK_ENGINE_ENDPOINT_PASSWORD=""

#### Provide Spark Resource Settings

To configure how much of your Spark Cluster resources this job can consume, edit the following values:


- max_num_executors: _Maximum Number of executors to launch for this session_
- min_executors: _Minimum Number of executors to launch for this session_
- executor_cores: _Number of cores to use for each executor_  
- executor_memory: _Amount of memory (in GBs) to use per executor process_
- driver_cores: _Number of cores to use for the driver process_
- driver_memory: _Amount of memory (in GBs) to use for the driver process_

In [6]:
spark_parameters = {
    "max_num_executors": 2,
    "min_num_executors": 1,
    "executor_cores": 3,
    "executor_memory": 2,
    "driver_cores": 2,
    "driver_memory": 2
}

### Apache Hive

To connect to Apache Hive, you must provide the following details:

- HIVE_CONNECTION_NAME: _Custom display name for the Hive Connection_
- HIVE_CONNECTION_DESCRIPTION: _Custom description for the Hive connection_
- [Optional] HIVE_METASTORE_URI: _Thrift URI for Hive Metastore to connect to_<br>If the metastore URI is already configured in the `hive-site.xml` file in your Hadoop Ecosystem, you can leave the `HIVE_METASTORE_URI` as `None`.

In [11]:
HIVE_CONNECTION_NAME = ""
HIVE_CONNECTION_DESCRIPTION = ""

# [optional]
HIVE_METASTORE_URI = None

### Feedback table metadata

The quality monitor stores metadata in the feedback table. To configure the quality monitor, you must provide the following details. To skip quality monitoring, run the following cell to initialize variables with the value of `None`.

- FEEDBACK_DATABASE_NAME: _Database name where feedback table is present_
- FEEDBACK_SCHEMA_NAME: _Schema name where feedback table is present_
- FEEDBACK_TABLE_NAME: _Name of the feedback table_

In [8]:
#feedback

FEEDBACK_DATABASE_NAME = None
FEEDBACK_TABLE_NAME = None

### Payload and drift table metadata

The drift monitor stores metadata in the payload and drift tables. To configure the drift monitor, you must provide the following details. To skip drift monitoring, run the following cell to initialize variables with the value of `None`.

- PAYLOAD_DATABASE_NAME: _Database name where payload logging table is present_
- PAYLOAD_SCHEMA_NAME: _Schema name where payload logging table is present_
- PAYLOAD_TABLE_NAME: _Name of the payload logging table_
- DRIFT_DATABASE_NAME: _Database name where drifted transactions table is present_
- DRIFT_SCHEMA_NAME: _Schema name where drifted transactions table is present_
- DRIFT_TABLE_NAME: _Name of the drifted transactions table_


In [9]:
#payload logging

PAYLOAD_DATABASE_NAME = None
PAYLOAD_TABLE_NAME = None

#drift

DRIFT_DATABASE_NAME = None
DRIFT_TABLE_NAME = None

### Explainability table metadata

The explainability monitor requires the queue and result tables. The payload table can also be used as the queue table. To configure the explainability monitor, you must provide the following details. To skip explainability monitoring, run the following cell to initialize variables with the value of `None`.

- EXPLAINABILITY_DATABASE_NAME: _Database name where explanations queue, result tables are present_
- EXPLAINABILITY_QUEUE_TABLE_NAME: _Name of the explanations queue table_
- EXPLAINABILITY_RESULT_TABLE_NAME: _Name of the explanations result table_

In [10]:
#explainability

EXPLAINABILITY_DATABASE_NAME = None
EXPLAINABILITY_QUEUE_TABLE_NAME = None
EXPLAINABILITY_RESULT_TABLE_NAME = None

# 2. Configure Watson OpenScale <a name="openscale"></a>

### Import the required libraries and set up the Watson OpenScale client

In [11]:
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import *

authenticator = CloudPakForDataAuthenticator(
        url=WOS_CREDENTIALS["url"],
        username=WOS_CREDENTIALS["username"],
        password=WOS_CREDENTIALS["password"],
        disable_ssl_verification=True
    )

wos_client = APIClient(authenticator=authenticator, service_url=WOS_CREDENTIALS["url"], service_instance_id=WOS_CREDENTIALS["instance_id"])

### Display Watson OpenScale datamart details

In [12]:
wos_client.data_marts.show()
data_marts = wos_client.data_marts.list().result.data_marts
data_mart_id=data_marts[0].metadata.id

0,1,2,3,4,5
AIOSFASTPATHICP-00000000-0000-0000-0000-000000000000,Data Mart created by OpenScale ExpressPath,False,active,2024-06-04 05:19:03.698000+00:00,00000000-0000-0000-0000-000000000000


### Create a service provider

In [13]:
# Delete existing service provider with the same name as provided

service_providers = wos_client.service_providers.list().result.service_providers
for provider in service_providers:
    if provider.entity.name == SERVICE_PROVIDER_NAME:
        wos_client.service_providers.delete(service_provider_id=provider.metadata.id)
        break

In [14]:
# Add Service Provider

added_service_provider_result = wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description=SERVICE_PROVIDER_DESCRIPTION,
        service_type=ServiceTypes.CUSTOM_MACHINE_LEARNING,
        credentials={},
        operational_space_id="production",
        background_mode=False
    ).result

service_provider_id = added_service_provider_result.metadata.id

wos_client.service_providers.show()




 Waiting for end of adding service provider 8b65752d-0dc9-4715-a2a2-96ee19fb7ded 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




0,1,2,3,4,5
,active,WML_IAE2,custom_machine_learning,2024-06-27 13:36:27.009000+00:00,8b65752d-0dc9-4715-a2a2-96ee19fb7ded
99999999-9999-9999-9999-999999999999,active,shreya-space,watson_machine_learning,2024-06-26 04:34:04.685000+00:00,34668a1a-ee92-4328-8361-05ba1c3f72bf
,active,WML_IAE,custom_machine_learning,2024-06-23 10:16:49.805000+00:00,a6780fe7-b9aa-4b91-bf87-4ade19fd41ae
99999999-9999-9999-9999-999999999999,active,shreya,watson_machine_learning,2024-06-20 13:22:29.550000+00:00,db6e279f-6d0b-49db-a94f-4b14a12aaddc
00000000-0000-0000-0000-000000000000,active,wml_aie,watson_machine_learning,2024-06-14 06:11:43.041000+00:00,15921037-330f-479a-9b19-2e2285553bf0
00000000-0000-0000-0000-000000000000,active,wml_iae_jdbc,watson_machine_learning,2024-06-14 05:57:05.755000+00:00,33e21c9c-ae94-438a-b322-8827f221b5fe
,active,SDK_BATCH_SUB,custom_machine_learning,2024-06-14 05:37:53.874000+00:00,210bf9fa-bdb5-4cb8-8bb5-b2f938c1d5dc
,active,poojitha,custom_machine_learning,2024-06-14 05:34:56.721000+00:00,bdccd48d-6ccb-42bd-8662-dec574aa1b83
00000000-0000-0000-0000-000000000000,active,WML,watson_machine_learning,2024-06-14 04:32:02.681000+00:00,8b392383-39fc-44f5-a1c0-30bdea67f61c
,active,poojitha2,custom_machine_learning,2024-06-13 15:43:11.394000+00:00,47aeab55-43cc-4271-b86b-cd5b58b3867b


Note: First 10 records were displayed.


In [15]:
service_provide_details = wos_client.service_providers.get(service_provider_id=service_provider_id).result
print(service_provide_details)

{
  "metadata": {
    "id": "8b65752d-0dc9-4715-a2a2-96ee19fb7ded",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:service_provider:8b65752d-0dc9-4715-a2a2-96ee19fb7ded",
    "url": "/v2/service_providers/8b65752d-0dc9-4715-a2a2-96ee19fb7ded",
    "created_at": "2024-06-27T13:36:27.009000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "name": "WML_IAE2",
    "service_type": "custom_machine_learning",
    "credentials": {
      "secret_id": "6e689219-e968-4e03-94c9-83e9b4d4f20f"
    },
    "operational_space_id": "production",
    "status": {
      "state": "active"
    }
  }
}


### Create integrated systems for Spark Engine and Hive

In [16]:
# Delete existing spark and hive integrated systems if present

integrated_systems = IntegratedSystems(wos_client).list().result.integrated_systems

for system in integrated_systems:
    if system.entity.name in (SPARK_ENGINE_NAME, HIVE_CONNECTION_NAME):
        print("Deleting integrated system {}".format(system.entity.name))
        IntegratedSystems(wos_client).delete(integrated_system_id=system.metadata.id)

Deleting integrated system Hive_WML_IAE_temp
Deleting integrated system WML_IAEKB_Spark


#### Spark Engine

In [17]:
spark_engine_details = IntegratedSystems(wos_client).add(
    name=SPARK_ENGINE_NAME,
    description=SPARK_ENGINE_DESCRIPTION,
    type="spark",
    credentials={
        "username": SPARK_ENGINE_ENDPOINT_USERNAME,
        "password": SPARK_ENGINE_ENDPOINT_PASSWORD
    },
    connection={
        "endpoint": SPARK_ENGINE_ENDPOINT,
        "location_type": "custom"
    }
).result

spark_engine_id = spark_engine_details.metadata.id
print(spark_engine_details)

{
  "metadata": {
    "id": "ad56fa10-15f3-4a35-b196-41b3feb8ab9b",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:integrated_system:ad56fa10-15f3-4a35-b196-41b3feb8ab9b",
    "url": "/v2/integrated_systems/ad56fa10-15f3-4a35-b196-41b3feb8ab9b",
    "created_at": "2024-06-27T13:38:08.935000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "name": "WML_IAEKB_Spark",
    "type": "spark",
    "description": "WML_IAEKB_Spark",
    "credentials": {
      "secret_id": "5094869c-3dcc-4337-90aa-af64661ecad6"
    },
    "connection": {
      "display_name": "IAEBatchSpark",
      "endpoint": "https://cpd-cpd-instance.apps.wos415nfs2672.cp.fyre.ibm.com/v4/analytics_engines/7d2a4875-563e-42d0-994f-518ebf6e1e42/spark_applications",
      "location_type": "cpd_iae",
      "volume": "cpd-instance::IAEBatchTest"
    }
  }
}


#### Hive

In [18]:
hive_connection = {}
if HIVE_METASTORE_URI is not None:
    hive_connection["metastore_url"] = HIVE_METASTORE_URI
    hive_connection["location_type"] = "metastore"

hive_connection_details = IntegratedSystems(wos_client).add(
    name=HIVE_CONNECTION_NAME,
    description=HIVE_CONNECTION_DESCRIPTION,
    type="hive",
    credentials={
        
    },
    connection=hive_connection
).result

hive_connection_id=hive_connection_details.metadata.id
print(hive_connection_details)

{
  "metadata": {
    "id": "cba14401-4ab9-48a9-aac2-c0062525ba1f",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:integrated_system:cba14401-4ab9-48a9-aac2-c0062525ba1f",
    "url": "/v2/integrated_systems/cba14401-4ab9-48a9-aac2-c0062525ba1f",
    "created_at": "2024-06-27T13:38:33.537000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "name": "Hive_WML_IAE_temp",
    "type": "hive",
    "description": "Hive_WML_IAE_temp",
    "credentials": {
      "secret_id": "4a0f8f0c-2510-47b7-b9a2-b6069aaa053a"
    },
    "connection": {
      "location_type": "metastore",
      "metastore_url": "thrift://shillong1.fyre.ibm.com:9083"
    }
  }
}


# 3. Set up a subscription <a name="subscription"></a>

In [19]:
# Delete an existing subscription with the provided name

subscriptions = wos_client.subscriptions.list().result.subscriptions
for sub in subscriptions:
    if sub.entity.deployment.name == SUBSCRIPTION_NAME:
        wos_client.subscriptions.delete(subscription_id=sub.metadata.id)
        break

# Display all subscriptions
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8,9
d4a05e9b-6f76-43f3-8045-9276d97bc9cd,model,My SDK Batch Subscription-DB2,00000000-0000-0000-0000-000000000000,9bb3d394-3b1f-4ef4-a2d9-5bff6c9ecc00,My SDK Batch Subscription-DB2,4d2f2fb2-6b64-4d58-8f13-257166e468e9,active,2024-06-26 07:45:02.879000+00:00,9db7eff3-d92f-462c-85c0-dcfb4f7725a0
9a70ffbf-8bd7-4e6d-8520-1975832f6d31,model,shap438 - P10 XGB Classifier - Model,00000000-0000-0000-0000-000000000000,032538a7-4549-4f87-b3fc-ed72a5888849,438features,34668a1a-ee92-4328-8361-05ba1c3f72bf,active,2024-06-26 04:34:18.520000+00:00,3fc64ae7-b852-4951-bc65-4e450b473952
3ed9f2ab-665e-48e2-8c7a-58ac3583d33d,model,My SDK Batch Subscription-DB2,00000000-0000-0000-0000-000000000000,47313ae2-8dff-4ced-8c69-aa61f2a9a70b,My SDK Batch Subscription-DB2,4d2f2fb2-6b64-4d58-8f13-257166e468e9,active,2024-06-25 15:35:43.781000+00:00,947e3b52-efff-4b8c-bc75-9397fd682516
e307dda2-8653-4201-a2b3-aa7e89b6692b,model,My SDK Batch Subscription-DB2,00000000-0000-0000-0000-000000000000,6842d6e9-1aa6-4cd3-b0cd-185cea193b6e,My SDK Batch Subscription-DB2,4d2f2fb2-6b64-4d58-8f13-257166e468e9,active,2024-06-25 15:28:01.182000+00:00,d46696c9-ef7c-4159-887c-4bc3718d6e73
1995cae7-3390-4dad-a0bc-5d056fe3397f,model,WML_IAE,00000000-0000-0000-0000-000000000000,88e7e864-328f-4b3f-a683-5de9849d93e2,WML_IAE,a6780fe7-b9aa-4b91-bf87-4ade19fd41ae,active,2024-06-23 10:18:00.421000+00:00,9bb07cbc-82ca-4bc6-a22a-4bb36fcdd5e4
b330874076d0c41969a82aff7de5835f,model,neelima-admit-predict-linear-regression-2019-11-08-14-31-02,00000000-0000-0000-0000-000000000000,155b20136d20663b08010f148e279a8c,admit-predict-regression-endpoint-201911081437,41217095-4a55-4637-854f-70d6b90e2566,active,2024-06-23 13:06:49.875000+00:00,02deb4d3-99bb-4748-85b2-2c2ec911f2e5
76611220d2a21271860e391d01269235,model,walkingactivity.multiclass,00000000-0000-0000-0000-000000000000,dd6efb65261b6d31dc9819bf271684ac,walkingactivity.multiclass,58e42a79-15b3-42a0-8ebc-bf3bcb842512,active,2024-06-23 13:06:49.244000+00:00,cc16b260-a2f2-492f-86d4-41dc9f02915b
4b02f297-e125-4e49-8b33-b0f2db6e8c56,model,GCR Batch Binary,00000000-0000-0000-0000-000000000000,585ad7c4-3632-43c7-a79a-5ce34931a62c,GCR Batch Binary,db6e279f-6d0b-49db-a94f-4b14a12aaddc,error,2024-06-21 05:03:52.125000+00:00,5afc4862-a4ef-449b-aaf5-c4e0e065c5f6
0cb8cfb5-d6c3-4b00-8f5b-41f28ff4eb98,model,GCR Batch Subscription-GCR2,00000000-0000-0000-0000-000000000000,10ae0966-6849-449d-a883-48b1e46384a3,GCR Batch Subscription-GCR2,db6e279f-6d0b-49db-a94f-4b14a12aaddc,pending_delete,2024-06-20 14:21:29.556000+00:00,cc0a7062-56ef-4feb-91aa-33764217ad77
6edae2e7-f6c3-465d-8b71-aa0d2455acd7,model,GCR Batch Subscription-GCR2,00000000-0000-0000-0000-000000000000,7c5c1c58-17d7-4f63-bd6a-0219f07e0fec,GCR Batch Subscription-GCR2,db6e279f-6d0b-49db-a94f-4b14a12aaddc,pending_delete,2024-06-20 14:12:18.807000+00:00,89646855-3162-4b5a-aafe-0a1e8a2a3946


Note: First 10 records were displayed.


### Set subscription metadata

In the following cell, type a path to the common configuration JSON file that you created by running the [common configuration notebook](https://github.com/IBM/watson-openscale-samples/blob/main/Cloud%20Pak%20for%20Data/Batch%20Support/Configuration%20generation%20for%20OpenScale%20batch%20subscription.ipynb). After you edit the path information, run the cell to set the asset details and properties, the deployment details, the analytics engine details, and to add the required tables as data sources.

In [20]:
import uuid
import json

common_configuration = None
with open("/path/to/dir/containing/configuration.json", "r") as fp:
    configuration_json = json.load(fp)
    common_configuration = configuration_json.get("common_configuration")
    if common_configuration is None:
        print("Please provide the correct path to the common configuration JSON")
    
# Set asset details
asset = Asset(
    asset_id=str(uuid.uuid4()),
    url="",
    name=SUBSCRIPTION_NAME,
    asset_type=AssetTypes.MODEL,
    input_data_type=InputDataType.STRUCTURED,
    problem_type=ProblemType.BINARY_CLASSIFICATION
)

# Set deployment details
asset_deployment = AssetDeploymentRequest(
    deployment_id=str(uuid.uuid4()),
    name=SUBSCRIPTION_NAME,
    description=SUBSCRIPTION_DESCRIPTION,
    deployment_type="batch"
)

# Set asset properties 
asset_properties_request = AssetPropertiesRequest(
    label_column=common_configuration["label_column"],
    probability_fields=[common_configuration["probability"]],
    prediction_field=common_configuration["prediction"],
    feature_fields=common_configuration["feature_columns"],
    categorical_fields=common_configuration["categorical_columns"]
)

# Set analytics engine details
analytics_engine = AnalyticsEngine(
    type="spark",
    integrated_system_id=spark_engine_id,
    parameters = spark_parameters
)

# Add selected tables as data sources
data_sources = []
if FEEDBACK_DATABASE_NAME is not None and FEEDBACK_TABLE_NAME is not None:
    feedback_data_source = DataSource(
        type="feedback", 
        database_name=FEEDBACK_DATABASE_NAME, 
        schema_name=FEEDBACK_DATABASE_NAME, 
        table_name=FEEDBACK_TABLE_NAME, 
        connection=DataSourceConnection(
            type="hive", 
            integrated_system_id=hive_connection_id
        ),
        parameters={
            "hive_storage_format": "" #supported values are "csv", "parquet", "orc"
        },
        auto_create=True, #set it to False if table already exists
        status=DataSourceStatus(state="new")
    )
    data_sources.append(feedback_data_source)
    
if PAYLOAD_DATABASE_NAME is not None and PAYLOAD_TABLE_NAME is not None \
    and DRIFT_DATABASE_NAME is not None and DRIFT_TABLE_NAME is not None:
    payload_logging_data_source = DataSource(
        type="payload", 
        database_name=PAYLOAD_DATABASE_NAME, 
        schema_name=PAYLOAD_DATABASE_NAME, 
        table_name=PAYLOAD_TABLE_NAME, 
        connection=DataSourceConnection(
            type="hive", 
            integrated_system_id=hive_connection_id
        ),
        parameters={
            "hive_storage_format": "" #supported values are "csv", "parquet", "orc"
        },
        auto_create=True, #set it to False if table already exists
        status=DataSourceStatus(state="new")
    )
    
    drifted_transactions_table_data_source = DataSource(
        type="drift", 
        database_name=DRIFT_DATABASE_NAME, 
        schema_name=DRIFT_DATABASE_NAME, 
        table_name=DRIFT_TABLE_NAME, 
        connection=DataSourceConnection(
            type="hive", 
            integrated_system_id=hive_connection_id
        ),
        auto_create=True, #set it to False if table already exists
        status=DataSourceStatus(state="new")
    )
    
    data_sources.append(payload_logging_data_source)
    data_sources.append(drifted_transactions_table_data_source)

if EXPLAINABILITY_DATABASE_NAME is not None and \
    EXPLAINABILITY_QUEUE_TABLE_NAME is not None and \
        EXPLAINABILITY_RESULT_TABLE_NAME is not None:

    explainability_queue_data_source = DataSource(
        type="explain_queue", 
        database_name=EXPLAINABILITY_DATABASE_NAME,
        schema_name=EXPLAINABILITY_DATABASE_NAME,
        table_name=EXPLAINABILITY_QUEUE_TABLE_NAME,
        connection=DataSourceConnection(
            type="hive", 
            integrated_system_id=hive_integrated_system_id
        ),
        parameters={
            "hive_storage_format": "" #supported values are "csv", "parquet", "orc"
        },
        auto_create=True, #set it to False if table already exists
        status=DataSourceStatus(state="new")
    )
    
    data_sources.append(explainability_queue_data_source)

    explainability_result_data_source = DataSource(
        type="explain_result", 
        database_name=EXPLAINABILITY_DATABASE_NAME,
        schema_name=EXPLAINABILITY_DATABASE_NAME,
        table_name=EXPLAINABILITY_RESULT_TABLE_NAME,
        connection=DataSourceConnection(
            type="hive", 
            integrated_system_id=hive_integrated_system_id
        ),
        parameters={},
        auto_create=True, #set it to False if table already exists
        status=DataSourceStatus(state="new")
    )
    
    data_sources.append(explainability_result_data_source)

In [21]:
# Adding the subscription

subscription_details = Subscriptions(wos_client).add(
    data_mart_id=data_mart_id,
    service_provider_id=service_provider_id,
    asset=asset,
    deployment=asset_deployment,
    asset_properties=asset_properties_request,
    analytics_engine=analytics_engine,
    data_sources=data_sources).result

subscription_id = subscription_details.metadata.id
print(subscription_details)

{
  "metadata": {
    "id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:subscription:0f4c5577-ac33-4f23-820e-0a718b0a55c7",
    "url": "/v2/subscriptions/0f4c5577-ac33-4f23-820e-0a718b0a55c7",
    "created_at": "2024-06-27T13:38:42.493000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "service_provider_id": "8b65752d-0dc9-4715-a2a2-96ee19fb7ded",
    "asset": {
      "asset_id": "32108fcd-4407-4435-b057-b7043e2eb6bd",
      "url": "",
      "name": "WML_IAE2",
      "asset_type": "model",
      "problem_type": "binary",
      "input_data_type": "structured"
    },
    "asset_properties": {
      "label_column": "Risk",
      "prediction_field": "prediction",
      "feature_fields": [
        "CheckingStatus",
        "LoanDuration",
        "CreditHistory",
        "LoanPurpose",
        "LoanAmount",
        "ExistingSavings",


In [22]:
import time
# Checking subscription status

state = wos_client.subscriptions.get(subscription_id).result.entity.status.state
while state not in ["active", "error"]:
    state = wos_client.subscriptions.get(subscription_id).result.entity.status.state
    print(state)
    time.sleep(15)

preparing
preparing
preparing
preparing
active


In [23]:
# Add training, output, and input data schemas to the subscription

training_data_schema_patch_document=[
    JsonPatchOperation(op=OperationTypes.REPLACE, path='/asset_properties/training_data_schema', value=common_configuration["training_data_schema"])
]

input_data_schema_patch_document=[
    JsonPatchOperation(op=OperationTypes.REPLACE, path='/asset_properties/input_data_schema', value=common_configuration["input_data_schema"])
]

output_data_schema_patch_document=[
    JsonPatchOperation(op=OperationTypes.REPLACE, path='/asset_properties/output_data_schema', value=common_configuration["output_data_schema"])
]

wos_client.subscriptions.update(subscription_id=subscription_id, patch_document=training_data_schema_patch_document)
wos_client.subscriptions.update(subscription_id=subscription_id, patch_document=input_data_schema_patch_document)
wos_client.subscriptions.update(subscription_id=subscription_id, patch_document=output_data_schema_patch_document)

<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x7fc48050b190>

In [24]:
# Check subscription status

wos_client.subscriptions.get(subscription_id).result.entity.status.state

'active'

# 4. Quality monitoring <a name="quality"></a>

### Enable the quality monitor

In the following code cell, default values are set for the quality monitor. You can change the default values by updating the optional `min_feedback_data_size` attribute in the `parameters` dict and set the quality threshold in the `thresholds` list.

In [25]:
import time

target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)

parameters = {
    "min_feedback_data_size": 1000
}

thresholds = [{
        "metric_id": "area_under_roc",
        "type": "lower_limit",
        "value": 0.8
}]

quality_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters,
    thresholds=thresholds
).result

quality_monitor_instance_id = quality_monitor_details.metadata.id
print(quality_monitor_details)

{
  "metadata": {
    "id": "77cd536b-9492-4de8-9a38-284f167a68fe",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:77cd536b-9492-4de8-9a38-284f167a68fe",
    "url": "/v2/monitor_instances/77cd536b-9492-4de8-9a38-284f167a68fe",
    "created_at": "2024-06-27T13:41:36.305000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "quality",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "min_feedback_data_size": 1000
    },
    "thresholds": [
      {
        "metric_id": "area_under_roc",
        "type": "lower_limit",
        "value": 0.8
      }
    ],
    "schedule": {
      "repeat_interval": 1,
      "repeat_unit": "week",
      "start_time": {
        "type": "relative",
        "delay_unit": "minute",
        "delay": 10
      },
      "r

### Check monitor instance status

In [26]:
quality_status = None
from datetime import datetime

while quality_status not in ("active", "error"):
    monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=quality_monitor_instance_id).result
    quality_status = monitor_instance_details.entity.status.state
    if quality_status not in ("active", "error"):
        print(datetime.utcnow().strftime('%H:%M:%S'), quality_status)
        time.sleep(30)
        
print(datetime.utcnow().strftime('%H:%M:%S'), quality_status)

13:41:44 active


In [27]:
monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=quality_monitor_instance_id).result
print(monitor_instance_details)

{
  "metadata": {
    "id": "77cd536b-9492-4de8-9a38-284f167a68fe",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:77cd536b-9492-4de8-9a38-284f167a68fe",
    "url": "/v2/monitor_instances/77cd536b-9492-4de8-9a38-284f167a68fe",
    "created_at": "2024-06-27T13:41:36.305000Z",
    "created_by": "cpadmin",
    "modified_at": "2024-06-27T13:41:36.725000Z",
    "modified_by": "internal-service"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "quality",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "min_feedback_data_size": 1000
    },
    "thresholds": [
      {
        "metric_id": "area_under_roc",
        "type": "lower_limit",
        "value": 0.8
      }
    ],
    "schedule": {
      "repeat_interval": 1,
      "repeat_unit": "week",
      "start_time": {
        

### Run an on-demand evaluation

In [28]:
# Check Quality monitor instance details

monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=quality_monitor_instance_id).result
print(monitor_instance_details)

{
  "metadata": {
    "id": "77cd536b-9492-4de8-9a38-284f167a68fe",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:77cd536b-9492-4de8-9a38-284f167a68fe",
    "url": "/v2/monitor_instances/77cd536b-9492-4de8-9a38-284f167a68fe",
    "created_at": "2024-06-27T13:41:36.305000Z",
    "created_by": "cpadmin",
    "modified_at": "2024-06-27T13:57:58.320000Z",
    "modified_by": "internal-service"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "quality",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "min_feedback_data_size": 10
    },
    "thresholds": [
      {
        "metric_id": "area_under_roc",
        "type": "lower_limit",
        "value": 0.8
      },
      {
        "metric_id": "area_under_pr",
        "type": "lower_limit"
      },
      {
        "metric_id":

In [29]:
quality_monitor_instance_id = "30171a44-7a4e-4ebc-afda-3d0de5625cf5"
subscription_id = "e34b9b87-b6e1-4c53-b92e-cb80dea042be"

In [30]:
# Trigger on-demand run

monitoring_run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_instance_id).result
monitoring_run_id=monitoring_run_details.metadata.id

print(monitoring_run_details)

{
  "metadata": {
    "id": "1278a1bb-d471-482a-96c7-f458b03130a9",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:run:1278a1bb-d471-482a-96c7-f458b03130a9",
    "url": "/v2/monitor_instances/30171a44-7a4e-4ebc-afda-3d0de5625cf5/runs/1278a1bb-d471-482a-96c7-f458b03130a9",
    "created_at": "2024-07-05T08:57:18.081000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "parameters": {
      "last_processed_time": "2024-07-05T08:36:43.817463Z",
      "min_feedback_data_size": 10
    },
    "status": {
      "state": "running",
      "queued_at": "2024-07-05T08:57:18.070000Z",
      "started_at": "2024-07-05T08:57:18.081000Z",
      "operators": []
    }
  }
}


In [31]:
# Check run status
import time
quality_run_status = None
while quality_run_status not in ("finished", "error"):
    monitoring_run_details = wos_client.monitor_instances.get_run_details(monitor_instance_id=quality_monitor_instance_id, monitoring_run_id=monitoring_run_id).result
    quality_run_status = monitoring_run_details.entity.status.state
    if quality_run_status not in ("finished", "error"):
        print(datetime.utcnow().strftime("%H:%M:%S"), quality_run_status)
        time.sleep(30)
    else:
        print(monitoring_run_details.entity)
print(datetime.utcnow().strftime("%H:%M:%S"), quality_run_status)

08:57:30 running
08:58:00 running
08:58:30 running
08:59:00 running
08:59:30 running
09:00:00 running
09:00:30 running
09:01:00 running
09:01:30 running
09:02:00 finished


### Display quality metrics

In [32]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_instance_id)

# 5. Drift monitoring <a name="drift"></a>

### Enable the drift monitor

In the following code cell, type a path to the drift configuration tar ball.

In [33]:
wos_client.monitor_instances.upload_drift_model(
    model_path="/path/to/dir/containing/drift.tar.gz",
    data_mart_id=data_mart_id,
    subscription_id=subscription_id
).result

{'data_constraints': {'id': 'd8f2155e-81fc-4dea-ae2e-c3b5e42d2dc9',
  'version': '0.02_batch',
  'columns': [{'name': 'Age',
    'dtype': 'numeric_discrete',
    'count': 1000,
    'approx_count_distinct': 0,
    'sparse': False,
    'skip_learning': False},
   {'name': 'CheckingStatus',
    'dtype': 'categorical',
    'count': 1000,
    'approx_count_distinct': 0,
    'sparse': False,
    'skip_learning': False},
   {'name': 'CreditHistory',
    'dtype': 'categorical',
    'count': 1000,
    'approx_count_distinct': 0,
    'sparse': False,
    'skip_learning': False},
   {'name': 'CurrentResidenceDuration',
    'dtype': 'numeric_discrete',
    'count': 1000,
    'approx_count_distinct': 0,
    'sparse': False,
    'skip_learning': False},
   {'name': 'Dependents',
    'dtype': 'numeric_discrete',
    'count': 1000,
    'approx_count_distinct': 0,
    'sparse': False,
    'skip_learning': False},
   {'name': 'EmploymentDuration',
    'dtype': 'categorical',
    'count': 1000,
    'appr

In the following code cell, default values are set for the drift monitor. You can change the default values by updating the values in the parameters section. The `min_samples` parameter controls the number of records that triggers the drift monitor to run. The `drift_threshold` parameter sets the threshold in decimal format for the drift percentage to trigger an alert. The `train_drift_model` parameter controls whether to re-train the model based on the drift analysis.



In [34]:
import time

target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)

parameters = {
    "min_samples": 1000,
    "drift_threshold": 0.05,
    "train_drift_model": False
}

drift_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.DRIFT.ID,
    target=target,
    parameters=parameters
).result

drift_monitor_instance_id = drift_monitor_details.metadata.id
print(drift_monitor_details)

{
  "metadata": {
    "id": "36fc8948-7f76-4f52-b359-e96a8ca9b4ac",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:36fc8948-7f76-4f52-b359-e96a8ca9b4ac",
    "url": "/v2/monitor_instances/36fc8948-7f76-4f52-b359-e96a8ca9b4ac",
    "created_at": "2024-06-27T14:44:38.516000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "drift",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "drift_threshold": 0.05,
      "min_samples": 10,
      "train_drift_model": false
    },
    "thresholds": [
      {
        "metric_id": "drift_magnitude",
        "type": "upper_limit",
        "value": 0.5
      },
      {
        "metric_id": "predicted_accuracy",
        "type": "upper_limit",
        "value": 0.8
      },
      {
        "metric_id": "data_dr

### Check monitor instance status

In [35]:
drift_status = None

while drift_status not in ("active", "error"):
    monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=drift_monitor_instance_id).result
    drift_status = monitor_instance_details.entity.status.state
    if drift_status not in ("active", "error"):
        print(datetime.utcnow().strftime('%H:%M:%S'), drift_status)
        time.sleep(30)

print(datetime.utcnow().strftime('%H:%M:%S'), drift_status)

14:45:38 preparing
14:46:08 preparing
14:46:38 preparing
14:47:08 active


### Run an on-demand evaluation

In [36]:
# Check Drift monitor instance details

monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=drift_monitor_instance_id).result
print(monitor_instance_details)

{
  "metadata": {
    "id": "36fc8948-7f76-4f52-b359-e96a8ca9b4ac",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:36fc8948-7f76-4f52-b359-e96a8ca9b4ac",
    "url": "/v2/monitor_instances/36fc8948-7f76-4f52-b359-e96a8ca9b4ac",
    "created_at": "2024-06-27T14:44:38.516000Z",
    "created_by": "cpadmin",
    "modified_at": "2024-06-27T14:46:42.394000Z",
    "modified_by": "internal-service"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "drift",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "config_status": {
        "model_name": null,
        "state": "finished"
      },
      "data_drift_enabled": true,
      "data_drift_threshold": 0.1,
      "drift_buffer_range": [
        -4.5,
        4.5
      ],
      "drift_model_version": null,
      "drift_threshold": 0

In [37]:
# Trigger on-demand run

monitoring_run_details = wos_client.monitor_instances.run(monitor_instance_id=drift_monitor_instance_id).result
monitoring_run_id=monitoring_run_details.metadata.id

print(monitoring_run_details)

{
  "metadata": {
    "id": "38186d88-c5e2-4f8e-bc52-983b4fb22588",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:run:38186d88-c5e2-4f8e-bc52-983b4fb22588",
    "url": "/v2/monitor_instances/36fc8948-7f76-4f52-b359-e96a8ca9b4ac/runs/38186d88-c5e2-4f8e-bc52-983b4fb22588",
    "created_at": "2024-06-27T14:47:11.180000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "parameters": {
      "config_status": {
        "model_name": null,
        "state": "finished"
      },
      "data_drift_enabled": true,
      "data_drift_threshold": 0.1,
      "drift_buffer_range": [
        -4.5,
        4.5
      ],
      "drift_model_version": null,
      "drift_threshold": 0.05,
      "min_samples": 10,
      "model_drift_enabled": true,
      "table_schema": {
        "fields": [
          {
            "length": 64,
            "metadata": {},
            "name": "scoring_id",
            "nullable": false,
            "type": "string",
      

In [38]:
# Check run status

drift_run_status = None
while drift_run_status not in ("finished", "error"):
    monitoring_run_details = wos_client.monitor_instances.get_run_details(monitor_instance_id=drift_monitor_instance_id, monitoring_run_id=monitoring_run_id).result
    drift_run_status = monitoring_run_details.entity.status.state
    if drift_run_status not in ("finished", "error"):
        print(datetime.utcnow().strftime("%H:%M:%S"), drift_run_status)
        time.sleep(30)
        
print(datetime.utcnow().strftime("%H:%M:%S"), drift_run_status)

14:50:11 finished


### Display drift metrics

In [39]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=drift_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2024-06-27 14:47:12.950102+00:00,data_drift_magnitude,71ad6b8c-74b4-46df-acec-88496e6f71b4,0.0,,0.1,[],drift,36fc8948-7f76-4f52-b359-e96a8ca9b4ac,38186d88-c5e2-4f8e-bc52-983b4fb22588,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7
2024-06-27 14:47:12.950102+00:00,drift_magnitude,71ad6b8c-74b4-46df-acec-88496e6f71b4,0.0,,0.05,[],drift,36fc8948-7f76-4f52-b359-e96a8ca9b4ac,38186d88-c5e2-4f8e-bc52-983b4fb22588,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7
2024-06-27 14:47:12.950102+00:00,predicted_accuracy,71ad6b8c-74b4-46df-acec-88496e6f71b4,0.8181818181818182,,,[],drift,36fc8948-7f76-4f52-b359-e96a8ca9b4ac,38186d88-c5e2-4f8e-bc52-983b4fb22588,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7


# 6. Fairness monitoring <a name="fairness"></a>

### Enable the fairness monitor

The following code cell, will enable the fairness monitor.

In [40]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)

parameters = configuration_json["fairness_configuration"]["parameters"]
thresholds = configuration_json["fairness_configuration"]["thresholds"]

fairness_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.FAIRNESS.ID,
    target=target,
    parameters=parameters,
    thresholds=thresholds
).result

fairness_monitor_instance_id = fairness_monitor_details.metadata.id
print(fairness_monitor_details)

{
  "metadata": {
    "id": "eee6c677-7f68-4b33-94ca-5b079bccb1b5",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:eee6c677-7f68-4b33-94ca-5b079bccb1b5",
    "url": "/v2/monitor_instances/eee6c677-7f68-4b33-94ca-5b079bccb1b5",
    "created_at": "2024-06-27T14:50:39.749000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "fairness",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "training_data_last_processed_time": "2024-06-25T09:31:16.569860Z",
      "features": [
        {
          "minority": [
            "female"
          ],
          "feature": "Sex",
          "metric_ids": [
            "statistical_parity_difference",
            "fairness_value"
          ],
          "majority": [
            "male"
          ],
          "t

### Check monitor instance status

In [41]:
fairness_state = fairness_monitor_details.entity.status.state

while fairness_state not in ("active", "error"):
    print(datetime.utcnow().strftime('%H:%M:%S'), fairness_state)
    monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=fairness_monitor_instance_id).result
    fairness_state = monitor_instance_details.entity.status.state
    time.sleep(30)

print(datetime.utcnow().strftime('%H:%M:%S'), fairness_state)

14:50:40 preparing
14:51:10 active


### Run an on-demand evaluation

In [42]:
# Trigger on-demand run

monitoring_run_details = wos_client.monitor_instances.run(monitor_instance_id=fairness_monitor_instance_id).result
monitoring_run_id=monitoring_run_details.metadata.id

print(monitoring_run_details)

{
  "metadata": {
    "id": "ed47a9b0-417a-457a-b5a1-c87cdbe8590d",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:run:ed47a9b0-417a-457a-b5a1-c87cdbe8590d",
    "url": "/v2/monitor_instances/eee6c677-7f68-4b33-94ca-5b079bccb1b5/runs/ed47a9b0-417a-457a-b5a1-c87cdbe8590d",
    "created_at": "2024-06-27T14:51:10.927000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "parameters": {
      "favourable_class": [
        "No Risk"
      ],
      "features": [
        {
          "feature": "Sex",
          "majority": [
            "male"
          ],
          "metric_ids": [
            "statistical_parity_difference",
            "fairness_value"
          ],
          "minority": [
            "female"
          ],
          "threshold": 0.95
        },
        {
          "feature": "Age",
          "majority": [
            [
              26,
              75
            ]
          ],
          "metric_ids": [
            "stati

In [43]:
# Check run status

fairness_run_status = monitoring_run_details.entity.status.state
while fairness_run_status not in ("finished", "error"):
    print(datetime.utcnow().strftime("%H:%M:%S"), fairness_run_status)
    monitoring_run_details = wos_client.monitor_instances.get_run_details(monitor_instance_id=fairness_monitor_instance_id, monitoring_run_id=monitoring_run_id).result
    fairness_run_status = monitoring_run_details.entity.status.state
    time.sleep(30)
        
print(datetime.utcnow().strftime("%H:%M:%S"), fairness_run_status)

14:51:11 running
14:51:41 running
14:52:11 running
14:52:41 running
14:53:11 running
14:53:41 running
14:54:11 running
14:54:41 running
14:55:11 running
14:55:41 running
14:56:12 running
14:56:42 running
14:57:12 finished


### Display fairness metrics

In [44]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=fairness_monitor_instance_id)

0,1,2,3,4,5,6,7,8,9,10,11
2024-06-27 14:56:13.738831+00:00,fairness_value,88b03992-241e-4d9c-9221-0d44e02243b5,175.00000000000003,95.0,125.0,"['feature:Sex', 'fairness_metric_type:fairness', 'feature_value:female']",fairness,eee6c677-7f68-4b33-94ca-5b079bccb1b5,ed47a9b0-417a-457a-b5a1-c87cdbe8590d,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7
2024-06-27 14:56:13.738831+00:00,statistical_parity_difference,88b03992-241e-4d9c-9221-0d44e02243b5,0.429,-0.15,0.15,"['feature:Sex', 'fairness_metric_type:fairness', 'feature_value:female']",fairness,eee6c677-7f68-4b33-94ca-5b079bccb1b5,ed47a9b0-417a-457a-b5a1-c87cdbe8590d,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7
2024-06-27 14:56:13.738831+00:00,fairness_value,88b03992-241e-4d9c-9221-0d44e02243b5,150.00000000000003,95.0,125.0,"['feature:Age', 'fairness_metric_type:fairness', 'feature_value:18-25']",fairness,eee6c677-7f68-4b33-94ca-5b079bccb1b5,ed47a9b0-417a-457a-b5a1-c87cdbe8590d,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7
2024-06-27 14:56:13.738831+00:00,statistical_parity_difference,88b03992-241e-4d9c-9221-0d44e02243b5,0.333,-0.15,0.15,"['feature:Age', 'fairness_metric_type:fairness', 'feature_value:18-25']",fairness,eee6c677-7f68-4b33-94ca-5b079bccb1b5,ed47a9b0-417a-457a-b5a1-c87cdbe8590d,subscription,0f4c5577-ac33-4f23-820e-0a718b0a55c7


# 7. Explainability monitoring <a name="explainability"></a>

### Enable the explainability monitor

#### Upload explainability configuration archive
In the following code cell, type the path to the explainability configuration archive tar ball.

In [45]:
with open("/path/to/dir/containing/explainability.tar.gz", mode="rb") as explainability_tar:
    wos_client.monitor_instances.upload_explainability_archive(subscription_id=subscription_id, archive=explainability_tar)

print("Uploaded explainability configuration archive successfully.")

Uploaded explainability configuration archive successfully.


In [46]:
import time

target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)

parameters = {
# Uncomment the below lines to enable lime global explanation. Available from Cloud Pak for Data version 4.6.4 onwards.
#    "global_explanation": {
#        "enabled": True,  # Flag to enable global explanation 
#        "explanation_method": "lime",
#        "sample_size": 1000, # [Optional] The sample size of records to be used for generating payload data global explanation. If not specified entire data in the payload window is used.
#    }
}

explainability_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_instance_id = explainability_monitor_details.metadata.id
print(explainability_monitor_details)

{
  "metadata": {
    "id": "8d218fff-97b8-4858-b45b-130337c1be72",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:8d218fff-97b8-4858-b45b-130337c1be72",
    "url": "/v2/monitor_instances/8d218fff-97b8-4858-b45b-130337c1be72",
    "created_at": "2024-06-27T14:58:09.835000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "explainability",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {},
    "thresholds": [
      {
        "metric_id": "global_explanation_stability",
        "type": "lower_limit",
        "value": 0.8
      }
    ],
    "schedule": {
      "repeat_interval": 1,
      "repeat_unit": "week",
      "start_time": {
        "type": "relative",
        "delay_unit": "minute",
        "delay": 10
      },
      "repeat_type": "week"
 

### Check monitor instance status

In [47]:
explainability_status = None

while explainability_status not in ("active", "error"):
    monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=explainability_monitor_instance_id).result
    explainability_status = monitor_instance_details.entity.status.state
    if explainability_status not in ("active", "error"):
        print(datetime.utcnow().strftime('%H:%M:%S'), explainability_status)
        time.sleep(30)

print(datetime.utcnow().strftime('%H:%M:%S'), explainability_status)

15:09:04 active


### Run an on-demand evaluation

In [48]:
# Check Explainbility monitor instance details

monitor_instance_details = wos_client.monitor_instances.get(monitor_instance_id=explainability_monitor_instance_id).result
print(monitor_instance_details)

{
  "metadata": {
    "id": "8d218fff-97b8-4858-b45b-130337c1be72",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:monitor_instance:8d218fff-97b8-4858-b45b-130337c1be72",
    "url": "/v2/monitor_instances/8d218fff-97b8-4858-b45b-130337c1be72",
    "created_at": "2024-06-27T14:58:09.835000Z",
    "created_by": "cpadmin",
    "modified_at": "2024-06-27T15:08:51.102000Z",
    "modified_by": "internal-service"
  },
  "entity": {
    "data_mart_id": "00000000-0000-0000-0000-000000000000",
    "monitor_definition_id": "explainability",
    "target": {
      "target_type": "subscription",
      "target_id": "0f4c5577-ac33-4f23-820e-0a718b0a55c7"
    },
    "parameters": {
      "config_modified_at": "2024-06-27T15:08:51.033218Z",
      "config_package_file": "explainability (2).tar.gz",
      "controllable_features": [],
      "explanations_count": {
        "failed": 0,
        "total": 0
      },
      "lime": {
        "perturbations_count"

In [49]:
# Trigger on-demand run

monitoring_run_details = wos_client.monitor_instances.run(monitor_instance_id=explainability_monitor_instance_id).result
monitoring_run_id=monitoring_run_details.metadata.id

print(monitoring_run_details)

{
  "metadata": {
    "id": "228ba673-8edb-4a82-8ccf-4ef5d64e138c",
    "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:run:228ba673-8edb-4a82-8ccf-4ef5d64e138c",
    "url": "/v2/monitor_instances/8d218fff-97b8-4858-b45b-130337c1be72/runs/228ba673-8edb-4a82-8ccf-4ef5d64e138c",
    "created_at": "2024-06-27T15:09:06.109000Z",
    "created_by": "cpadmin"
  },
  "entity": {
    "parameters": {
      "config_modified_at": "2024-06-27T15:08:51.033218Z",
      "config_package_file": "explainability (2).tar.gz",
      "controllable_features": [],
      "explanations_count": {
        "failed": 0,
        "total": 0
      },
      "lime": {
        "perturbations_count": 208
      },
      "validate_table_job_app_id": null,
      "validate_table_job_id": "d31bd960-d06d-4723-b8cd-ccfa5d4788bf",
      "validate_table_job_output_path": "explainability_configuration/d2cd55b4-d1e8-484b-a491-a2a7be9aad41/output/c489084b-42eb-4a96-a5c5-b0f1755618c2",
     

In [50]:
# Check run status

explainability_run_status = None
while explainability_run_status not in ("finished", "error"):
    monitoring_run_details = wos_client.monitor_instances.get_run_details(monitor_instance_id=explainability_monitor_instance_id, monitoring_run_id=monitoring_run_id).result
    explainability_run_status = monitoring_run_details.entity.status.state
    if explainability_run_status not in ("finished", "error"):
        print(datetime.utcnow().strftime("%H:%M:%S"), explainability_run_status)
        time.sleep(60)
        
print(datetime.utcnow().strftime("%H:%M:%S"), explainability_run_status)

15:09:07 running
15:10:07 running
15:11:07 running
15:12:07 running
15:13:07 running
15:14:07 running
15:15:07 finished


In [51]:
# View the global explanation stability metric. When lime global explanation is enabled, the monitor run computes global explanation and publishes global_explanation_stability metric.
# wos_client.monitor_instances.show_metrics(monitor_instance_id=explainability_monitor_instance_id)

### Display sample explanations

In [52]:
explanations = wos_client.monitor_instances.get_all_explaination_tasks(subscription_id=subscription_id).result
print(explanations)

{
  "total_count": 11,
  "limit": 50,
  "offset": 0,
  "explanation_fields": [
    "explanation_task_id",
    "scoring_id",
    "created_at",
    "finished_at",
    "status",
    "prediction",
    "subscription_id",
    "deployment_id",
    "asset_name",
    "deployment_name",
    "probability",
    "explanation_type"
  ],
  "explanation_values": [
    [
      "ec582ee8-8b64-42df-91b1-f85680a7eef0",
      "sc1",
      "2024-06-27T15:10:10.928028Z",
      "2024-06-27T15:10:10.928372Z",
      "finished",
      "No Risk",
      "0f4c5577-ac33-4f23-820e-0a718b0a55c7",
      "ee50f3df-0c4d-4d50-b17c-09c123b5252e",
      "WML_IAE2",
      "WML_IAE2",
      0.8429199,
      "lime"
    ],
    [
      "1c2fa7f7-2328-49dc-8756-8d2dab52b929",
      "sc2",
      "2024-06-27T15:10:10.935234Z",
      "2024-06-27T15:10:10.935448Z",
      "finished",
      "No Risk",
      "0f4c5577-ac33-4f23-820e-0a718b0a55c7",
      "ee50f3df-0c4d-4d50-b17c-09c123b5252e",
      "WML_IAE2",
      "WML_IAE2",
      0.

## Congratulations!

You have finished the Batch demo for IBM Watson OpenScale using Remote Apache Spark. You can now view the [Watson OpenScale Dashboard](https://url-to-your-cp4d-cluster/aiopenscale). Click the tile for the **German Credit model** to see quality, drift and fairness monitors. Click the timeseries graph to get detailed information on transactions during a specific time window.