<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Tutorial on generating an explanation for a text-based model on Watson OpenScale

This notebook includes steps for creating a text-based watson-machine-learning model, creating a subscription, configuring explainability, and finally generating an explanation for a transaction.

### Contents
- [1. Setup](#setup)
- [2. Creating and deploying a text-based model](#deploy)
- [3. Subscriptions](#subscription)
- [4. Explainability](#explainability)

**Note**: This notebook works correctly with kernel `Python 3.10.x` with pyspark 3.3.x.

<a id="setup"></a>
## 1. Setup

### 1.1 Install Watson OpenScale and WML packages

In [1]:
!pip install --upgrade ibm-watson-openscale --no-cache --user| tail -n 1

Successfully installed ibm-watson-openscale-3.0.32


In [2]:
!pip install --upgrade ibm-watson-machine-learning --no-cache --user| tail -n 1

Successfully installed ibm-watson-machine-learning-1.0.308


Note: Restart the kernel to assure the new libraries are being used.

### 1.2 Configure credentials

Your Cloud API key can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below.

**NOTE:** You can also get OpenScale `API_KEY` using IBM CLOUD CLI.

How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

How to get api key using console:
```
bx login --sso
bx iam api-key-create 'my_key'
```

In [1]:
CLOUD_API_KEY = "cVWae96LfiI7X0ydukDaY_GZfmKv-dcB2qgpkOikxxI0"
IAM_URL="https://iam.us-south.bluemix.net/oidc/token"

In [2]:
WML_CREDENTIALS = {
                   "url": "https://us-south.ml.cloud.ibm.com",
                   "apikey": CLOUD_API_KEY
}

In [3]:
import json
from ibm_watson_machine_learning import APIClient

wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

'1.0.297'

In [4]:
wml_client.spaces.list(limit=10)

------------------------------------  -----------------------  ------------------------
ID                                    NAME                     CREATED
bdba7d20-65ed-4e46-8de5-fcbd38f27633  Deployment space Dallas  2023-06-07T13:47:07.341Z
------------------------------------  -----------------------  ------------------------


Unnamed: 0,ID,NAME,CREATED
0,bdba7d20-65ed-4e46-8de5-fcbd38f27633,Deployment space Dallas,2023-06-07T13:47:07.341Z


In [5]:
WML_SPACE_ID='bdba7d20-65ed-4e46-8de5-fcbd38f27633' # use space id here
wml_client.set.default_space(WML_SPACE_ID)

'SUCCESS'

## 3. Subscriptions <a id="subscription"></a>

### 3.1 Configuring OS

In [12]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator,BearerTokenAuthenticator

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *


authenticator = IAMAuthenticator(apikey=CLOUD_API_KEY)
#authenticator = BearerTokenAuthenticator(bearer_token=IAM_TOKEN) ## uncomment this line if using IAM token to authenticate
wos_client = APIClient(authenticator=authenticator)
wos_client.version

'3.0.31'

**Note**: Please re-run the above cell if it doesn't work the first time.

In [7]:
#DB_CREDENTIALS= {"hostname":"","username":"","password":"","database":"","port":"","ssl":True,"sslmode":"","certificate_base64":""}
DB_CREDENTIALS = None
KEEP_MY_INTERNAL_POSTGRES = True

In [16]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['hostname'],
                        username=DB_CREDENTIALS['username'],
                        password=DB_CREDENTIALS['password'],
                        db=DB_CREDENTIALS['database'],
                        port=DB_CREDENTIALS['port'],
                        ssl=True,
                        sslmode=DB_CREDENTIALS['sslmode'],
                        certificate_base64=DB_CREDENTIALS['certificate_base64']
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))

Using existing datamart e25969b8-316a-4515-b5ea-5895bbbd2c55


In [19]:
SERVICE_PROVIDER_NAME = "Service"
SERVICE_PROVIDER_DESCRIPTION = "Give a name here"

In [20]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    if service_instance_name == SERVICE_PROVIDER_NAME:
        service_provider_id = service_provider.metadata.id
        wos_client.service_providers.delete(service_provider_id)
        print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

Deleted existing service_provider for WML instance: 6bf5e4d9-5325-4884-8b0a-89563beb2706


In [21]:
added_service_provider_result = wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description=SERVICE_PROVIDER_DESCRIPTION,
        service_type=ServiceTypes.WATSON_MACHINE_LEARNING,
        deployment_space_id = WML_SPACE_ID,
        operational_space_id = "production",
        credentials=WMLCredentialsCloud(
            apikey=CLOUD_API_KEY,      ## use `apikey=IAM_TOKEN` if using IAM_TOKEN to initiate client
            url=WML_CREDENTIALS["url"],
            instance_id=None
        ),
        background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id




 Waiting for end of adding service provider 14a2e726-75f2-4e50-9898-c9619a190c7d 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




In [23]:
asset_deployment_details_list = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_space_id = WML_SPACE_ID).result['resources']
DEPLOYMENT_NAME='IMDB Classification Model on parsed data -DSB deployment' # use the model name here 
asset_deployment_details = [asset for asset in asset_deployment_details_list if asset['entity']["name"]==DEPLOYMENT_NAME]

if len(asset_deployment_details)>0:
    [asset_deployment_details] = asset_deployment_details
else:
    raise ValueError('deployment with name "{}" not found.'.format(DEPLOYMENT_NAME))
asset_deployment_details

{'metadata': {'guid': 'f0a360d3-e85b-4a34-9ff3-b33f1a84736d',
  'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/f0a360d3-e85b-4a34-9ff3-b33f1a84736d?space_id=bdba7d20-65ed-4e46-8de5-fcbd38f27633',
  'created_at': '2023-06-11T11:54:10.282Z',
  'modified_at': '2023-06-11T11:54:10.282Z'},
 'entity': {'name': 'IMDB Classification Model on parsed data -DSB deployment',
  'type': 'online',
  'scoring_endpoint': {'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/f0a360d3-e85b-4a34-9ff3-b33f1a84736d/predictions'},
  'asset': {},
  'asset_properties': {}}}

In [24]:
deployment_uid='f0a360d3-e85b-4a34-9ff3-b33f1a84736d'
model_asset_details_from_deployment=wos_client.service_providers.get_deployment_asset(data_mart_id=data_mart_id,service_provider_id=service_provider_id,deployment_id=deployment_uid,deployment_space_id=WML_SPACE_ID)
model_asset_details_from_deployment

{'metadata': {'guid': 'f0a360d3-e85b-4a34-9ff3-b33f1a84736d',
  'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/f0a360d3-e85b-4a34-9ff3-b33f1a84736d?space_id=bdba7d20-65ed-4e46-8de5-fcbd38f27633',
  'created_at': '2023-06-11T11:54:10.282Z',
  'modified_at': '2023-06-11T11:54:10.282Z'},
 'entity': {'name': 'IMDB Classification Model on parsed data -DSB deployment',
  'type': 'online',
  'scoring_endpoint': {'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/f0a360d3-e85b-4a34-9ff3-b33f1a84736d/predictions'},
  'asset': {'asset_id': '92110fb9-18ae-48ad-8ce2-23ba06d7fb4d',
   'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/models/92110fb9-18ae-48ad-8ce2-23ba06d7fb4d?space_id=bdba7d20-65ed-4e46-8de5-fcbd38f27633&version=2020-06-12',
   'name': 'IMDB Classification Model on parsed data -DSB',
   'asset_type': 'model',
   'created_at': '2023-06-11T11:53:44.563Z',
   'modified_at': '2023-06-11T11:53:45.244Z',
   'problem_type': 'binary',
   'input_data_type': 'structured'

### 3.2 Subscribe the asset

In [25]:
model_uid='92110fb9-18ae-48ad-8ce2-23ba06d7fb4d'
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    sub_model_id = subscription.entity.asset.asset_id
    if sub_model_id == model_uid:
        wos_client.subscriptions.delete(subscription.metadata.id)
        print('Deleted existing subscription for model', sub_model_id)

In [26]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest

In [27]:
scoring_url='https://us-south.ml.cloud.ibm.com/ml/v4/deployments/a04db143-6a7d-4695-9c75-94555ec5e8ce/predictions'
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=Asset(
            asset_id=model_asset_details_from_deployment["entity"]["asset"]["asset_id"],
            name=model_asset_details_from_deployment["entity"]["asset"]["name"],
            url=model_asset_details_from_deployment["entity"]["asset"]["url"],
            asset_type=AssetTypes.MODEL,
            input_data_type=InputDataType.UNSTRUCTURED_TEXT,
            problem_type=ProblemType.BINARY_CLASSIFICATION
        ),
        deployment=AssetDeploymentRequest(
            deployment_id=asset_deployment_details['metadata']['guid'],
            name=asset_deployment_details['entity']['name'],
            deployment_type= DeploymentTypes.ONLINE,
            url=asset_deployment_details['metadata']['url'],
            scoring_endpoint=ScoringEndpointRequest(url=scoring_url) # scoring model without shadow deployment
        ),
        asset_properties=AssetPropertiesRequest(
            label_column='label',
            probability_fields=['probability'],
            prediction_field='predictionLabel',
            feature_fields = ["texts_norm"],
            categorical_fields = ["texts_norm"],
            training_data_schema=SparkStruct.from_dict(model_asset_details_from_deployment["entity"]["asset_properties"]["training_data_schema"])
        )
    ).result
subscription_id = subscription_details.metadata.id
subscription_id

'ab642a86-31c5-4df1-8650-e024344c8a1a'

In [28]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

Payload data set id:  164af271-2310-4561-89f1-31a0695271aa


### 3.3 Get subscription

In [18]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8
92110fb9-18ae-48ad-8ce2-23ba06d7fb4d,IMDB Classification Model on parsed data -DSB,e25969b8-316a-4515-b5ea-5895bbbd2c55,f0a360d3-e85b-4a34-9ff3-b33f1a84736d,IMDB Classification Model on parsed data -DSB deployment,6bf5e4d9-5325-4884-8b0a-89563beb2706,active,2023-06-11 11:58:53.967000+00:00,5bd5f165-068a-4f41-964f-e0cc31f8e24f
ac6ac79d-4ad0-4058-ada1-7e941c343b69,P16,e25969b8-316a-4515-b5ea-5895bbbd2c55,7ec4e10b-b84f-4fe0-a8dc-f7b56c5d4ed0,H7 Prediction model,c8cfedcd-09f3-4d2e-ab59-2803138229f9,active,2023-06-09 20:31:23.542000+00:00,13c2cf68-6b36-4e0a-961e-7584c8f819cd
ec884afa-24c1-4aed-9967-2d53ab5af161,DKaluzaSpamDetector,e25969b8-316a-4515-b5ea-5895bbbd2c55,ce4c59b4-6877-44f4-94dc-5b6b8c961e87,DKaluzaSpamDetector deployment,6f8a1561-19e0-4cfb-81c5-aff23195f126,active,2023-05-24 08:36:42.214000+00:00,2756a69d-a3f4-4474-9f53-2c9130c40bbb
8678081b-79a4-4b22-94c8-04c80dce9245,Text Binary Classifier,e25969b8-316a-4515-b5ea-5895bbbd2c55,34e05d06-f82a-4ba0-9dda-07aef2ecc53a,Text Binary Classifier deployment,7c1db92e-dd74-4c2f-a33b-5f3ac59c799c,active,2023-05-05 08:35:17.385000+00:00,02152349-cb27-4c4d-8ae3-5b3e06d71fe6


In [29]:
wos_client.subscriptions.get(subscription_id).result.to_dict()

{'metadata': {'id': 'ab642a86-31c5-4df1-8650-e024344c8a1a',
  'crn': 'crn:v1:bluemix:public:aiopenscale:us-south:a/9f662e92df19cc1abadb5782b2f5c041:e25969b8-316a-4515-b5ea-5895bbbd2c55:subscription:ab642a86-31c5-4df1-8650-e024344c8a1a',
  'url': '/v2/subscriptions/ab642a86-31c5-4df1-8650-e024344c8a1a',
  'created_at': '2023-06-13T18:52:05.938000Z',
  'created_by': 'IBMid-664004GJS4'},
 'entity': {'data_mart_id': 'e25969b8-316a-4515-b5ea-5895bbbd2c55',
  'service_provider_id': '14a2e726-75f2-4e50-9898-c9619a190c7d',
  'asset': {'asset_id': '92110fb9-18ae-48ad-8ce2-23ba06d7fb4d',
   'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/models/92110fb9-18ae-48ad-8ce2-23ba06d7fb4d?space_id=bdba7d20-65ed-4e46-8de5-fcbd38f27633&version=2020-06-12',
   'name': 'IMDB Classification Model on parsed data -DSB',
   'asset_type': 'model',
   'problem_type': 'binary',
   'input_data_type': 'unstructured_text'},
  'asset_properties': {'training_data_schema': {'type': 'struct',
    'fields': [{'metadata': 

### 3.4 Score the model and get transaction-id

In [20]:
wml_client.set.default_space(WML_SPACE_ID)

'SUCCESS'

In [30]:
text = "as in amelie, recent french films seem to be taking a stereotypical male-female relationship slant, centered on a female finding her one true love. in this case, desperation leads to a convict, which leads to her evolution into a mob prototype. clever and surprising story in many ways, except that the female is there to support the male.<br /><br />for those of us that don't speak french, the subtitles are a little quick, but not unreasonable.<br /><br />the soundtrack, as seems to be increasingly the case with european films, is great and in perfect sync with the film's variations. nothing seems forced. visually, it reminds me of various urban horror movies. there's a wes craven in chicago feel to it."
payload = {"input_data": [{"fields": ['texts_norm'], "values": [[text]]}]}

response = wml_client.deployments.score(deployment_uid, payload)
print(response)

{'predictions': [{'fields': ['prediction', 'probability'], 'values': [[1, [0.1532694852830592, 0.8467305147169407]]]}]}


In [31]:
wos_client.data_sets.get_records_count(payload_data_set_id)

0

## 4. Explainability

### 4.1 Configure Explainability

In [34]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_id = explainability_details.metadata.id




 Waiting for end of monitor instance creation 52c03538-a10f-41cb-8038-3d65943cb57e 




active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




### 4.2 Get explanation for the transaction

In [35]:
pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result
scoring_ids = [pl_records_resp["records"][0]["entity"]["values"]["scoring_id"]]
print("Running explanations on scoring IDs: {}".format(scoring_ids))
explanation_types = ["lime"]
result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types, subscription_id=subscription_id).result
print(result)

Running explanations on scoring IDs: ['b529edcb37fde99552b0155049e378e7-1']
{
  "metadata": {
    "explanation_task_ids": [
      "a94887c9-4c1d-41b9-9293-b8226d7a3682"
    ],
    "created_by": "IBMid-664004GJS4",
    "created_at": "2023-06-13T18:53:14.848899Z"
  }
}


In [36]:
explanation_task_id=result.to_dict()['metadata']['explanation_task_ids'][0]
wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id, subscription_id=subscription_id).result.to_dict()

{'metadata': {'explanation_task_id': 'a94887c9-4c1d-41b9-9293-b8226d7a3682',
  'created_by': 'IBMid-664004GJS4',
  'created_at': '2023-06-13T18:53:14.848899Z'},
 'entity': {'status': {'state': 'in_progress'},
  'asset': {'id': '92110fb9-18ae-48ad-8ce2-23ba06d7fb4d',
   'name': 'IMDB Classification Model on parsed data -DSB',
   'input_data_type': 'unstructured_text',
   'problem_type': 'binary',
   'deployment': {'id': 'f0a360d3-e85b-4a34-9ff3-b33f1a84736d',
    'name': 'IMDB Classification Model on parsed data -DSB deployment'}},
  'scoring_id': 'b529edcb37fde99552b0155049e378e7-1'}}