<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Tutorial on generating an explanation for a text-based model on Watson OpenScale

This notebook includes steps for creating a text-based watson-machine-learning model, creating a subscription, configuring explainability, and finally generating an explanation for a transaction.

### Contents
- [1. Setup](#setup)
- [2. Creating and deploying a text-based model](#deploy)
- [3. Subscriptions](#subscription)
- [4. Explainability](#explainability)

**Note**: This notebook works correctly with kernel `Python 3.10.x` with pyspark 3.3.x.

<a id="setup"></a>
## 1. Setup

### 1.1 Install Watson OpenScale and WML packages

In [None]:
!pip install --upgrade ibm-watson-openscale --no-cache --user| tail -n 1

In [None]:
!pip install --upgrade ibm-watson-machine-learning --no-cache | tail -n 1

Note: Restart the kernel to assure the new libraries are being used.

### 1.2 Configure credentials

- WOS_CREDENTIALS (CP4D)
- WML_CREDENTIALS (CP4D)
- DATABASE_CREDENTIALS (DB2 on CP4D or Cloud Object Storage (COS))
- SCHEMA_NAME

In [1]:
WOS_CREDENTIALS = {
    "url": "***",
    "username": "***",
    "password": "***"
}

In [2]:
WML_CREDENTIALS = {
                   "url": "***",
                   "username": "***",
                   "password" : "***",
                   "instance_id": "wml_local",
                   "version" : "4.6" #If your env is CP4D 4.x.x then specify "4.x.x" instead of "4.6"
                  }

## 2. Creating and deploying a text-based model <a id="deploy"></a>

The dataset used is the UCI-ML SMS Spam Collection Dataset which can be found here: https://archive.ics.uci.edu/ml/machine-learning-databases/00228/. It is a binary classification dataset with the labels being 'ham' and 'spam'.

### 2.1 Loading the training data

In [3]:
!rm -rf SMSSpam.csv
!wget 'https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/spam_detection/SMSSpam.csv'

--2024-08-07 12:20:05--  https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/spam_detection/SMSSpam.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 480803 (470K) [text/plain]
Saving to: ‘SMSSpam.csv’


2024-08-07 12:20:05 (7.60 MB/s) - ‘SMSSpam.csv’ saved [480803/480803]



In [4]:
# The training data is downloaded and saved as 'SMSSpam.csv' in this step from public link

# !pip install pandas
# !rm smsspamcollection.zip
# !wget https://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip
# !unzip smsspamcollection.zip
#pd.read_csv("smsspamcollection.zip",sep="\t",header=None, encoding="utf-8").to_csv("SMSSpam.csv", header=["label", "text"], sep=",", index=False)

# !rm SMSSpamCollection
# !rm readme
# !rm smsspamcollection.zip

### 2.2 Creating a model

**Note**: Skip the pyspark install step below if you are using a Spark kernel on Watson Studio.

In [None]:
!pip install --upgrade pyspark==3.3.0

**Note**: When running this notebook locally, If the `SparkSession` import fails below, set 'SPARK_HOME' environment variable with the path to `pyspark` installation.

In [5]:
import pandas as pd
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
df = spark.read.csv(path="SMSSpam.csv", header=True, multiLine=True, escape='"')
df.show(5, truncate = False)

24/08/07 12:20:12 WARN Utils: Your hostname, Nelwins-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.0.103 instead (on interface en0)
24/08/07 12:20:12 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/08/07 12:20:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/08/07 12:20:13 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.


+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|label|text                                                                                                                                                       |
+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|ham  |Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...                                            |
|ham  |Ok lar... Joking wif u oni...                                                                                                                              |
|spam |Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's|
|ham  |U dun say

In [6]:
train_df, test_df = df.randomSplit([0.8, 0.2], seed=12345)
print("Total count of data set: {}".format(df.count()))
print("Total count of training data set: {}".format(train_df.count()))
print("Total count of test data set: {}".format(test_df.count()))

Total count of data set: 5572
Total count of training data set: 4454
Total count of test data set: 1118


In [7]:
!pip install nltk
from pyspark.ml.feature import StringIndexer, IndexToString, CountVectorizer, Tokenizer, IDF, StopWordsRemover
from pyspark.ml.classification import GBTClassifier
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml import Pipeline, Model
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

nltk.download('punkt')
nltk.download('stopwords')
stop_words = list(set(stopwords.words('english')))

stringIndexer_label = StringIndexer(inputCol="label", outputCol="label_ix").fit(df)
tokenizer = Tokenizer(inputCol="text", outputCol="words")
stopword_remover = StopWordsRemover(inputCol="words", outputCol="filtered_words").setStopWords(stop_words)
count = CountVectorizer(inputCol="filtered_words", outputCol="rawFeatures")
idf = IDF(inputCol="rawFeatures", outputCol="features")
nb = GBTClassifier(labelCol="label_ix")
labelConverter = IndexToString(inputCol="prediction", outputCol="predictionLabel", labels=stringIndexer_label.labels)

[nltk_data] Downloading package punkt to /Users/nelwin/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/nelwin/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


24/08/07 12:20:24 WARN GarbageCollectionMetrics: To enable non-built-in garbage collector(s) List(G1 Concurrent GC), users should configure it(them) to spark.eventLog.gcMetrics.youngGenerationGarbageCollectors or spark.eventLog.gcMetrics.oldGenerationGarbageCollectors


In [8]:
pipeline = Pipeline(stages=[stringIndexer_label, tokenizer, stopword_remover, count, idf, nb, labelConverter])
model = pipeline.fit(train_df)
predictions = model.transform(test_df)
evaluator = BinaryClassificationEvaluator(labelCol="label_ix", rawPredictionCol="prediction", metricName="areaUnderROC")
auc = evaluator.evaluate(predictions)

print("Area under ROC curve = %g" % auc)

                                                                                

Area under ROC curve = 0.856224


24/08/07 12:20:45 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS
24/08/07 12:20:45 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.VectorBLAS


In [9]:
import json
from ibm_watson_machine_learning import APIClient

wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

'1.0.360'

In [10]:
wml_client.spaces.list(limit=10)

------------------------------------  -------------------------------------------------------------------  ------------------------
ID                                    NAME                                                                 CREATED
7e5a8be6-9103-4c22-9c43-b66f3d8364de  poojitha_notebooks_space                                             2024-06-29T13:53:46.645Z
16ccd855-46bd-43ed-8219-5f00ac565d08  shreya-space                                                         2024-06-26T04:29:17.302Z
bc3b9797-c509-4fb4-a424-f67b1e2ed4be  QUALITY_WMLV4_PREPROD                                                2024-06-23T12:23:04.790Z
e396e187-2977-47b4-ade3-1539f9f10adc  QUALITY_WMLV4_PROD                                                   2024-06-23T12:22:54.422Z
40c4d032-0339-4da6-bfec-4bdb096c9650  shreya                                                               2024-06-20T10:54:20.088Z
088c142e-f35e-4e48-a30c-ad55a6edeecc  notebooks 5.0                                          

Unnamed: 0,ID,NAME,CREATED
0,7e5a8be6-9103-4c22-9c43-b66f3d8364de,poojitha_notebooks_space,2024-06-29T13:53:46.645Z
1,16ccd855-46bd-43ed-8219-5f00ac565d08,shreya-space,2024-06-26T04:29:17.302Z
2,bc3b9797-c509-4fb4-a424-f67b1e2ed4be,QUALITY_WMLV4_PREPROD,2024-06-23T12:23:04.790Z
3,e396e187-2977-47b4-ade3-1539f9f10adc,QUALITY_WMLV4_PROD,2024-06-23T12:22:54.422Z
4,40c4d032-0339-4da6-bfec-4bdb096c9650,shreya,2024-06-20T10:54:20.088Z
5,088c142e-f35e-4e48-a30c-ad55a6edeecc,notebooks 5.0,2024-06-13T04:42:07.336Z
6,b9b3d3b4-6e26-4e16-807d-e8bf5e7d6984,MRM_WMLV4_PREPROD,2024-06-12T15:49:26.571Z
7,d22e2b6b-917c-4427-a40c-1a439352a742,MRM_WMLV4_PROD,2024-06-12T15:49:16.185Z
8,ce15e0f6-be30-4349-af47-35ae15983bf1,openscale-express-path-preprod-00000000-0000-0...,2024-06-04T05:18:51.988Z
9,6264dc0e-087a-4dea-bcbc-6bd872b510fb,openscale-express-path-00000000-0000-0000-0000...,2024-06-04T05:18:30.811Z


In [11]:
WML_SPACE_ID='***' # use space id here
wml_client.set.default_space(WML_SPACE_ID)

'SUCCESS'

In [12]:
MODEL_NAME = "Text Binary Classifier"

In [13]:
software_spec_uid = wml_client.software_specifications.get_id_by_name("spark-mllib_3.3")
print("Software Specification ID: {}".format(software_spec_uid))
model_props = {
        wml_client._models.ConfigurationMetaNames.NAME:"{}".format(MODEL_NAME),
        wml_client._models.ConfigurationMetaNames.TYPE: "mllib_3.3",
        wml_client._models.ConfigurationMetaNames.SOFTWARE_SPEC_UID: software_spec_uid,
        wml_client._models.ConfigurationMetaNames.LABEL_FIELD: "label",
    }

Software Specification ID: d11f2434-4fc7-58b7-8a62-755da64fdaf8


In [14]:
print("Storing model ...")
published_model_details = wml_client.repository.store_model(
    model=model, 
    meta_props=model_props, 
    training_data=train_df, 
    pipeline=pipeline)

model_uid = wml_client.repository.get_model_id(published_model_details)
print("Done")
print("Model ID: {}".format(model_uid))

Storing model ...


                                                                                

Done
Model ID: 20516963-37ff-4769-a41f-b509827f3076


### 2.3 Deploying the model

In [15]:
deployment_details = wml_client.deployments.create(
    model_uid, 
    meta_props={
        wml_client.deployments.ConfigurationMetaNames.NAME: "{}".format(MODEL_NAME + " deployment"),
        wml_client.deployments.ConfigurationMetaNames.ONLINE: {}
    }
)
scoring_url = wml_client.deployments.get_scoring_href(deployment_details)
deployment_uid=wml_client.deployments.get_id(deployment_details)

print("Scoring URL:" + scoring_url)
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))



#######################################################################################

Synchronous deployment creation for uid: '20516963-37ff-4769-a41f-b509827f3076' started

#######################################################################################


initializing
Note: Software specification spark-mllib_3.3 is deprecated. Use spark-mllib_3.4 software specification instead when saving a spark model. For details, see https://www.ibm.com/support/producthub/icpdata/docs/content/SSQNUZ_latest/wsj/wmls/wmls-deploy-python-types.html.

ready


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20'
------------------------------------------------------------------------------------------------


Scoring URL:https://cpd-cpd-instance.apps.wos415nfs2672.cp.fyre.ibm.com/ml/v4/deployments/18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20/predictions
Model i

## 3. Subscriptions <a id="subscription"></a>

### 3.1 Configuring OS

In [16]:
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import APIClient

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *


authenticator = CloudPakForDataAuthenticator(
        url=WOS_CREDENTIALS['url'],
        username=WOS_CREDENTIALS['username'],
        password=WOS_CREDENTIALS['password'],
        disable_ssl_verification=True
    )

wos_client = APIClient(service_url=WOS_CREDENTIALS['url'],authenticator=authenticator)
wos_client.version

'3.0.39'

**Note**: Please re-run the above cell if it doesn't work the first time.

In [17]:
#DB_CREDENTIALS= {"hostname":"","username":"","password":"","database":"","port":"","ssl":True,"sslmode":"","certificate_base64":""}
DB_CREDENTIALS = None
KEEP_MY_INTERNAL_POSTGRES = True

In [18]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['hostname'],
                        username=DB_CREDENTIALS['username'],
                        password=DB_CREDENTIALS['password'],
                        db=DB_CREDENTIALS['database'],
                        port=DB_CREDENTIALS['port'],
                        ssl=True,
                        sslmode=DB_CREDENTIALS['sslmode'],
                        certificate_base64=DB_CREDENTIALS['certificate_base64']
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))

Using existing datamart 00000000-0000-0000-0000-000000000000


In [19]:
SERVICE_PROVIDER_NAME = "Watson Machine Learning V2_test"
SERVICE_PROVIDER_DESCRIPTION = "Added by tutorial WOS notebook."

In [20]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    if service_instance_name == SERVICE_PROVIDER_NAME:
        service_provider_id = service_provider.metadata.id
        wos_client.service_providers.delete(service_provider_id)
        print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

In [21]:
added_service_provider_result = wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description=SERVICE_PROVIDER_DESCRIPTION,
        service_type=ServiceTypes.WATSON_MACHINE_LEARNING,
        deployment_space_id = WML_SPACE_ID,
        operational_space_id = "production",
        credentials=WMLCredentialsCP4D(),
        background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id




 Waiting for end of adding service provider b9fd3cb5-c7a4-4b5c-a90c-684676276352 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




In [22]:
asset_deployment_details_list = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_space_id = WML_SPACE_ID).result['resources']
DEPLOYMENT_NAME='Text Binary Classifier deployment' # use the model name here 
asset_deployment_details = [asset for asset in asset_deployment_details_list if asset['entity']["name"]==DEPLOYMENT_NAME]

if len(asset_deployment_details)>0:
    [asset_deployment_details] = asset_deployment_details
else:
    raise ValueError('deployment with name "{}" not found.'.format(DEPLOYMENT_NAME))
asset_deployment_details

{'metadata': {'guid': '18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20',
  'created_at': '2024-08-07T06:51:15.201Z',
  'modified_at': '2024-08-07T06:51:15.201Z'},
 'entity': {'name': 'Text Binary Classifier deployment',
  'type': 'online',
  'scoring_endpoint': {'url': 'https://internal-nginx-svc:12443/ml/v4/deployments/18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20/predictions'},
  'asset': {},
  'asset_properties': {}}}

In [23]:
model_asset_details_from_deployment=wos_client.service_providers.get_deployment_asset(data_mart_id=data_mart_id,service_provider_id=service_provider_id,deployment_id=deployment_uid,deployment_space_id=WML_SPACE_ID)
model_asset_details_from_deployment

{'metadata': {'guid': '18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20',
  'created_at': '2024-08-07T06:51:15.201Z',
  'modified_at': '2024-08-07T06:51:15.201Z'},
 'entity': {'name': 'Text Binary Classifier deployment',
  'type': 'online',
  'scoring_endpoint': {'url': 'https://internal-nginx-svc:12443/ml/v4/deployments/18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20/predictions'},
  'asset': {'asset_id': '20516963-37ff-4769-a41f-b509827f3076',
   'url': 'https://internal-nginx-svc:12443/ml/v4/models/20516963-37ff-4769-a41f-b509827f3076?space_id=7e5a8be6-9103-4c22-9c43-b66f3d8364de&version=2020-06-12',
   'name': 'Text Binary Classifier',
   'asset_type': 'model',
   'created_at': '2024-08-07T06:51:06.899Z',
   'modified_at': '2024-08-07T06:51:12.207Z'},
  'asset_properties': {'model_type': 'mllib_3.3',
   'runtime_environment': 'spark-3.3.0',
   'label_column': 'label',
   'input_data_schema': {'type': 'struct',
    'id': '1',
    'fields': [{'name': 'text',
      'type': 'string',
      'nullable': True,


### 3.2 Subscribe the asset

In [24]:
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    sub_model_id = subscription.entity.asset.asset_id
    if sub_model_id == model_uid:
        wos_client.subscriptions.delete(subscription.metadata.id)
        print('Deleted existing subscription for model', sub_model_id)

In [25]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest

In [26]:
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=Asset(
            asset_id=model_asset_details_from_deployment["entity"]["asset"]["asset_id"],
            name=model_asset_details_from_deployment["entity"]["asset"]["name"],
            url=model_asset_details_from_deployment["entity"]["asset"]["url"],
            asset_type=AssetTypes.MODEL,
            input_data_type=InputDataType.UNSTRUCTURED_TEXT,
            problem_type=ProblemType.BINARY_CLASSIFICATION
        ),
        deployment=AssetDeploymentRequest(
            deployment_id=asset_deployment_details['metadata']['guid'],
            name=asset_deployment_details['entity']['name'],
            deployment_type= DeploymentTypes.ONLINE,
            url=model_asset_details_from_deployment['entity']['asset']['url'],
            scoring_endpoint=ScoringEndpointRequest(url=scoring_url) # scoring model without shadow deployment
        ),
        asset_properties=AssetPropertiesRequest(
            label_column='label',
            probability_fields=['probability'],
            prediction_field='predictionLabel',
            feature_fields = ["text"],
            categorical_fields = ["text"],
            training_data_schema=SparkStruct.from_dict(model_asset_details_from_deployment["entity"]["asset_properties"]["training_data_schema"])
        )
    ).result
subscription_id = subscription_details.metadata.id
subscription_id


'd673e127-dd69-4b05-a204-5b90332d7e72'

In [27]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

Payload data set id:  506eb78a-57a1-4e15-96a3-c83b4e6584d3


### 3.3 Get subscription

In [28]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8,9
20516963-37ff-4769-a41f-b509827f3076,model,Text Binary Classifier,00000000-0000-0000-0000-000000000000,18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20,Text Binary Classifier deployment,b9fd3cb5-c7a4-4b5c-a90c-684676276352,active,2024-08-07 06:51:41.900000+00:00,d673e127-dd69-4b05-a204-5b90332d7e72
67e11064-81b2-4ea8-addf-b39c218299b4,model,German Credit Risk Model - Challenger,00000000-0000-0000-0000-000000000000,978e26e1-509c-4e03-b20a-baacfbbe222b,German Credit Risk Model - Challenger,81655d73-6ea3-44ae-9876-86df1fa6390d,active,2024-08-07 06:07:38.473000+00:00,c0174a6c-d8e6-458b-8549-cd03d7df486e
f1b4fe35-1a75-4dca-a9b3-bc5dd7ed43db,model,German Credit Risk Model - PreProd,00000000-0000-0000-0000-000000000000,42b13344-baa1-45a0-89ca-ef662387f691,German Credit Risk Model - PreProd,81655d73-6ea3-44ae-9876-86df1fa6390d,active,2024-08-07 06:07:24.153000+00:00,61003e50-20d6-4ce8-a379-013a03c44229
a4a1a71d-2a44-43f8-b92a-ba3c80dc36fd,model,Credit Risk python Fn Model,00000000-0000-0000-0000-000000000000,90b19695-de03-41b9-b820-2d3e11c2f662,dep_Credit Risk python Fn Model,e413a76d-802c-44f3-8c55-300c5aa2659b,active,2024-08-06 13:32:53.985000+00:00,77c99979-8408-40b7-ab3a-e3c0a8a80aab
eed8b225-19e5-4460-9b6f-7271dc1e3ff2,model,Adult Census Income Classifier Model,00000000-0000-0000-0000-000000000000,038b8c68-6615-4c0d-8d46-15783d6ab9c8,Adult Census Income Classifier Deployment,c3601bc6-22c9-45a1-b016-c29cf0bf64b4,active,2024-08-06 13:26:09.549000+00:00,c274e76c-6bb9-43ce-ba64-127a3db95ede
438ca544-9bd1-48c2-8e8d-3de4ef4ca79b,model,WML_IAE4,00000000-0000-0000-0000-000000000000,78a0af9e-1014-4fb1-b22a-5e11f4fd70e7,WML_IAE4,d90c6bf2-49c6-4179-9876-8b85b0247d95,active,2024-07-02 07:03:31.504000+00:00,e34b9b87-b6e1-4c53-b92e-cb80dea042be
592b902d-3dc9-4e56-8bcb-86cbf1a6d8a9,model,gcr - P2 XGB Classifier - Model,00000000-0000-0000-0000-000000000000,2b976af0-e4ab-4859-af7d-2f2287d864ad,gcr model,4d2f2fb2-6b64-4d58-8f13-257166e468e9,active,2024-07-17 07:11:44.727000+00:00,e2df4ec7-6c75-416f-a444-8d21389f7513
e3ac9fc3-bccf-4a4e-b37b-490bfb93dd81,model,GCR AutoAI - P2 XGB Classifier - Model,00000000-0000-0000-0000-000000000000,6399e6e8-df5a-4370-9af4-34b2f2e76bc6,GCR Auto AI,a7ca157a-de07-457a-8c4c-b1a2e998699c,active,2024-07-03 10:37:20.487000+00:00,ce36911c-75f6-4f99-b4d2-b71e5d55a802
592b902d-3dc9-4e56-8bcb-86cbf1a6d8a9,model,gcr - P2 XGB Classifier - Model,00000000-0000-0000-0000-000000000000,2b976af0-e4ab-4859-af7d-2f2287d864ad,gcr model,4d2f2fb2-6b64-4d58-8f13-257166e468e9,active,2024-07-02 15:33:17.454000+00:00,e96278a6-7190-48f3-b8bd-945fa48cfe50
327d8aea-ecfc-4990-9bb9-601a1695094d,model,GCR AutoAI - P2 XGB Classifier - Model,00000000-0000-0000-0000-000000000000,755c3e75-24b5-4839-8a8f-3f85c07a40c9,GCR demo,a7ca157a-de07-457a-8c4c-b1a2e998699c,active,2024-07-02 12:03:14.224000+00:00,a0f86241-8bfc-4322-895a-2597512e1653


Note: First 10 records were displayed.


In [29]:
wos_client.subscriptions.get(subscription_id).result.to_dict()

{'metadata': {'id': 'd673e127-dd69-4b05-a204-5b90332d7e72',
  'crn': 'crn:v1:bluemix:public:aiopenscale:us-south:a/na:00000000-0000-0000-0000-000000000000:subscription:d673e127-dd69-4b05-a204-5b90332d7e72',
  'url': '/v2/subscriptions/d673e127-dd69-4b05-a204-5b90332d7e72',
  'created_at': '2024-08-07T06:51:41.900000Z',
  'created_by': 'cpadmin',
  'modified_at': '2024-08-07T06:51:43.905000Z',
  'modified_by': 'cpadmin'},
 'entity': {'data_mart_id': '00000000-0000-0000-0000-000000000000',
  'service_provider_id': 'b9fd3cb5-c7a4-4b5c-a90c-684676276352',
  'asset': {'asset_id': '20516963-37ff-4769-a41f-b509827f3076',
   'url': 'https://internal-nginx-svc:12443/ml/v4/models/20516963-37ff-4769-a41f-b509827f3076?space_id=7e5a8be6-9103-4c22-9c43-b66f3d8364de&version=2020-06-12',
   'name': 'Text Binary Classifier',
   'asset_type': 'model',
   'problem_type': 'binary',
   'input_data_type': 'unstructured_text'},
  'asset_properties': {'training_data_schema': {'type': 'struct',
    'fields': [

### 3.4 Score the model and get transaction-id

In [30]:
text = "SIX chances to win CASH! From 100 to 20,000 pounds txt> CSH11 and send to 87575. Cost 150p/day, 6days, 16+ TsandCs apply Reply HL 4 info"
payload = {"input_data": [{"fields": ["text"], "values": [[text]]}]}

response = wml_client.deployments.score(deployment_uid,payload)
print(response)

{'predictions': [{'fields': ['text', 'label_ix', 'words', 'filtered_words', 'rawFeatures', 'features', 'rawPrediction', 'probability', 'prediction', 'predictionLabel'], 'values': [['SIX chances to win CASH! From 100 to 20,000 pounds txt> CSH11 and send to 87575. Cost 150p/day, 6days, 16+ TsandCs apply Reply HL 4 info', 0.0, ['six', 'chances', 'to', 'win', 'cash!', 'from', '100', 'to', '20,000', 'pounds', 'txt>', 'csh11', 'and', 'send', 'to', '87575.', 'cost', '150p/day,', '6days,', '16+', 'tsandcs', 'apply', 'reply', 'hl', '4', 'info'], ['six', 'chances', 'win', 'cash!', '100', '20,000', 'pounds', 'txt>', 'csh11', 'send', '87575.', 'cost', '150p/day,', '6days,', '16+', 'tsandcs', 'apply', 'reply', 'hl', '4', 'info'], [11799, [8, 18, 40, 102, 313, 400, 408, 510, 527, 1032, 1444, 1682, 1894, 2833, 2844, 2864, 3487, 3543, 4111, 4338, 4522], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]], [11799, [8, 18, 40, 102, 313, 400, 408, 51

In [31]:
wos_client.data_sets.get_records_count(payload_data_set_id)

1

## 4. Explainability

### 4.1 Configure Explainability

In [32]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_id = explainability_details.metadata.id




 Waiting for end of monitor instance creation 3a4799d7-29cf-45e2-885e-0f74be401631 




preparing
active

---------------------------------------
 Monitor instance successfully created 
---------------------------------------




### 4.2 Get explanation for the transaction

Generate LIME or SHAP explanation as needed.

In [33]:
pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result
scoring_ids = [pl_records_resp["records"][0]["entity"]["values"]["scoring_id"]]
print("Running explanations on scoring IDs: {}".format(scoring_ids))
explanation_types = ["lime"] # Specify ["shap"] to generate SHAP explanation
result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types, subscription_id=subscription_id).result
print(result)

Running explanations on scoring IDs: ['8e130860e92435e295d538a4a2980d86-1']
{
  "metadata": {
    "explanation_task_ids": [
      "e6e1a981-1d64-42da-a94e-5aaa3d7e0d3b"
    ],
    "created_by": "1000331001",
    "created_at": "2024-08-07T06:52:23.539044Z"
  }
}


In [34]:
explanation_task_id=result.to_dict()['metadata']['explanation_task_ids'][0]
wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id, subscription_id=subscription_id).result.to_dict()

{'metadata': {'explanation_task_id': 'e6e1a981-1d64-42da-a94e-5aaa3d7e0d3b',
  'created_by': '1000331001',
  'created_at': '2024-08-07T06:52:23.539044Z'},
 'entity': {'status': {'state': 'in_progress'},
  'asset': {'id': '20516963-37ff-4769-a41f-b509827f3076',
   'name': 'Text Binary Classifier',
   'input_data_type': 'unstructured_text',
   'problem_type': 'binary',
   'deployment': {'id': '18c8f8d4-cfbf-4fbb-a51a-5ac59d2e9e20',
    'name': 'Text Binary Classifier deployment'}},
  'scoring_id': '8e130860e92435e295d538a4a2980d86-1'}}