<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Tutorial on generating an explanation for a text-based model on Watson OpenScale

This notebook includes steps for creating a text-based watson-machine-learning model, creating a subscription, configuring explainability, and finally generating an explanation for a transaction.

### Contents
- [1. Setup](#setup)
- [2. Creating and deploying a text-based model](#deploy)
- [3. Subscriptions](#subscription)
- [4. Explainability](#explainability)

**Note**: This notebook works correctly with kernel `Python 3.10.x` with pyspark 3.3.x.

<a id="setup"></a>
## 1. Setup

### 1.1 Install Watson OpenScale and WML packages

In [None]:
!pip install --upgrade "ibm-watson-openscale~=3.0.34" --no-cache --user| tail -n 1

In [None]:
!pip install --upgrade ibm-watson-machine-learning --no-cache | tail -n 1

Note: Restart the kernel to assure the new libraries are being used.

### 1.2 Configure credentials

Your Cloud API key can be generated by going to the [**Users** section of the Cloud console](https://cloud.ibm.com/iam#/users). From that page, click your name, scroll down to the **API Keys** section, and click **Create an IBM Cloud API key**. Give your key a name and click **Create**, then copy the created key and paste it below.

**NOTE:** You can also get OpenScale `API_KEY` using IBM CLOUD CLI.

How to install IBM Cloud (bluemix) console: [instruction](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

How to get api key using console:
```
bx login --sso
bx iam api-key-create 'my_key'
```

In [1]:
CLOUD_API_KEY = "***"
IAM_URL="https://iam.ng.bluemix.net/oidc/token"

In [2]:
WML_CREDENTIALS = {
                   "url": "https://us-south.ml.cloud.ibm.com",
                   "apikey": CLOUD_API_KEY
}

## 2. Creating and deploying a text-based model <a id="deploy"></a>

The dataset used is the UCI-ML SMS Spam Collection Dataset which can be found here: https://archive.ics.uci.edu/ml/machine-learning-databases/00228/. It is a binary classification dataset with the labels being 'ham' and 'spam'.

### 2.1 Loading the training data

In [3]:
!rm -rf SMSSpam.csv
!wget 'https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/spam_detection/SMSSpam.csv'

--2024-08-06 11:17:56--  https://raw.githubusercontent.com/IBM/watson-openscale-samples/main/IBM%20Cloud/WML/assets/data/spam_detection/SMSSpam.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 480803 (470K) [text/plain]
Saving to: ‘SMSSpam.csv’


2024-08-06 11:17:57 (7.70 MB/s) - ‘SMSSpam.csv’ saved [480803/480803]



In [4]:
# The training data is downloaded and saved as 'SMSSpam.csv' in this step from public link

# !pip install pandas
# !rm smsspamcollection.zip
# !wget https://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip
# !unzip smsspamcollection.zip
#pd.read_csv("smsspamcollection.zip",sep="\t",header=None, encoding="utf-8").to_csv("SMSSpam.csv", header=["label", "text"], sep=",", index=False)

# !rm SMSSpamCollection
# !rm readme
# !rm smsspamcollection.zip

### 2.2 Creating a model

**Note**: Skip the pyspark install step below if you are using a Spark kernel on Watson Studio.

In [5]:
!pip install --upgrade pyspark==3.2.0

Collecting pyspark==3.2.0
  Downloading pyspark-3.2.0.tar.gz (281.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m281.3/281.3 MB[0m [31m10.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25h  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
[?25hCollecting py4j==0.10.9.2 (from pyspark==3.2.0)
  Downloading py4j-0.10.9.2-py2.py3-none-any.whl.metadata (1.3 kB)
Downloading py4j-0.10.9.2-py2.py3-none-any.whl (198 kB)
Building wheels for collected packages: pyspark
  Building wheel for pyspark (pyproject.toml) ... [?25ldone
[?25h  Created wheel for pyspark: filename=pyspark-3.2.0-py2.py3-none-any.whl size=281805893 sha256=3c1b5d3bbe21901d7cfea503ca0b7d7b7f8270fa7930581b391dd92ef597afdb
  Stored in directory: /Users/nelwin/Library/Caches/pip/wheels/2f/f8/95/2ad14a4614b4a9f645ee928fbbd057b1b254c67adb494c9a58
Successfully built pyspark
Installing colle

**Note**: When running this notebook locally, If the `SparkSession` import fails below, set 'SPARK_HOME' environment variable with the path to `pyspark` installation.

In [6]:
import pandas as pd
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
df = spark.read.csv(path="SMSSpam.csv", header=True, multiLine=True, escape='"')
df.show(5, truncate = False)

24/08/06 11:18:47 WARN Utils: Your hostname, Nelwins-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.0.103 instead (on interface en0)
24/08/06 11:18:47 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/08/06 11:18:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable


+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|label|text                                                                                                                                                       |
+-----+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
|ham  |Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...                                            |
|ham  |Ok lar... Joking wif u oni...                                                                                                                              |
|spam |Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)T&C's apply 08452810075over18's|
|ham  |U dun say

In [7]:
train_df, test_df = df.randomSplit([0.8, 0.2], seed=12345)
print("Total count of data set: {}".format(df.count()))
print("Total count of training data set: {}".format(train_df.count()))
print("Total count of test data set: {}".format(test_df.count()))

Total count of data set: 5572
Total count of training data set: 4454
Total count of test data set: 1118


In [8]:
!pip install nltk
from pyspark.ml.feature import StringIndexer, IndexToString, CountVectorizer, Tokenizer, IDF, StopWordsRemover
from pyspark.ml.classification import GBTClassifier
from pyspark.ml.evaluation import BinaryClassificationEvaluator
from pyspark.ml import Pipeline, Model
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords

nltk.download('punkt')
nltk.download('stopwords')
stop_words = list(set(stopwords.words('english')))

stringIndexer_label = StringIndexer(inputCol="label", outputCol="label_ix").fit(df)
tokenizer = Tokenizer(inputCol="text", outputCol="words")
stopword_remover = StopWordsRemover(inputCol="words", outputCol="filtered_words").setStopWords(stop_words)
count = CountVectorizer(inputCol="filtered_words", outputCol="rawFeatures")
idf = IDF(inputCol="rawFeatures", outputCol="features")
nb = GBTClassifier(labelCol="label_ix")
labelConverter = IndexToString(inputCol="prediction", outputCol="predictionLabel", labels=stringIndexer_label.labels)



[nltk_data] Downloading package punkt to /Users/nelwin/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/nelwin/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [9]:
pipeline = Pipeline(stages=[stringIndexer_label, tokenizer, stopword_remover, count, idf, nb, labelConverter])
model = pipeline.fit(train_df)
predictions = model.transform(test_df)
evaluator = BinaryClassificationEvaluator(labelCol="label_ix", rawPredictionCol="prediction", metricName="areaUnderROC")
auc = evaluator.evaluate(predictions)

print("Area under ROC curve = %g" % auc)

24/08/06 11:19:07 WARN GarbageCollectionMetrics: To enable non-built-in garbage collector(s) List(G1 Concurrent GC), users should configure it(them) to spark.eventLog.gcMetrics.youngGenerationGarbageCollectors or spark.eventLog.gcMetrics.oldGenerationGarbageCollectors


Area under ROC curve = 0.856224


24/08/06 11:19:14 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS
24/08/06 11:19:14 WARN InstanceBuilder: Failed to load implementation from:dev.ludovic.netlib.blas.VectorBLAS


In [10]:
import json
from ibm_watson_machine_learning import APIClient

wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

'1.0.360'

In [11]:
wml_client.spaces.list(limit=10)

------------------------------------  -------------------------------------------------------------------  ------------------------
ID                                    NAME                                                                 CREATED
4021f1d9-c203-4e9f-97f6-4766dd48155b  prod-space                                                           2024-08-05T04:42:04.665Z
be45ab4c-1fb7-440c-9b03-2909067e45e0  Automotive Demo - Quality report summarization                       2024-05-27T13:13:27.233Z
e0ee6250-7ef6-42c3-8ffa-350d9b0df578  pre-prod-space                                                       2024-02-28T07:35:30.368Z
63c5982f-7160-41c4-86f7-1310a8ab32cb  prompt-space                                                         2024-01-15T19:29:21.535Z
f04e0e73-a1b7-4ae9-a08d-7e16add4fe08  llm. space                                                           2023-11-24T13:57:05.167Z
d1afbea3-e899-4ed3-b9a6-0686751508c3  wml                                                    

Unnamed: 0,ID,NAME,CREATED
0,4021f1d9-c203-4e9f-97f6-4766dd48155b,prod-space,2024-08-05T04:42:04.665Z
1,be45ab4c-1fb7-440c-9b03-2909067e45e0,Automotive Demo - Quality report summarization,2024-05-27T13:13:27.233Z
2,e0ee6250-7ef6-42c3-8ffa-350d9b0df578,pre-prod-space,2024-02-28T07:35:30.368Z
3,63c5982f-7160-41c4-86f7-1310a8ab32cb,prompt-space,2024-01-15T19:29:21.535Z
4,f04e0e73-a1b7-4ae9-a08d-7e16add4fe08,llm. space,2023-11-24T13:57:05.167Z
5,d1afbea3-e899-4ed3-b9a6-0686751508c3,wml,2023-09-21T07:06:42.577Z
6,6f7c3969-6d3f-4f9a-b97a-b534f4e4fef3,AutoAIDemo,2023-08-25T02:50:27.113Z
7,0b7992c2-3991-4145-a5ba-d5b428261171,openscale-express-path-preprod-80e6093f-5acf-4...,2023-08-16T06:59:18.538Z
8,3226c381-5ae0-4bc4-b306-fc638c785e47,openscale-express-path-80e6093f-5acf-4eb7-9da6...,2023-08-16T06:58:57.332Z


In [12]:
WML_SPACE_ID='***' # use space id here
wml_client.set.default_space(WML_SPACE_ID)

'SUCCESS'

In [13]:
MODEL_NAME = "Text Binary Classifier"

In [14]:
software_spec_uid = wml_client.software_specifications.get_id_by_name("spark-mllib_3.3")
print("Software Specification ID: {}".format(software_spec_uid))
model_props = {
        wml_client._models.ConfigurationMetaNames.NAME:"{}".format(MODEL_NAME),
        wml_client._models.ConfigurationMetaNames.TYPE: "mllib_3.3",
        wml_client._models.ConfigurationMetaNames.SOFTWARE_SPEC_UID: software_spec_uid,
        wml_client._models.ConfigurationMetaNames.LABEL_FIELD: "label",
    }

Software Specification ID: d11f2434-4fc7-58b7-8a62-755da64fdaf8


In [15]:
print("Storing model ...")
published_model_details = wml_client.repository.store_model(
    model=model, 
    meta_props=model_props, 
    training_data=train_df, 
    pipeline=pipeline)

model_uid = wml_client.repository.get_model_id(published_model_details)
print("Done")
print("Model ID: {}".format(model_uid))

Storing model ...
Done
Model ID: 3f86844d-af75-41fa-89bf-597e81f08a8d


### 2.3 Deploying the model

In [16]:
deployment_details = wml_client.deployments.create(
    model_uid, 
    meta_props={
        wml_client.deployments.ConfigurationMetaNames.NAME: "{}".format(MODEL_NAME + " deployment"),
        wml_client.deployments.ConfigurationMetaNames.ONLINE: {}
    }
)
scoring_url = wml_client.deployments.get_scoring_href(deployment_details)
deployment_uid=wml_client.deployments.get_id(deployment_details)

print("Scoring URL:" + scoring_url)
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))



#######################################################################################

Synchronous deployment creation for uid: '3f86844d-af75-41fa-89bf-597e81f08a8d' started

#######################################################################################


initializing
Note: online_url and serving_urls are deprecated and will be removed in a future release. Use inference instead.

ready


------------------------------------------------------------------------------------------------
Successfully finished deployment creation, deployment_uid='26af6dfb-8745-4ba8-a00c-403cd2d77cb4'
------------------------------------------------------------------------------------------------


Scoring URL:https://us-south.ml.cloud.ibm.com/ml/v4/deployments/26af6dfb-8745-4ba8-a00c-403cd2d77cb4/predictions
Model id: 3f86844d-af75-41fa-89bf-597e81f08a8d
Deployment id: 26af6dfb-8745-4ba8-a00c-403cd2d77cb4


## 3. Subscriptions <a id="subscription"></a>

### 3.1 Configuring OS

In [17]:
from ibm_cloud_sdk_core.authenticators import IAMAuthenticator,BearerTokenAuthenticator

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *


authenticator = IAMAuthenticator(apikey=CLOUD_API_KEY)
#authenticator = BearerTokenAuthenticator(bearer_token=IAM_TOKEN) ## uncomment this line if using IAM token to authenticate
wos_client = APIClient(authenticator=authenticator)
wos_client.version

'3.0.39'

**Note**: Please re-run the above cell if it doesn't work the first time.

In [18]:
#DB_CREDENTIALS= {"hostname":"","username":"","password":"","database":"","port":"","ssl":True,"sslmode":"","certificate_base64":""}
DB_CREDENTIALS = None
KEEP_MY_INTERNAL_POSTGRES = True

In [19]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['hostname'],
                        username=DB_CREDENTIALS['username'],
                        password=DB_CREDENTIALS['password'],
                        db=DB_CREDENTIALS['database'],
                        port=DB_CREDENTIALS['port'],
                        ssl=True,
                        sslmode=DB_CREDENTIALS['sslmode'],
                        certificate_base64=DB_CREDENTIALS['certificate_base64']
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))

Using existing datamart 80e6093f-5acf-4eb7-9da6-7ba9bf56a929


In [20]:
SERVICE_PROVIDER_NAME = "Watson Machine Learning V2_test"
SERVICE_PROVIDER_DESCRIPTION = "Added by tutorial WOS notebook."

In [21]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    # if service_instance_name == SERVICE_PROVIDER_NAME:
    service_provider_id = service_provider.metadata.id
    wos_client.service_providers.delete(service_provider_id)
    print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

Deleted existing service_provider for WML instance: 57e9af4f-cc42-4a7e-9e7a-65c230c768ea


In [22]:
added_service_provider_result = wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description=SERVICE_PROVIDER_DESCRIPTION,
        service_type=ServiceTypes.WATSON_MACHINE_LEARNING,
        deployment_space_id = WML_SPACE_ID,
        operational_space_id = "production",
        credentials=WMLCredentialsCloud(
            apikey=CLOUD_API_KEY,      ## use `apikey=IAM_TOKEN` if using IAM_TOKEN to initiate client
            url=WML_CREDENTIALS["url"],
            instance_id=None
        ),
        background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id




 Waiting for end of adding service provider 30846fc8-5b33-43aa-b684-84cc70ef2241 




active

-----------------------------------------------
 Successfully finished adding service provider 
-----------------------------------------------




In [23]:
asset_deployment_details_list = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_space_id = WML_SPACE_ID).result['resources']
DEPLOYMENT_NAME='Text Binary Classifier deployment' # use the model name here 
asset_deployment_details = [asset for asset in asset_deployment_details_list if asset['entity']["name"]==DEPLOYMENT_NAME]
if len(asset_deployment_details)>0:
    [asset_deployment_details] = asset_deployment_details[0]
else:
    raise ValueError('deployment with name "{}" not found.'.format(DEPLOYMENT_NAME))
asset_deployment_details

[{'metadata': {'guid': '26af6dfb-8745-4ba8-a00c-403cd2d77cb4', 'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/26af6dfb-8745-4ba8-a00c-403cd2d77cb4?space_id=f04e0e73-a1b7-4ae9-a08d-7e16add4fe08', 'created_at': '2024-08-06T05:49:35.567Z', 'modified_at': '2024-08-06T05:49:35.567Z'}, 'entity': {'name': 'Text Binary Classifier deployment', 'type': 'online', 'scoring_endpoint': {'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/26af6dfb-8745-4ba8-a00c-403cd2d77cb4/predictions'}, 'asset': {}, 'asset_properties': {}}}, {'metadata': {'guid': '82270e4a-2e1d-43c1-9ddd-dda3ad7faea6', 'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/82270e4a-2e1d-43c1-9ddd-dda3ad7faea6?space_id=f04e0e73-a1b7-4ae9-a08d-7e16add4fe08', 'created_at': '2024-08-06T05:40:13.218Z', 'modified_at': '2024-08-06T05:40:13.218Z'}, 'entity': {'name': 'Text Binary Classifier deployment', 'type': 'online', 'scoring_endpoint': {'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/82270e4a-2e1

In [24]:
model_asset_details_from_deployment=wos_client.service_providers.get_deployment_asset(data_mart_id=data_mart_id,service_provider_id=service_provider_id,deployment_id=deployment_uid,deployment_space_id=WML_SPACE_ID)
model_asset_details_from_deployment

{'metadata': {'guid': '26af6dfb-8745-4ba8-a00c-403cd2d77cb4',
  'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/26af6dfb-8745-4ba8-a00c-403cd2d77cb4?space_id=f04e0e73-a1b7-4ae9-a08d-7e16add4fe08',
  'created_at': '2024-08-06T05:49:35.567Z',
  'modified_at': '2024-08-06T05:49:35.567Z'},
 'entity': {'name': 'Text Binary Classifier deployment',
  'type': 'online',
  'scoring_endpoint': {'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/deployments/26af6dfb-8745-4ba8-a00c-403cd2d77cb4/predictions'},
  'asset': {'asset_id': '3f86844d-af75-41fa-89bf-597e81f08a8d',
   'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/models/3f86844d-af75-41fa-89bf-597e81f08a8d?space_id=f04e0e73-a1b7-4ae9-a08d-7e16add4fe08&version=2020-06-12',
   'name': 'Text Binary Classifier',
   'asset_type': 'model',
   'created_at': '2024-08-06T05:49:27.766Z',
   'modified_at': '2024-08-06T05:49:36.244Z'},
  'asset_properties': {'model_type': 'mllib_3.3',
   'runtime_environment': 'spark-3.3.0',
   'label_column'

### 3.2 Subscribe the asset

In [25]:
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    sub_model_id = subscription.entity.asset.asset_id
    if sub_model_id == model_uid:
        wos_client.subscriptions.delete(subscription.metadata.id)
        print('Deleted existing subscription for model', sub_model_id)

In [26]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest

In [27]:
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=Asset(
            asset_id=model_asset_details_from_deployment["entity"]["asset"]["asset_id"],
            name=model_asset_details_from_deployment["entity"]["asset"]["name"],
            url=model_asset_details_from_deployment["entity"]["asset"]["url"],
            asset_type=AssetTypes.MODEL,
            input_data_type=InputDataType.UNSTRUCTURED_TEXT,
            problem_type=ProblemType.BINARY_CLASSIFICATION
        ),
        deployment=AssetDeploymentRequest(
            deployment_id=asset_deployment_details['metadata']['guid'],
            name=asset_deployment_details['entity']['name'],
            deployment_type= DeploymentTypes.ONLINE,
            url=asset_deployment_details['metadata']['url'],
            scoring_endpoint=ScoringEndpointRequest(url=scoring_url) # scoring model without shadow deployment
        ),
        asset_properties=AssetPropertiesRequest(
            label_column='label',
            probability_fields=['probability'],
            prediction_field='predictionLabel',
            feature_fields = ["text"],
            categorical_fields = ["text"],
            training_data_schema=SparkStruct.from_dict(model_asset_details_from_deployment["entity"]["asset_properties"]["training_data_schema"])
        )
    ).result
subscription_id = subscription_details.metadata.id
subscription_id

'bd86b5e4-bac8-4163-80d8-c6740a025b63'

In [28]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

Payload data set id:  779801b8-7147-4550-b08c-96b8409eae63


### 3.3 Get subscription

In [29]:
wos_client.subscriptions.show()

0,1,2,3,4,5,6,7,8,9
3f86844d-af75-41fa-89bf-597e81f08a8d,model,Text Binary Classifier,80e6093f-5acf-4eb7-9da6-7ba9bf56a929,26af6dfb-8745-4ba8-a00c-403cd2d77cb4,Text Binary Classifier deployment,30846fc8-5b33-43aa-b684-84cc70ef2241,active,2024-08-06 05:52:27.130000+00:00,bd86b5e4-bac8-4163-80d8-c6740a025b63


In [30]:
wos_client.subscriptions.get(subscription_id).result.to_dict()

{'metadata': {'id': 'bd86b5e4-bac8-4163-80d8-c6740a025b63',
  'crn': 'crn:v1:bluemix:public:aiopenscale:us-south:a/0a3c25959fab4ecea2768fa6b8d61595:80e6093f-5acf-4eb7-9da6-7ba9bf56a929:subscription:bd86b5e4-bac8-4163-80d8-c6740a025b63',
  'url': '/v2/subscriptions/bd86b5e4-bac8-4163-80d8-c6740a025b63',
  'created_at': '2024-08-06T05:52:27.130000Z',
  'created_by': 'IBMid-662005298W',
  'modified_at': '2024-08-06T05:52:29.457000Z',
  'modified_by': 'IBMid-662005298W'},
 'entity': {'data_mart_id': '80e6093f-5acf-4eb7-9da6-7ba9bf56a929',
  'service_provider_id': '30846fc8-5b33-43aa-b684-84cc70ef2241',
  'asset': {'asset_id': '3f86844d-af75-41fa-89bf-597e81f08a8d',
   'url': 'https://us-south.ml.cloud.ibm.com/ml/v4/models/3f86844d-af75-41fa-89bf-597e81f08a8d?space_id=f04e0e73-a1b7-4ae9-a08d-7e16add4fe08&version=2020-06-12',
   'name': 'Text Binary Classifier',
   'asset_type': 'model',
   'problem_type': 'binary',
   'input_data_type': 'unstructured_text'},
  'asset_properties': {'training

### 3.4 Score the model and get transaction-id

In [31]:
text = "SIX chances to win CASH! From 100 to 20,000 pounds txt> CSH11 and send to 87575. Cost 150p/day, 6days, 16+ TsandCs apply Reply HL 4 info"
payload = {"input_data": [{"fields": ["text"], "values": [[text]]}]}

response = wml_client.deployments.score(deployment_uid,payload)
print(response)

{'predictions': [{'fields': ['text', 'label_ix', 'words', 'filtered_words', 'rawFeatures', 'features', 'rawPrediction', 'probability', 'prediction', 'predictionLabel'], 'values': [['SIX chances to win CASH! From 100 to 20,000 pounds txt> CSH11 and send to 87575. Cost 150p/day, 6days, 16+ TsandCs apply Reply HL 4 info', 0.0, ['six', 'chances', 'to', 'win', 'cash!', 'from', '100', 'to', '20,000', 'pounds', 'txt>', 'csh11', 'and', 'send', 'to', '87575.', 'cost', '150p/day,', '6days,', '16+', 'tsandcs', 'apply', 'reply', 'hl', '4', 'info'], ['six', 'chances', 'win', 'cash!', '100', '20,000', 'pounds', 'txt>', 'csh11', 'send', '87575.', 'cost', '150p/day,', '6days,', '16+', 'tsandcs', 'apply', 'reply', 'hl', '4', 'info'], [11799, [8, 18, 40, 102, 313, 400, 408, 510, 527, 1032, 1444, 1682, 1894, 2833, 2844, 2864, 3487, 3543, 4111, 4338, 4522], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]], [11799, [8, 18, 40, 102, 313, 400, 408, 51

In [32]:
wos_client.data_sets.get_records_count(payload_data_set_id)

1

## 4. Explainability

### 4.1 Configure Explainability

In [33]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_id = explainability_details.metadata.id

### 4.2 Get explanation for the transaction

Generate LIME or SHAP explanation as needed.

In [34]:
pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result
scoring_ids = [pl_records_resp["records"][0]["entity"]["values"]["scoring_id"]]
print("Running explanations on scoring IDs: {}".format(scoring_ids))
explanation_types = ["lime"] # Specify ["shap"] to generate SHAP explanation
result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types, subscription_id=subscription_id).result
print(result)

Running explanations on scoring IDs: ['0dce0fd6937d87ad885e1f82ab4e4ce2-1']
{
  "metadata": {
    "explanation_task_ids": [
      "0eecd086-c5b1-455b-9621-6c10a73e3443"
    ],
    "created_by": "IBMid-662005298W",
    "created_at": "2024-08-06T05:56:27.621436Z"
  }
}


In [35]:
explanation_task_id=result.to_dict()['metadata']['explanation_task_ids'][0]
wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id, subscription_id=subscription_id).result.to_dict()

{'metadata': {'explanation_task_id': '0eecd086-c5b1-455b-9621-6c10a73e3443',
  'created_by': 'IBMid-662005298W',
  'created_at': '2024-08-06T05:56:27.621436Z'},
 'entity': {'status': {'state': 'in_progress'},
  'asset': {'id': '3f86844d-af75-41fa-89bf-597e81f08a8d',
   'name': 'Text Binary Classifier',
   'input_data_type': 'unstructured_text',
   'problem_type': 'binary',
   'deployment': {'id': '26af6dfb-8745-4ba8-a00c-403cd2d77cb4',
    'name': 'Text Binary Classifier deployment'}},
  'scoring_id': '0dce0fd6937d87ad885e1f82ab4e4ce2-1'}}

## 5. Quality

## 5.1 Configure Quality

In [36]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "min_feedback_data_size": 20,
    "threshold": 0.8
}
quality_monitor_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters
).result

quality_monitor_id = quality_monitor_details.metadata.id

### 5.2 Store feedback record

In [37]:
feedback_dataset_id = None
feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result
print(feedback_dataset)
feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id
if feedback_dataset_id is None:
    print("Feedback data set not found. Please check quality monitor status.")

{
  "data_sets": [
    {
      "metadata": {
        "id": "0921f82f-6948-4c27-b62a-47d31724d1f5",
        "crn": "crn:v1:bluemix:public:aiopenscale:us-south:a/181ed6cc388f47bd9d862fe066f9cfce:80e6093f-5acf-4eb7-9da6-7ba9bf56a929:data_set:0921f82f-6948-4c27-b62a-47d31724d1f5",
        "url": "/v2/data_sets/0921f82f-6948-4c27-b62a-47d31724d1f5",
        "created_at": "2024-08-06T05:53:03.899000Z",
        "created_by": "iam-ServiceId-2e5c9fda-38bf-4279-9712-cdb3b6f3a7ad",
        "modified_at": "2024-08-06T05:53:04.370000Z",
        "modified_by": "iam-ServiceId-2e5c9fda-38bf-4279-9712-cdb3b6f3a7ad"
      },
      "entity": {
        "data_mart_id": "80e6093f-5acf-4eb7-9da6-7ba9bf56a929",
        "name": "bd86b5e4-bac8-4163-80d8-c6740a025b63_feedback",
        "description": "bd86b5e4-bac8-4163-80d8-c6740a025b63_feedback",
        "type": "feedback",
        "target": {
          "target_type": "subscription",
          "target_id": "bd86b5e4-bac8-4163-80d8-c6740a025b63"
        },
    

In [38]:
import csv

# Open the CSV file
with open('SMSSpam.csv', 'r') as file:
    csv_data = csv.DictReader(file)
    data = [row for row in csv_data]

# Convert the data to JSON
json_data = json.dumps(data)
# Load the JSON data as a Python object
loaded_data = json.loads(json_data)
# read first 20 records as feedback data
feedback_data = loaded_data[:20]

wos_client.data_sets.store_records(feedback_dataset_id, request_body=feedback_data, background_mode=False).result




 Waiting for end of storing records with request id: 7a62e26f-1e79-4253-bf9c-857e1b54d00b 




active

---------------------------------------
 Successfully finished storing records 
---------------------------------------




<ibm_cloud_sdk_core.detailed_response.DetailedResponse at 0x1481bbf10>

In [39]:
time.sleep(5)
feedback_records_count = wos_client.data_sets.get_records_count(data_set_id=feedback_dataset_id)
print('Number of records in the feedback table: ', feedback_records_count)

Number of records in the feedback table:  20


### 5.3 Run Quality monitor

In [40]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_id, background_mode=False).result




 Waiting for end of monitoring run 71790afc-05d2-4f4b-be08-f0afc85da02d 




finished

---------------------------
 Successfully finished run 
---------------------------




In [41]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_id)

0,1,2,3,4,5,6,7,8,9,10,11
2024-07-16 06:13:19.969000+00:00,true_positive_rate,1267b942-5fe0-423c-8263-b32816c30099,0.3636363636363636,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,area_under_roc,1267b942-5fe0-423c-8263-b32816c30099,0.6587412587412588,0.8,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,precision,1267b942-5fe0-423c-8263-b32816c30099,0.8,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,matthews_correlation_coefficient,1267b942-5fe0-423c-8263-b32816c30099,0.4167242637192667,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,f1_measure,1267b942-5fe0-423c-8263-b32816c30099,0.5000000000000001,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,accuracy,1267b942-5fe0-423c-8263-b32816c30099,0.7551020408163265,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,label_skew,1267b942-5fe0-423c-8263-b32816c30099,0.6909336273400493,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,gini_coefficient,1267b942-5fe0-423c-8263-b32816c30099,0.3174825174825175,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,log_loss,1267b942-5fe0-423c-8263-b32816c30099,0.4493805793027406,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68
2024-07-16 06:13:19.969000+00:00,false_positive_rate,1267b942-5fe0-423c-8263-b32816c30099,0.0461538461538461,,,['model_type:original'],quality,f16dff88-1305-4584-8c7d-3056728e79a4,92b130ed-fc31-43c2-8b0e-a5f8d3a91345,subscription,f1a960ef-b474-43f9-8621-d0b1f2a4ef68


Note: First 10 records were displayed.
