<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/banner.png" align="left" alt="banner">

# Tutorial on generating an explanation for an image-based binary classifier model on Watson OpenScale

## Contents:
- [1. Setup](#setup)
- [2. Creating and deploying an image-based model](#deployment)
- [3. Subscriptions](#subscription)
- [4. Explainability](#explainability)


**Note:** This notebook works correctly with kernel `IBM Runtime 22.1 on Python 3.9 / IBM Runtime 22.2 on Python 3.10` if you are using IBM Watson Studio or `Python 3.9.x / Python 3.10.x` if using any other tool

<a id="setup"></a>
## 1. Setup

### 1.1 Install OS and WML packages

In [None]:
!pip install --upgrade ibm-watson-openscale --no-cache | tail -n 1

In [None]:
!pip install --upgrade ibm-watson-machine-learning --no-cache | tail -n 1

Note: Restart the kernel to assure the new libraries are being used.

### 1.2 Configure credentials

- WOS_CREDENTIALS (CP4D)
- WML_CREDENTIALS (CP4D)
- DATABASE_CREDENTIALS (DB2 on CP4D or Cloud Object Storage (COS))
- SCHEMA_NAME

In [None]:
WOS_CREDENTIALS = {
    "url": "***",
    "username": "***",
    "password": "***"
}

In [None]:
WML_CREDENTIALS = {
                   "url": "***",
                   "username": "***",
                   "password" : "***",
                   "instance_id": "wml_local",
                   "version" : "4.6" #If your env is CP4D 4.x.x then specify "4.x.x" instead of "4.6"
                  }

<a id="deployment"></a>
## 2. Creating and deploying an image-based model

We are going to create a binary classifier which classifies an image as a Dog or a Cat (Probability: 1 = dog, 0 = cat). The dataset can be downloaded from here: https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data. The dataset can also be found here: https://ibm.box.com/shared/static/itl0el289mz06py2e6aemehx6lge1rou.zip

Now, create a folder named `data` and inside it create subdirectories: `train` and `validation`. Further, create folders named `dogs` and `cats` (as shown below) with 1024 dog and cat images in the `train` directory and 416 dog and cat images in the `validation` directory respectively. Post unzipping the downloaded zip file, use the images from the `train` folder found after unzipping `train.zip`.

```python
data/
    train/
        dogs/ # 1024 pictures
            dog.1.jpg
            dog.2.jpg
            ...
        cats/ # 1024 pictures
            cat.1.jpg
            cat.2.jpg
            ...
    validation/
        dogs/ # 416 pictures
            dog.1025.jpg
            dog.1026.jpg
            ...
        cats/ # 416 pictures
            cat.1025.jpg
            cat.1026.jpg
            ...
```

Note: Keras and TensorFlow versions supported by WML are: Keras 2.2.5 with TensorFlow 1.15 backend. This combination is used in this notebook.

In [None]:
# Note: Uncomment the code only if you are not running this in IBM watson Studio

# !pip install tensorflow
# !pip install keras

In [None]:

!pip install keras_sequential_ascii
!pip install pillow

In [None]:
#import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dropout, Flatten, Dense
from tensorflow.keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras_sequential_ascii import sequential_model_to_ascii_printout
from tensorflow.keras import backend as keras_backend
from tensorflow.keras import optimizers
from tensorflow.keras import applications
from tensorflow.keras.models import Model
import numpy as np
#print(keras.__version__)

Note: If you are running this in IBM watson Studio, Upload the zipped data folder created to watson studio project and run the below cell

In [None]:
# import zipfile
# from ibm_watson_studio_lib import access_project_or_space
# wslib = access_project_or_space()
# wslib.download_file("data.zip")

# z = zipfile.ZipFile("data.zip")
# z.extractall()

### 2.1 Creating a model

In [None]:
!ls

In [None]:
# Dimension of the images
img_width, img_height = 90, 90

train_data_dir = 'data/train/'
validation_data_dir = 'data/validation/'

Note: Please modify the paths above accordingly.

In [None]:
# Preprocessing

#used to rescale the pixel values from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 32

# automagically retrieve images and their classes for train and validation sets
train_generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='categorical')

validation_generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='categorical')

In [None]:
# Define Model

def base_model():
    model = Sequential()
    model.add(Convolution2D(32, (3, 3), input_shape=(img_width, img_height, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Convolution2D(32, (3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Convolution2D(64, (3, 3)))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Dropout(0.5))
    model.add(Flatten())
    model.add(Dense(64, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2, activation='softmax'))

    model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])
    
    return model

In [None]:
epochs = 10 # One can increase the no. of epochs to get better accuracy
train_samples = 2048
validation_samples = 832

In [None]:
cnn_n = base_model()
cnn_n.summary()

In [None]:
# Vizualizing model structure
sequential_model_to_ascii_printout(cnn_n)

In [None]:
# Train the model
cnn_n.fit_generator(
    train_generator,
    steps_per_epoch=train_samples // batch_size,
    epochs=epochs,
    validation_steps=validation_samples // batch_size,
    validation_data=validation_generator
)

In [None]:
!ls

In [None]:
cnn_n.save('dog_cat_cnn.h5')
!rm dog_cat_cnn.tar*
!tar -czvf dog_cat_cnn.tar.gz dog_cat_cnn.h5
!rm dog_cat_cnn.h5

In [None]:
scores = cnn_n.evaluate_generator(validation_generator, validation_samples)
print(scores)
print("Accuracy: %.2f%%" % (scores[1]*100))

### 2.2 Storing the model

In [None]:
import json
from ibm_watson_machine_learning import APIClient

wml_client = APIClient(WML_CREDENTIALS)
wml_client.version

In [None]:
wml_client.spaces.list(limit=10)

In [None]:
WML_SPACE_ID='***' # use space id here
wml_client.set.default_space(WML_SPACE_ID)

In [None]:
MODEL_NAME = "Dog-Cat binary"

In [None]:
software_spec_uid = wml_client.software_specifications.get_uid_by_name("tensorflow_rt22.2-py3.10")
print("Software Specification ID: {}".format(software_spec_uid))
model_props = {
        wml_client.repository.ModelMetaNames.NAME:"{}".format(MODEL_NAME),
        wml_client.repository.ModelMetaNames.TYPE: 'tensorflow_2.9',
        wml_client.repository.ModelMetaNames.SOFTWARE_SPEC_UID: software_spec_uid,
    }

In [None]:
print("Storing model ...")
published_model_details = wml_client.repository.store_model(
    model='dog_cat_cnn.tar.gz', 
    meta_props=model_props, 
)

model_uid = wml_client.repository.get_model_id(published_model_details)
print("Done")
print("Model ID: {}".format(model_uid))

### 2.3 Deploying the model

In [None]:
deployment_details = wml_client.deployments.create(
    model_uid, 
    meta_props={
        wml_client.deployments.ConfigurationMetaNames.NAME: "{}".format(MODEL_NAME + " deployment"),
        wml_client.deployments.ConfigurationMetaNames.ONLINE: {}
    }
)
scoring_url = wml_client.deployments.get_scoring_href(deployment_details)
deployment_uid=wml_client.deployments.get_id(deployment_details)

print("Scoring URL:" + scoring_url)
print("Model id: {}".format(model_uid))
print("Deployment id: {}".format(deployment_uid))

## 3. Subscriptions <a id="subscription"></a>

### 3.1 Configuring OpenScale

In [None]:
!pip install matplotlib
!pip install scikit-image

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline 

import matplotlib.image as mpimg
from skimage.transform import resize

img = mpimg.imread("data/train/dogs/dog.0.jpg")
img = resize(img, (90, 90))
print(img.shape)
plt.imshow(img, cmap='gray')
plt.show()

In [None]:
from ibm_cloud_sdk_core.authenticators import CloudPakForDataAuthenticator
from ibm_watson_openscale import APIClient

from ibm_watson_openscale import *
from ibm_watson_openscale.supporting_classes.enums import *
from ibm_watson_openscale.supporting_classes import *


authenticator = CloudPakForDataAuthenticator(
        url=WOS_CREDENTIALS['url'],
        username=WOS_CREDENTIALS['username'],
        password=WOS_CREDENTIALS['password'],
        disable_ssl_verification=True
    )

wos_client = APIClient(service_url=WOS_CREDENTIALS['url'],authenticator=authenticator)
wos_client.version

In [None]:
#DB_CREDENTIALS= {"hostname":"","username":"","password":"","database":"","port":"","ssl":True,"sslmode":"","certificate_base64":""}
DB_CREDENTIALS = None
KEEP_MY_INTERNAL_POSTGRES = True

In [None]:
data_marts = wos_client.data_marts.list().result.data_marts
if len(data_marts) == 0:
    if DB_CREDENTIALS is not None:
        if SCHEMA_NAME is None: 
            print("Please specify the SCHEMA_NAME and rerun the cell")

        print('Setting up external datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook",
                database_configuration=DatabaseConfigurationRequest(
                  database_type=DatabaseType.POSTGRESQL,
                    credentials=PrimaryStorageCredentialsLong(
                        hostname=DB_CREDENTIALS['hostname'],
                        username=DB_CREDENTIALS['username'],
                        password=DB_CREDENTIALS['password'],
                        db=DB_CREDENTIALS['database'],
                        port=DB_CREDENTIALS['port'],
                        ssl=True,
                        sslmode=DB_CREDENTIALS['sslmode'],
                        certificate_base64=DB_CREDENTIALS['certificate_base64']
                    ),
                    location=LocationSchemaName(
                        schema_name= SCHEMA_NAME
                    )
                )
             ).result
    else:
        print('Setting up internal datamart')
        added_data_mart_result = wos_client.data_marts.add(
                background_mode=False,
                name="WOS Data Mart",
                description="Data Mart created by WOS tutorial notebook", 
                internal_database = True).result
        
    data_mart_id = added_data_mart_result.metadata.id
    
else:
    data_mart_id=data_marts[0].metadata.id
    print('Using existing datamart {}'.format(data_mart_id))

In [None]:
SERVICE_PROVIDER_NAME = "Image Binary WML V2_test"
SERVICE_PROVIDER_DESCRIPTION = "Added by tutorial WOS notebook."

In [None]:
service_providers = wos_client.service_providers.list().result.service_providers
for service_provider in service_providers:
    service_instance_name = service_provider.entity.name
    if service_instance_name == SERVICE_PROVIDER_NAME:
        service_provider_id = service_provider.metadata.id
        wos_client.service_providers.delete(service_provider_id)
        print("Deleted existing service_provider for WML instance: {}".format(service_provider_id))

In [None]:
added_service_provider_result = wos_client.service_providers.add(
        name=SERVICE_PROVIDER_NAME,
        description=SERVICE_PROVIDER_DESCRIPTION,
        service_type=ServiceTypes.WATSON_MACHINE_LEARNING,
        deployment_space_id = WML_SPACE_ID,
        operational_space_id = "production",
        credentials=WMLCredentialsCP4D(
            url=WML_CREDENTIALS["url"],
            username=WML_CREDENTIALS["username"],
            password=WML_CREDENTIALS["password"],
            instance_id=None
        ),
        background_mode=False
    ).result
service_provider_id = added_service_provider_result.metadata.id

In [None]:
asset_deployment_details = wos_client.service_providers.list_assets(data_mart_id=data_mart_id, service_provider_id=service_provider_id, deployment_id = deployment_uid,deployment_space_id = WML_SPACE_ID).result['resources'][0]


In [None]:
model_asset_details_from_deployment=wos_client.service_providers.get_deployment_asset(data_mart_id=data_mart_id,service_provider_id=service_provider_id,deployment_id=deployment_uid,deployment_space_id=WML_SPACE_ID)
model_asset_details_from_deployment

### 3.2 Subscribe the asset

In [None]:
subscriptions = wos_client.subscriptions.list().result.subscriptions
for subscription in subscriptions:
    sub_model_id = subscription.entity.asset.asset_id
    if sub_model_id == model_uid:
        wos_client.subscriptions.delete(subscription.metadata.id)
        print('Deleted existing subscription for model', sub_model_id)

In [None]:
from ibm_watson_openscale.base_classes.watson_open_scale_v2 import ScoringEndpointRequest

In [None]:
subscription_details = wos_client.subscriptions.add(
        data_mart_id=data_mart_id,
        service_provider_id=service_provider_id,
        asset=Asset(
            asset_id=model_asset_details_from_deployment["entity"]["asset"]["asset_id"],
            name=model_asset_details_from_deployment["entity"]["asset"]["name"],
            url=model_asset_details_from_deployment["entity"]["asset"]["url"],
            asset_type=AssetTypes.MODEL,
            input_data_type=InputDataType.UNSTRUCTURED_IMAGE,
            problem_type=ProblemType.BINARY_CLASSIFICATION
        ),
        deployment=AssetDeploymentRequest(
            deployment_id=asset_deployment_details['metadata']['guid'],
            name=asset_deployment_details['entity']['name'],
            deployment_type= DeploymentTypes.ONLINE,
            url=model_asset_details_from_deployment['entity']['asset']['url'],
            scoring_endpoint=ScoringEndpointRequest(url=scoring_url) # scoring model without shadow deployment
        ),
        asset_properties=AssetPropertiesRequest(
            probability_fields=['probability']
            )
    ).result
subscription_id = subscription_details.metadata.id
subscription_id

In [None]:
import time

time.sleep(5)
payload_data_set_id = None
payload_data_set_id = wos_client.data_sets.list(type=DataSetTypes.PAYLOAD_LOGGING, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result.data_sets[0].metadata.id
if payload_data_set_id is None:
    print("Payload data set not found. Please check subscription status.")
else:
    print("Payload data set id: ", payload_data_set_id)

### 3.3 Score the model and get transaction-id

In [None]:
scoring_data={"input_data": [{"values": [img.tolist()]}]}
predictions = wml_client.deployments.score(deployment_uid, scoring_data)
print(predictions)

In [None]:
wos_client.data_sets.get_records_count(payload_data_set_id)

<a id="explainability"></a>
## 4. Explainability

### 4.1 Configure Explainability

In [None]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "enabled": True
}
explainability_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.EXPLAINABILITY.ID,
    target=target,
    parameters=parameters
).result

explainability_monitor_id = explainability_details.metadata.id

### 4.2 Get explanation for the transaction

In [None]:
pl_records_resp = wos_client.data_sets.get_list_of_records(data_set_id=payload_data_set_id, limit=1, offset=0).result
scoring_ids = [pl_records_resp["records"][0]["entity"]["values"]["scoring_id"]]
print("Running explanations on scoring IDs: {}".format(scoring_ids))
explanation_types = ["lime", "contrastive"]
result = wos_client.monitor_instances.explanation_tasks(scoring_ids=scoring_ids, explanation_types=explanation_types, subscription_id=subscription_id).result
print(result)

In [None]:
explanation_task_id=result.to_dict()['metadata']['explanation_task_ids'][0]
explanation=wos_client.monitor_instances.get_explanation_tasks(explanation_task_id=explanation_task_id, subscription_id=subscription_id).result.to_dict()

### The explanation images can be obtained using the cells below

In [None]:
!pip install Pillow
from PIL import Image
import base64
import io

time.sleep(45) # sleep for 45 second so explanation finishes
pred = explanation["entity"]['explanations'][0]['predictions'][0]
print("Explanation for {} region:".format(pred["value"]))

img = pred["explanation"][0]["full_image"]
img_data = base64.b64decode(img)
Image.open(io.BytesIO(img_data))

In [None]:
pred = pred = explanation["entity"]['explanations'][0]['predictions'][1]
print("Explanation for {} region:".format(pred["value"]))

img = pred["explanation"][0]["full_image"]
img_data = base64.b64decode(img)
Image.open(io.BytesIO(img_data))

<a id="Quality"></a>
## 5. Quality

### 5.1 Configure Quality

In [None]:
target = Target(
    target_type=TargetTypes.SUBSCRIPTION,
    target_id=subscription_id
)
parameters = {
    "min_feedback_data_size": 20
}
quality_details = wos_client.monitor_instances.create(
    data_mart_id=data_mart_id,
    background_mode=False,
    monitor_definition_id=wos_client.monitor_definitions.MONITORS.QUALITY.ID,
    target=target,
    parameters=parameters
).result

quality_monitor_id = quality_details.metadata.id

### 5.2 Store feedback record

In [None]:
feedback_dataset_id = None
feedback_dataset = wos_client.data_sets.list(type=DataSetTypes.FEEDBACK, 
                                                target_target_id=subscription_id, 
                                                target_target_type=TargetTypes.SUBSCRIPTION).result
print(feedback_dataset)
feedback_dataset_id = feedback_dataset.data_sets[0].metadata.id
if feedback_dataset_id is None:
    print("Feedback data set not found. Please check quality monitor status.")

In [None]:
feedback = []
for i in range(10):
    catimagePath = "data/validation/cats/cat." + str(i+1) + ".jpg"
    dogimagePath = "data/validation/dogs/dog." + str(i+1) + ".jpg"
    catimg = mpimg.imread(catimagePath)
    catimg = resize(catimg, (90, 90))
    dogimg = mpimg.imread(dogimagePath)
    dogimg = resize(dogimg, (90, 90))
    catelement = {"scoring_input":catimg.tolist(), "target":0}
    dogelement = {"scoring_input":dogimg.tolist(), "target":1}
    feedback.append(catelement)
    feedback.append(dogelement)
wos_client.data_sets.store_records(feedback_dataset_id, request_body=feedback, background_mode=False)

### 5.3 Run Quality monitor

In [None]:
run_details = wos_client.monitor_instances.run(monitor_instance_id=quality_monitor_id, background_mode=False).result

In [None]:
wos_client.monitor_instances.show_metrics(monitor_instance_id=quality_monitor_id)