# Amazon SageMaker model bias monitor: Fairness and explainability with SageMaker Clarify

Amazon SageMaker Clarify helps improve your machine learning models by detecting potential bias and helping explain how these models make predictions. The fairness and explainability functionality provided by Clarify takes a step towards helping you to build trustworthy and understandable machine learning models. The product comes with the tools to help you with the following tasks:

- Measure biases that can occur during each stage of the machine learning (ML) lifecycle (data collection, model training and tuning, and monitoring of ML models deployed for inference).
- Generate model governance reports targeting risk and compliance teams and external regulators.
- Provide explanations of the data, models, and monitoring used to assess predictions.

In this lab exercise, the production model and SageMaker endpoint is already deployed. You will use SageMaker Clarify to analyze the training dataset and validation datasets, measuring the pre-training bias of a dataset and post-training bias of a model. Finally, you will generate and review an explainability report on the importance of the various input features on the model’s decision.


![MLOPS Amazon Clarify architecture](images/ClarifySDK.png)



## Imports and configurations

Run the following cell to import the required Python modules.

In [2]:
import os
import copy
import json
import random
import time
import datetime
import boto3
import pandas as pd

from datetime import datetime, timedelta

from sagemaker import get_execution_role, image_uris, Session
from sagemaker.s3 import S3Downloader, S3Uploader
from sagemaker.predictor import Predictor

from sagemaker.model_monitor import (
    BiasAnalysisConfig,
    CronExpressionGenerator,
    DataCaptureConfig,
    EndpointInput,
    ExplainabilityAnalysisConfig,
    ModelBiasMonitor,
    ModelExplainabilityMonitor,
)

from sagemaker.clarify import (
    BiasConfig,
    DataConfig,
    ModelConfig,
    ModelPredictedLabelConfig,
    SHAPConfig,
)

Run the following code to determine the full name of the modelArtifactBucket for use later.

In [3]:
bucket = ''
s3 = boto3.resource('s3')
for buckets in s3.buckets.all():
    if 'modelartifactbucket' in buckets.name:
        bucket = buckets.name

print(f"Bucket Name: {bucket}")

Bucket Name: qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm


Run the following code to prepopulate SageMaker production endpoint name and production model name.

In [4]:
# Get production endpoint name
endpoint_name = boto3.Session().client('sagemaker').list_endpoints(SortBy='CreationTime')['Endpoints'][0]['EndpointName']

# Get production model name
model_name = boto3.Session().client('sagemaker').list_models(SortBy='CreationTime')['Models'][0]['ModelName']

# Check if the endpoint is in service
client = boto3.client('sagemaker')
result = False
while result is False:
    response = client.describe_endpoint(EndpointName=endpoint_name)
    if (response['EndpointStatus'] == 'InService'):
        print(f"Production Endpoint Name:{endpoint_name}")
        print(f"Production Model Name:{model_name}")
        result = True
    else:
        print("Waiting for endpoint to be active")
        time.sleep(5)
        result = False

Production Endpoint Name:ProductionEndpoint-hjzMextUiqLd
Production Model Name:ProductionModel-5xNxXxd85t7t


Run the following code to get the required IAM Role and AWS Region.

In [5]:
role = get_execution_role()
print(f"RoleArn: {role}")

sagemaker_session = Session()
sagemaker_client = sagemaker_session.sagemaker_client
sagemaker_runtime_client = sagemaker_session.sagemaker_runtime_client

region = sagemaker_session.boto_region_name
print(f"AWS region: {region}")

RoleArn: arn:aws:iam::704275265636:role/qls-5142924-fc6beeb7d2b314f5-SageMakerStudioRole-T3DP9PS80MKD
AWS region: us-west-2


Run the following code to create an Amazon S3 path for storing captured data and reports.

In [6]:
prefix = 'sagemaker'
s3_key = f"s3://{bucket}/{prefix}"
print(f"S3 key: {s3_key}")

code_prefix = '{}/code'.format(prefix)
data_capture_prefix = f'{prefix}/datacapture'
s3_capture_upload_path = f'{s3_key}/datacapture'
s3_report_path = f'{s3_key}/reports'

print(f"Capture path: {s3_capture_upload_path}")
print(f"Report path: {s3_report_path}")


S3 key: s3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker
Capture path: s3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/datacapture
Report path: s3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/reports


Run the following code to read the train and validate data files.

In [7]:
test_dataset = "data/validate.csv"
train_dataset = "data/train.csv"
dataset_type = "text/csv"

with open(train_dataset) as f:
    headers_line = f.readline().rstrip()
all_headers = headers_line.split(",")
label_header = all_headers[0]

## Testing endpoint with sample traffic

Run the following code to extract a subset of samples from the validate datasets, and write the sample data as a local csv file. You will have two sets of files:

- data-test.csv contains only the feature data. 
- data-test-label.csv contains the label and feature data.

In [8]:
# Here we are randomly picking subset of data from test datasets.
import itertools

shape = pd.read_csv(test_dataset)

a = [10*i for i in range(3)]
b = [10+i for i in range(10)]
indices = [i+j for i,j in itertools.product(a,b)]

test_data = shape.drop(shape.columns[[0]],axis=1)
test_data = test_data.iloc[indices]
test_data_with_label = shape.iloc[indices]

In [9]:
test_data.to_csv("data/data-test.csv",index=False,header=False)
test_data_with_label.to_csv("data/data-test-label.csv",index=False,header=False)

Run the following code to perform the prediction. You will use the sample dataset you populated in the previous steps as a payload. At the end of the run, you will see the prediction as "0" or "1".

In [10]:
# Invoke Endpoint with test data for a quick test
print(f"Sending test traffic to the endpoint {endpoint_name}. \nPlease wait...")
predictions = ''

with open('data/data-test.csv', 'r') as f:
    for row in f:
        print(".", end="", flush=True)
        payload = row.rstrip('\n')
        response = sagemaker_runtime_client.invoke_endpoint(EndpointName=endpoint_name,
                                   ContentType="text/csv",
                                   Body=payload)
        predictions = ','.join([predictions, response['Body'].read().decode('utf-8')])
        time.sleep(0.5)

predictions = predictions.replace('\n','')
predictions = predictions.split(",")
predictions.pop(0)
print("="*20)
print(predictions)
print("Done!") 

Sending test traffic to the endpoint ProductionEndpoint-hjzMextUiqLd. 
Please wait...
['0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '1', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0', '0']
Done!


## View captured data

Run the following code to list the data capture files stored in S3. You should expect to see different files from different time periods organized based on the hour in which the invocation occurred. Rerun the cell if you see failure logs in the output.


In [11]:
print("Waiting 2 minutes for file to load...")
time.sleep(120)
s3_client = boto3.Session().client('s3')
current_endpoint_capture_prefix = '{}/{}'.format(data_capture_prefix, endpoint_name)
result = s3_client.list_objects(Bucket=bucket, Prefix=current_endpoint_capture_prefix)
capture_files = [capture_file.get("Key") for capture_file in result.get('Contents')]
print("Found Capture Files:")
print("\ns3://"+bucket+str(capture_files[0]))

Waiting 2 minutes for file to load...
Found Capture Files:

s3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjmsagemaker/datacapture/ProductionEndpoint-hjzMextUiqLd/ProductionModel-5xNxXxd85t7t/2021/12/08/06/34-31-704-0a2f5311-f320-41fd-8803-ad15a36b7a3d.jsonl


Run the following code to view the contents of a single capture file. Here, you should see all the data captured in a SageMaker specific JSON formatted file. Take a quick peek at the first few lines in the captured file.

In [12]:
def get_obj_body(obj_key):
    return s3_client.get_object(Bucket=bucket, Key=obj_key).get('Body').read().decode("utf-8")

capture_file = get_obj_body(capture_files[-1])
print(capture_file[:2000])

{"captureData":{"endpointInput":{"observedContentType":"text/csv","mode":"INPUT","data":"105,0,193.7,108,183.2,124,293.7,72,10.8,5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0","encoding":"CSV"},"endpointOutput":{"observedContentType":"text/csv; charset=utf-8","mode":"OUTPUT","data":"0\n","encoding":"CSV"}},"eventMetadata":{"eventId":"a51e65b5-ee06-4084-8d43-132030dd68f0","inferenceTime":"2021-12-08T06:34:31Z"},"eventVersion":"0"}
{"captureData":{"endpointInput":{"observedContentType":"text/csv","mode":"INPUT","data":"114,4,141.3,96,230.4,88,223.7,85,9.4,3,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1","encoding":"CSV"},"endpointOutput":{"observedContentType":"text/csv; charset=utf-8","mode":"OUTPUT","data":"0\n","encoding":"CSV"}},"eventMetadata":{"eventId":"a379a935-8b7f-479d-8311-2e04c88f1daf","inferenceTime":"2021-12-08T06:34:32Z"},"eventV

Run the following code to view content of a single line in the formatted JSON object.

In [13]:
import json
print(json.dumps(json.loads(capture_file.split('\n')[0]), indent=2))

{
  "captureData": {
    "endpointInput": {
      "observedContentType": "text/csv",
      "mode": "INPUT",
      "data": "105,0,193.7,108,183.2,124,293.7,72,10.8,5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0",
      "encoding": "CSV"
    },
    "endpointOutput": {
      "observedContentType": "text/csv; charset=utf-8",
      "mode": "OUTPUT",
      "data": "0\n",
      "encoding": "CSV"
    }
  },
  "eventMetadata": {
    "eventId": "a51e65b5-ee06-4084-8d43-132030dd68f0",
    "inferenceTime": "2021-12-08T06:34:31Z"
  },
  "eventVersion": "0"
}


## Amazon SageMaker Clarify setup

**Note:** The following code code samples are taken from [sample code](https://github.com/aws/amazon-sagemaker-examples/blob/master/sagemaker_processing/fairness_and_explainability/fairness_and_explainability.ipynb) provided by the Amazon SageMaker team. Some of the codes are updated to the lab requirements.

Run the following code to use the SageMaker Clarify processor.

The Clarify processor is setting up the infrastructure that will be used to run the bias detection on a distributed cluster.  In this example, there is only one instance in the cluster. To learn more about Clarify configuration, see [Configure an Amazon SageMaker Clarify Processing Jobs for Fairness and Explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-configure-processing-jobs.html).

In [14]:
from sagemaker import clarify
clarify_processor = clarify.SageMakerClarifyProcessor(role=role,
                                                      instance_count=1,
                                                      instance_type='ml.m5.xlarge',
                                                      sagemaker_session=sagemaker_session)

Run the following code to upload the test and train datasets to S3. The _train_uri_ and _test_uri_ will be used to directly pull data from S3, writing the dataconfig and modelconfig in the next steps.

In [15]:
from sagemaker.s3 import S3Uploader
from sagemaker.inputs import TrainingInput

train_uri = S3Uploader.upload('data/train.csv', 's3://{}/{}'.format(bucket, prefix))
train_input = TrainingInput(train_uri, content_type='csv')
test_uri = S3Uploader.upload('data/validate.csv', 's3://{}/{}'.format(bucket, prefix))

## Detecting bias

#### SageMaker Clarify helps you detect possible pre- and post-training biases using a variety of metrics.

### Writing DataConfig and ModelConfig

A **DataConfig** object communicates some basic information about data I/O to Clarify. You will specify where to find the input dataset, where to store the output, the target column (label), the header names, and the dataset type.



Run the following code to define DataConfig.

In [16]:
bias_report_output_path = 's3://{}/{}/clarify-bias'.format(bucket, prefix)
bias_data_config = clarify.DataConfig(s3_data_input_path=train_uri,
                                      s3_output_path=bias_report_output_path,
                                      label='Churn',
                                      headers=all_headers,
                                      dataset_type='text/csv')

A **ModelConfig** object communicates information about your trained model. To avoid additional traffic to your production models, Clarify sets up and tears down a dedicated endpoint when processing.

- **instance_type** and **instance_count** specify the preferred instance type and instance count used to run your model while processing in Clarify. The testing dataset is small so a single standard instance is good enough to run this example. If you have a large and complex dataset, you might want to use a better instance type to speed it up, or add more instances to enable Spark parallelization.

- **accept_type** denotes the endpoint response payload format, and **content_type** denotes the payload format of request to the endpoint.

Run the following code to define ModelConfig.

In [17]:
model_config = clarify.ModelConfig(model_name=model_name,
                                   instance_type='ml.m5.xlarge',
                                   instance_count=1,
                                   accept_type='text/csv',
                                   content_type='text/csv')

A **ModelPredictedLabelConfig** provides information on the format of your predictions. You are using the binary classification for this problem. The random forest model gives you the probabilities of samples, so Clarify invokes the endpoint, and then it uses probability_threshold to convert the probability to binary labels for bias analysis. Prediction above the threshold is interpreted as label value 1; and below or equalto the threshold is interpreted as label value 0.

Run the following code to define ModelPredictedLabelConfig for clarify.

In [18]:
predictions_config = clarify.ModelPredictedLabelConfig(probability_threshold=0.8)

### Writing BiasConfig
Clarify also needs information on what the sensitive columns (facets) are, what the sensitive features (facet_values_or_threshold) may be, and what the desirable outcomes are (label_values_or_threshold). Clarify can handle both categorical and continuous data for facet_values_or_threshold and for label_values_or_threshold. In this case, you are using categorical data.

This information is specified in the BiasConfig API. Here, the positive outcome is Churn = 0, _Account Length_ is a sensitive category, _Day Calls_ respondents is the sensitive group, and _group_name_ is used to form subgroups for the measurement.

In [19]:
bias_config = clarify.BiasConfig(label_values_or_threshold=[0],
                                facet_name='Account Length',
                                facet_values_or_threshold=[100],
                                group_name='Day Calls')

#### **Pre-training bias**

Bias can be present in your data before any model training occurs. Inspecting your data for bias before training begins can help detect any data collection gaps, inform your feature engineering, and hep you understand what societal biases the data may reflect.

Computing pretraining bias metrics does not require a trained model.

#### **Post-training bias**

Computing post-training bias metrics does require a trained model.

Unbiased training data (as determined by concepts of fairness measured by bias metrics) may still result in biased model predictions after training. Whether this occurs depends on several factors, including hyperparameter choices.

You can run these options separately with run_pre_training_bias() and run_post_training_bias() or at the same time with run_bias() as shown in the following run.

Run the following cell to start the Clarify processor for evaluation. It typically takes 10 minutes to complete the Clarify run. Clarify creates an ephemeral SageMaker endpoint for compute and deletes the resource once the calculation is complete.

In [20]:
clarify_processor.run_bias(data_config=bias_data_config,
                           bias_config=bias_config,
                           model_config=model_config,
                           model_predicted_label_config=predictions_config,
                           pre_training_methods='all',
                           post_training_methods='all')


Job Name:  Clarify-Bias-2021-12-08-06-39-28-369
Inputs:  [{'InputName': 'dataset', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/train.csv', 'LocalPath': '/opt/ml/processing/input/data', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'analysis_config', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/clarify-bias/analysis_config.json', 'LocalPath': '/opt/ml/processing/input/config', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'analysis_result', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/clarify-bias', 'LocalPath': '/opt/ml/processing/output', 'S3UploadMode': 'EndOfJob

Now that the run is complete, let's review the bias report.

#### **Viewing the bias report**

In Studio, you can view the results under the **Experiments** tab.

1. In the left pane, choose on **SageMaker components and registries**.

1. Select **Experimentals and trials** from drop-down menu.

1. Open (double-click) **Unassigned trial components**.

1. Open (double-click) the trial with a name that contains **clarify-bias**.

  The bias report will open in a new tab. After you have reviewed it, you will need to return to this tab (the tab labled *model_bias_clarify.ipynb*) to view the remaining instructions.

1. In the new tab that opens, select **bias report** to review the report.

<img src="./recordings/bias_report.gif">

Each bias metric has detailed explanations with examples that you can explore. Here are a few metrics to review.

- **Class Imbalance(CI):** Detects if the advantaged group is represented in the dataset at a substantially higher rate than the disadvantaged group, or vice versa.
- **Total Variation Distance (TVD):** This measure of distance in label distributions is half the Hamming distance between the probability distribution of labels of the advantaged class and the probability distribution of the disadvantaged class.
- **Conditional Demographic Disparity in Predicted Labels (CDDPL):** The metric examines whether the model predicted a bigger proportion of rejected outcomes for the disadvantaged class than the proportion of accepted outcomes for the same class.

<img src="./recordings/bias_detail.gif">

You could also summarize the results in a handy table. Choose the **table** icon on the right side of the page.

<img src="./recordings/bias_report_chart.gif">



You may view the bias repot (formatted as a pdf, html, and ipynb a file), in the following S3 bucket.

In [21]:
bias_report_output_path

's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/clarify-bias'

## Explaining predictions

There are expanding business needs and legislative regulations that require explanations of why a model made the decision it did. SageMaker Clarify uses SHapley Additive exPlanations (SHAP) to explain the contribution that each input feature makes to the final decision. SHAP analyzes, for each data instance, the individual contribution of feature values to the predicted output, and it represents them as a positive or negative value.

The Kernel SHAP algorithm requires a baseline (also known as background dataset). Baseline dataset type must be the same as dataset_type of DataConfig, and baseline samples must only include features. By definition, the baseline dataset should either be an S3 URI to the baseline dataset file or an in-place list of samples. In this case, you chose the latter, and put the first sample of the test dataset to an in-place list.

Run the following code to add SHAP configuration.

In [22]:
shap_config = clarify.SHAPConfig(baseline=[test_data.iloc[0].values.tolist()],
                                 num_samples=15,
                                 agg_method='mean_abs',
                                 save_local_shap_values=False)

explainability_output_path = 's3://{}/{}/clarify-explainability'.format(bucket, prefix)
explainability_data_config = clarify.DataConfig(s3_data_input_path=train_uri,
                                s3_output_path=explainability_output_path,
                                label='Churn',
                                headers=all_headers,
                                dataset_type='text/csv')

Run the following cell to start Clarify and evaluate explainability. It typically takes 10 minutes to complete the task.

While you are waiting for the process to complete, you can learn more about [Clarify fairness and explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-fairness-and-explainability.html) here.

In [23]:
clarify_processor.run_explainability(data_config=explainability_data_config,
                                     model_config=model_config,
                                     explainability_config=shap_config)


Job Name:  Clarify-Explainability-2021-12-08-06-53-36-869
Inputs:  [{'InputName': 'dataset', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/train.csv', 'LocalPath': '/opt/ml/processing/input/data', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'analysis_config', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/clarify-explainability/analysis_config.json', 'LocalPath': '/opt/ml/processing/input/config', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': 'analysis_result', 'AppManaged': False, 'S3Output': {'S3Uri': 's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/clarify-explainability', 'LocalPath': '/opt/ml/processing/outp

Now that the run is complete, let's review the explainability report.

#### Viewing the explainability report
As with the bias report, you can view the explainability report in SageMaker Studio under the **Experiments** tab.

1. In the left pane, choose the **SageMaker components and registries**.

1. Select **Experimentals and trials** from drop-down menu.

1. Select **Unassigned trial components**.

1. Select the name with **clarify-explainability** and select **model explainability** to review the report.


<img src="./recordings/explainability_detail.gif">

#### **Question:** Based on the report, which feature has the highest importance in determining the churn prediction?

You may view the bias report (formatted as PDF, HTML, or ipynb) in the following S3 bucket.

In [24]:
explainability_output_path

's3://qls-5142924-fc6beeb7d2b314f5-modelartifactbucket-1f8d8cpcntqjm/sagemaker/clarify-explainability'

### Clean up

Finally, don't forget to clean up the resources you set up and used for this demo.

In [25]:
sagemaker_session.delete_model(model_name)