# Exercises

This is the notebook containing the exercises for Feature Store, Model Monitor, and Clarify. Tested for these exercises was performed using __2 vCPU + 4 GiB notebook instance with Python 3 (TensorFlow 2.1 Python 3.6 CPU Optimized) kernel__.

## Staging

We'll begin by initializing some variables. These are often assumed to be present in code samples you'll find in the AWS documenation.

In [1]:
import sagemaker
from sagemaker.session import Session
from sagemaker import get_execution_role

role = get_execution_role()
session = sagemaker.Session()
region = session.boto_region_name
bucket = session.default_bucket()
prefix = 'MLmonitoring'

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


## Feature Store
---

Feature store is a special database to give ML systems a consistent data flow across training and inference workloads. It can ingest data in batches (for training) as well as serve input features to models with very low latency for real-time prediction.

For this exercise we'll work with a wine quality dataset: https://archive.ics.uci.edu/ml/datasets/wine+quality/

```P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.```

In [2]:
import pandas as pd
from sklearn import datasets
#import uuid

data = datasets.load_wine()
df = pd.DataFrame(data['data'])
df.columns = data['feature_names']

If we leave the column names as-is, Feature Store won't be able to handle the `/` in `od280/od315_of_diluted_wines` (`/` is a delimiter Feature Store uses to manage how features are organized.)

In [3]:
df.rename(columns = {'od280/od315_of_diluted_wines':'od280_od315_of_diluted_wines'}, inplace=True)

### Create FeatureGroup
Once we have our data, we can create a feature group. Remember to attach event time and ID columns - Feature Store needs them.

In [6]:
from sagemaker.feature_store.feature_group import FeatureGroup
import time

# Add event time and ID columns
df["EventTime"] = time.time()
df["id"] = range(len(df))

# Instance a FeatureGroup class
feature_group = FeatureGroup(name='my_group1',
                             sagemaker_session=session)

# Add feature definitions
feature_group.load_feature_definitions(data_frame=df)



[FeatureDefinition(feature_name='alcohol', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='malic_acid', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='ash', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='alcalinity_of_ash', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='magnesium', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='total_phenols', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='flavanoids', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collection_type=None),
 FeatureDefinition(feature_name='nonflavanoid_phenols', feature_type=<FeatureTypeEnum.FRACTIONAL: 'Fractional'>, collec

The feature group is not created until we call the `create` method, let's do that now:

In [8]:
# Create the feature store:
feature_group.create(s3_uri=f's3://{bucket}/{prefix}',
                     record_identifier_name='id',
                     event_time_feature_name='EventTime',
                     role_arn=role)

{'FeatureGroupArn': 'arn:aws:sagemaker:us-east-1:533701990481:feature-group/my_group1',
 'ResponseMetadata': {'RequestId': '63967523-e800-4e16-8280-8e7f5ddcaf59',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '63967523-e800-4e16-8280-8e7f5ddcaf59',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '86',
   'date': 'Wed, 10 Jul 2024 18:22:23 GMT'},
  'RetryAttempts': 0}}

### Check the status of the FeatureGroup

In [11]:
import boto3

sagemaker_client = boto3.client('sagemaker')

response = sagemaker_client.describe_feature_group(FeatureGroupName='my_group1')

print(response['FeatureGroupStatus'])


Created


Lastly, ingest some data into your feature group:

In [12]:
feature_group.ingest(data_frame=df,
                     max_workers=3,
                     wait=True)

IngestionManagerPandas(feature_group_name='my_group1', feature_definitions={'alcohol': {'FeatureName': 'alcohol', 'FeatureType': 'Fractional'}, 'malic_acid': {'FeatureName': 'malic_acid', 'FeatureType': 'Fractional'}, 'ash': {'FeatureName': 'ash', 'FeatureType': 'Fractional'}, 'alcalinity_of_ash': {'FeatureName': 'alcalinity_of_ash', 'FeatureType': 'Fractional'}, 'magnesium': {'FeatureName': 'magnesium', 'FeatureType': 'Fractional'}, 'total_phenols': {'FeatureName': 'total_phenols', 'FeatureType': 'Fractional'}, 'flavanoids': {'FeatureName': 'flavanoids', 'FeatureType': 'Fractional'}, 'nonflavanoid_phenols': {'FeatureName': 'nonflavanoid_phenols', 'FeatureType': 'Fractional'}, 'proanthocyanins': {'FeatureName': 'proanthocyanins', 'FeatureType': 'Fractional'}, 'color_intensity': {'FeatureName': 'color_intensity', 'FeatureType': 'Fractional'}, 'hue': {'FeatureName': 'hue', 'FeatureType': 'Fractional'}, 'od280_od315_of_diluted_wines': {'FeatureName': 'od280_od315_of_diluted_wines', 'Featu

Great job! You've demonstrated your understanding of creating feature groups and ingesting data into them using Feature Store. Next up we'll cover Model Monitor!

## Model Monitor

In this exercise we'll create a monitoring schedule for a deployed model. We're going to provide code to help you deploy a model and get started, so that you can focus on Model Monitor for this exercise. __Remember to clean up your model before you end a work session__. We'll provide some code at the end to help you clean up your model. We'll begin by reloading our data from the previous exercise.

1. Instance and Train the model (an Estimator)
2. Define DataCaptureConfig
3. Deploy the model (predictor) and pass the data_capture_config
4. Instance the Model Monitor object
5. Suggest a baseline 'my_monitor.suggest_baseline()'
6. Create a monitoring schedule using cron expressions 'my_monitor.create_monitoring_schedule()'


In [13]:
# Reload data

data = datasets.load_wine()
df = pd.DataFrame(data['data'])
df.columns = data['feature_names']
# Rename column to avoid issues with Model Monitor '/'
df.rename(columns = {'od280/od315_of_diluted_wines':'od280_od315_of_diluted_wines'}, inplace=True)

We also need to put the target variable in the first column per the docs for our chosen algorithm: https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html

In [14]:
df["TARGET"] = data['target']
df.set_index(df.pop('TARGET'), inplace=True)
df.reset_index(inplace=True)

Now we'll upload the data to S3 as train and validation data:

In [16]:
delimiter = int(len(df)/2)
train, test = df.iloc[delimiter:], df.iloc[:delimiter]

# Save the train and validation datasets locally
train.to_csv("train.csv", header=False, index=False)
test.to_csv("validation.csv", header=False, index=False)

# Upload data to default bucket
train_location = session.upload_data('./train.csv', key_prefix=f'{prefix}/data')
val_location = session.upload_data('./validation.csv', key_prefix=f"{prefix}/data")

# Instance SageMaker test_sample
s3_input_train = sagemaker.inputs.TrainingInput(s3_data=train_location, content_type='csv')
s3_input_validation = sagemaker.inputs.TrainingInput(s3_data=val_location, content_type='csv')

print(train_location)
print(val_location)

s3://sagemaker-us-east-1-533701990481/MLmonitoring/data/train.csv
s3://sagemaker-us-east-1-533701990481/MLmonitoring/data/validation.csv


### Instance the model (an Estimator object)

In [17]:
# Retrieve the model uri
algo_image = sagemaker.image_uris.retrieve("xgboost", region, version='latest')
s3_model_output = f"s3://{bucket}/{prefix}/models/wine_model"

# Create the Estimator
model=sagemaker.estimator.Estimator(
    image_uri=algo_image,
    role=role,
    instance_count=1,
    instance_type='ml.m4.xlarge',
    volume_size=5,
    output_path=s3_model_output,
    sagemaker_session=sagemaker.Session()
)

model.set_hyperparameters(max_depth=5,
                        eta=0.2,
                        gamma=4,
                        min_child_weight=6,
                        subsample=0.8,
                        objective='reg:linear',
                        early_stopping_rounds=10,
                        num_round=200)

# Train the model
model.fit({'train': s3_input_train, 'validation': s3_input_validation})

INFO:sagemaker:Creating training-job with name: xgboost-2024-07-10-18-34-13-913


2024-07-10 18:34:14 Starting - Starting the training job...
2024-07-10 18:34:32 Starting - Preparing the instances for training...
2024-07-10 18:35:00 Downloading - Downloading input data...
2024-07-10 18:35:30 Downloading - Downloading the training image......
2024-07-10 18:36:26 Training - Training image download completed. Training in progress.[34mArguments: train[0m
[34m[2024-07-10:18:36:37:INFO] Running standalone xgboost training.[0m
[34m[2024-07-10:18:36:37:INFO] File size need to be processed in the node: 0.01mb. Available memory size in the node: 8456.26mb[0m
[34m[2024-07-10:18:36:37:INFO] Determined delimiter of CSV input is ','[0m
[34m[18:36:37] S3DistributionType set as FullyReplicated[0m
[34m[18:36:37] 89x13 matrix with 1157 entries loaded from /opt/ml/input/data/train?format=csv&label_column=0&delimiter=,[0m
[34m[2024-07-10:18:36:37:INFO] Determined delimiter of CSV input is ','[0m
[34m[18:36:37] S3DistributionType set as FullyReplicated[0m
[34m[18:36:37]

Now that your training job has finished, you can perform the first task in this exercise: creating a data capture config. Configure your model to sample `34%` of inferences:

In [18]:
from sagemaker.model_monitor import DataCaptureConfig

capture_uri = f's3://{bucket}/{prefix}/data-capture'

data_capture_config = DataCaptureConfig(enable_capture=True,
                                        sampling_percentage=34,
                                        destination_s3_uri=capture_uri)

Great! We'll use your config to deploy a model below:

In [19]:
xgb_predictor = model.deploy(initial_instance_count=1, 
                             instance_type='ml.m4.xlarge',
                             data_capture_config=data_capture_config)

INFO:sagemaker:Creating model with name: xgboost-2024-07-10-18-43-18-752
INFO:sagemaker:Creating endpoint-config with name xgboost-2024-07-10-18-43-18-752
INFO:sagemaker:Creating endpoint with name xgboost-2024-07-10-18-43-18-752


--------!

Great! You should see an indicator like this when the deployment finishes:

```
-----------------!
```
We can test your deployment like so:

In [20]:
xgb_predictor.serializer = sagemaker.serializers.CSVSerializer()
test_sample = test.copy()
# Drop the target variable
test_sample = test_sample.drop(columns=test_sample.columns[0])
y_pred = xgb_predictor.predict(test_sample.sample(5).values).decode('utf-8')
y_pred

'0.6030303239822388,0.6030303239822388,0.6030303239822388,0.6030303239822388,0.7861111164093018'

All systems go! To finish up the exercise, we're going to provide you with a DefaultModelMonitor and a suggested baseline. Combine the `xgb_predictor` and the provided `my_monitor` to configure the monitoring schedule for _hourly_ monitoring.

In [21]:
from sagemaker.model_monitor import DefaultModelMonitor
from sagemaker.model_monitor.dataset_format import DatasetFormat

my_monitor = DefaultModelMonitor(
    role=role,
    instance_count=1,
    instance_type='ml.m5.xlarge',
    volume_size_in_gb=20,
    max_runtime_in_seconds=3600,
)

INFO:sagemaker.image_uris:Defaulting to the only supported framework/algorithm version: .
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.


In [22]:
my_monitor.suggest_baseline(
    baseline_dataset=train_location,
    dataset_format=DatasetFormat.csv(header=False))

INFO:sagemaker:Creating processing-job with name baseline-suggestion-job-2024-07-10-18-51-16-526


.............[34m2024-07-10 18:53:21.473159: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory[0m
[34m2024-07-10 18:53:21.473190: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.[0m
[34m2024-07-10 18:53:23.116250: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory[0m
[34m2024-07-10 18:53:23.116283: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)[0m
[34m2024-07-10 18:53:23.116307: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ip-10-2-82-146.ec2.internal): /proc/driver/nvidia/version does not e

<sagemaker.processing.ProcessingJob at 0x7f1027d604c0>

Below, provide the monitoring schedule:

In [23]:
from sagemaker.model_monitor import CronExpressionGenerator

my_monitor.create_monitoring_schedule(monitor_schedule_name='schedule01',
                                      endpoint_input=xgb_predictor.endpoint_name,
                                      statistics=my_monitor.baseline_statistics(),
                                      constraints=my_monitor.suggested_constraints(),
                                      schedule_cron_expression=CronExpressionGenerator.hourly())

INFO:sagemaker.model_monitor.model_monitoring:Creating Monitoring Schedule with name: schedule01


Great job! You can check that your schedule was created by selecting the `SageMaker components and registries` tab on the far left.

In this exercise you configured Model Monitor to watch a simple model. Next, we'll monitor the same deployment for explainability.

__REMINDER:__ Don't leave your model deployed overnight. If you aren't going to follow up with the Clarify exercise within a few hours, use the code below to remove your model:

In [25]:
monitors = xgb_predictor.list_monitors()
for monitor in monitors:
    monitor.delete_monitoring_schedule()

INFO:sagemaker.image_uris:Defaulting to the only supported framework/algorithm version: .
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.
INFO:sagemaker:Deleting Monitoring Schedule with name: schedule01
INFO:sagemaker.model_monitor.model_monitoring:Deleting Data Quality Job Definition with name: data-quality-job-definition-2024-07-10-18-58-00-299


In [None]:
xgb_predictor.delete_endpoint()

## Clarify

For the last exercise we'll deploy an explainability monitor using Clarify. We're going to use the model that you deployed in the last exercise, but if you cleaned up your deployments from the previous exercise, that's ok! You can rerun the deployment from the previous exercise up to the point where we deployed our model. It'll look like this:

```python
xgb_predictor = model.deploy(
    initial_instance_count=1, instance_type='ml.m4.xlarge',
    data_capture_config=data_capture_config
)
```

Once your model is deployed, you can come back here. _REMINDER_: you need to clean up your deployment, don't leave it running overnight. We'll provide some code at the end to delete your deployment.

## Prep

We'll begin by reloading our data from the previous exercise.

In [88]:
data = datasets.load_wine()
df = pd.DataFrame(data['data'])
df.columns = data['feature_names']
df.rename(columns = {'od280/od315_of_diluted_wines':'od280_od315_of_diluted_wines'}, inplace=True)

We also need to put the target variable in the first column per the docs for our chosen algorithm: https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html

In [89]:
df["TARGET"] = data['target']
df.set_index(df.pop('TARGET'), inplace=True)
df.reset_index(inplace=True)

Now we'll upload the data to S3 as train and validation data:

In [90]:
delimiter = int(len(df)/2)
train, test = df.iloc[delimiter:], df.iloc[:delimiter]

# Save datasets locally
train.to_csv("train.csv", header=False, index=False)
test.to_csv("validation.csv", header=False, index=False)

# Upload datasets to S3
train_location = session.upload_data('./train.csv', key_prefix=f"{prefix}/data")
val_location = session.upload_data('./validation.csv', key_prefix=f"{prefix}/data")

# Create SageMaker inputs
s3_input_train = sagemaker.inputs.TrainingInput(s3_data=train_location, content_type='csv')
s3_input_validation = sagemaker.inputs.TrainingInput(s3_data=val_location, content_type='csv')

Great! Our data is staged and our model is deployed - let's monitor it for explainability. We need to define three config objects, the `SHAPConfig`, the `ModelConfig`, and the `ExplainabilityAnalysisConfig` to pass them to the scheduling method. Below, we provide the `SHAPConfig`.

In [26]:
# Instance the SHAP analysis algorithm
shap_config = sagemaker.clarify.SHAPConfig(
    baseline=[train.mean().astype(int).to_list()[1:]],
    num_samples=int(train.size),
    agg_method="mean_abs",
    save_local_shap_values=False)

# Instance model config
model_config = sagemaker.clarify.ModelConfig(
    model_name="xgboost-2024-08-25-15-19-33-499",
    instance_count=1,
    instance_type='ml.m4.xlarge',
    content_type="text/csv",
    accept_type="text/csv")

# Pass the analyses algorithm and the model config to the Explainability config
analysis_config = sagemaker.model_monitor.ExplainabilityAnalysisConfig(
        explainability_config=shap_config,
        model_config=model_config,
        headers=train.columns.to_list()[1:])

Before we apply our config, we need to create the monitor object. This is what we'll apply all our config to.

In [27]:
# Instance the explainability monitor
model_explainability_monitor = sagemaker.model_monitor.ModelExplainabilityMonitor(
    role=role,
    sagemaker_session=session,
    max_runtime_in_seconds=1800)

INFO:sagemaker.image_uris:Defaulting to the only supported framework/algorithm version: 1.0.
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.


Everything's ready! Below, create a monitoring schedule using the configs we created. Set the schedule to run _daily_.

In [28]:
from sagemaker.model_monitor import CronExpressionGenerator

# Path to save explainability results
explainability_uri = f's3://{bucket}/{prefix}/explainability'

# Create the monitoring schedule from the explainability monitor
model_explainability_monitor.create_monitoring_schedule(
    output_s3_uri=explainability_uri,
    analysis_config=analysis_config, #ExplainabilityAnalysisConfig object
    endpoint_input=xgb_predictor.endpoint_name,
    schedule_cron_expression=CronExpressionGenerator.hourly())

INFO:sagemaker.model_monitor.clarify_model_monitoring:Uploading analysis config to {s3_uri}.
INFO:sagemaker.model_monitor.model_monitoring:Creating Monitoring Schedule with name: monitoring-schedule-2024-07-10-19-12-51-203


Way to go! You can check that your schedule was created by selecting the `SageMaker components and registries` tab on the far left.

In this exercise you deployed a monitor for explainability to your SageMaker endpoint. This is the last exercise - you'll apply these learnings again in your Project at the end of the course.



__REMINDER:__ Don't leave your model deployed overnight. Use the code below to remove your model:

In [29]:
monitors = xgb_predictor.list_monitors()
for monitor in monitors:
    monitor.delete_monitoring_schedule()

INFO:sagemaker.image_uris:Defaulting to the only supported framework/algorithm version: 1.0.
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.
INFO:sagemaker:Deleting Monitoring Schedule with name: monitoring-schedule-2024-07-10-19-12-51-203
INFO:sagemaker.model_monitor.clarify_model_monitoring:Deleting Model Explainability Job Definition with name: model-explainability-job-definition-2024-07-10-19-12-51-203


In [30]:
xgb_predictor.delete_endpoint()

INFO:sagemaker:Deleting endpoint configuration with name: xgboost-2024-07-10-18-43-18-752
INFO:sagemaker:Deleting endpoint with name: xgboost-2024-07-10-18-43-18-752
