# Time Series Explainability DeepAR notebook

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

---

## Runtime

This notebook takes approximately 30 minutes to run.

### Contents

1. [Summary](#Summary)

2. [Prerequisites](#Prerequisites)    

    a. [Install Mercury](#Install-Mercury)

    b. [Install Development SDK](#Install-Development-SDK)
    
    c. [Loading the data: Everlane Dataset](#Loading-the-data:-Everlane-Dataset)
    
    d. [Train TimeSeries Model](#Train-TimeSeries-Model)
    
    e. [Create an Endpoint from Training Job](#Create-an-Endpoint-from-Training-Job)

4. [Time Series Explainability](#Time-Series-Explainability)

    a. [Create `TimeSeriesDataConfig`](#Create-TimeSeriesDataConfig)
     
    b. [Create `TimeSeriesModelConfig`](#Create-TimeSeriesModelConfig)
     
    c. [Create `AsymmetricShapleyValueConfig`](#Create-AsymmetricShapleyValueConfig)
    
    d. [Create `DataConfig`](#Create-DataConfig)
    
    e. [Create `ModelConfig`](#Create-ModelConfig)
     
    f. [Setup Processor](#Setup-Processor)
    
    g. [Run Explainability Call](#Run-Explainability-Call)

5. [Analysis Config](#Analysis-Config)

    a. [Retrieve Config From s3](#Retrieve-Config-From-s3)
        
    b. [Display Config](#Display-Config)

6. [Explainability Results](#Explainability-Results)

    a. [Retrieve Results From s3](#Retrieve-Results-From-s3)
    
    b. [Display Results](#Display-Results)

7. [Clean Up](#Clean-Up)

## Summary

This notebook is created to demonstrate the invocation of a SageMaker Clarify explainability job for a time series DeepAR forecasting model. Given a real model and a training dataset, a processingJob will be create to analyze the SHAP score for each feature attribute.

## Prerequisites

### Install Mercury

If not already installed, the following cell will install the `mercury` package in order to display the `analysis_config.json` and explainability job output within the notebook.

In [1]:
!pip install mercury -q

[0m[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
awscli 1.31.7 requires botocore==1.33.7, but you have botocore 1.29.165 which is incompatible.
awscli 1.31.7 requires s3transfer<0.9.0,>=0.8.0, but you have s3transfer 0.6.2 which is incompatible.
sagemaker 2.214.3 requires boto3<2.0,>=1.33.3, but you have boto3 1.26.83 which is incompatible.[0m[31m
[0m

## Install Development SDK

As of 13-12-23, time series explainability (TSX) is in closed beta. The changes to the SDK to support it are not merged into the `sagemaker-python-sdk` repository. Instead, we install a version of the PySDK that does support these changes directly via a wheel file. 

In [2]:
!pip uninstall sagemaker -y -q c
!pip install sagemaker
!pip show sagemaker

[0mCollecting sagemaker
  Using cached sagemaker-2.214.3-py3-none-any.whl.metadata (14 kB)
Collecting boto3<2.0,>=1.33.3 (from sagemaker)
  Using cached boto3-1.34.80-py3-none-any.whl.metadata (6.6 kB)
Collecting botocore<1.35.0,>=1.34.80 (from boto3<2.0,>=1.33.3->sagemaker)
  Using cached botocore-1.34.80-py3-none-any.whl.metadata (5.7 kB)
Collecting s3transfer<0.11.0,>=0.10.0 (from boto3<2.0,>=1.33.3->sagemaker)
  Using cached s3transfer-0.10.1-py3-none-any.whl.metadata (1.7 kB)
Using cached sagemaker-2.214.3-py3-none-any.whl (1.4 MB)
Using cached boto3-1.34.80-py3-none-any.whl (139 kB)
Using cached botocore-1.34.80-py3-none-any.whl (12.1 MB)
Using cached s3transfer-0.10.1-py3-none-any.whl (82 kB)
[0mInstalling collected packages: botocore, s3transfer, boto3, sagemaker
  Attempting uninstall: botocore
    Found existing installation: botocore 1.29.165
    Uninstalling botocore-1.29.165:
      Successfully uninstalled botocore-1.29.165
  Attempting uninstall: s3transfer
    Found ex

### Import Libraries

The model used in this example notebook is DeepAr forecasting model in AWS. For more information, please check https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html

Besides, a separate notebook will introduce how to bring your own time series model into TSX.

In [3]:
import yaml
import pandas as pd
import numpy as np
import sagemaker
import boto3
import json
import mercury
import pprint

from sagemaker import get_execution_role, session

session = boto3.Session()
s3_client = session.client("s3")
sagemaker_session = sagemaker.Session()
sm_client = boto3.client("sagemaker")
region = session.region_name
role = get_execution_role()
bucket = sagemaker.Session().default_bucket()
image_uri = sagemaker.image_uris.retrieve("forecasting-deepar", region)
training_job_name = "DeepArTest"

sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/zicanl/.config/sagemaker/config.yaml


### Loading the data: Everlane Dataset

In the example dataset below, here are the corresponding variable name for TimeSeries model attributes. Same column names will be applied to `time_series_mock_data.json` as well.

- `item_id`: unique identifier for the object to be analyzed
- `target_value`: The univariate TimeSeries instance which the residuals will be computed for, also named as `series` in some open source model
- `dynamic_feature_x`: several past-observed or future-known covariate time series, also named as `past_covariates` or `future_covariates`
- `static_feature`: characteristics of a time series / constants which do not change over time, also named as `static_covariates`

In [4]:
data_tts = pd.read_json("training_dataset_lines.json", orient="records", lines=True)
data_tts.head()

Unnamed: 0,item_id,timestamp,target_value,dynamic_feature_1,dynamic_feature_2,dynamic_feature_3,static_feature_1,static_feature_2
0,mosfets,2019-09-11,47650.3,0.4576,0.2164,0.1906,1,1
1,mosfets,2019-09-12,47380.3,0.4839,0.2274,0.1889,1,1
2,mosfets,2019-09-13,50905.9,0.5047,0.2391,0.1877,1,1
3,mosfets,2019-09-14,52401.3,0.5189,0.2516,0.1871,1,1
4,mosfets,2019-09-15,68734.1,0.5258,0.2647,0.187,1,1


Group the data into the format that can be accepted by the DeepAR model

In [5]:
grouped_data = data_tts.groupby("item_id")
prediction_length = 14
dynamic_feat_list = ["dynamic_feature_1", "dynamic_feature_2", "dynamic_feature_3"]
deepar_training = []
deepar_test = []

for item_id, group in grouped_data:
    deepar_training.append(
        {
            "start": str(group["timestamp"].min()),
            "target": group["target_value"].fillna("NaN").tolist()[:-prediction_length],
            "dynamic_feat": [
                group[feature_name].tolist()[:-prediction_length]
                for feature_name in dynamic_feat_list
            ],
            "cat": [int(group.iloc[0]["static_feature_1"]), int(group.iloc[0]["static_feature_2"])],
        }
    )

    deepar_test.append(
        {
            "start": str(group["timestamp"].min()),
            "target": group["target_value"].fillna("NaN").tolist()[:-prediction_length],
            "dynamic_feat": [
                group[feature_name].tolist()[:-prediction_length]
                for feature_name in dynamic_feat_list
            ],
            "cat": [int(group.iloc[0]["static_feature_1"]), int(group.iloc[0]["static_feature_2"])],
        }
    )

Upload training and testing file

In [6]:
def write_dicts_to_json(path, data):
    with open(path, "wb") as file_path:
        for ts in data:
            file_path.write(json.dumps(ts).encode("utf-8"))
            file_path.write("\n".encode("utf-8"))


deepar_training_path = "train.json"
deepar_test_path = "test.json"
write_dicts_to_json(deepar_training_path, deepar_training)
write_dicts_to_json(deepar_test_path, deepar_test)

# upload the training and test file to S3
deepar_s3_training_path = "sagemaker/endpoint_test/train.json"
deepar_s3_test_path = "sagemaker/endpoint_test/test.json"

s3_client.upload_file(deepar_training_path, bucket, deepar_s3_training_path)
s3_client.upload_file(deepar_test_path, bucket, deepar_s3_test_path)

### Train TimeSeries Model

The example time series model is the built in [Deep-Ar forecasting model](https://docs.aws.amazon.com/sagemaker/latest/dg/algorithms-time-series.html) in SageMaker.

In [7]:
estimator = sagemaker.estimator.Estimator(
    sagemaker_session=sagemaker_session,
    image_uri=image_uri,
    role=role,
    instance_count=1,
    instance_type="ml.c5.2xlarge",
    base_job_name=training_job_name,
    use_spot_instances=False,
    output_path=f"s3://{bucket}/deepar/endpoint_test/model_output",  # specify a model output path in S3
)

hyperparameters = {
    "epochs": "10",
    "time_freq": "H",  # the intersection of the time in provided dataset
    "prediction_length": prediction_length,
    "context_length": prediction_length,
}

estimator.set_hyperparameters(**hyperparameters)

estimator.fit(
    inputs={
        "train": "s3://{}/{}".format(bucket, deepar_s3_training_path),
        "test": "s3://{}/{}".format(bucket, deepar_s3_test_path),
    }
)

INFO:sagemaker:Creating training-job with name: DeepArTest-2024-04-09-17-56-24-856


2024-04-09 17:56:24 Starting - Starting the training job...
2024-04-09 17:56:41 Starting - Preparing the instances for training...
2024-04-09 17:57:14 Downloading - Downloading input data...
2024-04-09 17:57:29 Downloading - Downloading the training image...............
2024-04-09 18:00:24 Training - Training image download completed. Training in progress....Docker entrypoint called with argument(s): train
Running default environment configuration script
Running custom environment configuration script
  if num_device is 1 and 'dist' not in kvstore:
[04/09/2024 18:00:44 INFO 139861130520384] Reading default configuration from /opt/amazon/lib/python3.8/site-packages/algorithm/resources/default-input.json: {'_kvstore': 'auto', '_num_gpus': 'auto', '_num_kv_servers': 'auto', '_tuning_objective_metric': '', 'cardinality': 'auto', 'dropout_rate': '0.10', 'early_stopping_patience': '', 'embedding_dimension': '10', 'learning_rate': '0.001', 'likelihood': 'student-t', 'mini_batch_size': '128', 

[04/09/2024 18:00:46 INFO 139861130520384] processed a total of 1313 examples
#metrics {"StartTime": 1712685645.7165518, "EndTime": 1712685646.928788, "Dimensions": {"Algorithm": "AWS/DeepAR", "Host": "algo-1", "Operation": "training"}, "Metrics": {"update.time": {"sum": 1211.9834423065186, "count": 1, "min": 1211.9834423065186, "max": 1211.9834423065186}}}
[04/09/2024 18:00:46 INFO 139861130520384] #throughput_metric: host=algo-1, train throughput=1083.283137663433 records/second
[04/09/2024 18:00:46 INFO 139861130520384] #progress_metric: host=algo-1, completed 20.0 % of epochs
[04/09/2024 18:00:46 INFO 139861130520384] #quality_metric: host=algo-1, epoch=1, train loss <loss>=11.426260167902166
[04/09/2024 18:00:47 INFO 139861130520384] Epoch[2] Batch[0] avg_epoch_loss=11.216604
[04/09/2024 18:00:47 INFO 139861130520384] #quality_metric: host=algo-1, epoch=2, batch=0 train loss <loss>=11.216604232788086
[04/09/2024 18:00:47 INFO 139861130520384] Epoch[2] Batch[5] avg_epoch_loss=11.12

[04/09/2024 18:00:53 INFO 139861130520384] #quality_metric: host=algo-1, epoch=7, batch=5 train loss <loss>=10.710832436879477
[04/09/2024 18:00:53 INFO 139861130520384] Epoch[7] Batch [5]#011Speed: 1553.05 samples/sec#011loss=10.710832
[04/09/2024 18:00:54 INFO 139861130520384] processed a total of 1238 examples
#metrics {"StartTime": 1712685652.9992828, "EndTime": 1712685654.128487, "Dimensions": {"Algorithm": "AWS/DeepAR", "Host": "algo-1", "Operation": "training"}, "Metrics": {"update.time": {"sum": 1128.94868850708, "count": 1, "min": 1128.94868850708, "max": 1128.94868850708}}}
[04/09/2024 18:00:54 INFO 139861130520384] #throughput_metric: host=algo-1, train throughput=1096.5100057586037 records/second
[04/09/2024 18:00:54 INFO 139861130520384] #progress_metric: host=algo-1, completed 80.0 % of epochs
[04/09/2024 18:00:54 INFO 139861130520384] #quality_metric: host=algo-1, epoch=7, train loss <loss>=10.693192481994629
[04/09/2024 18:00:54 INFO 139861130520384] Epoch[8] Batch[0] a

2024-04-09 18:01:16 Completed - Training job completed
Training seconds: 242
Billable seconds: 242


### Create an Endpoint from Training Job

In [8]:
job_name = estimator.latest_training_job.name

# Deploy the endpoint from a training job
endpoint_name = sagemaker_session.endpoint_from_job(
    job_name=job_name,
    initial_instance_count=1,
    instance_type="ml.c5.large",
    image_uri=image_uri,
    role=role,
)
print(endpoint_name)

INFO:sagemaker:Creating model with name: DeepArTest-2024-04-09-17-56-24-856
INFO:sagemaker:Creating endpoint-config with name DeepArTest-2024-04-09-17-56-24-856
INFO:sagemaker:Creating endpoint with name DeepArTest-2024-04-09-17-56-24-856


---------------!DeepArTest-2024-04-09-17-56-24-856


In [9]:
boto3.client("sagemaker").describe_endpoint(EndpointName=endpoint_name)

{'EndpointName': 'DeepArTest-2024-04-09-17-56-24-856',
 'EndpointArn': 'arn:aws:sagemaker:us-west-2:678264136642:endpoint/DeepArTest-2024-04-09-17-56-24-856',
 'EndpointConfigName': 'DeepArTest-2024-04-09-17-56-24-856',
 'ProductionVariants': [{'VariantName': 'AllTraffic',
   'DeployedImages': [{'SpecifiedImage': '156387875391.dkr.ecr.us-west-2.amazonaws.com/forecasting-deepar:1',
     'ResolvedImage': '156387875391.dkr.ecr.us-west-2.amazonaws.com/forecasting-deepar@sha256:9c75ecae00a74a32c9a2471b2a171bceabf8f09adb1188068a66e3541f0cdefe',
     'ResolutionTime': datetime.datetime(2024, 4, 9, 18, 1, 40, 59000, tzinfo=tzlocal())}],
   'CurrentWeight': 1.0,
   'DesiredWeight': 1.0,
   'CurrentInstanceCount': 1,
   'DesiredInstanceCount': 1}],
 'EndpointStatus': 'InService',
 'CreationTime': datetime.datetime(2024, 4, 9, 18, 1, 39, 188000, tzinfo=tzlocal()),
 'LastModifiedTime': datetime.datetime(2024, 4, 9, 18, 9, 23, 704000, tzinfo=tzlocal()),
 'ResponseMetadata': {'RequestId': 'f119680c-

### Verify the endpoint

In [10]:
request = {
    "target": [33.1595, 30.1788, 28.5022, 27.5708, 27.7571],
    "start": "2014-05-30 01:00:00",
    "dynamic_feat": [
        [
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
            1.0,
            2.0,
            3.0,
            4.0,
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
        ],
        [
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
            1.0,
            2.0,
            3.0,
            4.0,
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
        ],
        [
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
            1.0,
            2.0,
            3.0,
            4.0,
            1.0,
            2.0,
            3.0,
            4.0,
            5.0,
        ],
    ],
    "cat": [0, 1],
}

input_instances = [request]
predictor_input = {
    "instances": input_instances,
}

In [11]:
from sagemaker.serializers import JSONSerializer

predictor = sagemaker.predictor.Predictor(
    endpoint_name=endpoint_name, sagemaker_session=sagemaker_session, serializer=JSONSerializer()
)
prediction = predictor.predict(predictor_input)
print(json.loads(prediction))

{'predictions': [{'mean': [28.5425510406, 24.647687912, 22.2436542511, 18.3880290985, 15.5038976669, 22.9397029877, 20.5245018005, 17.0487174988, 15.3083105087, 21.5579490662, 19.5475158691, 15.5444173813, 13.7278432846, 12.3198976517]}]}


## Time Series Explainability

### Import Components

Import the components needed to make a TSX call.

In [12]:
from sagemaker.clarify import (
    AsymmetricShapleyValueConfig,  # config for the explainability algorithm
    DataConfig,  # general-purpose DataConfig. time series-specific data config object is provided to this
    ModelConfig,  # general-purpose ModelConfig. time series-specific data config object is provided to this
    SageMakerClarifyProcessor,  # processor object, the job call is made via this
    TimeSeriesDataConfig,  # time series-specific data config object
    TimeSeriesModelConfig,  # time series-specific predictor config object
    TimeSeriesJSONDatasetFormat,  # time series-specific dataset format
)

### Set Configurations

In [13]:
dataset_name = "time_series_mock_data.json"

# Content template
c_template = '{"instances": $records}'
# Record template
r_template = '{"start": $start_time, "target": $target_time_series, "dynamic_feat": $related_time_series, "cat": $static_covariates}'

s3 = boto3.client("s3")  # s3 client
bucket_name = sagemaker.Session().default_bucket()
bucket_uri = "s3://" + bucket_name + "/deepar/"

s3_client.upload_file(dataset_name, bucket_name, f"deepar/data/{dataset_name}")

### Asymmetric Shapley value

Our time series forecasting explainability algorithm hinges on the application of the asymmetric Shapley values (ASV) from the theory of cooperative games. The ASV is a modification of the well-known Shapley value (e.g SHAP) that discards the symmetry axiom, but retains the efficiency exioms (i.e. attributions sum up to the predictions). Coalitions of features are generated based on a given probability distribution over feature *permutations* (rather than over *subsets* in the case of the Shapley value). In the case of time series, the distributions we use puts zero probability on permutations of features that do not respect the temporal dependencies, i.e. that have "holes". 

**References:**
- Our main scientific reference is https://arxiv.org/abs/1910.06358. We scale the approach of the paper to include also static covariates, related time series and implement a stochastic estimator for efficiency.
- A very useful math reference is [Probabilistic values by RJ Weber](http://www.library.fa.ru/files/Roth2.pdf#page=109);  specifically, section 8 about random-order values (these are the same mathematical construction of ASV). 

### Create `AsymmetricShapleyValueConfig`

An `AsymmetricShapleyValueConfig` is used to configure the algorithm Clarify uses for time series explainability. It takes the following arguments:

- `direction`: direction of explanation to be used. Available explanation types are `"chronological"`, `"anti_chronological"`, `"bidirectional"`. The cronological direction highlights the effect of older timesteps over more recent one, while the anti-chronological direction higlights the effect of timesteps closer to the forecasting. Bidirectional is a combination of the previous two modes. 
- `granularity`: Granularity of explanation to be used. Available granularities are `"timewise"` and `"fine_grained"`. The first granularity is fast and computes the attribution of individual timesteps toward the forecast, not making distinctions of related time series. The fine-grained mode is slower, but computes an attribution for every timestep and every feature dynamic, distinguishing between related and target TS.
- `num_samples`: Number of samples to be used in the Asymmetric Shapley Value forecasting algorithm. Only applicable when using `"fine_grained"`  explanations. This represents the number of permutations sampled for computing the ASV. 

The notebook sets `explanation_direction` and `granularity` as variables for later reference.

In [14]:
direction = "chronological"
granularity = "fine_grained"

Only then does the notebook create the `AsymmetricShapleyValueConfig` object.

In [15]:
asym_shap_val_config = AsymmetricShapleyValueConfig(
    direction=direction,
    granularity=granularity,
    num_samples=9,  # (dimension of target_time_series + dimension of related_time_series) ^ 2
    baseline={
        "target_time_series": "zero",
        "related_time_series": "zero",
        "static_covariates": {
            "mosfets": [0, 1],
            "interpol": [1, 1],
        },
    },
)

### Create `TimeSeriesDataConfig`

A `TimeSeriesDataConfig` object is used to configure data I/O settings specific to TSX. It takes the following arguments:

- `target_time_series`: A string or a zero-based integer index. Used to locate the target time series in the shared input dataset. If this parameter is a string, then all other parameters must also be strings or lists of strings. If this parameter is an int, then all others must be ints or lists of ints.
- `item_id`: A string or a zero-based integer index. Used to locate item id in the shared input dataset.
- `timestamp`: A string or a zero-based integer index. Used to locate timestamp in the shared input dataset.
- `related_time_series`: Optional. An array of strings or array of zero-based integer indices. Used to locate all related time series in the shared input dataset (if present).
- `static_covariates`: Optional. An array of strings or array of zero-based integer indices. Used to locate all item metadata fields in the shared input dataset (if present).
- `dataset_format`: Optional. A string which describes the format of the data files provided for analysis. Should only be provided when dataset is in JSON format. Currently, we support `columns` and `timestamp_records` where example mock data files `ts_cols.json` and `time_series_mock_data.json` are provided respectively.

This `TimeSeriesDataConfig` helps the container to parse the data needed for the analysis. Any additional data columns will be excluded if not providing corresponding Jmes_path to locate them.

In [16]:
ts_data_config = TimeSeriesDataConfig(
    target_time_series="[].target_value",
    item_id="[].item_id",
    timestamp="[].timestamp",
    related_time_series=[f"[].dynamic_feature_{x+1}" for x in range(3)],
    static_covariates=["[].static_feature_1", "[].static_feature_2"],
    dataset_format=TimeSeriesJSONDatasetFormat.TIMESTAMP_RECORDS,
)

### Create `TimeSeriesModelConfig`

A `TimeSeriesModelConfig` is used to configure model settings specific to TSX. At the moment it has only one argument:

- `forecast`: JMESPath expression to extract the forecast result.

In [17]:
ts_model_config = TimeSeriesModelConfig(
    forecast="predictions[*].mean",
)

### Create DataConfig

General information about the dataset the TimeSeries model uses is provided to `DataConfig`. Here, we are providing where to retrieve the dataset, where to output the explainability job results, what format the dataset is in, and our TSX specific data settings.

In [18]:
input_uri = bucket_uri + "data/" + dataset_name
output_path = bucket_uri + "output"

data_config = DataConfig(
    s3_data_input_path=input_uri,
    s3_output_path=output_path,
    dataset_type="application/json",
    time_series_data_config=ts_data_config,
    headers=[
        "item_id",
        "timestamp",
        "target_value",
        "dynamic_feature_1",
        "dynamic_feature_2",
        "dynamic_feature_3",
        "static_feature_1",
        "static_feature_2",
    ],
)

### Create ModelConfig

With `ModelConfig` is configured here, Clarify will deploy the specified model to a new endpoint.

In [19]:
model_config = ModelConfig(
    endpoint_name=endpoint_name,
    content_type="application/json",
    accept_type="application/json",
    content_template=c_template,
    record_template=r_template,
    time_series_model_config=ts_model_config,
)

It is also possible to specify an existing endpoint for Clarify to use with the following modifications to the `ModifyConfig` call:

1. Omitting `model_name`, `instance_count`, `instance_type`, and `endpoint_name_prefix`.
2. Provided `endpoint_name`.

### Setup Processor

Create the `Processor` object that will setup the explainability job.

In [20]:
instance_count = 1
instance_type = "ml.c5.2xlarge"

clarify_processor = SageMakerClarifyProcessor(
    role=role,
    sagemaker_session=sagemaker_session,
    instance_count=instance_count,
    instance_type=instance_type,
    job_name_prefix="clarify-tsx-job-demo",
)

INFO:sagemaker.image_uris:Defaulting to the only supported framework/algorithm version: 1.0.
INFO:sagemaker.image_uris:Ignoring unnecessary instance type: None.


### Run Explainability Call

In [21]:
clarify_processor.run_explainability(
    data_config=data_config,
    model_config=model_config,
    explainability_config=asym_shap_val_config,
)

INFO:sagemaker.clarify:Analysis Config: {'dataset_type': 'application/json', 'headers': ['item_id', 'timestamp', 'target_value', 'dynamic_feature_1', 'dynamic_feature_2', 'dynamic_feature_3', 'static_feature_1', 'static_feature_2'], 'time_series_data_config': {'target_time_series': '[].target_value', 'item_id': '[].item_id', 'timestamp': '[].timestamp', 'related_time_series': ['[].dynamic_feature_1', '[].dynamic_feature_2', '[].dynamic_feature_3'], 'static_covariates': ['[].static_feature_1', '[].static_feature_2'], 'dataset_format': 'timestamp_records'}, 'predictor': {'endpoint_name': 'DeepArTest-2024-04-09-17-56-24-856', 'accept_type': 'application/json', 'content_type': 'application/json', 'content_template': '{"instances": $records}', 'record_template': '{"start": $start_time, "target": $target_time_series, "dynamic_feat": $related_time_series, "cat": $static_covariates}', 'time_series_predictor_config': {'forecast': 'predictions[*].mean'}}, 'methods': {'report': {'name': 'report',

INFO:sagemaker-clarify-processing:Starting SageMaker Clarify Processing job
INFO:analyzer.data_loading.data_loader_util:Analysis config path: /opt/ml/processing/input/config/analysis_config.json
INFO:analyzer.data_loading.data_loader_util:Analysis result path: /opt/ml/processing/output
INFO:analyzer.data_loading.data_loader_util:This host is algo-1.
INFO:analyzer.data_loading.data_loader_util:This host is the leader.
INFO:analyzer.data_loading.data_loader_util:Number of hosts in the cluster is 1.
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/04/09 18:15:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
INFO:analyzer.predictor.managed_endpoint:Checking endpoint status:
Legend:
(OutOfService: x, Creating: -, Updating: -, InService: !, RollingBack: <, Deleting: o, Failed: *)
INFO:analyzer.predictor.managed_endpoint:Endpoint is

## Analysis Config

### Retrieve Config From s3

In [22]:
s3.download_file(bucket_name, "deepar/output/analysis_config.json", "analysis_config.json")

### Display Config

In [23]:
with open("./analysis_config.json", "r") as analyis_config_file:
    analysis_config = json.load(analyis_config_file)
    # mercury.JSON(analysis_config, level = 3)
    config_printer = pprint.PrettyPrinter(width=200, compact=False)
    config_printer.pprint(analysis_config)

{'dataset_type': 'application/json',
 'headers': ['item_id', 'timestamp', 'target_value', 'dynamic_feature_1', 'dynamic_feature_2', 'dynamic_feature_3', 'static_feature_1', 'static_feature_2'],
 'methods': {'asymmetric_shapley_value': {'baseline': {'related_time_series': 'zero', 'static_covariates': {'interpol': [1, 1], 'mosfets': [0, 1]}, 'target_time_series': 'zero'},
                                          'direction': 'chronological',
                                          'granularity': 'fine_grained',
                                          'num_samples': 9},
             'report': {'name': 'report', 'title': 'Analysis Report'}},
 'predictor': {'accept_type': 'application/json',
               'content_template': '{"instances": $records}',
               'content_type': 'application/json',
               'endpoint_name': 'DeepArTest-2024-04-09-17-56-24-856',
               'record_template': '{"start": $start_time, "target": $target_time_series, "dynamic_feat": $related_ti

## Explainability Results

### Retrieve Results From s3

In [24]:
full_result_path = f"deepar/output/asymmetric_shapley_value/{granularity}_{direction}/out.jsonl"

s3.download_file(bucket_name, full_result_path, "results.jsonl")

### Display Results

In [25]:
with open("./results.jsonl", "r") as results_file:
    results_lines = results_file.readlines()
    explainability_results = [json.loads(jsonline) for jsonline in results_lines]
    mercury.JSON(explainability_results, level=5)
    # results_printer = pprint.PrettyPrinter(width = 200, depth = 5, compact = False)
    # results_printer.pprint(explainability_results)

## Clean Up

Remove downloaded/installed files and deployed resources as necessary.

In [26]:
# remove the model, endpoint_config and endpoint from sagemaker
boto3.client("sagemaker").delete_model(ModelName=endpoint_name)
boto3.client("sagemaker").delete_endpoint_config(EndpointConfigName=endpoint_name)
boto3.client("sagemaker").delete_endpoint(EndpointName=endpoint_name)

{'ResponseMetadata': {'RequestId': '1408e236-27a2-4e5f-916f-b80c51e1d522',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '1408e236-27a2-4e5f-916f-b80c51e1d522',
   'content-type': 'application/x-amz-json-1.1',
   'date': 'Tue, 09 Apr 2024 18:16:35 GMT',
   'content-length': '0'},
  'RetryAttempts': 0}}

In [27]:
# remove the results and analysis config files
# !rm -r ./sagemaker-python-sdk -f
# !rm analysis_config.json
# !rm results.jsonl

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/sagemaker-clarify|time_series_deepar|time_series_deepar.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/sagemaker-clarify|fairness_and_explainability|fairness_and_explainability.ipynb)
