# Evaluate Model with Amazon SageMaker Processing Jobs and Scikit-Learn

Often, distributed data processing frameworks such as Scikit-Learn are used to pre-process data sets in order to prepare them for training.  

In this notebook we'll use Amazon SageMaker Processing, and leverage the power of Scikit-Learn in a managed SageMaker environment to run our processing workload.

# NOTE:  THIS NOTEBOOK WILL TAKE A 5-10 MINUTES TO COMPLETE.

# PLEASE BE PATIENT.

![](img/prepare_dataset_bert.png)

![](img/processing.jpg)

## Contents

1. Setup Environment
1. Setup Input Data
1. Setup Output Data
1. Build a Spark container for running the processing job
1. Run the Processing Job using Amazon SageMaker
1. Inspect the Processed Output Data

# Setup Environment

Let's start by specifying:
* The S3 bucket and prefixes that you use for training and model data. Use the default bucket specified by the Amazon SageMaker session.
* The IAM role ARN used to give processing and training access to the dataset.

In [31]:
import sagemaker
import boto3

sess = sagemaker.Session()
role = sagemaker.get_execution_role()
bucket = sess.default_bucket()
region = boto3.Session().region_name

sm = boto3.Session().client(service_name="sagemaker", region_name=region)

In [32]:
%store -r training_job_name

In [33]:
try:
    training_job_name
    print("[OK]")
except NameError:
    print("+++++++++++++++++++++++++++++++")
    print("[ERROR] Please run the notebooks in the previous TRAIN section before you continue.")
    print("+++++++++++++++++++++++++++++++")

[OK]


In [34]:
print(training_job_name)

tensorflow-training-2024-02-18-22-26-04-112


In [35]:
%store -r raw_input_data_s3_uri

In [36]:
try:
    raw_input_data_s3_uri
except NameError:
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")
    print("[ERROR] Please run the notebooks in the PREPARE section before you continue.")
    print("++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++")

In [37]:
print(raw_input_data_s3_uri)

s3://sagemaker-eu-west-2-123137613716/amazon-reviews-pds/tsv/


In [38]:
%store -r max_seq_length

In [39]:
try:
    max_seq_length
    print("[OK]")
except NameError:
    print("+++++++++++++++++++++++++++++++")
    print("[ERROR] Please run the notebooks in the previous TRAIN section before you continue.")
    print("+++++++++++++++++++++++++++++++")

[OK]


In [40]:
print(max_seq_length)

64


In [41]:
%store -r experiment_name

In [42]:
try:
    experiment_name
    print("[OK]")
except NameError:
    print("+++++++++++++++++++++++++++++++")
    print("[ERROR] Please run the notebooks in the previous TRAIN section before you continue.")
    print("+++++++++++++++++++++++++++++++")

[OK]


In [43]:
print(experiment_name)

Amazon-Customer-Reviews-BERT-Experiment-1708210947


In [44]:
%store -r trial_name

In [45]:
try:
    trial_name
    print("[OK]")
except NameError:
    print("+++++++++++++++++++++++++++++++")
    print("[ERROR] Please run the notebooks in the previous TRAIN section before you continue.")
    print("+++++++++++++++++++++++++++++++")

[OK]


In [46]:
print(trial_name)

trial-1708210947


In [47]:
print(training_job_name)

tensorflow-training-2024-02-18-22-26-04-112


In [48]:
from sagemaker.tensorflow.estimator import TensorFlow

describe_training_job_response = sm.describe_training_job(TrainingJobName=training_job_name)
print(describe_training_job_response)

{'TrainingJobName': 'tensorflow-training-2024-02-18-22-26-04-112', 'TrainingJobArn': 'arn:aws:sagemaker:eu-west-2:123137613716:training-job/tensorflow-training-2024-02-18-22-26-04-112', 'ModelArtifacts': {'S3ModelArtifacts': 's3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-26-04-112/output/model.tar.gz'}, 'TrainingJobStatus': 'Completed', 'SecondaryStatus': 'Completed', 'HyperParameters': {'enable_checkpointing': 'false', 'enable_sagemaker_debugger': 'true', 'enable_tensorboard': 'true', 'epochs': '1', 'epsilon': '1e-08', 'freeze_bert_layer': 'false', 'learning_rate': '0.001', 'max_seq_length': '64', 'model_dir': '"s3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-26-04-112/model"', 'run_sample_predictions': 'true', 'run_test': 'true', 'run_validation': 'true', 'sagemaker_container_log_level': '20', 'sagemaker_job_name': '"tensorflow-training-2024-02-18-22-26-04-112"', 'sagemaker_program': '"tf_bert_reviews.py"', 'sagemaker_region': '"eu-we

In [49]:
model_dir_s3_uri = describe_training_job_response["ModelArtifacts"]["S3ModelArtifacts"].replace("model.tar.gz", "")
model_dir_s3_uri

's3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-26-04-112/output/'

# Run the Processing Job using Amazon SageMaker

Next, use the Amazon SageMaker Python SDK to submit a processing job using our custom python script.

# Create the `Experiment Config`

In [50]:
experiment_config = {
    "ExperimentName": experiment_name,
    "TrialName": trial_name,
    "TrialComponentDisplayName": "evaluate",
}

# Set the Processing Job Hyper-Parameters 

In [51]:
processing_instance_type = "ml.m5.xlarge"
processing_instance_count = 1

# Choosing a `max_seq_length` for BERT
Since a smaller `max_seq_length` leads to faster training and lower resource utilization, we want to find the smallest review length that captures `80%` of our reviews.

Remember our distribution of review lengths from a previous section?

```
mean         51.683405
std         107.030844
min           1.000000
10%           2.000000
20%           7.000000
30%          19.000000
40%          22.000000
50%          26.000000
60%          32.000000
70%          43.000000
80%          63.000000
90%         110.000000
100%       5347.000000
max        5347.000000
```

![](img/review_word_count_distribution.png)

Review length `63` represents the `80th` percentile for this dataset.  However, it's best to stick with powers-of-2 when using BERT.  So let's choose `64` as this is the smallest power-of-2 greater than `63`.  Reviews with length > `64` will be truncated to `64`.

In [52]:
from sagemaker.sklearn.processing import SKLearnProcessor

processor = SKLearnProcessor(
    framework_version="0.23-1",
    role=role,
    instance_type=processing_instance_type,
    instance_count=processing_instance_count,
    max_runtime_in_seconds=1800,
)

# Reduced the max run time in seconds by 1/4th ===> 7200 * 0.25 = 1800

In [53]:
from sagemaker.processing import ProcessingInput, ProcessingOutput

processor.run(
    code="evaluate_model_metrics.py",
    inputs=[
        ProcessingInput(
            input_name="model-tar-s3-uri", source=model_dir_s3_uri, destination="/opt/ml/processing/input/model/"
        ),
        ProcessingInput(
            input_name="evaluation-data-s3-uri",
            source=raw_input_data_s3_uri,
            destination="/opt/ml/processing/input/data/",
        ),
    ],
    outputs=[
        ProcessingOutput(s3_upload_mode="EndOfJob", output_name="metrics", source="/opt/ml/processing/output/metrics"),
    ],
    arguments=["--max-seq-length", str(max_seq_length)],
    experiment_config=experiment_config,
    logs=True,
    wait=False,
)


Job Name:  sagemaker-scikit-learn-2024-02-18-22-54-40-811
Inputs:  [{'InputName': 'model-tar-s3-uri', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-26-04-112/output/', 'LocalPath': '/opt/ml/processing/input/model/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'evaluation-data-s3-uri', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-eu-west-2-123137613716/amazon-reviews-pds/tsv/', 'LocalPath': '/opt/ml/processing/input/data/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-18-22-54-40-811/input/code/evaluate_model_metrics.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputM

In [54]:
scikit_processing_job_name = processor.jobs[-1].describe()["ProcessingJobName"]
print(scikit_processing_job_name)

sagemaker-scikit-learn-2024-02-18-22-54-40-811


In [55]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/sagemaker/home?region={}#/processing-jobs/{}">Processing Job</a></b>'.format(
            region, scikit_processing_job_name
        )
    )
)

In [56]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://console.aws.amazon.com/cloudwatch/home?region={}#logStream:group=/aws/sagemaker/ProcessingJobs;prefix={};streamFilter=typeLogStreamPrefix">CloudWatch Logs</a> After About 5 Minutes</b>'.format(
            region, scikit_processing_job_name
        )
    )
)

In [57]:
from IPython.core.display import display, HTML

display(
    HTML(
        '<b>Review <a target="blank" href="https://s3.console.aws.amazon.com/s3/buckets/{}/{}/?region={}&tab=overview">S3 Output Data</a> After The Processing Job Has Completed</b>'.format(
            bucket, scikit_processing_job_name, region
        )
    )
)

# Monitor the Processing Job

In [58]:
running_processor = sagemaker.processing.ProcessingJob.from_processing_name(
    processing_job_name=scikit_processing_job_name, sagemaker_session=sess
)

processing_job_description = running_processor.describe()

print(processing_job_description)

{'ProcessingInputs': [{'InputName': 'model-tar-s3-uri', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-26-04-112/output/', 'LocalPath': '/opt/ml/processing/input/model/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'evaluation-data-s3-uri', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-eu-west-2-123137613716/amazon-reviews-pds/tsv/', 'LocalPath': '/opt/ml/processing/input/data/', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-18-22-54-40-811/input/code/evaluate_model_metrics.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyR

In [59]:
processing_evaluation_metrics_job_name = processing_job_description["ProcessingJobName"]
print(processing_evaluation_metrics_job_name)

sagemaker-scikit-learn-2024-02-18-22-54-40-811


In [60]:
%%time

running_processor.wait(logs=False)

.

................................................................................................................................................................!CPU times: user 646 ms, sys: 61.8 ms, total: 708 ms
Wall time: 13min 31s


# _Please Wait Until the ^^ Processing Job ^^ Completes Above._

# Inspect the Processed Output Data

Take a look at a few rows of the transformed dataset to make sure the processing was successful.

In [61]:
processing_job_description = running_processor.describe()

output_config = processing_job_description["ProcessingOutputConfig"]
for output in output_config["Outputs"]:
    if output["OutputName"] == "metrics":
        processed_metrics_s3_uri = output["S3Output"]["S3Uri"]

print(processed_metrics_s3_uri)

s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-18-22-54-40-811/output/metrics


In [62]:
!aws s3 ls $processed_metrics_s3_uri/

2024-02-18 23:08:04         42 evaluation.json


## Show the test accuracy

In [63]:
import json
from pprint import pprint

evaluation_json = sagemaker.s3.S3Downloader.read_file("{}/evaluation.json".format(processed_metrics_s3_uri))

pprint(json.loads(evaluation_json))

{'metrics': {'accuracy': {'value': 0.12}}}


In [64]:
# !aws s3 cp $processed_metrics_s3_uri/confusion_matrix.png ./model_evaluation/

# import time

# time.sleep(10)  # Slight delay for our notebook to recognize the newly-downloaded file

In [65]:
# %%html

# <img src='./model_evaluation/confusion_matrix.png'>

# Pass Variables to the Next Notebook(s)

In [66]:
%store processing_evaluation_metrics_job_name

Stored 'processing_evaluation_metrics_job_name' (str)


In [67]:
%store processed_metrics_s3_uri

Stored 'processed_metrics_s3_uri' (str)


In [68]:
%store

Stored variables and their in-db values:
auto_ml_job_name                                      -> 'automl-dm-15-23-18-19'
autopilot_endpoint_arn                                -> 'arn:aws:sagemaker:eu-west-2:123137613716:endpoint
autopilot_endpoint_name                               -> 'automl-dm-ep-16-02-16-45'
autopilot_model_arn                                   -> 'arn:aws:sagemaker:eu-west-2:123137613716:model/au
autopilot_model_name                                  -> 'automl-dm-model-16-02-16-44'
autopilot_train_s3_uri                                -> 's3://sagemaker-eu-west-2-123137613716/data/amazon
balance_dataset                                       -> True
balanced_bias_data_jsonlines_s3_uri                   -> 's3://sagemaker-eu-west-2-123137613716/bias-detect
balanced_bias_data_s3_uri                             -> 's3://sagemaker-eu-west-2-123137613716/bias-detect
bias_data_s3_uri                                      -> 's3://sagemaker-eu-west-2-123137613716/bias-dete

# Show the Experiment Tracking Lineage

In [69]:
from sagemaker.analytics import ExperimentAnalytics

import pandas as pd

pd.set_option("max_colwidth", 500)

experiment_analytics = ExperimentAnalytics(
    sagemaker_session=sess, experiment_name=experiment_name, sort_by="CreationTime", sort_order="Descending"
)

experiment_analytics_df = experiment_analytics.dataframe()
experiment_analytics_df

Unnamed: 0,TrialComponentName,DisplayName,SourceArn,SageMaker.InstanceCount,SageMaker.InstanceType,SageMaker.VolumeSizeInGB,SageMaker.ImageUri - MediaType,SageMaker.ImageUri - Value,code - MediaType,code - Value,...,SageMaker.ModelArtifact - Value,AWS_DEFAULT_REGION,raw-input-data - MediaType,raw-input-data - Value,bert-test - MediaType,bert-test - Value,bert-train - MediaType,bert-train - Value,bert-validation - MediaType,bert-validation - Value
0,sagemaker-scikit-learn-2024-02-18-22-54-40-811-aws-processing-job,evaluate,arn:aws:sagemaker:eu-west-2:123137613716:processing-job/sagemaker-scikit-learn-2024-02-18-22-54-40-811,1.0,ml.m5.xlarge,30.0,,764974769150.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3,,s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-18-22-54-40-811/input/code/evaluate_model_metrics.py,...,,,,,,,,,,
1,sagemaker-scikit-learn-2024-02-18-22-51-33-980-aws-processing-job,evaluate,arn:aws:sagemaker:eu-west-2:123137613716:processing-job/sagemaker-scikit-learn-2024-02-18-22-51-33-980,1.0,ml.m5.xlarge,30.0,,764974769150.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3,,s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-18-22-51-33-980/input/code/evaluate_model_metrics.py,...,,,,,,,,,,
2,tensorflow-training-2024-02-18-22-26-04-112-aws-training-job,train,arn:aws:sagemaker:eu-west-2:123137613716:training-job/tensorflow-training-2024-02-18-22-26-04-112,1.0,ml.c5.9xlarge,1024.0,,,,,...,s3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-26-04-112/output/model.tar.gz,,,,,,,,,
3,tensorflow-training-2024-02-18-22-07-23-321-aws-training-job,train,arn:aws:sagemaker:eu-west-2:123137613716:training-job/tensorflow-training-2024-02-18-22-07-23-321,1.0,ml.c5.9xlarge,1024.0,,,,,...,s3://sagemaker-eu-west-2-123137613716/tensorflow-training-2024-02-18-22-07-23-321/output/model.tar.gz,,,,,,,,,
4,sagemaker-scikit-learn-2024-02-17-23-02-28-552-aws-processing-job,prepare,arn:aws:sagemaker:eu-west-2:123137613716:processing-job/sagemaker-scikit-learn-2024-02-17-23-02-28-552,2.0,ml.c5.2xlarge,30.0,,764974769150.dkr.ecr.eu-west-2.amazonaws.com/sagemaker-scikit-learn:0.23-1-cpu-py3,,s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-17-23-02-28-552/input/code/preprocess-scikit-text-to-bert-feature-store.py,...,,eu-west-2,,s3://sagemaker-eu-west-2-123137613716/amazon-reviews-pds/tsv/,,s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-17-23-02-28-552/output/bert-test,,s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-17-23-02-28-552/output/bert-train,,s3://sagemaker-eu-west-2-123137613716/sagemaker-scikit-learn-2024-02-17-23-02-28-552/output/bert-validation


In [70]:
trial_component_name = experiment_analytics_df.TrialComponentName[0]
print(trial_component_name)

sagemaker-scikit-learn-2024-02-18-22-54-40-811-aws-processing-job


In [71]:
trial_component_description = sm.describe_trial_component(TrialComponentName=trial_component_name)
trial_component_description

{'TrialComponentName': 'sagemaker-scikit-learn-2024-02-18-22-54-40-811-aws-processing-job',
 'TrialComponentArn': 'arn:aws:sagemaker:eu-west-2:123137613716:experiment-trial-component/sagemaker-scikit-learn-2024-02-18-22-54-40-811-aws-processing-job',
 'DisplayName': 'evaluate',
 'Source': {'SourceArn': 'arn:aws:sagemaker:eu-west-2:123137613716:processing-job/sagemaker-scikit-learn-2024-02-18-22-54-40-811',
  'SourceType': 'SageMakerProcessingJob'},
 'Status': {'PrimaryStatus': 'Completed',
  'Message': 'Status: Completed, exit message: null, failure reason: null'},
 'StartTime': datetime.datetime(2024, 2, 18, 22, 59, 2, tzinfo=tzlocal()),
 'EndTime': datetime.datetime(2024, 2, 18, 23, 8, 9, tzinfo=tzlocal()),
 'CreationTime': datetime.datetime(2024, 2, 18, 22, 54, 41, 714000, tzinfo=tzlocal()),
 'CreatedBy': {'UserProfileArn': 'arn:aws:sagemaker:eu-west-2:123137613716:user-profile/d-bftxvdkrbngl/default-20240214t015029',
  'UserProfileName': 'default-20240214t015029',
  'DomainId': 'd-

In [72]:
from sagemaker.lineage.visualizer import LineageTableVisualizer

lineage_table_viz = LineageTableVisualizer(sess)
lineage_table_viz_df = lineage_table_viz.show(processing_job_name=processing_evaluation_metrics_job_name)
lineage_table_viz_df

Unnamed: 0,Name/Source,Direction,Type,Association Type,Lineage Type
0,s3://...811/input/code/evaluate_model_metrics.py,Input,DataSet,ContributedTo,artifact
1,s3://...t-2-123137613716/amazon-reviews-pds/tsv/,Input,DataSet,ContributedTo,artifact
2,s3://...training-2024-02-18-22-26-04-112/output/,Input,DataSet,ContributedTo,artifact
3,76497...om/sagemaker-scikit-learn:0.23-1-cpu-py3,Input,Image,ContributedTo,artifact
4,s3://...n-2024-02-18-22-54-40-811/output/metrics,Output,DataSet,Produced,artifact


# Release Resources

In [73]:
%%html

<p><b>Shutting down your kernel for this notebook to release resources.</b></p>
<button class="sm-command-button" data-commandlinker-command="kernelmenu:shutdown" style="display:none;">Shutdown Kernel</button>
        
<script>
try {
    els = document.getElementsByClassName("sm-command-button");
    els[0].click();
}
catch(err) {
    // NoOp
}    
</script>

In [74]:
%%javascript

try {
    Jupyter.notebook.save_checkpoint();
    Jupyter.notebook.session.delete();
}
catch(err) {
    // NoOp
}

<IPython.core.display.Javascript object>