## Model Evaluation using SageMaker Processing Job

1. [Introduction](#Introduction)
2. [Prerequisites](#Prerequisites)
3. [Setup](#Setup)
4. [Dataset](#Dataset)
5. [Build a SageMaker Processing Job](#Build-a-SageMaker-Processing-Job)
    1. [Prepare the Script and Docker File](#Prepare-the-Script-and-Docker-File)
    2. [Configure a ScriptProcessor](#Configure-a-ScriptProcessor)
6. [Review Outputs](#Review-Outputs)

# Introduction

Postprocess and Model evaluation is an important step to vet out models before deployment. 

In this lab you will use ScriptProcessor from SageMaker Process to build a post processing step after model training to evaluate the performance of the model.  

To setup your ScriptProcessor, we will build a custom container for a model evaluation script which will Load the tensorflow model, Load the test dataset and annotation (from previous module), and then run predicition and generate the confussion matrix. 

** Note: This Notebook was tested on Data Science Kernel in SageMaker Studio**


# Prerequisites

Download the notebook into your environment, and you can run it by simply execute each cell in order. To understand what's happening, you'll need:

- Access to the SageMaker default S3 bucket.
- Familiarity with Python and numpy
- Basic familiarity with AWS S3.
- Basic understanding of AWS Sagemaker.
- Basic familiarity with AWS Command Line Interface (CLI) -- ideally, you should have it set up with credentials to access the AWS account you're running this notebook from.
- SageMaker Studio is preferred for the full UI integration

## Setup

Setting up the environment, load the libraries, and define the parameter for the entire notebook.

In [3]:
import sagemaker
from sagemaker import get_execution_role
import boto3
import json

role = get_execution_role()
sess = sagemaker.Session()

account = sess.account_id()
region = sess.boto_region_name
bucket = sess.default_bucket() # or use your own custom bucket name
prefix = 'postprocessing-model-evaluation'

### Dataset
The dataset we are using is from [Caltech Birds (CUB 200 2011)](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html). 

Here we are using the artifacts from previous labs, thus we need to update the s3 location below for you Test images and Test data annotation file.

- S3 path for test image data
- S3 path for test data annotation file
- S3 path for the bird classification model

In [4]:
s3_images = f's3://{bucket}/preprocess/outputs/1667290054/test/'
s3_manifest = f's3://{bucket}/preprocess/outputs/1667290054/manifest'
s3_model = f's3://{bucket}/{prefix}/postprocessing-model-evaluation-2022-11-01-09-17-05-235/output'

## Build a SageMaker Processing Job

### Prepare the Script and Docker File
With SageMaker, you can run data processing jobs using the SKLearnProcessor, popular ML frameworks processors, Apache Spark, or BYOC.  To learn more about [SageMaker Processing](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job.html)

For this example we are going to practice using ScriptProcess and Bring Our Own Container (BYOC). ScriptProcess require you to feed a container uri from ECR and a custom script for the process.

#### Preparing the script

Please inspect the [evaluation.py](evaluation.py) script that has been provided for you.

Here is what the script [evaluation.py](evaluation.py) does:
1. loading the tf model
2. looping through the annotation file to run inference predictions
3. tally the results using sklearn libraries & generate the confusion matrix
4. save the metrics in a evaluation.json report as output

#### Bring Our Own Container (BYOC)
Below we build a custom docker container and push to Amazon Elastic Container Registry (ECR).

You can use the standard TFflow container, but ScriptProcessor currently does not support `source_dir` for custom requirement.txt and multiple python file.  That is on the roadmap, please follow this [thread](https://github.com/aws/sagemaker-python-sdk/issues/1248) for updates.

In [5]:
!mkdir docker

mkdir: cannot create directory ‘docker’: File exists


In [35]:
%%writefile docker/requirements.txt
# This is the set of Python packages that will get pip installed
# at startup of the Amazon SageMaker endpoint or batch transformation. 
Pillow
scikit-learn
pandas
numpy
tensorflow==2.10
boto3==1.18.4
sagemaker-experiments
matplotlib==3.4.2
seaborn

Overwriting docker/requirements.txt


In [48]:
%%writefile docker/Dockerfile

FROM public.ecr.aws/docker/library/python:3.7
    
ADD requirements.txt /

RUN pip3 install -r requirements.txt

ENV PYTHONUNBUFFERED=TRUE 
ENV TF_CPP_MIN_LOG_LEVEL="2"

ENTRYPOINT ["python3"]

Overwriting docker/Dockerfile


The easiest way to build a container image and push to ECR is to use studio image builder. This require certain permission for your sagemaker execution role, which is already provided in this setup. 

But please check this [blog](https://aws.amazon.com/blogs/machine-learning/using-the-amazon-sagemaker-studio-image-build-cli-to-build-container-images-from-your-studio-notebooks/) for additional information on how to use the Amazon SageMaker Studio Image Build CLI to build container images from your Studio notebooks in case you need to update your role policy. 

In [49]:
!pip install sagemaker-studio-image-build

[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m22.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [50]:
container_name = "sagemaker-tf-container"
container_version = "2.0"
!cd docker && sm-docker build . --file Dockerfile --repository $container_name:$container_version
    
ecr_image = "{}.dkr.ecr.{}.amazonaws.com/{}:{}".format(account, region, container_name, container_version)

....[Container] 2022/11/01 11:20:45 going inside waitForAgent

[Container] 2022/11/01 11:20:45 Waiting for agent ping
[Container] 2022/11/01 11:20:46 Waiting for DOWNLOAD_SOURCE
[Container] 2022/11/01 11:20:48 Phase is DOWNLOAD_SOURCE
[Container] 2022/11/01 11:20:48 finished waitForAgent
[Container] 2022/11/01 11:20:48 inside CopySrc
[Container] 2022/11/01 11:20:48 CODEBUILD_SRC_DIR=/codebuild/output/src948061306/src
[Container] 2022/11/01 11:20:48 finished CopySrc
[Container] 2022/11/01 11:20:48 YAML location is /codebuild/output/src948061306/src/buildspec.yml
[Container] 2022/11/01 11:20:48 Setting HTTP client timeout to higher timeout for S3 source
[Container] 2022/11/01 11:20:48 Processing environment variables
[Container] 2022/11/01 11:20:48 No runtime version selected in buildspec.
[Container] 2022/11/01 11:20:48 Moving to directory /codebuild/output/src948061306/src
[Container] 2022/11/01 11:20:49 Configuring ssm agent with target id: codebuild:29026d2d-6c5d-4265-ab78-ed2015436f

### Configure a ScriptProcessor

1) copy the ecr uri from the step above

2) initialize the Process (instance count, instance type, etc.)

3) run the processing job (define script path, input arguments, input and output file locations

Note: we are not using GPU, so you can ignore the CUDA warning message. You can add the corresponding libraries to you docker file if you want use GPU acceleration.

In [61]:
import boto3
from sagemaker.processing import ScriptProcessor, ProcessingInput, ProcessingOutput, Processor
from sagemaker import get_execution_role

import uuid

region = boto3.session.Session().region_name

role = get_execution_role()

image_uri = ecr_image

s3_evaluation_output = f's3://{bucket}/{prefix}/outputs/evaluation'


script_processor = ScriptProcessor(base_job_name = prefix,
                command=['python3'],
                image_uri=image_uri,
                role=role,
                instance_count=1,
                instance_type='ml.m5.xlarge')

In [62]:
script_processor.run(
                        code='evaluation.py',
                        arguments=["--model-file", "model.tar.gz"],
                        inputs=[ProcessingInput(source=s3_images, 
                                                destination="/opt/ml/processing/input/test"),
                                ProcessingInput(source=s3_manifest, 
                                                destination="/opt/ml/processing/input/manifest"),
                                ProcessingInput(source=s3_model, 
                                                destination="/opt/ml/processing/model"),
                               ],
                        outputs=[
                            ProcessingOutput(output_name="evaluation", source="/opt/ml/processing/evaluation", 
                                             destination=s3_evaluation_output),
                        ]
                    )


Job Name:  postprocessing-model-evaluation-2022-11-01-12-22-48-558
Inputs:  [{'InputName': 'input-1', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-628084464172/preprocess/outputs/1667290054/test/', 'LocalPath': '/opt/ml/processing/input/test', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'input-2', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-628084464172/preprocess/outputs/1667290054/manifest', 'LocalPath': '/opt/ml/processing/input/manifest', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'input-3', 'AppManaged': False, 'S3Input': {'S3Uri': 's3://sagemaker-us-east-1-628084464172/postprocessing-model-evaluation/postprocessing-model-evaluation-2022-11-01-09-17-05-235/output', 'LocalPath': '/opt/ml/processing/model', 'S3DataType': 'S3Prefix', 'S3InputMode': 

# Review Outputs

At the end of the lab, you will generate a json file containing the performance metrics (accuracy, precision, recall, f1, and confusion matrix) on your test dataset.  Run the cell below to review the output.

In [63]:
import pprint as pp
s3 = boto3.resource('s3')
eval_matrix_key = f'{prefix}/outputs/evaluation/evaluation.json'
content_object = s3.Object(bucket, eval_matrix_key)
file_content = content_object.get()['Body'].read().decode('utf-8')
json_content = json.loads(file_content)

pp.pprint(json_content['multiclass_classification_metrics'])

{'accuracy': {'standard_deviation': 'NaN', 'value': 0.9895833333333334},
 'confusion_matrix': {'013.Bobolink': {'013.Bobolink': 11,
                                       '017.Cardinal': 0,
                                       '035.Purple_Finch': 1,
                                       '036.Northern_Flicker': 0,
                                       '047.American_Goldfinch': 0,
                                       '068.Ruby_throated_Hummingbird': 0,
                                       '073.Blue_Jay': 0,
                                       '087.Mallard': 0},
                      '017.Cardinal': {'013.Bobolink': 0,
                                       '017.Cardinal': 12,
                                       '035.Purple_Finch': 0,
                                       '036.Northern_Flicker': 0,
                                       '047.American_Goldfinch': 0,
                                       '068.Ruby_throated_Hummingbird': 0,
                                   

You can also check the confusion matrix output by running the cell below.

In [64]:
cf_matrix_file = f's3://{bucket}/{prefix}/outputs/evaluation/confusion_matrix.png'
!aws s3 cp $cf_matrix_file_path .

download: s3://sagemaker-us-east-1-628084464172/postprocessing-model-evaluation/outputs/evaluation/confusion_matrix.png to ./confusion_matrix.png


![confusion_matrix.png](confusion_matrix.png)