# Amazon SageMaker Workshop
### _**Evaluation**_

---
In this part of the workshop we will get the previous model we trained to Predict Mobile Customer Departure and evaluate its performance with a test dataset.

---

## Contents

1. [Background](#Background) - Getting the model trained in the previous lab.
2. [Evaluate](#Evaluate)
    * Creating a script to evaluate model
    * Using [SageMaker Processing](https://docs.aws.amazon.com/sagemaker/latest/dg/processing-job.html) jobs to automate evaluation of models

---

## Background

In the previous [Modeling](../2-Modeling/modeling.ipynb) lab we used SageMaker trained models by creating multiple SageMaker training jobs.

Install and import some packages we'll need for this lab:

In [None]:
import sys
!{sys.executable} -m pip install sagemaker==2.42.0 -U
!{sys.executable} -m pip install xgboost==1.2.1

In [None]:
import boto3
import sagemaker
from sagemaker.s3 import S3Uploader, S3Downloader

In [None]:
region = boto3.Session().region_name
sm_sess = sagemaker.session.Session()
role = sagemaker.get_execution_role()

Get the variables from initial setup:

In [None]:
%store -r bucket

In [None]:
%store -r prefix

In [None]:
bucket, prefix

### - if you skipped the previous lab

Use the pre-trained model in config directory (`config/source.tar.gz`).

In [None]:
## Uncomment if you skipped previous lab
# !cp config/model.tar.gz ./

In [None]:
## Uncomment if you skipped previous lab
# model_s3_uri = S3Uploader.upload("model.tar.gz", f"s3://{bucket}/{prefix}/model")

### - if you have done the previous lab

Download the model from S3:

In [None]:
# Get name of training job and other variables
%store -r training_job_name

In [None]:
training_job_name

In [None]:
estimator = sagemaker.estimator.Estimator.attach(training_job_name)
model_s3_uri = estimator.model_data
print("\nmodel_s3_uri =",model_s3_uri)

In [None]:
S3Downloader.download(model_s3_uri, ".")

---

# Evaluate model

Let's create a simple evaluation with some Scikit-Learn Metrics like [Area Under the Curve (AUC)](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.auc.html) and [Accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html).

In [None]:
import json
import os
import tarfile
import logging
import pickle

import pandas as pd
import xgboost

from sklearn.metrics import classification_report, roc_auc_score, accuracy_score


model_path = "model.tar.gz"
with tarfile.open(model_path) as tar:
    tar.extractall(path=".")

print("Loading xgboost model.")
model = pickle.load(open("xgboost-model", "rb"))

In [None]:
print("Loading test input data")
test_path = "config/test-dataset.csv"
df = pd.read_csv(test_path, header=None)
df

In [None]:
print("Reading test data.")
y_test = df.iloc[:, 0].to_numpy()
df.drop(df.columns[0], axis=1, inplace=True)
X_test = xgboost.DMatrix(df.values)
X_test

In [None]:
print("Performing predictions against test data.")
predictions = model.predict(X_test)

print("Creating classification evaluation report")
acc = accuracy_score(y_test, predictions.round())
auc = roc_auc_score(y_test, predictions.round())

print("Accuracy =", acc)
print("AUC =", auc)

### Creating a classification report

Now, let's save the results in a JSON file, following the structure defined in SageMaker docs:
https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html

We'll use this logic later in [Lab 6-Pipelines](../6-Pipelines/pipelines.ipynb):

In [None]:
import pprint
# The metrics reported can change based on the model used - check the link for the documentation 
report_dict = {
    "binary_classification_metrics": {
        "accuracy": {
            "value": acc,
            "standard_deviation": "NaN",
        },
        "auc": {"value": auc, "standard_deviation": "NaN"},
    },
}

print("Classification report:")
pprint.pprint(report_dict)

In [None]:
evaluation_output_path = os.path.join(
    ".", "evaluation.json"
)
print("Saving classification report to {}".format(evaluation_output_path))

with open(evaluation_output_path, "w") as f:
    f.write(json.dumps(report_dict))

---

## Ok, now we have working code. Let's put it in a Python Script

In [None]:
%%writefile evaluate.py
"""Evaluation script for measuring model accuracy."""

import json
import os
import tarfile
import logging
import pickle

import pandas as pd
import xgboost

logger = logging.getLogger()
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())

# May need to import additional metrics depending on what you are measuring.
# See https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html
from sklearn.metrics import classification_report, roc_auc_score, accuracy_score


if __name__ == "__main__":
    model_path = "/opt/ml/processing/model/model.tar.gz"
    with tarfile.open(model_path) as tar:
        tar.extractall(path="..")

    logger.debug("Loading xgboost model.")
    model = pickle.load(open("xgboost-model", "rb"))

    print("Loading test input data")
    test_path = "/opt/ml/processing/test/test-dataset.csv"
    df = pd.read_csv(test_path, header=None)

    logger.debug("Reading test data.")
    y_test = df.iloc[:, 0].to_numpy()
    df.drop(df.columns[0], axis=1, inplace=True)
    X_test = xgboost.DMatrix(df.values)

    logger.info("Performing predictions against test data.")
    predictions = model.predict(X_test)

    print("Creating classification evaluation report")
    acc = accuracy_score(y_test, predictions.round())
    auc = roc_auc_score(y_test, predictions.round())

    # The metrics reported can change based on the model used, but it must be a specific name per (https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-model-quality-metrics.html)
    report_dict = {
        "binary_classification_metrics": {
            "accuracy": {
                "value": acc,
                "standard_deviation": "NaN",
            },
            "auc": {"value": auc, "standard_deviation": "NaN"},
        },
    }

    print("Classification report:\n{}".format(report_dict))

    evaluation_output_path = os.path.join(
        "/opt/ml/processing/evaluation", "evaluation.json"
    )
    print("Saving classification report to {}".format(evaluation_output_path))

    with open(evaluation_output_path, "w") as f:
        f.write(json.dumps(report_dict))


---

## Ok, now we are finally running this script with a simple call to SageMaker Processing!

In [None]:
framework_version = '1.2-2'

docker_image_name = sagemaker.image_uris.retrieve(framework='xgboost', region=region, version=framework_version)
docker_image_name

In [None]:
from sagemaker.processing import (
    ProcessingInput,
    ProcessingOutput,
    ScriptProcessor,
)

In [None]:
# Processing step for evaluation
processor = sagemaker.processing.ScriptProcessor(
    image_uri=docker_image_name,
    command=["python3"],
    instance_type="ml.m5.xlarge",
    instance_count=1,
    base_job_name="CustomerChurn/eval-script",
    sagemaker_session=sm_sess,
    role=role,
)

In [None]:
entrypoint = "evaluate.py"

In [None]:
# Upload test dataset to S3
test_s3_uri = S3Uploader.upload("config/test-dataset.csv", f"s3://{bucket}/{prefix}/test")
test_s3_uri

In [None]:
processor.run(
    code=entrypoint,
    inputs=[
        sagemaker.processing.ProcessingInput(
            source=model_s3_uri,
            destination="/opt/ml/processing/model",
        ),
        sagemaker.processing.ProcessingInput(
            source=test_s3_uri,
            destination="/opt/ml/processing/test",
        ),
    ],
    outputs=[
        sagemaker.processing.ProcessingOutput(
            output_name="evaluation", source="/opt/ml/processing/evaluation"
        ),
    ],
    job_name="CustomerChurnEval"
)

If everything went well, the SageMaker Processing job must have created the JSON with the evaluation report of our model and saved it in S3.

### Let's check it the evaluation report from S3!

In [None]:
out_s3_report_uri = processor.latest_job.outputs[0].destination
out_s3_report_uri

In [None]:
reports_list = S3Downloader.list(out_s3_report_uri)
reports_list

In [None]:
report = S3Downloader.read_file(reports_list[0])

print("=====Model Report====")
print(json.dumps(json.loads(report.split('\n')[0]), indent=2))