<h1>Model Deployment</h1>

Once you have built and trained the models for feature engineering (using Amazon SageMaker Processing and SKLearn) and binary classification (using the XGBoost open-source container for Amazon SageMaker), you can deploy them by hosting them in a serial [inference pipeline](https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipelines.html) behind one endpoint.


This notebook demonstrates how to create an inference pipeline composed of the SKLearn model for feature engineering and the XGBoost model for binary classification.

Define the variables.

In [None]:
import sagemaker
import boto3

role = sagemaker.get_execution_role()
region = boto3.Session().region_name
sagemaker_session = sagemaker.Session()
bucket_name = sagemaker_session.default_bucket()
prefix = 'end-to-end-ml'

print(region)
print(role)
print(bucket_name)

## Retrieve model artifacts

First, you need to create two Amazon SageMaker **Model** objects, which associate the serialized training artifacts to the Docker container used for inference. To do that, you need to provide the paths to the serialized models in Amazon S3:
<ul>
    <li>For the SKLearn transform model, in Step 02 (Feature Engineering), you defined the path where the model artifacts are saved.</li>
    <li>For the XGBoost model, you need the find the path using Amazon SageMaker's naming convention, so you use a utility function to get the model artifacts of the last training job matching a specific base job name.</li>
</ul>

In [None]:
from notebook_utilities import get_latest_training_job_name, get_training_job_s3_model_artifacts

# SKLearn model artifacts path.
sklearn_model_path = 's3://{0}/{1}/output/sklearn/model.tar.gz'.format(bucket_name, prefix)

# XGBoost model artifacts path.
training_base_job_name = 'end-to-end-ml-sm-xgb'
latest_training_job_name = get_latest_training_job_name(training_base_job_name)
xgboost_model_path = get_training_job_s3_model_artifacts(latest_training_job_name)

print('SKLearn model path: ' + sklearn_model_path)
print('XGBoost model path: ' + xgboost_model_path)

## SKLearn Featurizer Model

Let's build the model object for the SKLearn model. When building this model object, you provide a custom inference script that processes the inputs and outputs and execute the transform.

The custom inference scrip, `sklearn_source_dir/inference.py`, defines:

- a custom `input_fn` for pre-processing inference requests. The input function accepts CSV input, loads the input in a Pandas dataframe, and assigns feature column names to the dataframe
- a custom `predict_fn` for running the transform over the inputs
- a custom `output_fn` for returning either JSON or CSV
- a custom `model_fn` for deserializing the model

In [None]:
!pygmentize sklearn_source_dir/inference.py

Now, let's create the `SKLearnModel` object by providing the custom script and the path to S3 model artifacts as input.

In [None]:
import time
from sagemaker.sklearn import SKLearnModel

code_location = 's3://{0}/{1}/code'.format(bucket_name, prefix)

sklearn_model = SKLearnModel(name='end-to-end-ml-sm-skl-model-{0}'.format(str(int(time.time()))),
                             model_data=sklearn_model_path,
                             entry_point='inference.py',
                             source_dir='sklearn_source_dir/',
                             code_location=code_location,
                             role=role,
                             sagemaker_session=sagemaker_session,
                             framework_version='0.20.0',
                             py_version='py3')

## XGBoost Model

Like the previous step, create an `XGBoost` model object and provide a custom inference script.

The inference script, `xgboost_source_dir/inference.py`, defines:

- a custom `input_fn` for pre-processing inference requests. This input function can handle JSON requests plus all content types supported by the default XGBoost container. For additional information please visit: https://github.com/aws/sagemaker-xgboost-container/blob/master/src/sagemaker_xgboost_container/encoder.py. The reason for adding the JSON content type is that the container-to-container default request content type in an inference pipeline is JSON.
- a custom `model_fn` for deserializing the model

In [None]:
!pygmentize xgboost_source_dir/inference.py

Now, let's create the `XGBoostModel` object by providing the custom script and the path to the S3 model artifacts as input.

In [None]:
import time
from sagemaker.xgboost import XGBoostModel

code_location = 's3://{0}/{1}/code'.format(bucket_name, prefix)

xgboost_model = XGBoostModel(name='end-to-end-ml-sm-xgb-model-{0}'.format(str(int(time.time()))),
                             model_data=xgboost_model_path,
                             entry_point='inference.py',
                             source_dir='xgboost_source_dir/',
                             code_location=code_location,
                             framework_version='0.90-2',
                             py_version='py3',
                             role=role, 
                             sagemaker_session=sagemaker_session)

## Pipeline Model

After creating the model objects for the two models, you deploy them in a pipeline by building a `PipelineModel` object and calling the `deploy()` method.

In [None]:
import sagemaker
import time
from sagemaker.pipeline import PipelineModel

pipeline_model_name = 'end-to-end-ml-sm-xgb-skl-pipeline-{0}'.format(str(int(time.time())))

pipeline_model = PipelineModel(
    name=pipeline_model_name, 
    role=role,
    models=[
        sklearn_model, 
        xgboost_model],
    sagemaker_session=sagemaker_session)

endpoint_name = 'end-to-end-ml-sm-pipeline-endpoint-{0}'.format(str(int(time.time())))
print(endpoint_name)

pipeline_model.deploy(initial_instance_count=1, 
                      instance_type='ml.m5.xlarge', 
                      endpoint_name=endpoint_name)

<span style="color: red; font-weight:bold">Please take note of the endpoint name, since it will be used in the next workshop module.</span>

## Inference

You can now invoke the pipeline to perform inference on example input values:

In [None]:
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer
from sagemaker.predictor import Predictor

predictor = Predictor(
    endpoint_name=endpoint_name,
    sagemaker_session=sagemaker_session,
    serializer=CSVSerializer(),
    deserializer=JSONDeserializer())

#'Type', 'Air temperature [K]', 'Process temperature [K]', 'Rotational speed [rpm]', 'Torque [Nm]', 'Tool wear [min]'
payload = "L,298.4,308.2,1582,70.7,216"
print(predictor.predict(payload))

payload = "M,298.4,308.2,1582,30.2,214"
print(predictor.predict(payload))

payload = "L,298.4,308.2,30,70.7,216"
print(predictor.predict(payload))

In [None]:
#predictor.delete_endpoint()

After testing the endpoint, you can move to the next workshop module. Please access the module <a href="https://github.com/aws-samples/amazon-sagemaker-build-train-deploy/tree/master/06_API_Gateway_and_Lambda" target="_blank">06_API_Gateway_and_Lambda</a> on GitHub to continue.