# Deploy our ML Model

**SageMaker Studio Kernel**: Data Science

In this exercise you will do:
 - Run a Preprocessing Job using Amazon SageMaker Processing Job
 - Run a Tensorflow Training Job using Amazon SageMaker Training Job
 - Register a new version of the trained model in the Amazon SageMaker Model Registry

***

## Part 1/4 - Setup
Here we'll import some libraries and define some variables.

### Import required modules

In [None]:
import boto3
from botocore.exceptions import ClientError
from datetime import datetime
import logging
from sagemaker.model_monitor import DataCaptureConfig
import sagemaker.session
from sagemaker.tensorflow import TensorFlowModel
import traceback

In [None]:
s3_client = boto3.client("s3")
sagemaker_client = boto3.client("sagemaker")

In [None]:
logging.basicConfig(level=logging.INFO)
LOGGER = logging.getLogger(__name__)

***

## Part 2/4 - Model Package Definition
During this steps, we are retrieving model informations from the Amazon SageMaker Model Registry

### Get Approved Model Packages

This method can be used for returning the last approved model from the specified model package group

In [None]:
model_package_group = "ml-end-to-end-group"

In [None]:
try:
    # Get the latest approved model package
    response = sagemaker_client.list_model_packages(
        ModelPackageGroupName=model_package_group,
        ModelApprovalStatus="Approved",
        SortBy="CreationTime",
        SortOrder="Descending",
        MaxResults=1,
    )
    approved_packages = response["ModelPackageSummaryList"]

    # Return error if no packages found
    if len(approved_packages) == 0:
        error_message = ("No approved ModelPackage found for ModelPackageGroup: {}".format(model_package_group))
        LOGGER.error("{}".format(error_message))

        raise Exception(error_message)

    model_package = approved_packages[0]
    LOGGER.info("Identified the latest approved model package: {}".format(model_package))
except ClientError as e:
    stacktrace = traceback.format_exc()
    error_message = e.response["Error"]["Message"]
    LOGGER.error("{}".format(stacktrace))

    raise Exception(error_message)

### List Model Packages

This method can be used for listing all the registered models in a Model Package Group

In [None]:
model_package_arn = model_package["ModelPackageArn"]

In [None]:
try:
    model_package = sagemaker_client.describe_model_package(
        ModelPackageName=model_package_arn
    )

    LOGGER.info("{}".format(model_package))

    if len(model_package) == 0:
        error_message = ("No ModelPackage found for: {}".format(model_package_arn))
        LOGGER.error("{}".format(error_message))

        raise Exception(error_message)
except ClientError as e:
    stacktrace = traceback.format_exc()
    error_message = e.response["Error"]["Message"]
    LOGGER.error("{}".format(stacktrace))

    raise Exception(error_message)

***

## Part 3/4 - Deploy an Amazon SageMaker Endpoint
Here we are deploying an Amazon SageMaker Endpoint by using the ML model taken from the Model Registry

In [None]:
region = boto3.session.Session().region_name
role_name = "mlops-sagemaker-execution-role"
role = "arn:aws:iam::{}:role/{}".format(boto3.client('sts').get_caller_identity().get('Account'), role_name)

kms_account_id = boto3.client('sts').get_caller_identity().get('Account')

kms_alias = "ml-kms"

bucket_artifacts = ""
bucket_inference = ""

inference_artifact_path = "artifact/inference"
inference_artifact_name = "sourcedir.tar.gz"
inference_instance_count = 1
inference_instance_type = "ml.m5.xlarge"

model_package_group = "ml-end-to-end-group"


monitoring_output_path = "data/monitoring/captured"

training_framework_version = 2.4

In [None]:
kms_key = "arn:aws:kms:{}:{}:alias/{}".format(region, kms_account_id, kms_alias)

In [None]:
boto_session = boto3.Session(region_name=region)

sagemaker_client = boto_session.client("sagemaker")
runtime_client = boto_session.client("sagemaker-runtime")

sagemaker_session = sagemaker.session.Session(
    boto_session=boto_session,
    sagemaker_client=sagemaker_client,
    sagemaker_runtime_client=runtime_client,
    default_bucket=bucket_inference
)

### Compress source code for installing additional python modules

In [None]:
! ./../algorithms/buildspec.sh inference $bucket_artifacts

In [None]:
inference_source_dir = "s3://{}/{}/{}".format(
    bucket_inference,
    inference_artifact_path,
    inference_artifact_name
)

print(inference_source_dir)

### Create SageMaker model

This method can be used for creating a SageMaker model

In [None]:
try:
    model = TensorFlowModel(
        entry_point="inference.py",
        framework_version=str(training_framework_version),
        source_dir=inference_source_dir,
        model_data=model_package["InferenceSpecification"]["Containers"][0]["ModelDataUrl"],
        model_kms_key=kms_key,
        role=role,
        sagemaker_session=sagemaker_session
    )
except Exception as e:
    stacktrace = traceback.format_exc()
    LOGGER.error("{}".format(stacktrace))

    raise e

### Deploy a SageMaker Endpoint

Lets deploy the endpoint. If we want to update an existing endpoint, we have to create a new endpoint configuration defined in the method below

In [None]:
def get_deployed_model():
    try:
        response = sagemaker_client.list_models(
            SortBy="CreationTime",
            SortOrder="Descending",
            MaxResults=1
        )

        model_name = None

        if "Models" in response and len(response["Models"]) > 0:
            model_name = response["Models"][0]["ModelName"]

        return model_name
    except Exception as e:
        stacktrace = traceback.format_exc()
        LOGGER.error("{}".format(stacktrace))

        raise e

In [None]:
def update_model(
        bucket_inference,
        model_name,
        model_package_group_name,
        env,
        inference_instance_count,
        inference_instance_type,
        kms_key,
        monitoring_output_path):
    try:
        config_name = "{}-{}-{}".format(model_package_group_name, env, datetime.today().strftime('%Y-%m-%d-%H-%M-%S'))

        LOGGER.info("Creating endpoint configuration {}".format(config_name))

        response_endpoint_config = sagemaker_client.create_endpoint_config(
            EndpointConfigName=config_name,
            ProductionVariants=[
                {
                    "VariantName": "AllTraffic",
                    "ModelName": model_name,
                    "InitialInstanceCount": inference_instance_count,
                    "InstanceType": inference_instance_type,
                    "InitialVariantWeight": 1.0
                }
            ],
            DataCaptureConfig={
                'EnableCapture': True,
                'InitialSamplingPercentage': 100,
                'DestinationS3Uri': "s3://{}/{}".format(bucket_inference, monitoring_output_path),
                'KmsKeyId': kms_key,
                'CaptureOptions': [
                    {
                        'CaptureMode': 'Input'
                    },
                    {
                        'CaptureMode': 'Output'
                    }
                ],
                'CaptureContentTypeHeader': {
                    'CsvContentTypes': [
                        "text/csv",
                        "CSV/Text"
                    ],
                    'JsonContentTypes': [
                        'application/jsonlines',
                    ]
                }
            }
        )

        LOGGER.info(response_endpoint_config)

        response = sagemaker_client.update_endpoint(
            EndpointName=model_package_group_name + "-" + env,
            EndpointConfigName=config_name
        )

        LOGGER.info("Update endpoint {}-{}".format(model_package_group_name, env))
        LOGGER.info(response)

    except Exception as e:
        stacktrace = traceback.format_exc()
        LOGGER.info("{}".format(stacktrace))

        raise e

In [None]:
model_package_group = "ml-end-to-end-group"

In [None]:
try:
    model.deploy(
        endpoint_name=model_package_group + "-dev",
        initial_instance_count=inference_instance_count,
        instance_type=inference_instance_type,
        update_endpoint=True,
        data_capture_config=DataCaptureConfig(
                enable_capture=True,
                sampling_percentage=100,
                destination_s3_uri="s3://{}/{}".format(bucket_inference, monitoring_output_path))
    )
except ClientError as e:
    stacktrace = traceback.format_exc()
    LOGGER.info("{}".format(stacktrace))

    model_name = get_deployed_model()

    update_model(
            bucket_inference,
            model_name,
            model_package_group,
            "dev",
            inference_instance_count,
            inference_instance_type,
            kms_key,
            monitoring_output_path)

### Test the SageMaker Endpoint

* Negative - 0
* Neutral - 1
* Positive - 2

In [None]:
model_package_group = "ml-end-to-end-group"

In [None]:
from sagemaker.deserializers import CSVDeserializer
from sagemaker.serializers import CSVSerializer
from sagemaker.tensorflow.model import TensorFlowPredictor

predictor = TensorFlowPredictor(
    endpoint_name=model_package_group + "-dev",
    model_name="saved_model",
    model_version=1,
    accept_type="text/csv",
    serializer=CSVSerializer(),
    deserializer=CSVDeserializer()
)

In [None]:
inputs = ["ti imploro di guardare questo documentario. molto spaventoso e informativo. uno dei motivi esatti che sto eliminando fb entir"]

result = predictor.predict(inputs)

LOGGER.info("{}".format(result))

We have just seen how to deploy a ML models by using Amazon SageMaker Hosting Services and real-time endpoints. 
For creating monitoring jobs for checking the quality of the deployed model, we can execute the following lab (Optional).

 > [Model-Monitor](./01-Model-Monitor.ipynb)

Now we are ready to execute our end to end workflow using a Pipeline

 > [Pipeline](./02-Pipeline-Deployment.ipynb)