# Build and Deploy ML Application to SageMaker real-time endpoints

- We build a fully custom ML application that encapsulates the following:
  1. A [featurizer](./featurizer/) model (data pre-processing container) built using `SKLearn` column transformer
     - The model transforms raw csv input data to features and returns the transformed data as output
  1. A [predictor](./predictor/) XGBoost model trained on UCI Abalone dataset that accepts transformed features (generated by featurizer model) and returns predictions in JSON format.

![Abalone Predictor Pipeline](./images/serial-inference-pipeline.png)

## Building Custom inference containers

1. **Step 1:** Build inference container with featurizer model - Refer to [`featurizer.ipynb`](./featurizer/featurizer.ipynb) Notebook
1. **Step 2:** Build inference container with trained XGBoost model - Refer to [`predictor.ipynb`](./predictor/predictor.ipynb) Notebook


## Prerequisite

Ensure both [featurizer.ipynb](./featurizer/featurizer.ipynb) and [predictor.ipynb](./predictor/predictor.ipynb) are completed before running this notebook.

### Upload Models to S3

Upload models generated by **Step 1:** from [`featurizer.ipynb`](./featurizer/featurizer.ipynb) Notebook and **Step 2:** from [`predictor.ipynb`](./predictor/predictor.ipynb) Notebook to S3



In [None]:
import boto3
from sagemaker import get_execution_role, session
from sagemaker.s3 import S3Downloader, S3Uploader, s3_path_join

sm_session = session.Session()
region = sm_session._region_name
role = get_execution_role()
bucket = sm_session.default_bucket()

prefix = "sagemaker/abalone/models/byoc"

sm_client = boto3.client("sagemaker")

featurizer_model_data = s3_path_join(f"s3://{bucket}/{prefix}", "featurizer")
predictor_model_data = s3_path_join(f"s3://{bucket}/{prefix}", "predictor")

print(f"Uploading featurizer model to {featurizer_model_data}")
S3Uploader.upload(
    local_path="./featurizer/models/model.tar.gz",
    desired_s3_uri=featurizer_model_data,
    sagemaker_session=sm_session,
)

print(f"Uploading predictor model to {predictor_model_data}")
S3Uploader.upload(
    local_path="./predictor/models/model.tar.gz",
    desired_s3_uri=predictor_model_data,
    sagemaker_session=sm_session,
)

# Verify model files after upload
print(f"Listing files under s3://{bucket}/{prefix}")
print(S3Downloader.list(featurizer_model_data))
print(S3Downloader.list(predictor_model_data))

### Create Models and Pipeline Model

Now, we create two model objects to be combined later to a Pipeline Model

In [None]:
from datetime import datetime
from uuid import uuid4
from sagemaker.model import Model

suffix = f"{str(uuid4())[:5]}-{datetime.now().strftime('%d%b%Y')}"
region = boto3.Session().region_name
account_id = boto3.client("sts").get_caller_identity().get("Account")

# Featurizer Model (SKLearn Model)
image_name = "abalone/featurizer"
sklearn_image_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image_name}:latest"

featurizer_model_name = f"AbaloneXGB-featurizer-{suffix}"
print(f"Creating Featurizer model: {featurizer_model_name}")
sklearn_model = Model(
    image_uri=sklearn_image_uri,
    name=featurizer_model_name,
    model_data=f"{featurizer_model_data}/model.tar.gz",
    role=role,
)

# Predictor Model (XGBoost Model)
image_name = "abalone/predictor"
xgboost_image_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image_name}:latest"

predictor_model_name = f"AbaloneXGB-Predictor-{suffix}"
print(f"Creating Predictor model: {predictor_model_name}")
xgboost_model = Model(
    image_uri=xgboost_image_uri,
    name=predictor_model_name,
    model_data=f"{predictor_model_data}/model.tar.gz",
    role=role,
)

### Create Pipeline Model

1. Create a Pipeline model with `sklearn_model` and `xgboost_model` to act a serial inference pipline.
1. Deploy Pipeline Model

In [None]:
from sagemaker.pipeline import PipelineModel

pipeline_model_name = f"Abalone-pipeline-{suffix}"

pipeline_model = PipelineModel(
    name=pipeline_model_name,
    role=role,
    models=[sklearn_model, xgboost_model],
    sagemaker_session=sm_session,
)

print(f"Deploying pipeline model {pipeline_model_name}...")
predictor = pipeline_model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.xlarge",
)

### Test inference on Endpoint with Pipeline Model

- Instantiate a `Predictor` class from `sagemaker.predictor` module
- Use `CSVSerialzier` to serialize payload
- and `JSONDeSerializer` for deserializing output (JSON) from the XGBoost model

In [None]:
from time import sleep
from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer

# Use the endpoint_name you specified when deploying the pipeline
endpoint_name = pipeline_model_name

# Let's use the test dataset in featurizer/data directory
LOCALDIR = "./data"
local_test_dataset = f"{LOCALDIR}/abalone_test.csv"

limit = 15
i = 0

with open(local_test_dataset, "r") as _f:
    for row in _f:
        # Skip headers row
        if i == 0:
            i += 1
        elif i <= limit:
            row = row.rstrip("\n")
            splits = row.split(",")
            # Remove the target column (last column)
            label = splits.pop(-1)
            input_cols = ",".join(s for s in splits)
            prediction = None
            try:
                predictor = Predictor(
                    endpoint_name=endpoint_name,
                    sagemaker_session=sm_session,
                    serializer=CSVSerializer(),
                    deserializer=JSONDeserializer(),
                )
                response = predictor.predict(input_cols)
                # response = {'result' : [predicted_value]}
                print(f"True: {label} | Predicted: {response['result'][0]}")
                i += 1
                sleep(0.15)
            except Exception as e:
                print(f"Prediction error: {e}")
                pass

### (Optional) Verify Logs emitted by the endpoint in CloudWatch

In [None]:
from datetime import timedelta

logs_client = boto3.client("logs")
end_time = datetime.utcnow()
start_time = end_time - timedelta(minutes=15)

log_group_name = f"/aws/sagemaker/Endpoints/{endpoint_name}"
log_streams = logs_client.describe_log_streams(logGroupName=log_group_name)
log_stream_name = log_streams["logStreams"][0]["logStreamName"]

# Retrieve the logs
logs = logs_client.get_log_events(
    logGroupName=log_group_name,
    logStreamName=log_stream_name,
    startTime=int(start_time.timestamp() * 1000),
    endTime=int(end_time.timestamp() * 1000),
)

# Print the logs
for event in logs["events"]:
    print(f"{datetime.fromtimestamp(event['timestamp'] // 1000)}: {event['message']}")

## Cleanup

In [None]:
# Delete endpoint, model
try:
    print(f"Deleting endpoint: {endpoint_name}")
    sm_client.delete_endpoint(EndpointName=endpoint_name)
except Exception as e:
    print(f"Error deleting EP: {endpoint_name}\n{e}")
    pass

try:
    print(f"Deleting model: {endpoint_name}")
    sm_client.delete_model(ModelName=endpoint_name)
except Exception as e:
    print(f"Error deleting model: {endpoint_name}\n{e}")
    pass