# Build and deploy a serial inference application to SageMaker real-time endpoints

---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook.

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-2/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

---

- We build a fully custom ML serial inference application that encapsulates the following:
  1. A ["featurizer"](./featurizer/) model (data pre-processing container) built using `SKLearn` column transformer
     - The model transforms raw csv input data to features and returns the transformed data as output
  1. A [predictor](./predictor/) `XGBoost` model trained on UCI Abalone dataset that accepts transformed features (generated by "featurizer" model) and returns predictions in JSON format.

![ Abalone Predictor Pipeline ](./images/serial-inference-pipeline.png)

## Building Custom inference containers

1. To build, test and host "featurizer" container locally Refer to [`featurizer.ipynb`](./featurizer/featurizer.ipynb) Notebook
1.  To build, test and host "predictor" container locally - Refer to [`predictor.ipynb`](./predictor/predictor.ipynb) Notebook


## Prerequisite

**NOTE:** Ensure both [featurizer.ipynb](./featurizer/featurizer.ipynb) and [predictor.ipynb](./predictor/predictor.ipynb) are completed before running this notebook.

In [None]:
!pip install -U awscli boto3 sagemaker watermark scikit-learn tqdm --quiet

%load_ext watermark
%watermark -p awscli,boto3,sagemaker,scikit-learn,tqdm

In [None]:
import os
import boto3
from pathlib import Path
from sagemaker import session, get_execution_role
from sagemaker.s3 import S3Downloader, S3Uploader, s3_path_join

# account id for constructing ECR repo uri
account_id = boto3.client("sts").get_caller_identity().get("Account")

sm_session = session.Session()
region = sm_session.boto_region_name
role = get_execution_role()
bucket = sm_session.default_bucket()

prefix = "sagemaker/abalone/models/byoc"

current_dir = os.getcwd()

abalone_s3uri = (
    f"s3://sagemaker-example-files-prod-{region}/datasets/tabular/uci_abalone/abalone.csv"
)

pretrained_xgboost_model_s3uri = (
    f"s3://sagemaker-example-files-prod-{region}/models/xgb-abalone/xgboost-model"
)


base_dir = Path("./data")
featurizer_dir = Path("./featurizer").absolute()
predictor_dir = Path("./predictor").absolute()

S3Downloader.download(s3_uri=abalone_s3uri, local_path=base_dir, sagemaker_session=sm_session)

### Build "featurizer" model

In [None]:
os.chdir(featurizer_dir)
print(os.getcwd())

In [None]:
import joblib
import numpy as np
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

featurizer_model_dir = featurizer_dir.joinpath("models")

DATA_DIR = Path("../data").resolve()
DATA_FILE = DATA_DIR.joinpath("abalone.csv")

if not DATA_FILE.exists():
    raise ValueError(f"{DATA_FILE} doesn't exist")

if not featurizer_model_dir.exists():
    featurizer_model_dir.mkdir(parents=True)

# As we get a headerless CSV file, we specify the column names here.
feature_columns_names = [
    "sex",
    "length",
    "diameter",
    "height",
    "whole_weight",
    "shucked_weight",
    "viscera_weight",
    "shell_weight",
]
label_column = "rings"

feature_columns_dtype = {
    "sex": str,
    "length": np.float64,
    "diameter": np.float64,
    "height": np.float64,
    "whole_weight": np.float64,
    "shucked_weight": np.float64,
    "viscera_weight": np.float64,
    "shell_weight": np.float64,
}
label_column_dtype = {"rings": np.float64}


def merge_two_dicts(x, y):
    z = x.copy()
    z.update(y)
    return z


df = pd.read_csv(
    DATA_FILE,
    header=None,
    names=feature_columns_names + [label_column],
    dtype=merge_two_dicts(feature_columns_dtype, label_column_dtype),
)

print("Splitting raw dataset to train and test datasets..")

(df_train_val, df_test) = train_test_split(df, random_state=42, test_size=0.1)


df_test.to_csv(f"{DATA_DIR.joinpath('abalone_test.csv')}", index=False)

print(f"Test dataset written to {str(DATA_DIR.resolve())}/abalone_test.csv")


numeric_features = list(feature_columns_names)
numeric_features.remove("sex")
numeric_transformer = Pipeline(
    steps=[
        ("imputer", SimpleImputer(strategy="median")),
        ("scaler", StandardScaler()),
    ]
)

categorical_features = ["sex"]
categorical_transformer = Pipeline(
    steps=[
        ("imputer", SimpleImputer(strategy="constant", fill_value="missing")),
        ("onehot", OneHotEncoder(handle_unknown="ignore")),
    ]
)

preprocess = ColumnTransformer(
    transformers=[
        ("num", numeric_transformer, numeric_features),
        ("cat", categorical_transformer, categorical_features),
    ]
)

# Call fit on ColumnTransformer to fit all transformers to X, y
preprocessor = preprocess.fit(df_train_val)

# Save the processor model to featurizer/models directory
joblib.dump(preprocess, featurizer_model_dir.joinpath("preprocess.joblib"))
print(f"Saved preprocessor model to {featurizer_model_dir}")

In [None]:
import subprocess

os.chdir(featurizer_model_dir.absolute())

featurizer_model_path = featurizer_model_dir.absolute().joinpath("model.tar.gz")

if featurizer_model_path.exists():
    featurizer_model_path.unlink()

tar_cmd = "tar -czvf model.tar.gz preprocess.joblib ../code/"
result = subprocess.run(tar_cmd, shell=True, capture_output=True)

if result.returncode == 0:
    print(f"{featurizer_model_path} archive created successfully!")
    os.chdir(featurizer_dir)
else:
    os.chdir(featurizer_dir)
    print("An error occurred:", result.stderr)

### Build and push "featurizer" docker image to private ECR repo

In [None]:
featurizer_image_name = "abalone/featurizer"

# build featurizer image
!docker build -t $featurizer_image_name .

# change file permissions
!chmod +x build_n_push.sh

# push image to ecr repo
!./build_n_push.sh $featurizer_image_name

In [None]:
# Full name of the ECR repository
featurizer_ecr_repo_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{featurizer_image_name}"

print(featurizer_ecr_repo_uri)

# Upload featurizer model to s3
featurizer_s3_uri = s3_path_join(f"s3://{bucket}/{prefix}", "featurizer")

if featurizer_model_path.exists():
    featurizer_model_data = S3Uploader.upload(
        local_path=str(featurizer_model_path),
        desired_s3_uri=featurizer_s3_uri,
        sagemaker_session=sm_session,
    )
else:
    print(f"{featurizer_model_path} not found!")

print(f"featurizer model uploaded to to {featurizer_model_data}")

### Build predictor model

We downlaod and use the pre-trained `xgboost` model from s3

In [None]:
# Step out of featurizer directory
os.chdir(current_dir)
print(os.getcwd())

In [None]:
predictor_model_dir = predictor_dir.joinpath("models").absolute()
if not predictor_model_dir.exists():
    predictor_model_dir.mkdir(exist_ok=True)

In [None]:
os.chdir(predictor_dir)
os.getcwd()

In [None]:
!aws s3 cp $pretrained_xgboost_model_s3uri $predictor_model_dir

In [None]:
os.chdir(predictor_model_dir)
predictor_model_path = predictor_model_dir.joinpath("model.tar.gz")

if predictor_model_path.exists():
    predictor_model_path.unlink()

tar_cmd = "tar -czvf model.tar.gz xgboost-model ../code/"
result = subprocess.run(tar_cmd, shell=True, capture_output=True)

if result.returncode == 0:
    print("Tar archive created successfully!")
    print(predictor_model_path)
    os.chdir(predictor_dir)
else:
    os.chdir(predictor_model_dir)
    print("An error occurred:", result.stderr)

### Build and push "predictor" docker image to private ECR repo

In [None]:
predictor_image_name = "abalone/predictor"

!docker build -t $predictor_image_name .

!chmod +x build_n_push.sh

!./build_n_push.sh $predictor_image_name

In [None]:
# Full name of the ECR repository
predictor_ecr_repo_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{predictor_image_name}"

print(predictor_ecr_repo_uri)

# Upload featurizer model to s3
predictor_s3_uri = s3_path_join(f"s3://{bucket}/{prefix}", "predictor")

if predictor_model_path.exists():
    print(f"Uploading predictor model to {predictor_s3_uri}")
    predictor_model_data = S3Uploader.upload(
        local_path=str(predictor_model_path),
        desired_s3_uri=predictor_s3_uri,
        sagemaker_session=sm_session,
    )
else:
    print(f"{predictor_model_path} not found!")

os.chdir(current_dir)

### Create Models and Pipeline Model

Now, we create two model objects to be combined later to a Pipeline Model

In [None]:
from datetime import datetime
from uuid import uuid4
from sagemaker.model import Model

suffix = f"{str(uuid4())[:5]}-{datetime.now().strftime('%d%b%Y')}"

# Featurizer Model (SKLearn Model)
image_name = "abalone/featurizer"
sklearn_image_uri = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image_name}:latest"

featurizer_model_name = f"AbaloneXGB-featurizer-{suffix}"
print(f"Creating Featurizer model: {featurizer_model_name}")
sklearn_model = Model(
    image_uri=featurizer_ecr_repo_uri,
    name=featurizer_model_name,
    model_data=featurizer_model_data,
    role=role,
)

# Predictor Model (XGBoost Model)
predictor_model_name = f"AbaloneXGB-Predictor-{suffix}"
print(f"Creating Predictor model: {predictor_model_name}")
xgboost_model = Model(
    image_uri=predictor_ecr_repo_uri,
    name=predictor_model_name,
    model_data=predictor_model_data,
    role=role,
)

### Create Pipeline Model

1. Create a Pipeline model with `sklearn_model` and `xgboost_model` to act a serial inference pipeline.
1. Deploy Pipeline Model

In [None]:
from sagemaker.pipeline import PipelineModel

pipeline_model_name = f"Abalone-pipeline-{suffix}"

pipeline_model = PipelineModel(
    name=pipeline_model_name,
    role=role,
    models=[sklearn_model, xgboost_model],
    sagemaker_session=sm_session,
)

print(f"Deploying pipeline model {pipeline_model_name}...")
predictor = pipeline_model.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.xlarge",
)

### Test inference on Endpoint with Pipeline Model

- Instantiate a `Predictor` class from `sagemaker.predictor` module
- Use `CSVSerialzier` to serialize payload
- and `JSONDeSerializer` for deserializing output (JSON) from the XGBoost model

In [None]:
from time import sleep
from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer

# Use the endpoint_name you specified when deploying the pipeline
endpoint_name = pipeline_model_name

# Let's use the test dataset in featurizer/data directory
local_test_dataset = DATA_DIR.joinpath("abalone_test.csv").resolve()

predictor = Predictor(
    endpoint_name=endpoint_name,
    sagemaker_session=sm_session,
    serializer=CSVSerializer(),
    deserializer=JSONDeserializer(),
)


limit = 15
i = 0

with open(local_test_dataset, "r") as _f:
    for row in _f:
        # Skip headers row
        if i == 0:
            i += 1
        elif i <= limit:
            row = row.rstrip("\n")
            splits = row.split(",")
            # Remove the target column (last column)
            label = splits.pop(-1)
            input_cols = ",".join(s for s in splits)
            prediction = None
            try:
                response = predictor.predict(input_cols)
                print(f"True value: {label} | Predicted: {response['result'][0]}")
                i += 1
                sleep(0.15)
            except Exception as e:
                print(f"Prediction error: {e}")
                pass

### (Optional) Verify Logs emitted by the endpoint in CloudWatch

In [None]:
from datetime import timedelta

logs_client = boto3.client("logs")
end_time = datetime.utcnow()
start_time = end_time - timedelta(minutes=15)

log_group_name = f"/aws/sagemaker/Endpoints/{endpoint_name}"
log_streams = logs_client.describe_log_streams(logGroupName=log_group_name)
log_stream_name = log_streams["logStreams"][0]["logStreamName"]

# Retrieve the logs
logs = logs_client.get_log_events(
    logGroupName=log_group_name,
    logStreamName=log_stream_name,
    startTime=int(start_time.timestamp() * 1000),
    endTime=int(end_time.timestamp() * 1000),
)

# Print the logs
for event in logs["events"]:
    print(f"{datetime.fromtimestamp(event['timestamp'] // 1000)}: {event['message']}")

## Cleanup

In [None]:
# Delete model, endpoint
try:
    print(f"Deleting model: {pipeline_model_name}")
    predictor.delete_model()
except Exception as e:
    print(f"Error deleting model: {pipeline_model_name}\n{e}")
    pass

try:
    print(f"Deleting endpoint: {endpoint_name}")
    predictor.delete_endpoint()
except Exception as e:
    print(f"Error deleting EP: {endpoint_name}\n{e}")
    pass

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.


![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-east-2/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/us-west-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ca-central-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/sa-east-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-2/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-west-3/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-central-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/eu-north-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-southeast-2/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-northeast-2/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://prod.us-west-2.tcx-beacon.docs.aws.dev/sagemaker-nb/ap-south-1/inference|structured|realtime|byoc|byoc-nginx-python|serial-inference-pipeline.ipynb)
