# Build Predictor (XGBoost) Model

We demonstrate building an ML application to predict the rings of Abalone.

After the model is hosted for inference, the payload will be sent as a raw (untransformed) csv string to a real-time endpoint.
The raw payload is first received by the preprocessor container. The raw payload is then transformed (feature-engineering) by the preprocessor, and the transformed record (float values) are returned as a csv string by the preprocessor container.

The transformed record is then passed to the predictor container (XGBoost model). The predictor then converts the transformed record into [`XGBoost DMatrix`](https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.DMatrix) format, loads the model, calls `booster.predict(input_data)` and returns the predictions (Rings) in a JSON format.

![Abalone Predictor](../images/byoc-predictor.png)

We use [nginx](https://nginx.org/) as the reverse proxy, [gunicorn](https://gunicorn.org/#deployment) as the web server gateway interface and the inference code as python ["Flask"](https://flask.palletsprojects.com/en/2.3.x/tutorial/factory/) app.

## Dataset and model

For this example, we use a pre-trained [XGBoost](https://xgboost.readthedocs.io) model on [UCI Abalone dataset](https://archive.ics.uci.edu/ml/datasets/abalone).
Trained [xgboost-model](./models/xgboost-model) accepts input in `text/csv` format and returns prediction results in `application/json`

## Prerequisite

Ensure [`featurizer.ipynb`](../featurizer/featurizer.ipynb) is run first. 
We use `abalone_test_predictions.csv` file generated by [`featurizer.ipynb`](../featurizer/featurizer.ipynb).

### Inference script for a predictor (XGBoost) model

- In this example, we use a trained XGBoost model on the UCI abalone dataset. Trained model with name `xgboost-model` is available under `./models` directory
- The inference code is implemented in [`code/inference.py`](./code/inference.py). The [Flask](https://flask.palletsprojects.com/) app implementation is as follows:
  - Implement routes for `/ping` and `/invocations`
  - Implement functions to handle preprocessing, model loading and prediction
  - Predictions will be returned from `/invocations` function

In [None]:
!pygmentize ./code/inference.py

### Build and test custom inference image locally

 - [Dockerfile](./Dockerfile) implementation
   - Set required LABEL `LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true`
   - Installs required software and python packages ([nginx](https://nginx.org/), [gunicorn](https://gunicorn.org/#deployment), [flask](https://flask.palletsprojects.com/en/2.3.x/tutorial/factory/), [xgboost](https://xgboost.readthedocs.io/) etc ,; )
   - Copies all files under code directory to `/opt/program`
   - Sets `ENTRYPOINT` to `["python"]`
   - Sets `CMD` to `["serve"]` (python script that launches nginx, gunicorn in the background)

In [None]:
!pygmentize Dockerfile

In [None]:
# build image locally
!docker build -t abalone/predictor .

### Launch and test custom Inference container locally

Launch a new terminal and run the below docker command

```docker
docker run --rm -v $(pwd)/models:/opt/ml/model -p 8080:8080 abalone/predictor
```

- This command mounts the [models](./models) directory to `/opt/ml/model` directory inside the container and maps container port `8080` to host port `8080`

In [None]:
# open a terminal and cd into the predictor directory (the location where predictor.ipynb is located) run the following command
# run this command to launch container locally
# docker run --rm -v $(pwd)/models:/opt/ml/model -p 8080:8080 abalone/predictor

#### Check container health by invoking `/ping`

If the `/ping` was successful you should see a response similar to `"GET /ping HTTP/1.1" 200 1` in the terminal

In [None]:
# Ping local inference endpoint
!curl http://localhost:8080/ping

### Verify container logs locally (using docker logs)

- To inspect a running container to view container config values or IP address we use `docker inspect <CONTAINER_ID_OR_NAME>`
- To view and tail logs generated in the container we use `docker logs --follow <NUM_OF_LINES> <CONTAINER_ID_OR_NAME>`
- SageMaker publishes container logs to CloudWatch. CloudWatch logs for a given endpoint are published to the following log stream path
`/aws/sagemaker/Endpoints/ENDPOINT_NAME/VARIANT_NAME/CONTAINER_NAME`

**NOTE:** 
1. Run this command in a terminal as running this inside a cell would hang execution.
1. the below command assumes there is only one running container. If you have more, then use command with container name `docker inspect <CONTAINER_ID_OR_NAME>` 

In [None]:
# RUN THE BELOW IN A SEPARATE NEW TERMINAL
# docker ps --format "{{.Names}}" | xargs -n1 -I{} docker logs --follow --tail 50 {}

### Troubleshooting container locally (using logs)

`!docker logs abalone/featurizer`

#### Test records for inference

Grab a test record from [abalone_test_predictor.csv](../featurizer/data/abalone_test_predictor.csv), generated by [`featurizer.ipynb`](../featurizer/featurizer.ipynb), format it as a CSV record, and send it as raw data to the endpoint `http://localhost:8080/invocations` path

In [None]:
# Send test records to /invocations on the endpoint
!curl --data-raw '-1.3317586042173168,-1.1425409076053987,-1.0579488602777858,-1.177706547272754,-1.130662184748842,-1.1493955859050584,-1.139968767909096,0.0,1.0,0.0' \
-H 'Content-Type: text/csv; charset=utf-8' \
-v http://localhost:8080/invocations

In [None]:
# Send test records to /invocations on the endpoint
!curl --data-raw '0.7995425613971686,0.877965470587042,1.326659055767273,1.398563012556441,0.9896192483949702,1.509166873607132,2.01650402614155,0.0,0.0,1.0' \
-H 'Content-Type: text/csv; charset=utf-8' \
-v http://localhost:8080/invocations

### Tag and push the local image to private ECR

- Tag the `abalone/predictor` local image to `{account_id}.dkr.ecr.{region}.amazonaws.com/{imagename}:{tag}` format
- Run [./build_n_push.sh](./build_n_push.sh) shell script with image name `nginx` as parameter


In [None]:
!chmod +x ./build_n_push.sh
!./build_n_push.sh abalone/predictor

### Optional: Test predictor inference image by deploying to a real-time endpoint

- **Step 1:** SageMaker session initialize
- **Step 2:** Compress your model in `./models/xgboost-model` to `model.tar.gz` format and upload to s3
- **Step 3:** Create Model object with your custom inference image 
- **Step 4:** Deploy model
- **Step 5:** Send test inference request to deployed endpoint
- **Step 6:** Cleanup

#### **Step 1:** Initialize Session and upload model artifacts to S3

In [None]:
import boto3
import sagemaker
import os
import tarfile
from sagemaker import get_execution_role, session
from sagemaker.s3 import S3Downloader, S3Uploader, s3_path_join

sm_session = session.Session()
region = sm_session._region_name
role = get_execution_role()
bucket = sm_session.default_bucket()

prefix = "sagemaker/abalone/models/byoc"

sm_client = boto3.client("sagemaker")
account_id = boto3.client("sts").get_caller_identity().get("Account")
model_s3uri = s3_path_join(f"s3://{bucket}/{prefix}", "predictor")

print(f"SageMaker Role: {role}")

print(f"Listing files under: {model_s3uri}")
S3Downloader.list(model_s3uri)

model_path = os.path.join("./models", "xgboost-model")
model_output_path = os.path.join("./models", "model.tar.gz")

# SageMaker expects model artifacts to be compressed to `model.tar.gz`
if not os.path.exists(model_output_path):
    print(f"Compressing model to {model_output_path}")
    tar = tarfile.open(model_output_path, "w:gz")
    tar.add(model_path, arcname="xgboost-model")
    tar.close()
else:
    print(f"Model file exists: {model_output_path}")

# Upload compressed model artifact to S3 using S3Uploader utility class
model_data_url = S3Uploader.upload(
    local_path=model_output_path,
    desired_s3_uri=model_s3uri,
    sagemaker_session=sm_session,
)
print(f"Uploaded predictor model.tar.gz to {model_data_url}")

#### **Step 2:** Create model object with custom inference image

In [None]:
from datetime import datetime
from uuid import uuid4
from sagemaker.model import Model

image_name = "abalone/predictor"
ecr_image = f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image_name}:latest"
print(f"model_image_uri: {ecr_image}")

suffix = f"{str(uuid4())[:5]}-{datetime.now().strftime('%d%b%Y')}"
model_name = f"AbaloneXGB-predictor-{suffix}"

print(f"Creating model : {model_name} with custom image:\n{ecr_image}")
predictor_model = Model(
    image_uri=ecr_image,
    name=model_name,
    model_data=model_data_url,
    role=role,
    sagemaker_session=sm_session,
)

#### **Step 3:** Deploy model to endpoint

In [None]:
endpoint_name = f"Abalone-nginx-ep-{suffix}"

print(f"Deploying model to endpoint: {endpoint_name}")
predictor = predictor_model.deploy(
    endpoint_name=endpoint_name,
    initial_instance_count=1,
    instance_type="ml.m5.xlarge",
    wait=False,
)

### Wait for endpoint to be `InService`

Get a waiter on the endpoint and wait for endpoint to be `InService`

In [None]:
# Get endpoint status using describe endpoint
status = sm_client.describe_endpoint(EndpointName=endpoint_name)["EndpointStatus"]
print(f"Endpoint {endpoint_name} - Status: {status}")

# Get waiter object
waiter = sm_client.get_waiter("endpoint_in_service")
# Apply waiter on the endpoint
waiter.wait(EndpointName=endpoint_name)

# Get endpoint status using describe endpoint
status = sm_client.describe_endpoint(EndpointName=endpoint_name)["EndpointStatus"]
print(f"Endpoint {endpoint_name} - Status: {status}")

### **Step 4:** Send test inference requests to deployed endpoint

- Open [abalone_test_predictions.csv](../data/abalone_test_predictions.csv), read each line a do the following:
  - Ignore the header row
  - join the values to a csv record to form payload
  - send payload to endpoint by calling `invoke_endpoint` using SageMaker run-time client

In [None]:
from time import sleep

runtime_sm_client = boto3.client("sagemaker-runtime")

LOCALDIR = "../data"
local_test_dataset = f"{LOCALDIR}/abalone_test_predictions.csv"

limit = 100
i = 0

with open(local_test_dataset, "r") as _f:
    lines = _f.readlines()
    for row in lines:
        # Skip headers
        if i == 0:
            i += 1
        elif i <= limit:
            row = row.rstrip("\n")
            splits = row.split(",")
            input_cols = ",".join(s for s in splits)
            prediction = None
            try:
                prediction = runtime_sm_client.invoke_endpoint(
                    EndpointName=endpoint_name,
                    ContentType="text/csv; charset=utf-8",
                    Body=input_cols,
                )
                response = prediction["Body"].read().decode("utf-8")
                print(response)
                i += 1
                sleep(0.15)
            except Exception as e:
                print(f"Prediction error: {e}")
                pass

### View logs emitted by the endpoint in CloudWatch

In [None]:
from datetime import timedelta

logs_client = boto3.client("logs")
end_time = datetime.utcnow()
start_time = end_time - timedelta(minutes=15)

log_group_name = f"/aws/sagemaker/Endpoints/{endpoint_name}"
log_streams = logs_client.describe_log_streams(logGroupName=log_group_name)
log_stream_name = log_streams["logStreams"][0]["logStreamName"]

# Retrieve the logs
logs = logs_client.get_log_events(
    logGroupName=log_group_name,
    logStreamName=log_stream_name,
    startTime=int(start_time.timestamp() * 1000),
    endTime=int(end_time.timestamp() * 1000),
)

# Print the logs
for event in logs["events"]:
    print(f"{datetime.fromtimestamp(event['timestamp'] // 1000)}: {event['message']}")

### Cleanup

Cleanup resources. Delete endpoint and model

In [None]:
# Delete endpoint
try:
    print(f"Deleting endpoint: {endpoint_name}")
    sm_client.delete_endpoint(EndpointName=endpoint_name)
except Exception as e:
    print(f"Error deleting EP: {endpoint_name}\n{e}")
    pass
# Delete model
try:
    print(f"Deleting model: {model_name}")
    sm_client.delete_model(ModelName=model_name)
except Exception as e:
    print(f"Error deleting Model: {model_name}\n{e}")
    pass