# Hosting a Live Endpoint for Real-Time Inference with SageMaker for Customer Churn

## Environment Setup

- Image: Data Science
- Kernel: Python 3
- Instance type: ml.t3.medium

## Background

This notebook builds on previous notebooks where we trained a model to predicts customer churn (i.e., when a company loses a customer).  In this iteration of the notebook, we deploy our trained model to a live endpoint then pass in test data to see how well the model performs predictions.

To keep things simple, Experiments have been removed from this version of the notebook.

This notebook has been adapted from the [SageMaker examples](https://github.com/aws/amazon-sagemaker-examples/blob/main/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.ipynb).

## Initialize Environment and Variables

In [4]:
# Import libraries
import boto3
import re
import pandas as pd
import numpy as np
import os
import time
import io

import sagemaker
from sagemaker import get_execution_role
from sagemaker.predictor import CSVSerializer
from sagemaker.inputs import TrainingInput

# Get the SageMaker session and the execution role from the SageMaker domain
sess = sagemaker.Session()
role = get_execution_role()

bucket = 'test-sagemaker-examples-1357942113492' # Update with the name of a bucket that is already created in S3
prefix = 'live-inference-demo' # The name of the folder that will be created in the S3 bucket

## Data

For this lesson, data has already been cleaned and split into two local CSV files: **train.csv** (used to train the model) and **validation.csv** (used to validate how well the model does).

We'll take these local files and upload them to S3 so SageMaker can use them.

In [5]:
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train/train.csv')).upload_file('train.csv')
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'validation/validation.csv')).upload_file('validation.csv')

## Train

We trained the model in previous lessons, but to make it easier to follow along with this notebook, we'll do that again here.

In this section, we need to specify three things: where our training data is, the path to the algorithm container stored in the Elastic Container Registry, and the algorithm to use (along with hyperparameters).

The training job (the Estimator) takes in several hyperparameters.  More information on the hyperparameters for the XGBoost algorithm can be found [here](https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost_hyperparameters.html).

In [6]:
# The location of our training and validation data in S3
s3_input_train = TrainingInput(
    s3_data='s3://{}/{}/train'.format(bucket, prefix), content_type='csv'
)
s3_input_validation = TrainingInput(
    s3_data='s3://{}/{}/validation/'.format(bucket, prefix), content_type='csv'
)

In [7]:
# The location of the XGBoost container version 1.5-1 (an AWS-managed container)
container = sagemaker.image_uris.retrieve('xgboost', sess.boto_region_name, '1.5-1')

In [8]:
# Initialize hyperparameters
hyperparameters = {
                    'max_depth':'5',
                    'eta':'0.2',
                    'gamma':'4',
                    'min_child_weight':'6',
                    'subsample':'0.8',
                    'objective':'binary:logistic',
                    'eval_metric':'error',
                    'num_round':'100'}

# Output path where the trained model will be saved
output_path = 's3://{}/{}/output'.format(bucket, prefix)

# Set up the Estimator, which is training job
xgb = sagemaker.estimator.Estimator(image_uri=container, 
                                    hyperparameters=hyperparameters,
                                    role=role,
                                    instance_count=1, 
                                    instance_type='ml.m4.xlarge', 
                                    output_path=output_path,
                                    sagemaker_session=sess)

In [9]:
# "fit" executes the training job
xgb.fit({'train': s3_input_train, 'validation': s3_input_validation}) 

INFO:sagemaker:Creating training-job with name: sagemaker-xgboost-2023-05-15-09-02-17-906


2023-05-15 09:02:18 Starting - Starting the training job...
2023-05-15 09:02:43 Starting - Preparing the instances for training......
2023-05-15 09:03:51 Downloading - Downloading input data...
2023-05-15 09:04:21 Training - Downloading the training image......
2023-05-15 09:05:27 Uploading - Uploading generated training model[34m[2023-05-15 09:05:18.013 ip-10-0-179-110.eu-west-1.compute.internal:7 INFO utils.py:28] RULE_JOB_STOP_SIGNAL_FILENAME: None[0m
[34m[2023-05-15 09:05:18.096 ip-10-0-179-110.eu-west-1.compute.internal:7 INFO profiler_config_parser.py:111] User has disabled profiler.[0m
[34m[2023-05-15:09:05:18:INFO] Imported framework sagemaker_xgboost_container.training[0m
[34m[2023-05-15:09:05:18:INFO] Failed to parse hyperparameter eval_metric value error to Json.[0m
[34mReturning the value itself[0m
[34m[2023-05-15:09:05:18:INFO] Failed to parse hyperparameter objective value binary:logistic to Json.[0m
[34mReturning the value itself[0m
[34m[2023-05-15:09:05:1

## Host

Now that we've trained the model, let's deploy it to an endpoint so we can send data to it for live prediction.  We can do that with a single line.  This call to "deploy" will create our endpoint configuration and endpoint all at the same time.

In [10]:
xgb_predictor = xgb.deploy(initial_instance_count = 1, instance_type = 'ml.m4.xlarge')

INFO:sagemaker:Creating model with name: sagemaker-xgboost-2023-05-15-09-06-35-229
INFO:sagemaker:Creating endpoint-config with name sagemaker-xgboost-2023-05-15-09-06-35-229
INFO:sagemaker:Creating endpoint with name sagemaker-xgboost-2023-05-15-09-06-35-229


------!

## Make Predictions

Now that our endpoint is live, let's pass in test data to get predictions.

In [11]:
# Read test data into a dataframe and transform it into a CSV that can be passed to the endpoint
payload = pd.read_csv('test.csv')
payload_file = io.StringIO()
payload.to_csv(payload_file, header = None, index = None)

In [12]:
# Create a low-level client for the SageMaker Runtime
sagemaker_runtime = boto3.client('sagemaker-runtime')

# Client applications use this API to get inferences from the hosted model, by calling invoke_endpoint
# We pass in the test data/payload file we read in earlier
response = sagemaker_runtime.invoke_endpoint(
                            EndpointName=xgb_predictor.endpoint_name, 
                            Body=payload_file.getvalue(),
                            ContentType = 'text/csv'
)

# Print the response body, which contains the predictions
print(response['Body'].read().decode('utf-8'))

0.1684480607509613
0.21427156031131744
0.06330718100070953
0.02791607193648815
0.014169521629810333
0.005713687278330326
0.10534518957138062
0.025899196043610573
0.6651358604431152
0.03172963857650757
0.09675747901201248
0.35873568058013916
0.023574918508529663
0.006310072727501392
0.01659534126520157
0.9036388993263245
0.5863341093063354
0.40476715564727783
0.02699803188443184
0.051476720720529556
0.8226211071014404
0.050476640462875366
0.6812863945960999
0.014689727686345577
0.010139471851289272
0.016720956191420555
0.00886827427893877
0.013370179571211338
0.8598929047584534
0.024568557739257812
0.022816648706793785
0.011015648953616619
0.15493033826351166
0.21677029132843018
0.007811678573489189
0.008974204771220684
0.8475439548492432
0.03913778066635132
0.03130830079317093
0.014615676365792751
0.020784741267561913
0.022972693666815758
0.014104612171649933
0.015749145299196243
0.008738397620618343
0.65254807472229
0.4511687755584717
0.9566231369972229
0.024895407259464264
0.04089356