# Running Real-Time Predictions with a SageMaker Hosted Model Endpoint
Utilizing the Linear Learner algorithm with the MNIST dataset to predict whether a handwritten digit is a 0 or not.

This example is based on the following AWS sample notebook:
Linear Learner MNIST Example


## Introduction

The MNIST dataset comprises images of handwritten digits ranging from zero to nine. Each 28 x 28 grayscale image is represented by individual pixel values, which will be used to predict a binary label: whether the digit is a 0 or any other digit (1, 2, 3, ... 9).

We will use the Linear Learner algorithm to perform binary classification. The predicted_label will be either 1 or 0: 1 indicates the image is predicted to be a 0, while 0 indicates the image is predicted to be a digit other than 0.

## Prerequisites and Preprocessing

This notebook is designed to be used with SageMaker Studio's JupyterLab.

Before proceeding, make sure to specify the following:

The S3 bucket and prefix where the training and model data will be stored.
The IAM role ARN that grants permissions for training and hosting to access your data.

In [None]:
import sagemaker

bucket = sagemaker.Session().default_bucket()
prefix = "sagemaker/DEMO-linear-mnist"

# Define IAM role
import boto3
import re
from sagemaker import get_execution_role

role = get_execution_role()

### Data Ingestion
Load the dataset from an online URL directly into memory for preprocessing before training. Since the dataset is small, it can be handled entirely in memory.

In [None]:
%%time
import pickle, gzip, numpy, urllib.request, json

fobj = (
    boto3.client("s3")
    .get_object(
        Bucket=f"sagemaker-example-files-prod-{boto3.session.Session().region_name}",
        Key="datasets/image/MNIST/mnist.pkl.gz",
    )["Body"]
    .read()
)

with open("mnist.pkl.gz", "wb") as f:
    f.write(fobj)

# Load the dataset
with gzip.open("mnist.pkl.gz", "rb") as f:
    train_set, valid_set, test_set = pickle.load(f, encoding="latin1")

### Inspecting the Data
After importing the dataset, we can examine one of the digits included in the dataset.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = (2, 10)


def show_digit(img, caption="", subplot=None):
    if subplot == None:
        _, (subplot) = plt.subplots(1, 1)
    imgr = img.reshape((28, 28))
    subplot.axis("off")
    subplot.imshow(imgr, cmap="gray")
    plt.title(caption)


show_digit(train_set[0][30], "This is a {}".format(train_set[1][30]))

### Convert the Data to RecordIO-Wrapped Protobuf Format
Amazon SageMaker's Linear Learner algorithm supports data in either RecordIO-wrapped protobuf or CSV format. Therefore, we need to transform the data into a supported format for the algorithm to process it.

The code below converts the np.array into the RecordIO-wrapped protobuf format.

In [None]:
import io
import numpy as np
import sagemaker.amazon.common as smac

vectors = np.array([t.tolist() for t in train_set[0]]).astype("float32")
labels = np.where(np.array([t.tolist() for t in train_set[1]]) == 0, 1, 0).astype("float32")

buf = io.BytesIO()
smac.write_numpy_to_dense_tensor(buf, vectors, labels)
buf.seek(0)

## Upload Training Data
With the RecordIO-wrapped protobuf now created, the next step is to upload it to S3, allowing Amazon SageMaker to access and use it for training.

In [None]:
import boto3
import os

key = "recordio-pb-data"
boto3.resource("s3").Bucket(bucket).Object(os.path.join(prefix, "train", key)).upload_fileobj(buf)
s3_train_data = "s3://{}/{}/train/{}".format(bucket, prefix, key)
print("uploaded training data location: {}".format(s3_train_data))

Setup an output S3 location for the model artifact that will be output as the result of training with the algorithm.

In [None]:
output_location = "s3://{}/{}/output".format(bucket, prefix)
print("training artifacts will be uploaded to: {}".format(output_location))

In [None]:
from sagemaker.image_uris import retrieve

container = retrieve("linear-learner", boto3.Session().region_name)

Initiate the training job with the following parameters:

    feature_dim is set to 784, representing the total number of pixels in each 28 x 28 image.

    predictor_type is defined as 'binary_classifier', as the objective is to predict whether the image depicts a 0 or not.

    mini_batch_size is configured to 200.

In [None]:
import boto3

sess = sagemaker.Session()

linear = sagemaker.estimator.Estimator(
    container,
    role,
    instance_count=1,
    instance_type="ml.m5.large",
    output_path=output_location,
    sagemaker_session=sess,
)
linear.set_hyperparameters(feature_dim=784, predictor_type="binary_classifier", mini_batch_size=200)

linear.fit({"train": s3_train_data})

## Set Up a Model Endpoint
After completing the training, we can deploy the model to a SageMaker real-time hosted endpoint, allowing for dynamic generation of predictions (inference).

By utilizing the deploy API, we can define key parameters such as the number of initial instances, the instance type, and the methods for serializing requests and deserializing responses. In this configuration, the input data will be in RecordIO-wrapped protobuf format, while the model's output will be in JSON format.

In [None]:
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer

linear_predictor = linear.deploy(
    initial_instance_count=1,
    instance_type="ml.m5.large",
    serializer=CSVSerializer(),
    deserializer=JSONDeserializer(),
)

## Validate the Model for Usage
At this stage, we can validate the model and test its functionality. By sending HTTP POST requests to the endpoint, we can receive predictions. To simplify this process, we will use the Amazon SageMaker Python SDK, which allows us to define how to serialize the input requests and deserialize the output responses specific to the algorithm.

Now let's try getting a prediction for a single record.

In [None]:
result = linear_predictor.predict(train_set[0][30:31])
print(result)

If everything works, the endpoint will return a prediction: `predicted_label` which will be either `0` or `1`. `1` denotes that we predict the image is a 0, while `0` denotes that we are predicting the image is not of a 0.

It also gives a `score` which is a single floating point number indicating how strongly the algorithm believes it has predicted correctly. 

### Clean Up - Remove the Endpoint
The delete_endpoint command in the code below will terminate the hosted endpoint to prevent incurring unnecessary charges. 
Additionally, it's recommended to delete the associated S3 buckets to free up resources.

In [None]:
sagemaker.Session().delete_endpoint(linear_predictor.endpoint_name)