## Deploy Scikit-Learn average calculation logic on SageMaker Endpoint 

In [None]:
import datetime
import time
import tarfile
from pathlib import Path
import os
import boto3
import pandas as pd
import numpy as np
from sagemaker import get_execution_role
import sagemaker
from sagemaker.sklearn import SKLearn, SKLearnModel


sm_boto3 = boto3.client("sagemaker")

sess = sagemaker.Session()

region = sess.boto_session.region_name

role = get_execution_role()

bucket = sess.default_bucket()  # this could also be a hard-coded bucket name
prefix = "scikit_learn_average_calc"

print("Using bucket " + bucket)

## Create dummy model file

SageMaker is expecting `model.tar.gz` file with a model inside. It could be a `pickle` file for example. In our case, we will just put an empty file.

In [None]:
dummy_model_file = Path("dummy.model")
dummy_model_file.touch()

with tarfile.open("model.tar.gz", "w:gz") as tar:
    tar.add(dummy_model_file.as_posix())

## Upload Model file to S3

SageMaker SageMaker is expecting model.tar.gz file from S3, so we will upload it to S3.

In [None]:
fObj = open("model.tar.gz", "rb")
key = os.path.join(prefix, "model.tar.gz")
boto3.Session().resource("s3").Bucket(bucket).Object(key).upload_fileobj(fObj)

## Set up hosting for the model

This involves creating a SageMaker model from the dummy model file previously uploaded to S3.

In [None]:
model_url = "s3://{}/{}".format(bucket, key)
model_url

### Entry Point for the Inference Image

Your model artifacts pointed by `model_data` is pulled by the `SKLearnModel` and it is decompressed and saved in
in the docker image it defines. 

Also, the deployed endpoint interacts with RESTful API calls, you need to tell it how to parse an incoming 
request to your model. 

These two instructions needs to be defined as two functions in the python file pointed by `entry_point`.

By convention, we name this entry point file `inference.py` and we put it in the `code` directory.

To tell the inference image how to load the model checkpoint, you need to implement a function called 
`model_fn`. This function takes one positional arguments.

### Predicting Functions

* model_fn(model_dir) - loads your model.
* input_fn(serialized_input_data, content_type) - deserializes predictions to predict_fn.
* output_fn(prediction_output, accept) - serializes predictions from predict_fn.
* predict_fn(input_data, model) - calls a model on data deserialized in input_fn.

The model_fn() is the only function that doesn't have a default implementation and is required by the user for using PyTorch on SageMaker. 


### Construct a script for inference
Here is the full code that does model inference.

In [None]:
!pygmentize code/inference.py

## Define the SKLearnModel Object

`SKLearnModel` is a Scikit-learn SageMaker Model that can be deployed to a SageMaker Endpoint.

In [None]:
model = SKLearnModel(
        role=role,
        model_data=model_url,
        framework_version='0.23-1',
        py_version='py3',
        source_dir='code',
        entry_point='inference.py'
    )

## Deploy to SageMaker Endpoint

In [None]:
predictor = model.deploy(
        initial_instance_count=1,
        instance_type='ml.m5.large',
    )

## Invoke SageMaker Endpoint

### Invoke with the Python SDK

In [None]:
inputs = np.array([1, 2, 3, 4 , 5])

In [None]:
predictions = predictor.predict(inputs)

In [None]:
print(predictions)

### Alternative: invoke with `boto3`

In [None]:
runtime = boto3.client("sagemaker-runtime")

In [None]:
# npy serialization
from io import BytesIO


# Serialise numpy ndarray as bytes
buffer = BytesIO()
# Assuming testX is a data frame
np.save(buffer, inputs)

response = runtime.invoke_endpoint(
    EndpointName=predictor.endpoint, Body=buffer.getvalue(), ContentType="application/x-npy"
)

print(response["Body"].read().decode())

## Clean up

Endpoints should be deleted when no longer in use, since (per the [SageMaker pricing page](https://aws.amazon.com/sagemaker/pricing/)) they're billed by time deployed.

In [None]:
predictor.delete_endpoint()