## Deploy Utilities Demand Response Sci-kit model on Sagemaker
The Utilities Demand Response model predicts how likely a customer is to enroll in the Demand Response Program. The models are built based on customer demographic, retail details and historical energy usage. We trained the model using sci-kit libraries using jupyter notebook on IBM Cloud pak for data.

This notebook shows how to use a pre-trained sci-kit model with the Amazon SageMaker scikit-learn container to quickly create a hosted endpoint for that model. 

Inoder to run/test this notebook we will require below libraries. Uncomment the below cell if the libraries are not already installed

In [2]:
#!pip install sagemaker
#!pip install pandas

In [3]:
import os
import boto3
import re
import json
import pandas as pd
import numpy as np
import sagemaker
from sagemaker import get_execution_role
import time


import subprocess
from time import gmtime, strftime

import warnings
warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')

#### Setup Sagemaker Environment
We can set up the sagemaker environment using `boto3`. Below cell sets AWS region, The IAM role arn used to give learning and hosting access to data, The S3 bucket that you want to use for training and model data.

In [4]:


region = boto3.Session().region_name

role = get_execution_role()

bucket = sagemaker.Session().default_bucket()
prefix = "sagemaker/DEMO-AutoAIsklearn-byo-model"
s3 = boto3.Session().resource('s3')

client = boto3.client(service_name="sagemaker")

print(f"bucket: {bucket}")

bucket: sagemaker-us-east-1-849589503910


In [5]:
## Uncomment this cell to test the model locally
#df_score=pd.read_csv('DemandResponseHoldout.csv')

#!pip install scikit-learn==1.0.1
#
#import sklearn
#import joblib
#

#scikit_model=joblib.load(os.path.join("scikit_model.pkl"))

# Fetch the columns from the transformer

#cols=[]
#for i in range(0,3):
#    columns=scikit_model.named_steps['preprocessor'].transformers_[i][2]
#    cols = cols+columns
#
#scikit_model.predict_proba(df_score[cols].head(5))

## Package pre-trained model contents and upload the file to S3

Below cell packages all the model contents and uploads the tar file to S3 bucket. The model contents include following files.

1. **scikit_model.pkl**: Sci-kit model pickle file exported from watson studio project.<br>
2. **inference.py**: An entry point script which contains following functions. <br>
`input_fn()` - Takes request data and deserializes the data into an object for prediction.  <br>
`output_fn()` - Takes the result of prediction and serializes this according to the response content type.  <br>
`predict_fn()` - Function that takes the deserialized request object and performs inference against the loaded model.  <br>
`model_fn()` - Function to load the model.  <br>

Additional information on model loading and model serving for scikit-learn on SageMaker can be found in the [SageMaker Scikit-learn Model Server documentation](https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/using_sklearn.html#deploy-a-scikit-learn-model). <br>
3. **requirements.txt** It installs additional Python dependencies required to run inference script. 

In [6]:

bashCommand = "tar -cvpzf WS_scikit_model.tar.gz scikit_model.pkl inference.py requirements.txt"
process = subprocess.Popen(bashCommand.split(), stdout=subprocess.PIPE)
output, error = process.communicate()

#Upload tar.gz to bucket
model_artifacts = f"s3://{bucket}/WS_scikit_model.tar.gz"
response = s3.meta.client.upload_file('WS_scikit_model.tar.gz', bucket, 'WS_scikit_model.tar.gz')
response

## Deploy with Python SDK
Once the model is packaged and loaded into S3 bucket, we follow these steps to deploy the model.
1. Retrieve the Sklearn image for inference directly from SageMaker using the SageMaker SDK.
2. Create Sagemaker model using the uploaded model and the image
3. Create Endpoint Configuration, by specifying instance type and count we want for our endpoint.
4. Use the configuration created in step 3 and create an endpoint. This step takes few minutes to complete


Below cell retrieves the Sklearn image for inference directly from SageMaker using the SageMaker SDK.

In [7]:
#Step 1: Model Creation
image_uri = sagemaker.image_uris.retrieve(
    framework="sklearn",
    region=region,
    version="1.0-1",
    py_version="py3",
    instance_type="ml.m5.xlarge",
)

Below cell creates Sagemaker model using the uploaded model and the image. Model name can be updated using `model_name` variable.

In [8]:

#Step 2: Model Creation
model_name = "SKLearn-demandresponse" + strftime("%d-%H-%M-%S", gmtime())
print("Model name: " + model_name)
create_model_response = client.create_model(
    ModelName=model_name,
    Containers=[
        {
            "Image": image_uri,
            "Mode": "SingleModel",
            "ModelDataUrl": model_artifacts,
            "Environment": {'SAGEMAKER_SUBMIT_DIRECTORY': model_artifacts,
                           'SAGEMAKER_PROGRAM': 'inference.py'} 
        }
    ],
    ExecutionRoleArn=role,
)
print("Model Arn: " + create_model_response["ModelArn"])

Model name: SKLearn-demandresponse09-11-23-51
Model Arn: arn:aws:sagemaker:us-east-1:849589503910:model/sklearn-demandresponse09-11-23-51


Below cell configures endpoint , by specifying instance type and count we want for our endpoint.

In [9]:

#Step 3: EPC Creation
sklearn_epc_name = "SKLearn-demandresponse" + strftime("%d-%H-%M-%S", gmtime())
endpoint_config_response = client.create_endpoint_config(
    EndpointConfigName=sklearn_epc_name,
    ProductionVariants=[
        {
            "VariantName": "sklearnvariant",
            "ModelName": model_name,
            "InstanceType": "ml.c5.large",
            "InitialInstanceCount": 1
        },
    ],
)
print("Endpoint Configuration Arn: " + endpoint_config_response["EndpointConfigArn"])

Endpoint Configuration Arn: arn:aws:sagemaker:us-east-1:849589503910:endpoint-config/sklearn-demandresponse09-11-23-54


Below cell uses the configuration created in step 3 and create an endpoint. Name for the endpoint can be updated in the variable `endpoint_name`. This step takes few minutes to complete

In [10]:
#Step 4: EP Creation
endpoint_name = "SKLearn-demandresponse-ep" + strftime("%d-%H-%M-%S", gmtime())
create_endpoint_response = client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=sklearn_epc_name,
)
print("Endpoint Arn: " + create_endpoint_response["EndpointArn"])


#Monitor creation
describe_endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
while describe_endpoint_response["EndpointStatus"] == "Creating":
    describe_endpoint_response = client.describe_endpoint(EndpointName=endpoint_name)
    print(describe_endpoint_response["EndpointStatus"])
    time.sleep(15)
print(describe_endpoint_response)

Endpoint Arn: arn:aws:sagemaker:us-east-1:849589503910:endpoint/sklearn-demandresponse-ep09-11-23-56
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
Creating
InService
{'EndpointName': 'SKLearn-demandresponse-ep09-11-23-56', 'EndpointArn': 'arn:aws:sagemaker:us-east-1:849589503910:endpoint/sklearn-demandresponse-ep09-11-23-56', 'EndpointConfigName': 'SKLearn-demandresponse09-11-23-54', 'ProductionVariants': [{'VariantName': 'sklearnvariant', 'DeployedImages': [{'SpecifiedImage': '683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3', 'ResolvedImage': '683313688378.dkr.ecr.us-east-1.amazonaws.com/sagemaker-scikit-learn@sha256:e7887de688db0abeb4ae7f0467d18c69c613d3795bc23c425ec0574b19b81350', 'ResolutionTime': datetime.datetime(2022, 11, 9, 11, 23, 57, 295000, tzinfo=tzlocal())}], 'CurrentWeight': 1.0, 'DesiredWeight': 1.0, 'CurrentInstanceCount': 1, 'DesiredInstanceCount': 1}], 'EndpointStatus': 'InService', 'CreationTime': datetime.d

### Test endpoints with sample scoring

If the endpoint is successfullt created, We can  test our endpoint with a sample scoring data. Sample file `DemandResponseHoldout.csv` is provided in the folder to test the scoring.

In [11]:
df_score=pd.read_csv('DemandResponseHoldout.csv')

runtime_client = boto3.client('sagemaker-runtime')
content_type = "application/json"
request_body = {"Input": df_score.head(5).values.tolist()}
data = json.loads(json.dumps(request_body))
data['Headers']=df_score.columns.tolist()
payload = json.dumps(data)



response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType=content_type,
    Body=payload)
result = json.loads(response['Body'].read().decode())['Output']
print(result)

[[0.9273475228581084, 0.07265247714189171], [0.7857858278439955, 0.21421417215600452], [0.9212587131678012, 0.07874128683219879], [0.9482537505719867, 0.05174624942801344], [0.9032694912568143, 0.0967305087431857]]


Uncomment the below cell to delete the endpoints. 

In [83]:
#endpoint_name="SKLearn-demandresponse-ep09-10-58-43"
#response = client.delete_endpoint(
#    EndpointName=endpoint_name
#)