# Containerize Auto-AI Image
Below command builds the docker image and pushes into ECR using sagemakers `sm-docker` command. If `sm-docker` is not installed uncomment pip install code. When the image is successfully built, last line returns the ECR URI for the image.

In [77]:
#!pip install sagemaker-studio-image-build
! sm-docker build -t autoaipython39docker .


https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

[Container] 2022/11/02 12:25:22 Running command $(aws ecr get-login --no-include-email --region $AWS_DEFAULT_REGION --registry-ids 462105765813)
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded

[Container] 2022/11/02 12:25:22 Phase complete: PRE_BUILD State: SUCCEEDED
[Container] 2022/11/02 12:25:22 Phase context status code:  Message:
[Container] 2022/11/02 12:25:23 Entering phase BUILD
[Container] 2022/11/02 12:25:23 Running command echo Build started on `date`
Build started on Wed Nov 2 12:25:23 UTC 2022

[Container] 2022/11/02 12:25:23 Running command echo Building the Docker image...
Building the Docker image...

[Container] 2022/11/02 12:25:23 Running command docker build -t $IMAGE_REPO_NAME:$IMAGE_TAG -t autoaipython39docker .
Sending build context to Docker daemon  906.2kB
Step 1/16 : FROM ubuntu:18.04
18.04: Pulling from library/ubuntu

Copy the image uri into a new variable 

In [78]:
image="849589503910.dkr.ecr.us-east-1.amazonaws.com/sagemaker-studio-d-dwojoyu93giy:default-1667319908824"

# Deploy Auto-AI pipeline on Sagemaker

Once the image uri is generated, we can start preparing to deploy the image on sagemaker to generate endpoints. Below cell imports required libraries.

In [79]:
import datetime
import tarfile

import boto3 # AWS SDK for python. Provides low-level access to AWS services
from sagemaker import get_execution_role
import sagemaker
import re

import os
import numpy as np
import pandas as pd

Below cell creates AWS sagemaker client and sets up the environment.

In [80]:

m_boto3 = boto3.client('sagemaker') 

sess = sagemaker.Session()

region = sess.boto_session.region_name

bucket = sess.default_bucket()  #  Bucket is a logical unit of storage in AWS S3
role = get_execution_role()


print('Using bucket ' + bucket)

Using bucket sagemaker-us-east-1-849589503910


### Environment

In [81]:

# S3 prefix
prefix = "DemandResponse-autoai"

WORK_DIRECTORY = "/root/Sagemaker-AutoAI/autoai-container/data"

data_location = sess.upload_data(WORK_DIRECTORY, key_prefix=prefix)
account = sess.boto_session.client("sts").get_caller_identity()["Account"]


In order to use SageMaker to fit our algorithm, we create an Estimator that defines how to use the container to train. This includes the configuration we need to invoke SageMaker training:

- **The container name**. This is constructed as in the shell commands above. <br>
- **The role**. As defined above.<br>
- The **instance count** which is the number of machines to use for training.<br>
- The **instance type** which is the type of machine to use for training.<br>
- The **output path** determines where the model artifact will be written.<br>
- The **session** is the SageMaker session object that we defined above.<br>


Then we use fit() on the estimator to train using the model and data that we uploaded above.

In [82]:

#image = "{}.dkr.ecr.{}.amazonaws.com/autoai-deploy-sagemaker:latest".format(account, region)
#image = "849589503910.dkr.ecr.us-east-1.amazonaws.com/sagemaker-studio-d-5nsd2ufwkmwe:default-1662989009075"
model = sagemaker.estimator.Estimator(
    image,
    role,
    1,
    "ml.c4.2xlarge",
    output_path="s3://{}/output".format(sess.default_bucket()),
    sagemaker_session=sess,
)

model.fit(data_location)

2022-11-02 12:27:47 Starting - Starting the training job...
2022-11-02 12:28:10 Starting - Preparing the instances for trainingProfilerReport-1667392066: InProgress
.........
2022-11-02 12:29:33 Downloading - Downloading input data..
2022-11-02 12:30:16 Training - Training image download completed. Training in progress.
2022-11-02 12:30:16 Uploading - Uploading generated training model
2022-11-02 12:30:16 Completed - Training job completed
Training seconds: 42
Billable seconds: 42


## Hosting your model
We can use a trained model to get real time predictions using HTTP endpoint. Follow these steps to walk you through the process.

### Deploy the model
Deploying the model to SageMaker hosting just requires a deploy call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint.

In [83]:
from sagemaker.predictor import csv_serializer

predictor = model.deploy(1, "ml.m4.xlarge", serializer=csv_serializer)

--------!

### Choose some data and use it for a prediction
In order to do some predictions, we'll extract some of the data we used for training and do predictions against it. This is, of course, bad statistical practice, but a good way to see how the mechanism works.

In [84]:

df_raw=pd.read_csv('demandresponseAutoAiHoldout.csv')
#features=['CUSTOMER_ID','CSTFNM','Customer Last Name','Telephone Number','Email Address','AGE','CITY','MARITAL_STATUS','GENDER','EDUCATION','EMPLOYMENT','TENURE','SEGMENT','HOME_SIZE','ENERGY_USAGE_PER_MONTH','ENERGY_EFFICIENCY','IS_REGISTERED_FOR_ALERTS','OWNS_HOME','COMPLAINTS','EST_INCOME','CLTV','HAS_THERMOSTAT','HAS_HOME_AUTOMATION','PHOTOVOLTAIC_ZONING','WIND_ZONING','SMART_METER_COMMENTS','IS_CAR_OWNER','HAS_EV','HAS_PHOTOVOLTAIC','HAS_WIND','EBILL','IN_WARRANTY','STD_YRLY_USAGE','MISSED_PAYMENT','YEARLY_USAGE_PREDICTED']

#df_score=df_raw[features]



Scoring on sample data

In [85]:
df_raw.tail(1).values

array([[4, 'Nicolas', 'Baumbach', '507-490-8532',
        'Nicolas.Baumbach@amber.biz', 37, 'Santa Clara', 'M', 'male',
        "Bachelor's degree or more", 'Employed full-time', 33,
        'Budget Payment Plan Members', 1450, 4330, 0.335, True, True,
        False, 64792, 257, False, False, False, True, 'Positive', True,
        True, False, False, False, True, 52098, 0, 51577.02]],
      dtype=object)

Score using `predict()` function with the predictor. The serializer does the conversion required for scoring.

In [86]:
print(predictor.predict(df_raw.tail(1).values))


The csv_serializer has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


b'{"result":"Customer_ID,Probability_Yes,Probability_No\\n4,0.33880077586910073,0.6611992241308993\\n","status":200}\n'


## Run Batch Transform Job
We can use a trained model to get inference on large data sets by using Amazon SageMaker Batch Transform. A batch transform job takes your input data S3 location and outputs the predictions to the specified S3 output folder. Similar to hosting, we can extract inferences for training data to test batch transform.

### Create a Transform Job
We create an Transformer that defines how to use the container to get inference results on a data set. This includes the configuration we need to invoke SageMaker batch transform:

The **instance count** which is the number of machines to use to extract inferences. <br>
The **instance type** which is the type of machine to use to extract inferences.<br>
The **output path** determines where the inference results will be written.<br>

In [60]:
transform_output_folder = "batch-transform-output"
output_path = "s3://{}/{}".format(sess.default_bucket(), transform_output_folder)

transformer = model.transformer(
    instance_count=1,
    instance_type="ml.m4.xlarge",
    output_path=output_path,
    assemble_with="Line",
    accept="text/csv",
)

In [61]:
filename="demandresponseAutoAiHoldout.csv"
filepath=data_location+"/"+filename


We use tranform() on the transfomer to get inference results against the data that we uploaded. We can use these options when invoking the transformer.

- The **filepath** which is the location of input data.<br>
- The **content_type** which is the content type set when making HTTP request to container to get prediction.<br>


In [62]:
transformer.transform(
    filepath, content_type="text/csv"
)
transformer.wait()

.............................


### View Output
We can read results of above transform job from s3 files and print output.

In [63]:
s3_client = sess.boto_session.client("s3")
s3_client.download_file(
    sess.default_bucket(), "{}/demandresponseAutoAiHoldout.csv.out".format(transform_output_folder), "/tmp/demandresponseAutoAiHoldout.csv.out"
)


In [64]:
with open("/tmp/demandresponseAutoAiHoldout.csv.out") as f:
    results = f.readlines()
print("Transform results: \n{}".format("".join(results)))

Transform results: 
{"result":"Probability_Yes,Probability_No,Customer_ID\n0.9171045640872786,0.08289543591272146,3\n0.33880077586910073,0.6611992241308993,4\n","status":200}

