#  Immersion Day Lab - Bring your own Algorithim Container


With Amazon SageMaker, you can package your own algorithms that can be trained and deployed in the SageMaker environment. This notebook will guide you through an example that shows you how to use a Docker container for training and inference.


[Prerequisite - Permissions](#Prerequisite---Permissions)<br>
[About the Dataset](#About-the-Dataset)

1. [Part 1: Training and Hosting your Algorithm in Amazon SageMaker](#Part-2:-Training-and-Hosting-your-Algorithm-in-Amazon-SageMaker)
  1. [Set up the environment](#Set-up-the-environment)
  1. [Create the session](#Create-the-session)
  1. [Create Variables](#Create-Variables)
  1. [Upload the data for training](#Upload-the-data-for-training)
  1. [Create an estimator](#Create-an-estimator)
  1. [Fit the model](#Fit-the-model)
  1. [Deploy the model](#Deploy-the-model)
  1. [Perform Prediction](#Perform-Prediction)
  1. [Optional cleanup](#Optional-cleanup)<br>
  <br>

1. [Part 2: Granular Endpoint Deployment](#Part-3:-Granular-Endpoint-Deployment)

  1. [Create Model](#Create-Model)
  1. [Define Endpoint Configuration - Production Variant](#Define-Endpoint-Configuration---Production-Variant)
  1. [Deploy Endpoint](#Deploy-Endpoint)
  1. [Prediction](#Prediction)
  1. [Optional Delete Endpoint](#Optional-Delete-Endpoint)  
<br>
1. [Part 3: Miscellaneous](#Part-4:-Miscellaneous)
    1. [Other ways to perform prediction to the deployed endpoint](#Other-ways-to-perform-prediction-to-the-deployed-endpoint)
    1. [Update an existing Endpoint's weight](#Update-an-existing-Endpoint's-weight)
    1. [Check an existing Endpoint's status](#Check-an-existing-Endpoint's-status)
    1. [Update an existing Endpoint](#Update-an-existing-Endpoint)

## Prerequisite - Permissions

Running this notebook requires permissions in addition to the normal SageMakerFullAccess permissions. This is because we'll creating new repositories in Amazon ECR. The easiest way to add these permissions is simply to add the managed policy **AmazonEC2ContainerRegistryFullAccess** to the role that you used to start your notebook instance. There's no need to restart your notebook instance when you do this, the new permissions will be available immediately.

## About the Dataset

We will be building a publicly available Tensorflow model [example](https://github.com/tensorflow/models/tree/master/official/wide_deep/) in our containers and deploying it to an endpoint for inference. In this tutorial, we will use the tf.estimator API in TensorFlow to solve a binary classification problem: Given census data about a person such as age, education, marital status, and occupation (the features), we will try to predict whether or not the person earns more than 50,000 dollars a year (the target label). 

The point of this exercise is to show the surrounding structure that you'll want to add to your own code so you can train and host it in Amazon SageMaker. The ideas shown here will work in any language or environment.

# Part 1: Training and Hosting your Algorithm in Amazon SageMaker

Once you have both your container packaged and pushed to the ECR repository, you can use them to train and serve models, which we will be discussing in this section.

## Set up the environment

### Create Session
We begin by setting up the role that will be used by the sagemaker sdk for its calls. We are using the role that the notebook instance was launched with. 
The session object we invoke remembers our connection parameters to SageMaker. We'll use it to perform all of our SageMaker operations.

In [None]:
from sagemaker import get_execution_role
from sagemaker import session

role = get_execution_role()
sess = session.Session()
print(role)

## Create Variables 
Next, we will declare the ECR repository that we had pushed to earlier. You will also specify the bucket locations where your have your training data uploaded and, where you would like for your output to be stored. The repository and the buckets should be in the same region as this notebook. We will also name our Training Job and Endpoint Resources.<br><br>

We will reference all the declared variables and pass them in as arugments to the required API calls.

In [None]:
import datetime
current_time = datetime.datetime.now()
current_time = current_time.strftime('%Y-%m-%d-%H-%M')

#ECR repo name created in the earlier lab
ecr_repository_name='YOUR_CONTAINER_NAME'

#fetch account number and region name that the notebook instance is running in.
account_num = sess.boto_session.client('sts').get_caller_identity()['Account']
region_name = sess.boto_session.region_name

#construct ECR URI
training_image = '{}.dkr.ecr.{}.amazonaws.com/{}'.format(account_num, region_name, ecr_repository_name)
serving_image = '{}.dkr.ecr.{}.amazonaws.com/{}'.format(account_num, region_name, ecr_repository_name)

#training and model buckets
training_data_location_s3 = 's3://YOUR_BUCKET/KEY_PREFIX'
model_output_location_s3 = 's3://YOUR_BUCKET/KEY_PREFIX'

#Resources Name
training_job_name = "training-{}-{}".format (ecr_repository_name.replace ("_", "-"), current_time)
model_name = "model-{}-{}".format (ecr_repository_name.replace ("_", "-"), current_time)
endpoint_name = "endpoint-{}-{}".format (ecr_repository_name.replace ("_", "-"), current_time)

print (training_job_name, model_name, endpoint_name)

## Create an estimator

In order to use SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to  train. Estimators encapsulate training on Sagemaker.

In [None]:
from sagemaker import estimator
census_estimator = estimator.Estimator(image_name = training_image,
                                       role = role, 
                                       train_instance_count = 1, 
                                       train_instance_type = 'ml.m4.xlarge',
                                       output_path = model_output_location_s3, 
                                       sagemaker_session = sess
                                      )

We also set the hyperparameters that will be saved on the container in /opt/ml/input/config/hyperparameters.json. The 'train' code will use these parameters during training.

In [None]:
#set hyperparameters to be passed to the training job
census_estimator.set_hyperparameters(test_data="/opt/ml/input/data/training/test_data.csv",
                                     train_data="/opt/ml/input/data/training/train_data.csv",
                                     model_type= "wide",
                                     train_epochs=10,
                                     epochs_per_eval=2,
                                     batch_size=40
                                    )

#check to see if the hyperparameters are set as expected
census_estimator.hyperparameters()

## Fit the model

We use fit() on the estimator to train against the arguments that we declared above.

In [None]:
%%time
census_estimator.fit(training_data_location_s3, job_name=training_job_name)


## Deploy the Model

Deploying the model to SageMaker as an endpoint just requires a `deploy` call on the fitted model. This single call behind the scenes creates three Sagemaker resources i.e. 'Model', 'Endpoint Configuration' and the actual 'Endpoint' itself.

In [None]:
%%time
predictor = census_estimator.deploy(instance_type='ml.m4.xlarge',
                                    initial_instance_count=1,
                                    endpoint_name=endpoint_name,
                                    image=serving_image,
                                    name=model_name
                                   )


## Perform prediction

In order to do some predictions, we'll randomly use a single csv from our test data. Prediction is as easy as calling predict with the predictor we got back from deploy and the data we want to do predictions with.

In [None]:
prediction_input = ('38,Private,89814,HS-grad,9,Married-civ-spouse,Farming-fishing,Husband,White,Male,0,0,50,United-States,unknown')
print(predictor.predict(prediction_input))

In [None]:
prediction_input = ('40,Private,85019,Doctorate,16,Married-civ-spouse,Prof-specialty,Husband,Asian-Pac-Islander,Male,0,0,45,?,unknown')
print(predictor.predict(prediction_input))

## Optional cleanup

Unlike the training job instances which get deleted automatically post job completion, the endpoints that are deployed for prediction/serving continue running until terminated, incurring costs. The following deletes the endpoint and the backend instances. 

In [None]:
sess.delete_endpoint(predictor.endpoint)

--------------------------------------------------------------------------------------------------------
--

# Part 3: Granular Endpoint Deployment 
## Creating Model, Endpoint Configuration and Endpoint

Although extremely convenient, the [deploy](#Deploy-the-Model) step we performed earlier, created three resources Model-EndpointConfiguration-Endpoint in just a single step. In this section we will be using the 'session' function in the sagemaker sdk to create a new endpoint but with more options. Along the way we will create a new endpoint with two production variants with different weights and instance counts.

In [None]:
import datetime
current_time = datetime.datetime.now()
current_time = current_time.strftime('%Y-%m-%d-%H-%M')


new_model_name = "model-" + current_time
new_endpoint_name = "endpoint-" + current_time

## Create Model
To save time, we will reuse the training job that we ran earlier to create a new Model:

In [None]:
sess.create_model_from_job(name = new_model_name,
                           role = role,
                           primary_container_image = serving_image,
                           training_job_name = training_job_name
                          )

## Define Endpoint Configuration - Production Variant

### A/B Testing

We define Production Variants that will define our endpoint configuration.   
<br>The Variant Weight you specify defines how much traffic you want to allocate to each model. For example, suppose that you want to host two models, A and B, and you assign traffic weight 2 for model A and 1 for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A, and one-third to model B.

In [None]:
variant_A = session.production_variant(variant_name = 'Current-Production-Variant',
                                          model_name = model_name, #model we created using estimator earlier
                                          instance_type = 'ml.m4.xlarge', 
                                          initial_instance_count = 1, 
                                          initial_weight = 1
                                      )
variant_B = session.production_variant(variant_name = 'To-be-Production-Variant',
                                          model_name = new_model_name, #new model that we have built for testing
                                          instance_type = 'ml.m4.xlarge', 
                                          initial_instance_count = 1, 
                                          initial_weight = 2  #sending more traffic towards this model 
                                         )

## Deploy Endpoint 

This step finally deploys the endpoint with our desired configuration defined in the previous step.

In [None]:
sess.endpoint_from_production_variants(name = new_endpoint_name,
                                       production_variants= [variant_A, variant_B],
                                       wait = True
                                      )

## Prediction

As you are not using the Estimator class to deploy your model, you will not be able to use the predictor we used in our previous predictions. Instead, we will test the deployed endpoint directly using the low level aws cli. The [Miscellaneous](#Part-4:-Miscellaneous) section on this notebook shows a few sample commands to do so.

# Part 4: Miscellaneous

## Other ways to perform prediction to the deployed endpoint

The [InvokeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html) requires AWS [SigV4](https://docs.aws.amazon.com/general/latest/gr/signature-version-4.html) authentication for model access security. The aws cli needs to be configured on your lcoal machine with atleast access to InvokeEndPoint API. Some sample commands that can be run from your local machine to test:

- **Using aws cli**<br>
aws sagemaker-runtime invoke-endpoint --endpoint-name Census-TensorFlow-Production --body “$ENTER PATH TO TEXT FILE THAT HOLDS SINGLE TEST DATA(INSIDE QUOTES)” --content-type "application/text"


- **Using [awscurl](https://github.com/okigan/awscurl)**<br>
awscurl --service sagemaker --region us-east-1 -X POST -d @$TEXT FILE THAT HOLDS THE SINGLE TEST DATA -H "Accept: application/json" -H "Content-Type: application/text" https://runtime.sagemaker.us-east-1.amazonaws.com/endpoints/Census-TensorFlow-Production/invocations


## Check an existing Endpoint's status

Returns the description of an endpoint. Use this to check the status of the updates being made to the endpoint(Updating|InService)

In [None]:
import pprint
import boto3
client = boto3.client('sagemaker')

response = client.describe_endpoint(
    EndpointName=new_endpoint_name
)
pprint.pprint(response)

## Update an existing Endpoint's weight

Updates variant weight of one or more variants associated with an existing endpoint, or capacity of one variant associated with an existing endpoint. When it receives the request, Amazon SageMaker sets the endpoint status to Updating . After updating the endpoint, it sets the status to InService .

In [None]:
import pprint
import boto3
client = boto3.client('sagemaker')

response = client.update_endpoint_weights_and_capacities(
    EndpointName=new_endpoint_name,
    DesiredWeightsAndCapacities=[
        {'VariantName': 'Current-Production-Variant',
         'DesiredWeight': 1.0
        },
        {'VariantName': 'To-be-Production-Variant',
         'DesiredWeight': 1.0         
        }
    ]
)
pprint.pprint(response)

## Update an existing Endpoint

After testing, when you have decided to go ahead with an endpoint configuration into production, you can update the endpoint to switch to use the newly created EndpointConfig, and then delete resources provisioned for the endpoint using the previous EndpointConfig (there is no availability loss).


In [None]:
import pprint
import boto3
client = boto3.client('sagemaker')

response = client.update_endpoint(
    EndpointName=new_endpoint_name,
    EndpointConfigName=''
)

pprint.pprint(response)


## Optional Delete Endpoint

The following deletes the endpoint and the backend instances. 

In [None]:
sess.delete_endpoint(endpoint_name=new_endpoint_name)