### Bring Your Own Container: Recommend top 5 books for new users using R

1. [Introduction](#Introduction)
2. [Create R Docker Image](#Create-R-Docker-Image)   
3. [Model Training](#Model-Training)
4. [Model Deployment](#Model-Deployment)
5. [Model Evaluation](#Model-Evaluation)

## Introduction
The goal of this notebook is to illustrate how you can train and host R model seamlessly in Amazon SageMaker. In other words, we will go through the process of bringing your own docker container to Amazon SageMaker. Rather than reinventing the wheel of training and hosting ML models using SageMaker's built-in algorithms, data scientists and machine learning engineers can re-use their work done in R in SageMaker

In [1]:
import pandas as pd
import boto3
import os
import time
import json
from sagemaker import get_execution_role

### Prepare Dataset
We will first begin by preparing the dataset. The ClndBookRatings.csv is generated from the notebook object2vec_bookratings_reco.ipynb. We will further refine this dataset by only selecting users who have rated greater than 135 books

In [2]:
ip_fn = '../ClndBookRatings.csv' # outliers are removed - remove books with zero ratings
op_fn = 'train_test_bkratings_r.csv' #output file name
bkratings = pd.read_csv(ip_fn)

In [3]:
bkratings.head()

Unnamed: 0,book_id,user_id,rating,title,book_ind,user_ind
0,1159,32773,5,Stones from the River,80,6320
1,1159,47984,4,Stones from the River,80,9777
2,1159,29097,3,Stones from the River,80,5444
3,1159,5657,3,Stones from the River,80,10959
4,1159,19404,4,Stones from the River,80,2756


In [19]:
# short list users who have rated at least 135 books
grp_bkratings = bkratings.groupby('user_ind')
fil_bkratings = grp_bkratings.filter(lambda x: x['book_ind'].count() >=135)

In [21]:
fil_bkratings.head()

Unnamed: 0,book_id,user_id,rating,title,book_ind,user_ind
11,1159,31760,4,Stones from the River,80,6052
36,1159,51460,5,Stones from the River,80,10507
39,1159,28767,3,Stones from the River,80,5346
61,1159,3087,3,Stones from the River,80,5862
62,1159,18361,3,Stones from the River,80,2471


In [20]:
fil_bkratings.shape

(40142, 6)

In [23]:
sel_bkratings = fil_bkratings[['user_ind', 'book_ind', 'rating']]
sel_bkratings.to_csv(op_fn, header='true', index=False)

## Create R Docker Image
We will now create docker image that contains both training and testing logic

### Setup Permissions for Publishing Image to ECR

In [24]:
bucket = 'ai-in-aws'
prefix = 'Chapter7/byoc-r'
 
# Define IAM role
role = get_execution_role()

In [26]:
role

'arn:aws:iam::413491515223:role/service-role/AmazonSageMaker-ExecutionRole-20190822T170423'

For the AmazonSageMaker-ExecutionRole-20190822T170423 IAM role, you will need both SageMakerFullAccess and AmazonEC2ContainerRegistryFullAccess permissions. Navigate to IAM service, select Roles on the left navigation pane, and search for SageMakerExecutionRole-20190822T170423. And then attach AmazonEC2ContainerRegistryFullAccess policy to the role.

### Publishing Docker Image to ECR

We will create docker image on the local EC2 instance and then publish it to ECR

In [42]:
%%sh

# The name of our algorithm
algorithm_name=cosinesimilarity

#Get current account 

account=$(aws sts get-caller-identity --query Account --output text)

# Get the region defined in the current configuration. Default is us-east-1
region=$(aws configure get region)
region=${region:-us-east-1}

fullname="${account}.dkr.ecr.${region}.amazonaws.com/${algorithm_name}:latest"

# If the repository doesn't exist in ECR, create it.

aws ecr describe-repositories --repository-names "${algorithm_name}" > /dev/null 2>&1

if [ $? -ne 0 ]
then
    aws ecr create-repository --repository-name "${algorithm_name}" > /dev/null
fi

# Get the login command from ECR and execute it directly
$(aws ecr get-login --region ${region} --no-include-email)

# Build the docker image locally with the image name and then push it to ECR
# with the full name.
# Dockerfile is defined in the current directory
docker build  -t ${algorithm_name} .
docker tag ${algorithm_name} ${fullname}

docker push ${fullname}

Login Succeeded

Step 1/8 : FROM ubuntu:16.04
 ---> 5e13f8dd4c1a
Step 2/8 : RUN echo "deb http://cloud.r-project.org/bin/linux/ubuntu xenial/" >> /etc/apt/sources.list
 ---> Using cache
 ---> 6922a54142cb
Step 3/8 : RUN apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
 ---> Using cache
 ---> fa0811fa33eb
Step 4/8 : RUN apt-get -y update --allow-unauthenticated && apt-get install -y --no-install-recommends     wget     r-base     r-base-dev     ca-certificates
 ---> Using cache
 ---> 9344ac3e6af6
Step 5/8 : RUN R -e "install.packages(c('reshape2', 'recommenderlab', 'plumber', 'dplyr', 'jsonlite'), quiet = TRUE)"
 ---> Using cache
 ---> 7606a35d8b57
Step 6/8 : COPY Recommender.R /opt/ml/Recommender.R
 ---> Using cache
 ---> 24602ecccf22
Step 7/8 : COPY plumber.R /opt/ml/plumber.R
 ---> 1adae2680f86
Step 8/8 : ENTRYPOINT ["/usr/bin/Rscript", "/opt/ml/Recommender.R", "--no-save"]
 ---> Running in 1fea231fe601
Removing intermediate container

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



## Model Training
Let us train Recommender model, from recommerlab package, on user book ratings
We will start my pushing processed user book ratings to s3 bucket

In [30]:
boto3.Session().resource('s3').Bucket(bucket).Object(os.path.join(prefix, 'train', op_fn)).upload_file(op_fn)

In [31]:
region = boto3.Session().region_name
account = boto3.client('sts').get_caller_identity().get('Account')

#### Create Training Parameters

In [36]:
r_job = 'BYOC-r' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

print("Training job", r_job)

r_training_params = {
    "RoleArn": role,
    "TrainingJobName": r_job,
    "AlgorithmSpecification": {
        "TrainingImage": '{}.dkr.ecr.{}.amazonaws.com/cosinesimilarity:latest'.format(account, region),
        "TrainingInputMode": "File"
    },
    "ResourceConfig": {
        "InstanceCount": 1,
        "InstanceType": "ml.m4.xlarge",
        "VolumeSizeInGB": 10
    },
    "InputDataConfig": [
        {
            "ChannelName": "train",
            "DataSource": {
                "S3DataSource": {
                    "S3DataType": "S3Prefix",
                    "S3Uri": "s3://{}/{}/train".format(bucket, prefix),
                    "S3DataDistributionType": "FullyReplicated"
                }
            },
            "CompressionType": "None",
            "RecordWrapperType": "None"
        }
    ],
    "OutputDataConfig": {
        "S3OutputPath": "s3://{}/{}/output".format(bucket, prefix)
    },
    "HyperParameters": {
        "method": "Cosine",
        "nn": "10",
        "n_users": "270"
    },
    "StoppingCondition": {
        "MaxRuntimeInSeconds": 60 * 60
    }
}

Training job BYOC-r2019-08-24-16-38-51


#### Create Training Job

In [37]:
%%time

sm = boto3.client('sagemaker')
sm.create_training_job(**r_training_params)

status = sm.describe_training_job(TrainingJobName=r_job)['TrainingJobStatus']
print(status)
sm.get_waiter('training_job_completed_or_stopped').wait(TrainingJobName=r_job)
status = sm.describe_training_job(TrainingJobName=r_job)['TrainingJobStatus']
print("Training job ended with status: " + status)
if status == 'Failed':
    message = sm.describe_training_job(TrainingJobName=r_job)['FailureReason']
    print('Training failed with the following error: {}'.format(message))
    raise Exception('Training job failed')

InProgress
Training job ended with status: Completed
CPU times: user 61 ms, sys: 4.32 ms, total: 65.4 ms
Wall time: 4min


## Model Deployment

#### Create Model
Let's create a model from the training job, pointing to the docker image in ECR and the model artifacts resulting from training job 

In [45]:
r_hosting_container = {
    'Image': '{}.dkr.ecr.{}.amazonaws.com/cosinesimilarity:latest'.format(account, region),
    'ModelDataUrl': sm.describe_training_job(TrainingJobName=r_job)['ModelArtifacts']['S3ModelArtifacts']
}

create_model_response = sm.create_model(
    ModelName=r_job,
    ExecutionRoleArn=role,
    PrimaryContainer=r_hosting_container)

print(create_model_response['ModelArn'])

arn:aws:sagemaker:us-east-1:413491515223:model/byoc-r2019-08-24-16-38-51


In [None]:
r_hosting_container

### Create Model EndPoint
Define the type of infrastructure that needs to be spun up and the model that needs to be hosted on it. 

In [47]:
r_endpoint_config = 'BYOC-r-config-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
print(r_endpoint_config)
create_endpoint_config_response = sm.create_endpoint_config(
    EndpointConfigName=r_endpoint_config,
    ProductionVariants=[{
        'InstanceType': 'ml.m4.xlarge',
        'InitialInstanceCount': 1,
        'ModelName': r_job,
        'VariantName': 'AllTraffic'}])

print("Endpoint Config Arn: " + create_endpoint_config_response['EndpointConfigArn'])

BYOC-r-config-2019-08-24-17-11-34
Endpoint Config Arn: arn:aws:sagemaker:us-east-1:413491515223:endpoint-config/byoc-r-config-2019-08-24-17-11-34


In [48]:
%%time

r_endpoint = 'BYOC-r-endpoint-' + time.strftime("%Y%m%d%H%M", time.gmtime())
print(r_endpoint)
create_endpoint_response = sm.create_endpoint(
    EndpointName=r_endpoint,
    EndpointConfigName=r_endpoint_config)
print(create_endpoint_response['EndpointArn'])

resp = sm.describe_endpoint(EndpointName=r_endpoint)
status = resp['EndpointStatus']
print("Status: " + status)

try:
    sm.get_waiter('endpoint_in_service').wait(EndpointName=r_endpoint)
finally:
    resp = sm.describe_endpoint(EndpointName=r_endpoint)
    status = resp['EndpointStatus']
    print("Arn: " + resp['EndpointArn'])
    print("Status: " + status)

    if status != 'InService':
        raise Exception('Endpoint creation did not succeed')

BYOC-r-endpoint-201908241711
arn:aws:sagemaker:us-east-1:413491515223:endpoint/byoc-r-endpoint-201908241711
Status: Creating
Arn: arn:aws:sagemaker:us-east-1:413491515223:endpoint/byoc-r-endpoint-201908241711
Status: InService
CPU times: user 207 ms, sys: 20.9 ms, total: 228 ms
Wall time: 7min 31s


## Model Evaluation

Here we invoke the endpoint created earlier. We will get top 5 recommendations of user 272

In [49]:
ratings = pd.read_csv(op_fn)

runtime = boto3.Session().client('runtime.sagemaker')

payload =  ratings.to_csv(index=False) # get top 5 book recommendations for user 272 (Remember, we trained the model on the first 190 users)
response = runtime.invoke_endpoint(EndpointName='BYOC-r-endpoint-201908241711', #r_endpoint
                                   ContentType='text/csv',
                                   Body=payload)

result = json.loads(response['Body'].read().decode())
result 

[['212', '173', '492', '955', '289']]

In [50]:
# Retrieve book titles from book indexes

bkratings[bkratings.book_ind.isin(result[0])]['title'].drop_duplicates()

258886                   The Dark Tower (The Dark Tower, #7)
869592                              A Tree Grows in Brooklyn
966261     The Return of the King (The Lord of the Rings,...
994995                                            Fight Club
1031959                                           Life of Pi
Name: title, dtype: object