# SageBuild Tutorial

This notebook will walk you through on how to use Sagebuild to build and deploy custom models on-demand or in response to events. We will reuse the code from the "scikit_bring_your_own" example notebook.

## Helpfull Links
* [Blog Post]() to see the details of how SageBuild works. 
* [See here](/notebooks/sample-notebooks/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb) for details of how to write Dockerfiles for your own algorithms.

## Table of Contents
1. [Setup](#SetUp)
2. [Deploy](#Deploy)
3. [Wait](#Wait)
4. [Use](#Use)
5. [Conclusion](#Conclusion)

## SetUp <a name="SetUp"></a>
The following sets up the packages and variables we need. Note, the region and StackName variables have been filled in for you by the cloudformation template.

In [None]:
import boto3
import json
from subprocess import check_output as run
from subprocess import STDOUT
from botocore.exceptions import ClientError
from time import sleep
import numpy as np
import pandas as pd
from io import StringIO

cf = boto3.client('cloudformation')
sns = boto3.client('sns')
step = boto3.client('stepfunctions')
s3 = boto3.resource('s3')
ssm = boto3.client('ssm')
sagemaker = boto3.client('sagemaker-runtime')
Lambda=boto3.client('lambda')

with open('../config.json') as json_file:  
    data = json.load(json_file)
    
region=data['Region']
StackName=data['StackName']
data='iris.csv'

#Get outputs from build stack
result=cf.describe_stacks(
    StackName=StackName
)
#Put Outputs in a dict for easy use
outputs={}
for output in result['Stacks'][0]['Outputs']:
    outputs[output['OutputKey']]=output['OutputValue']
print("Stack Outputs")
print(json.dumps(outputs,indent=4))

We need to make sure the Sagebuild template is configured correctly for MXNET. the following code will set the stack configuration

In [None]:
params=result["Stacks"][0]["Parameters"]
for n,i in enumerate(params):
    if(i["ParameterKey"]=="ConfigFramework"):
        i["ParameterValue"]="BYOD" 

try:
    cf.update_stack(
        StackName=StackName,
        UsePreviousTemplate=True,
        Parameters=params,
        Capabilities=[
            'CAPABILITY_NAMED_IAM',
        ]
    )
    waiter = cf.get_waiter('stack_update_complete')
    print("Waiting for stack update")
    waiter.wait(
        StackName=StackName,
        WaiterConfig={
            'Delay':10,
            'MaxAttempts':600
        }
    )

except ClientError as e:
    if(e.response["Error"]["Message"]=="No updates are to be performed."):
        pass
    else:
        raise e
print("stack ready!")

## configuration

Both the training-job and endpoint have various configuration parameters. The build generates those parameters by calling two lambda functions with the current build state and the parameters stored in SSM Parameter store. In the following we change the parameters in the Parameter store to match our build.


In [None]:
store=outputs["ParameterStore"]
result=ssm.get_parameter(Name=store)

params=json.loads(result["Parameter"]["Value"])
params["dockerfile_path_Training"]="example/train"
params["dockerfile_path_Inference"]="example/inference"
params["hyperparameters"]={}
params["channels"]={
    "training":{
        "path":"training/iris"
    }
}

ssm.put_parameter(
    Name=store,
    Type="String",
    Overwrite=True,
    Value=json.dumps(params)
)

The follow shell commands will configure git to be able to access AWS CodeCommit and clone down the example repo. 

In [None]:
#configure git to be able to access CodeCommit,uses SageMaker Instance's role for permissions.
!git config --global credential.helper '!aws codecommit credential-helper $@'
!git config --global credential.UseHttpPath true

#clone down our example code
!git clone https://github.com/aws-samples/aws-sagemaker-build.git


## Deploy! <a name="Deploy"></a>
The following will 
- add the CodeCommit repo created by the cloudformation template as a remote named deploy
- push example code to repo (will trigger a build)
- upload our data to the DataBucket created by the Cloudformation template (will trigger a build)

Once a build has started no new build can be started till the first one finishes

In [None]:
#push our Dockerfile code to the "deploy" CodeCommit repo
run("cd aws-sagemaker-build && git remote add deploy {0}; git push deploy master".format(outputs['RepoUrl']),
    stderr=STDOUT,
    shell=True) 
print("code Pushed")

#upload the data to the DataBucket
object = s3.Object(outputs["DataBucket"],'training/iris/data.csv')
object.upload_file(data) 
print("data uploaded")

You can also trigger a build by publishing to the launch topic directly

In [None]:
result=sns.publish(
    TopicArn=outputs['LaunchTopic'],
    Message="{}" #message is not important, just publishing to topic starts build
)
print("message published")

## Wait <a name="Wait"></a>


You can use the following code to get a notification 

In [None]:
result=sns.subscribe(
    TopicArn=outputs['TrainStatusTopic'],
    Protocol="SMS",
    Endpoint="#-###-###-####" #put your phone number here
)
print("subscribed to topic")

We can get the status of StateMachine as it builds and deploys our custom model. We can then setup a some code to wait for our build to complete

In [None]:
%%time 
#list all executions for our StateMachine to get our current running one
result=step.list_executions(
    stateMachineArn=outputs['StateMachine'],
    statusFilter="RUNNING"
)["executions"]
print(result)
if len(result) > 0:
    response = step.describe_execution(
        executionArn=result[0]['executionArn']
    )
    status=response['status']
    print(status,response['name'])
    #poll status till execution finishes
    while status == "RUNNING":
        print('.',end="")
        sleep(5)
        status=step.describe_execution(executionArn=result[0]['executionArn'])['status']
    print()
    print(status)
else:
    print("no running tasks")


## Use <a name="Use"></a>
Next we get some data and send to our newly deployed endpoint!

In [None]:
%%time 
test_data=pd.read_csv(data, header=None).sample(10)
test_X=test_data.iloc[:,1:]
test_y=test_data.iloc[:,0]

#convert test_X to csv
Body=str.encode(test_X.to_csv(header=False,index=False))

result=sagemaker.invoke_endpoint(
    EndpointName=outputs["SageMakerEndpoint"],
    Body=Body,    
    ContentType="text/csv",
    Accept="text/csv"
)

print(pd.read_csv(StringIO(result['Body'].read().decode('utf-8')),header=None))

## Conclusion <a name="Conclusion"></a>

Hopefully SageBuild can help you develop and deploy SageMaker custom models faster and easier. If you have any problems please let us know in our github issues [here](https://github.com/aws-samples/aws-sagemaker-build/issues). Feel free to send us pull request too!