## End-to-end ML pipeline using AWS SageMaker and Lambda

**This sample is provided for demonstration purposes, make sure to conduct appropriate testing if derivating this code for your own use-cases!**

### Step 0: Get Admin Setup Results
Bucket names, codecommit repo, docker image, IAM roles, ...

In order to keep things orginized, we will save our `Source Code` (data processing, model training/serving scripts), `datasets`, as well as our trained `model(s) binaries` and their `test-performance metrics` all on S3, **versioned with respect to the date/time of each update.**

In [None]:
import sagemaker
from sagemaker.s3 import S3Uploader
import boto3
import zipfile
import json
from time import gmtime, strftime

# Grab admin resources (S3 Bucket name, IAM Roles and Docker Image for Training)
with open('admin_setup.txt', 'r') as filehandle:
    admin_setup = json.load(filehandle)

SOURCE_DATA = admin_setup["raw_data_path"]
BUCKET = admin_setup["project_bucket"]
REPO_NAME = admin_setup["repo_name"]
TRAINING_IMAGE = admin_setup["docker_image"]
WORKFLOW_EXECUTION_ROLE = admin_setup["workflow_execution_role"]

# MLOps Hygiene
WORKFLOW_NAME = "my-project"
WORKFLOW_DATE_TIME = strftime("%Y-%m-%d-%H-%M-%S", gmtime())
TRAINING_JOB_NAME = "{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME)

SOURCE_CODE_PREFIX = "{}/{}".format(WORKFLOW_DATE_TIME, "source-code")
SOURCE_CODE = "s3://{}/{}/{}".format(BUCKET, SOURCE_CODE_PREFIX, "sourcedir.tar.gz")

OUTPUT_ARTIFACTS_PREFIX = "{}/{}".format(WORKFLOW_DATE_TIME, "model-artifacts")
OUTPUT_ARTIFACTS_PATH = 's3://{}/{}'.format(BUCKET, WORKFLOW_DATE_TIME + '/model-artifacts/')

TRAINING_DATA_PATH = "s3://{}/{}/data/train/train.csv".format(BUCKET, WORKFLOW_DATE_TIME)
VALIDATION_DATA_PATH = "s3://{}/{}/data/validation/validation.csv".format(BUCKET, WORKFLOW_DATE_TIME)
TESTING_DATA_PATH = "s3://{}/{}/data/test/test.csv".format(BUCKET, WORKFLOW_DATE_TIME)



# The following method will be used throughout this notebook to create Lambda functions without going to the console
session = sagemaker.Session()
lambda_client = boto3.client('lambda')

def create_lambda_function(zip_name, lambda_source_code, function_name, description):
    zf = zipfile.ZipFile(zip_name, mode='w')
    zf.write(lambda_source_code, arcname=lambda_source_code.split('/')[-1])
    zf.close()

    S3Uploader.upload(local_path=zip_name, 
                      desired_s3_uri="s3://{}/{}".format(BUCKET, SOURCE_CODE_PREFIX),
                      session=session
                     )

    response = lambda_client.create_function(
        FunctionName=function_name,
        Runtime='python3.6',
        Role=WORKFLOW_EXECUTION_ROLE,
        Handler=zip_name.split('.')[0]+'.lambda_handler',
        Code={
            'S3Bucket': BUCKET,
            'S3Key': '{}/{}'.format(SOURCE_CODE_PREFIX, zip_name)
        },
        Description=description,
        Timeout=180,
        MemorySize=256
    )

### Step 1: Move Code from CodeCommit to S3
The first step in training a model on sagemaker is to copy our source code to S3. This step is automatically done for you when you use the SageMaker SDK.

In [None]:
!pygmentize workflow-orchestration-src/codecommit_to_s3.py

Let's run the above script from a Lambda function, this will help us automoate this task later.

First create the Lambda function:

In [None]:
create_lambda_function(zip_name="codecommit_to_s3.zip",
                       lambda_source_code="./workflow-orchestration-src/codecommit_to_s3.py",
                       function_name=WORKFLOW_NAME + '-codecommit-to-s3',
                       description="Copy code files from CodeCommit to a tarball on S3"
                      )

Run it:

In [None]:
codecommit_to_s3_event = {
    "s3BucketName":BUCKET,
    "s3BucketKey":"{}/{}".format(WORKFLOW_DATE_TIME, "source-code"),
    "repository": REPO_NAME,
    "branch": "master",
    "codecommitRegion":"us-east-1",
    "repository_sagemaker_key": "sagemaker-train-serve-src",
    "repository_sm_processing_key": "sagemaker-processing-src"
}
response = lambda_client.invoke(
    FunctionName=WORKFLOW_NAME + '-codecommit-to-s3',
    Payload=json.dumps(codecommit_to_s3_event).encode()
)

## Step 2: Run SageMaker Processing Job with `boto3`

The `boto3` client for SageMaker is more verbose than the SageMaker SDK yet gives more visibility in the low-level details of Amazon SageMaker.

Let's look at the python script for our data processing:

In [None]:
!pygmentize sagemaker-processing-src/processing.py

To run the above script, we will use [boto3.client('sagemaker')
.create_processing_job()](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_processing_job) inside a lambda function.

Here is the code for the lambda function:

In [None]:
!pygmentize workflow-orchestration-src/create_sagemaker_prcoessing_job.py

Let's build it:

In [None]:
create_lambda_function(zip_name="create_sagemaker_prcoessing_job.zip",
                       lambda_source_code="./workflow-orchestration-src/create_sagemaker_prcoessing_job.py",
                       function_name=WORKFLOW_NAME + '-create-sagemaker-prcoessing-job',
                       description="Creates Sagemaker Processing Job"
                      )

Run it:

In [None]:
processing_job_event = {
    "JOB_NAME":"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME),
    "BUCKET": BUCKET,
    "WORKFLOW_DATE_TIME":WORKFLOW_DATE_TIME,
    "SOURCE_CODE_PREFIX":"{}/{}".format(WORKFLOW_DATE_TIME, "source-code"),
    "ENTRY_POINT_SCRIPT":"processing.py",
    "TRAINING_IMAGE":TRAINING_IMAGE,
    "ROLE_ARN":WORKFLOW_EXECUTION_ROLE,
    "INSTANCE_TYPE":"ml.c5.xlarge",
    "INSTANCE_COUNT":1,
    "VOLUME_SIZE_GB":10,
    "DATA_SOURCE": SOURCE_DATA
}
response = lambda_client.invoke(
    FunctionName=WORKFLOW_NAME + '-create-sagemaker-prcoessing-job',
    Payload=json.dumps(processing_job_event).encode()
)

Let's build a mechanism to check on the processing job status... again using a Lambda!

In [None]:
create_lambda_function(zip_name='query_data_processing_status.zip',
                       lambda_source_code='./workflow-orchestration-src/query_data_processing_status.py',
                       function_name=WORKFLOW_NAME + '-query-data-processing-status',
                       description='Get Status of SageMaker Processing Job'
                      )

processing_status_event = {
    'JOB_NAME':"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME)
}

#### Make sure the SageMaker processing job is done (status = Completed) before luanching the training step

In [None]:
%cd workflow-orchestration-src
import query_data_processing_status as qs
print(qs.lambda_handler(processing_status_event, ""))
%cd ..

## Step 3: Create SageMaker Training Job Using `boto3`

When using `boto3` to launch a training job, we must explicitly point it to our source code on S3 and docker image in addition to what SageMaker estimators expect.

Let's look at the code for the `create_sagemaker_training_job` lambda function.

In [None]:
!pygmentize workflow-orchestration-src/create_sagemaker_training_job.py

Let's create it:

In [None]:
create_lambda_function(zip_name='create_sagemaker_training_job.zip',
                       lambda_source_code='./workflow-orchestration-src/create_sagemaker_training_job.py',
                       function_name=WORKFLOW_NAME + '-create-sagemaker-training-job',
                       description='Creates SageMaker Training Job'
                      )

Run the training job once processing job is done:

In [None]:
training_job_event = {
    "TRAINING_JOB_NAME":"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME),
    "TRAINING_DATA":TRAINING_DATA_PATH,
    "TESTING_DATA":VALIDATION_DATA_PATH,
    "SOURCE_CODE":"s3://{}/{}/{}".format(BUCKET, WORKFLOW_DATE_TIME, "source-code/sourcedir.tar.gz"),
    "ENTRY_POINT_SCRIPT":"train.py",
    "TRAINING_IMAGE":TRAINING_IMAGE,
    "ROLE_ARN":WORKFLOW_EXECUTION_ROLE,
    "OUTPUT_ARTIFACTS_PATH":"s3://{}/{}/{}/".format(BUCKET, WORKFLOW_DATE_TIME, "model-artifacts"),
    "INSTANCE_TYPE":"ml.c5.xlarge",
    "INSTANCE_COUNT":1,
    "VOLUME_SIZE_GB":10,
    "PROCESSING_JOB_NAME":"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME)
}

response = lambda_client.invoke(
    FunctionName=WORKFLOW_NAME + '-create-sagemaker-training-job',
    Payload=json.dumps(training_job_event).encode()
)

Let's check on its status

In [None]:
create_lambda_function(zip_name='query_training_status.zip',
                       lambda_source_code='./workflow-orchestration-src/query_training_status.py',
                       function_name=WORKFLOW_NAME + '-query-training-status',
                       description='Get Status of SageMaker Training Job'
                      )

training_status_event = {
    'JOB_NAME':"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME)
}

## Step 4: Deploy model on SageMaker using model artifacts on S3 using `boto3`

#### Make sure the SageMaker training job is done (status = Completed) before deploying the model

In [None]:
%cd workflow-orchestration-src
import query_training_status as qs
print(qs.lambda_handler(training_status_event, ""))
%cd ..

#### If training is done, then check model accuracy before deploying

In [None]:
create_lambda_function(zip_name='query_model_accuracy.zip',
                       lambda_source_code='./workflow-orchestration-src/query_model_accuracy.py',
                       function_name=WORKFLOW_NAME + '-query-model-accuracy',
                       description='Get Model Accuracy from SageMaker Training Job'
                      )

model_accuracy_event = {
    'TrainingJobName':"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME)
}

In [None]:
%cd workflow-orchestration-src
import query_model_accuracy as qs
print(qs.lambda_handler(model_accuracy_event, ""))
%cd ..

Let's look at the code for the `deploy_sagemaker_model` lambda function. This function will be incharge of creating a SageMaker endpoint for our trained model. If endpoint exists, then it will update the endpoint with the new retrained model.

In [None]:
!pygmentize workflow-orchestration-src/deploy_sagemaker_model.py

Again, let's put this function in a Lambda:

In [None]:
create_lambda_function(zip_name='deploy_sagemaker_model.zip',
                       lambda_source_code='./workflow-orchestration-src/deploy_sagemaker_model.py',
                       function_name=WORKFLOW_NAME + '-deploy-sagemaker-model-job',
                       description='Creates and Deploys SageMaker Model From Training Artifacts'
                      )

And run it:

In [None]:
deploy_event = {
    "EndPointConfigName":"{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME),
    "EndPointName":WORKFLOW_NAME,
    "ModelURL":"s3://{}/{}/{}".format(BUCKET, WORKFLOW_DATE_TIME, "model-artifacts/"),
    "Directory":"s3://{}/{}/{}".format(BUCKET, WORKFLOW_DATE_TIME, "source-code/sourcedir.tar.gz"),
    "Program":"train.py",
    "Region":"us-east-1",
    "TrainingImage":TRAINING_IMAGE,
    "ROLE_ARN":WORKFLOW_EXECUTION_ROLE,
    "OUTPUT_ARTIFACTS_PATH":"s3://{}/{}/{}/{}".format(BUCKET, WORKFLOW_DATE_TIME, "model-artifacts", "{}-{}".format(WORKFLOW_NAME, WORKFLOW_DATE_TIME)+"/output/model.tar.gz"),
    "DeploymentInstanceType":"ml.c5.xlarge",
    "DeploymentInstanceCount":1
}


response = lambda_client.invoke(
    FunctionName=WORKFLOW_NAME + '-deploy-sagemaker-model-job',
    Payload=json.dumps(deploy_event).encode()
)

In [None]:
boto3.client("sagemaker").describe_endpoint(EndpointName=WORKFLOW_NAME)["EndpointStatus"]

### Test Endpoint

In [None]:
import boto3
import pandas as pd
from sklearn.datasets import load_boston
import json
#session = sagemaker.Session()
data = load_boston()

df = pd.DataFrame(data.data, columns=data.feature_names)
df['PRICE'] = data.target
print(df.shape)

sagemaker_runtime = boto3.client('sagemaker-runtime')
response = sagemaker_runtime.invoke_endpoint(
    EndpointName="my-project",
    Body=df[data.feature_names].to_csv(header=False, index=False).encode('utf-8'),
    ContentType='text/csv')

decoded_response = json.loads(response['Body'].read().decode("utf-8"))
print(decoded_response[0:10])