Sample Collection (Lab 1)

Our goal in this section is to gather data from our Fulfillment Center (FC) and store it as a dataset that can later be used to train machine learning models.

Create an S3 bucket to store the samples in

Let's work backwards by first creating a place to store our data. We will use S3 as our datastore because it is well-equipped to handle very large datasets. Later in the workshop we will discover that SageMaker is well integrated with S3, making it easy to train ML models.

Instructions

Navigate to S3 from the AWS home page.
Select the + Create bucket button in the top-left.
Type in aim368-samples-bucket-<FIRSTNAME>-<LASTNAME> (e.g. aim368-samples-bucket-mike-calder)

⚠️ Please follow all naming instructions: Failure to do so could prevent parts of the lab from working!

Click the Create button in the bottom-left of the pane.
Verify the bucket was created by finding it in the list.

Create a Kinesis Firehose to automatically batch the samples

We need to get our data from the FC to our new S3 Bucket, but the FC is publishing the data on a sample by sample basis. It would not make sense to create a new S3 file (or update a file) for every single sample, so we need an aggregation step. To accomplish this, we can create a Kinesis Firehose that automatically aggregates our data into batches and only writes to S3 after the batch size is reached or after a set amount of time.

Instructions

Navigate to Kinesis from the AWS home page.
Select the Get started button in the top-middle area.
In the Kinesis Firehose box, click Create delivery stream.
Enter SampleCollectionFirehose for the delivery stream name.
Click the Next button in the bottom-right.
Click the Next button in the bottom-right again.
Under S3 destination, select your bucket and click Next again.
Under Permissions click Create new or choose.
Select SampleCollectionFirehoseRole for the IAM Role.
Select SampleCollectionFirehosePolicy for the Policy Name.
Click the Allow button in the bottom-right.
Click the Next button in the bottom-right again.
Select Create delivery stream in the bottom-right.
Verify the Kinesis Firehose has started creating.
Verify the Kinesis Firehose was created.

Create a Lambda function to connect the SNS topic to the Firehose

Finally, we need to add a serverless compute step to read the data notifications from the FC and write the data to the Kinesis Firehose. We will use a Lambda with an SNS trigger to accomplish this.

Instructions

Navigate to Lambda from the AWS home page.
Select the Create function button in the top-right.
Enter SampleCollectionLambdaFunction for the function name.
Select Python 3.8 in the dropdown for the runtime.
Click Choose or create an execution role.
Click the Use an existing role radio button.
Select SampleCollectionLambdaRole in the dropdown.
Click the Create function button in the bottom-right.
Verify the function has been created.

Copy-paste the following code into the Function code box: (replacing the sample code)

import boto3

firehose = boto3.client('firehose')

def lambda_handler(event, context):
    sample = event['Records'][0]['Sns']['Message']
    firehose.put_record(
        DeliveryStreamName = 'SampleCollectionFirehose',
        Record = {'Data': sample.encode('UTF-8')}
    )

Click the Save button in the top-right corner.
In the top-left, click the + Add trigger button.
Select SNS in the trigger configuration dropdown.
Click the Add button in the bottom-right.
Verify the trigger has been added.

Model Training (Lab 2)

In this module, we will start by loading our data into a SageMaker Notebook for analysis. Then we will train ML models, choose the best model, and finally deploy it, all using SageMaker managed hardware.

Create a SageMaker Notebook

SageMaker Notebooks provides familiar tooling via Jupyter Notebooks running on cloud hardware. Let's first set up a notebook so we can import our data and train some ML models.

Instructions

Navigate to Amazon SageMaker from the AWS home page.
Select Notebook Instances in the left panel under Notebooks.
Click the Create notebook instance button in the top-right.
Enter ModelTrainingNotebook for the notebook instance name.
Select ml.c5.2xlarge for the notebook instance type.
Click the Create notebook instance button in the bottom-right.
Verify the Amazon SageMaker notebook has started creating.
Wait for the notebook to have the InService status. (3-5 minutes)

Train models in the SageMaker Notebook

Now that the SageMaker Notebook is set up, we can import our data and leverage built-in SageMaker training algorithms to generate ML models. No python or data science experience necessary, just follow the steps to import our notebook files. These files will walk you through some basic data analytics and help you train ML models with three different algorithms.

Instructions

Click Open JupyterLab in the right-most column.
Select the git clone logo in the top-right of the middle pane.
Enter https://github.com/mike-calder/AIM368-Notebooks.git and hit CLONE.

⚠️ There will be no visual indication that the clone is progressing, but it should complete within a minute.

Double-click the AIM368-Notebooks folder in the top-left.
Double-click the Data-Analysis.ipynb file in the top-left.
Walk through the notebook by typing Shift+Enter on each individual cell.
Repeat this process for each of the three model training notebooks:
- K-Nearest-Neighbors.ipynb
- Linear-Learner.ipynb
- XGBoost.ipynb

Deploy a model from the SageMaker Notebook

Now that we have trained some models, how do we use them to predict the duration of future human-robot interactions? For that, we need to deploy a model. Luckily, all we need is one line of code and we can deploy our model to a SageMaker managed cloud instance that's ready to recieve inference requests.

Instructions

Of the three models you trained, choose one to deploy for the accuracy competition.

Below that model's training output, create a new notebook cell with the following code:

model.deploy(initial_instance_count = 1,
             instance_type = 'ml.c5.2xlarge',
             endpoint_name = 'LiveInferenceEndpoint')

Deploy the model by pressing Shift+Enter in the new cell.

Live Inference (Lab 3)

In this module we will create a live inference API that routes requests to our SageMaker endpoint. This will allow our FC edge compute resources to call the ML model and receive duration estimates for upcoming human-robot interactions.

Create a Lambda function to handle API Gateway requests

Once again let's work backwards by first creating a Lambda to call our SageMaker endpoint. In a production situation, this would allow us to test our model integration before exposing it to traffic from the FC.

Instructions

Navigate to Lambda from the AWS home page.
Select the Create function button in the top-right.
Enter LiveInferenceLambdaFunction for the function name.
Select Python 3.8 in the dropdown for the runtime.
Click Choose or create an execution role.
Click the Use an existing role radio button.
Select LiveInferenceLambdaRole in the dropdown.
Click the Create function button in the bottom-right.
Verify the function has been created.

Copy-paste the following code into the Function code box: (replacing the sample code)

import boto3

sagemaker = boto3.client('sagemaker-runtime')

def lambda_handler(event, context):
    features = event['queryStringParameters']['features']
    response = sagemaker.invoke_endpoint(EndpointName = 'LiveInferenceEndpoint',
                                         ContentType = 'text/csv',
                                         Body = features)
    
    inference = response['Body'].read().decode()
    
    return {
        'statusCode': 200,
        'body': inference
    }

Click the Save button in the top-right corner.
Verify that the function updated successfully.

Create an API Gateway endpoint that can receive inference requests

Now that we have a Lambda capable of calling our ML model, we need to provide a way for the FC to trigger the lambda with new data. API Gateway will allow us to set up an external API that recieves data and routes it to the Lambda we created. Then the Lambda can call the ML model and return the estimated duration back to the caller.

Instructions

Navigate to API Gateway from the AWS home page.
Click Get Started in the top-middle, then Ok to clear the popup.
Choose REST for the protocol type and select the New API radio button.
Type LiveInferenceAPI for the name, then click Create API in the bottom-right.
At the top of the page, click the Actions drop down and select Create Resource.
Type liveinference as the resource name, then click Create Resource.
At the top of the page, click the Actions drop down and select Create Method.
Click the drop down that appears, select GET, then click the check mark to confirm.
On the page that appears, check the checkbox to enable Use Lambda Proxy integration.
In the Lambda Function field, type LiveInferenceLambdaFunction and click Save.
Click Ok to confirm the new permissions for your Lambda function.
After a few moments you should see a diagram of your GET - Method Execution, click Method Request.
Click on URL Query String Parameters to expand the section and then + Add query string.
Type features in the box that says myQueryString, then click the check mark.
At the top of the page, click the Actions drop down and select Deploy API.
For Deployment Stage select [New Stage] to expand the section.
Type in <FIRSTNAME>-<LASTNAME> (e.g. mike-calder) for the stage name and click Deploy.
In the middle pane, click the arrow to expand your stage view.
Under your liveinference resource, click GET to open the method page.
In the right pane, copy the Invoke URL at the top. You will need this for the next section.

⚠️ Make sure this URL ends in liveinference, if it does not please ask for help!

Update the FC edge compute Lambda to call your endpoint

Next, we need to provide your API Gateway endpoint to the FC edge compute so it can begin to send inference requests. In this lab, the FC edge compute is simulated by a Lambda function. Follow the steps below to point that Lambda function at your API Gateway. Once completed, the simulated FC will start requesting inferences from your deployed model about 5 times per second.

Instructions

Navigate to Lambda from the AWS home page.
Select EdgeComputeLambdaFunction.
Scroll down to Environment Variables.
For the LIVE_INFERENCE_API_GATEWAY_URL, paste your URL from the previous section.
Click the Save button in the top-right.
Verify the function updated successfully. You should appear on the leaderboard within a couple of minutes.

Automated Retraining

This section will be used as a starting point for an automated retraining discussion. There are many different strategies in this space, but almost all of them begin with some form of performance monitoring.

Monitor performance with CloudWatch dashboards

When facing real world problems, it's important that the resulting solutions include a feedback loop via metrics and alarms. This will allow you to monitor your solution's performance and react to any errors or regressions. In this lab, we have set up a very simple dashboard to illustrate how CloudWatch can help fill this role.

Instructions

Navigate to CloudWatch from the AWS home page.
Select Dashboards in the top of the left pane.
Click on LiveInferenceDashboard in the list.
Monitor the performance of your ML application.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sample Collection (Lab 1)

Create an S3 bucket to store the samples in

Create a Kinesis Firehose to automatically batch the samples

Create a Lambda function to connect the SNS topic to the Firehose

Model Training (Lab 2)

Create a SageMaker Notebook

Train models in the SageMaker Notebook

Deploy a model from the SageMaker Notebook

Live Inference (Lab 3)

Create a Lambda function to handle API Gateway requests

Create an API Gateway endpoint that can receive inference requests

Update the FC edge compute Lambda to call your endpoint

Automated Retraining

Monitor performance with CloudWatch dashboards

About

Releases

Packages

mike-calder/AIM368-Instructions

Folders and files

Latest commit

History

Repository files navigation

Sample Collection (Lab 1)

Create an S3 bucket to store the samples in

Create a Kinesis Firehose to automatically batch the samples

Create a Lambda function to connect the SNS topic to the Firehose

Model Training (Lab 2)

Create a SageMaker Notebook

Train models in the SageMaker Notebook

Deploy a model from the SageMaker Notebook

Live Inference (Lab 3)

Create a Lambda function to handle API Gateway requests

Create an API Gateway endpoint that can receive inference requests

Update the FC edge compute Lambda to call your endpoint

Automated Retraining

Monitor performance with CloudWatch dashboards

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages