# Greengrass Machine Learning Inference (MLI) with pre-trained model uploaded to S3

This Jupyter notebook describe the steps in setting up Greengrass Machine Learning Inference, using a pre-trained model uploaded to S3.

## Design Pattern

![Greengrass MLI pretrained model](https://github.com/awslabs/aws-iot-greengrass-accelerators/blob/master/accelerators/machine_learning_inference/assets/mli-s3_models.png?raw=true)

The common design patterns of using a pre-trained model in S3 bucket:

1. When the Greengrass configuration is being deployed, the Greengrass Core will download the model from the S3 bucket as configured in the Machine Learning Resources, to the local disk, and extract the files from the compressed `.tar.gz` or `.zip`.
2. **Data acquisition** - This function periodically acquire the raw data inputs from a image source. In this example, we are using static images to simulate image sources.
3. **Data preprocessor** - This function pre-process the image by resize to the images used to train the model.
4. **Estimator** - This function predict the data input with data loaded in MXNet runtime
5. The MXNet runtime loads the model from the local path 
6. The process will handle the prediction result, with object detected and confidence level.
7. The result can be used to trigger an action, or send it back to the cloud for further processing.

## Installing dependencies

In [None]:
!python3.7 -m pip install boto3 awscli;

## Parameters configuration

In [None]:
import boto3
import os

AWS_ACCOUNT = boto3.client('sts').get_caller_identity().get('Account')

AWS_REGION = "us-east-1"

# Configuration for the IoT Thing: Certificate file name, and AWS IoT Greengrass core name
CERT_PEM_OUTFILE="mli.cert.pem"
PUBLIC_KEY_OUTFILE="mli.public.key" 
PRIVATE_KEY_OUTFILE="mli.private.key" 
CORENAME="greengrass_ml_{}".format(AWS_REGION)

# Temporary workspace
WORKSPACE_FOLDER="./work"
GREENGRASS_WORK_FOLDER=os.path.join(WORKSPACE_FOLDER,"greengrass")

# S3 Bucket for the ML model
ML_S3_BUCKET="{}-greengrass-{}".format(AWS_ACCOUNT,AWS_REGION)

# S3 Bucket for the Cloudformation functions
CFN_S3_BUCKET="{}-cloudformation-{}".format(AWS_ACCOUNT,AWS_REGION)

# Cloudformaton Stack Name for the AWS IoT GReengrass
CFN_STACK_NAME="greengrass-mli-accelerator"

In [None]:
# Setting up the temporary work folder

![ -d {WORKSPACE_FOLDER} ] && rm -r {WORKSPACE_FOLDER} && echo "old {WORKSPACE_FOLDER} removed"
!mkdir -p {WORKSPACE_FOLDER} && echo "{WORKSPACE_FOLDER} created"
!mkdir -p {GREENGRASS_WORK_FOLDER} \
&& mkdir -p {GREENGRASS_WORK_FOLDER}/certs \
&& mkdir -p {GREENGRASS_WORK_FOLDER}/config \
&& echo "{GREENGRASS_WORK_FOLDER} created"

## Prepare and update the pre-trained model to S3 bucket

We will download pre-trained model from the model zoo, and upload the S3 folder that AWS IoT Greengrass can download to the Greengrass Core

In [None]:
!wget -nv http://data.mxnet.io/models/imagenet/inception-bn/Inception-BN-symbol.json --directory-prefix {WORKSPACE_FOLDER}
!wget -nv http://data.mxnet.io/mxnet/models/imagenet/synset.txt --directory-prefix {WORKSPACE_FOLDER}
!wget -nv http://data.mxnet.io/models/imagenet/inception-bn/Inception-BN-0126.params -O {WORKSPACE_FOLDER}/Inception-BN-0000.params

In [None]:
from zipfile import ZipFile
import os

# create a ZipFile object
zipObj = ZipFile(os.path.join(WORKSPACE_FOLDER, "inception-bn.zip"), 'w')
# Add multiple files to the zip
for file in ["Inception-BN-symbol.json","synset.txt","Inception-BN-0000.params"]:    
    zipObj.write(os.path.join(WORKSPACE_FOLDER,file), file)
# close the Zip File
zipObj.close()

In [None]:
# Upload to S3 bucket
import boto3
import os
s3 = boto3.resource('s3')
if not s3.Bucket(ML_S3_BUCKET) in s3.buckets.all():
    s3.create_bucket(Bucket=ML_S3_BUCKET)

s3_client = boto3.client('s3')
s3_client.upload_file( os.path.join(WORKSPACE_FOLDER, "inception-bn.zip"), ML_S3_BUCKET, "inception-bn.zip")

ML_S3_BUCKET_URI="s3://{}/{}".format(ML_S3_BUCKET, "inception-bn.zip")

## Create the credential for the AWS IoT Thing

We will create the device certificate for the AWS IoT Thing which will be the Greengrass Core

In [None]:
import boto3
from botocore.exceptions import ClientError
iotClient = boto3.client('iot', region_name=AWS_REGION)
try:
    response = iotClient.create_keys_and_certificate(
        setAsActive=True
    )
except ClientError as e:
    if(e.response["Error"]["Code"]=="AccessDeniedException"):
        print("Missing permission. Please add \niot:CreateKeysAndCertificate on resource: *\n to the instance IAM role\n")
    raise e

In [None]:
import os
CERTIFICATE_ID=response.get("certificateId")
CERT_FOLDER = os.path.join(GREENGRASS_WORK_FOLDER, "certs/")
if not os.path.exists(CERT_FOLDER):
    os.makedirs(CERT_FOLDER)                    

try:
    with open(os.path.join(CERT_FOLDER, CERT_PEM_OUTFILE), 'w') as the_file:
        the_file.write(response.get("certificatePem"))
    with open(os.path.join(CERT_FOLDER, PUBLIC_KEY_OUTFILE), 'w') as the_file:
        the_file.write(response.get("keyPair").get("PublicKey"))
    with open(os.path.join(CERT_FOLDER, PRIVATE_KEY_OUTFILE), 'w') as the_file:
        the_file.write(response.get("keyPair").get("PrivateKey"))
except IOError as e:
    print("Error creating certificate files")
    raise e

## Create the Greengrass Group with Cloudformation template

We will use the [Cloudformation template](https://github.com/awslabs/aws-iot-greengrass-accelerators/blob/master/accelerators/machine_learning_inference/cfn/mli_accelerator_s3_models-INPUT.cfn.yaml) from the **AWS IoT Greengrass Accelerator** to create the following resources:
* The Lambda function which will be pinned and keep running in the Greengrass Core
* An alias which points to the latest version of the Lambda function
* A Greengrass Group, consists of the following:
   * The Lambda function
   * ML resource from the pretrained model uploaded in the S3 bucket
   * AWS IoT Thing as the Greengrass Core

In [None]:
import boto3
s3 = boto3.resource('s3')
if not s3.Bucket(CFN_S3_BUCKET) in s3.buckets.all():
    s3.create_bucket(Bucket=CFN_S3_BUCKET)

In [None]:
# Check IAM permission
import boto3
from botocore.exceptions import ClientError

try:
    cloudformation = boto3.resource('cloudformation', region_name=AWS_REGION)
    stack = cloudformation.Stack(CFN_STACK_NAME)
    stack.description
except ClientError as e:
    if(e.response["Error"]["Code"]=="ValidationError"):
        print("Stack name {} does not exist, continue".format(CFN_STACK_NAME))
    elif(e.response["Error"]["Code"]=="AccessDenied"):
        print("Missing permission. Please add the following IAM Policy\n\n \
              resource: arn:aws:cloudformation:{}:{}:stack/{}/*\n\n \
              cloudformation:DescribeStacks \n \
              cloudformation:CreateChangeSet \n \
              to IAM role:\n{}\n".format(region,AWS_ACCOUNT,CFN_STACK_NAME,role))
        raise e
    else:
        raise e

In [None]:
def cloudformation_package_deploy():
    ![ -e {WORKSPACE_FOLDER}/*-OUTPUT.yaml ] && rm {WORKSPACE_FOLDER}/*-OUTPUT.yaml 
    !aws cloudformation package \
    --region {AWS_REGION} \
    --template-file aws-iot-greengrass-accelerators/accelerators/machine_learning_inference/cfn/mli_accelerator_s3_models-INPUT.cfn.yaml \
    --s3-bucket {CFN_S3_BUCKET} \
    --output-template-file {WORKSPACE_FOLDER}/mli_accelerator_s3_models-OUTPUT.yaml \
    && \
    aws cloudformation deploy \
      --region {AWS_REGION} \
      --stack-name {CFN_STACK_NAME} \
      --template-file {WORKSPACE_FOLDER}/mli_accelerator_s3_models-OUTPUT.yaml \
      --capabilities CAPABILITY_NAMED_IAM \
      --parameter-overrides \
        CoreName={CORENAME} \
        CertIdParam={CERTIFICATE_ID} \
        ModelS3Uri={ML_S3_BUCKET_URI} 

### Create/Update the Greengrass Cloud configuration with CloudFormation

In [None]:
cloudformation_package_deploy()

In [None]:
print("...waiting for stack {} to be ready...".format(CFN_STACK_NAME))
client = boto3.client('cloudformation', region_name=AWS_REGION)
waiter = client.get_waiter('stack_create_complete')
waiter.wait(StackName=CFN_STACK_NAME)

At this point, all resources have been created and an initial Greengrass deployment has also been created and ready to be sent to the device.

## Package the Greengrass config and credential files

We will create a zip file `greengrass-setup.zip` that includes the following folder file and folder structure:

* greengrass/
   * config/
      * config.json
   * certs/
      * AmazonRootCA1.pem
      * cert.pem
      * private.key.pem
      * public.key.pem
      
The file `greengrass-setup.zip` will then be uploaded to the Greengrass Core and extract to the folder where AWS IoT Greengrass is installed, typically in `/greengrass`

In [None]:
import boto3
import json
from zipfile import ZipFile
import os

def get_cloudformation_output(stack_name, output_keyname):
    """
    'stack_name' is the name of the Cloudformation stack to get the output value
    'output_keyname' is name of the key to retrieve the value
    """    
    cfn_client = boto3.client('cloudformation', region_name=AWS_REGION)
    response = cfn_client.describe_stacks(StackName=stack_name)
    stacks = response.get("Stacks")

    value={}
    for stack in stacks:
        if stack["Outputs"]:
            for output in stack["Outputs"]:
                if output["OutputKey"] == output_keyname:
                    value = output["OutputValue"]
    
    return value

def archive_all_subfolder(zipfile, folder):
    """
    'zipfile' is the file name for the archive with a full path
    'folder' is root of the folder which all files to be added to the zipfile
    """
    with ZipFile(zipfile, 'w') as zip:    
        for root, dirs, files in os.walk(folder):
            if(files):
                for file in files:
                    f = os.path.join(root,file)
                    print ("archiving file {} --> {}".format(f,os.path.relpath(f, folder)))
                    zip.write(f, os.path.relpath(f, folder))

In [None]:
greengrass_config = get_cloudformation_output(CFN_STACK_NAME,"GreengrassConfig")

greengrass_config = json.loads(greengrass_config)
    
# Update the credential file name
greengrass_config["crypto"]["principals"]["IoTCertificate"]["privateKeyPath"] = "file:////greengrass/certs/{}".format(PRIVATE_KEY_OUTFILE)
greengrass_config["crypto"]["principals"]["IoTCertificate"]["certificatePath"] = "file:////greengrass/certs/{}".format(CERT_PEM_OUTFILE)

try:
    GREENGRASS_CONFIG_FOLDER = os.path.join(GREENGRASS_WORK_FOLDER, "config/")
    if not os.path.exists(GREENGRASS_CONFIG_FOLDER):
        os.makedirs(GREENGRASS_CONFIG_FOLDER)                    
    with open(os.path.join(GREENGRASS_CONFIG_FOLDER, "config.json"), 'w') as the_file:
        the_file.write(json.dumps(greengrass_config, indent=4))
except IOError as e:
    print("Error creating Greengrass config file")
    raise e

In [None]:
# Download the Amazon ROOT CA cert into the folder
!wget -O {GREENGRASS_WORK_FOLDER}/certs/AmazonRootCA1.pem https://www.amazontrust.com/repository/AmazonRootCA1.pem

In [None]:
archive_all_subfolder(os.path.join(WORKSPACE_FOLDER, "greengrass-setup.zip"), GREENGRASS_WORK_FOLDER)

## Setting up the AWS IoT Greengrass

The `greengrass-setup.zip` can be download  to your local computer, by right-click on the file `greengrass-setup.zip` in the `File Browser` from the left sidebar, select `Download`.

Once the file `greengrass-setup.zip` is downloaded to your local computer, upload the file to the Greengrass Core.

### Optional: Create a AWS IoT Greengrass on an AWS EC2

You can create an EC2 with AWS IoT Greengrass running, with the Cloudformation template which can be found in [Github location of AWS IoT Greengrass Accelerator for MLI](https://github.com/awslabs/aws-iot-greengrass-accelerators/blob/master/accelerators/machine_learning_inference/S3_MODELS.md#aws-greengrass-core-on-aws-ec2). 
You will need the following parameter to create the EC2-Greengrass using the template.

In [None]:
# Parameters you will need to create the AWS IoT Greengrass on an AWS EC2
print("CORENAME: {}".format(CORENAME))

Once the stack has been created, you will upload the `greengrass-setup.zip` to the AWS EC2.

1. Download the `greengrass-config.zip` to your local computer, and upload to the Greengrass EC2
2. `scp` the `greengrass-config.zip` to the Greengrass EC2. The `scp` command can be found in the `SCPCommand` output of the stack.
3. Once `greengrass-config.zip` is `scp` to the EC2, ssh to the EC2. The `ssh` command can be found in the `ConnectCommand` output of the stack.
4. In the EC2, extract `greengrass-setup.zip` into `/greengrass` folder using command 

```bash
sudo unzip -o greengrass-setup.zip -d /greengrass
```

5. Quick check if all files are in place. For example, making sure the `AmazonRootCA1.pem` contents are proper 

```bash
$ sudo cat /greengrass/certs/AmazonRootCA1.pem
-----BEGIN CERTIFICATE-----
```

6. Starts the Greengrass with command `$ sudo systemctl restart greengrass`

7. Ensure that Greengrass started successfully 

```bash
$ sudo systemctl status greengrass
greengrass.service - greengrass daemon
   Loaded: loaded (/etc/systemd/system/greengrass.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-11-11 05:05:55 UTC; 20min ago
```

In [None]:
from IPython.display import display, Markdown
display(Markdown('Note: you can generate an EC2 with Greengrass running, using a template in section **[Optional: Create a AWS IoT Greengrass on an AWS EC2]**(#Optional:-Create-a-AWS-IoT-Greengrass-on-an-AWS-EC2)'))
key_pressed = input('Press ENTER to continue, once your AWS IoT Greengrass Core has been successfully started: ')

## Greengrass Deployment

With the `greengrass` software connected to the AWS IoT cloud using the configuration file and certificates in the Greengrass Core hardware, we can proceed to deploy the cloud configuration to the AWS IoT Greengrass Core Device. 

The deployment requires the Greengrass ID, and the version ID to be deployed. We will be deploying the latest version ID.

In [None]:
import boto3
from collections import OrderedDict
from operator import itemgetter    

greengrass_client = boto3.client('greengrass', region_name = AWS_REGION)

group_id = get_cloudformation_output(CFN_STACK_NAME, "GreengrassGroupID")

versions_response = greengrass_client.list_group_versions(
    GroupId=group_id
)

versions = versions_response.get("Versions")
sorted_versions = sorted(versions, key=lambda version: version.get("CreationTimestamp"), reverse=True)
latest_version = sorted_versions[0]

deployment_response = greengrass_client.create_deployment(
    DeploymentType='NewDeployment',
    GroupId=group_id,
    GroupVersionId=latest_version.get("Version")
)
deployment_response

In [None]:
import time

# wait for 1 min before checking the deployment status
time.sleep(60)

deployment_status_response = greengrass_client.get_deployment_status(DeploymentId=deployment_response.get("DeploymentId"),GroupId=group_id)
status = deployment_status_response.get("DeploymentStatus")
print("Deployment status was: {}".format(status))

## IoT Test Client

The estimator Lambda function will loop through the local images, and forward to the mxnet predictor. The Lambda function will then send the prediction results as MQTT message with topic `mli/predictions/{CORENAME}` to both local and to the cloud.

We will be using the AWS IoT Python SDK to subscribe to the topic.

Once the MQTT client is subscribed to the topic, you should be able to see the messages in the output, such as 

```
Received a new message: 
b'[{"confidence": "0.6718504", "prediction": "n03983396 pop bottle, soda bottle"}]'
from topic: 
mli/predictions/greengrass_ml_us-east-1
--------------
```

In [None]:
!python3.7 -m pip install AWSIoTPythonSDK

In [None]:
from AWSIoTPythonSDK.MQTTLib import AWSIoTMQTTClient
import boto3

client = boto3.client('iot', region_name=AWS_REGION)
endpoint = client.describe_endpoint(
    endpointType='iot:Data-ATS'
)

# Custom MQTT message callback
def subscriptionCallback(client, userdata, message):
    print("Received a new message: ")
    print(message.payload)
    print("from topic: ")
    print(message.topic)
    print("--------------\n\n")
    
awsIoTMqttClient = AWSIoTMQTTClient("myClientID")
awsIoTMqttClient.configureEndpoint(endpoint.get('endpointAddress'), 8883)
awsIoTMqttClient.configureCredentials( 
    os.path.join(GREENGRASS_WORK_FOLDER, "certs", "AmazonRootCA1.pem"), 
    os.path.join(GREENGRASS_WORK_FOLDER, "certs", PRIVATE_KEY_OUTFILE),
    os.path.join(GREENGRASS_WORK_FOLDER, "certs", CERT_PEM_OUTFILE)
)
awsIoTMqttClient.connect()
awsIoTMqttClient.subscribe("mli/predictions/{}".format(CORENAME), 1, subscriptionCallback)
key_pressed = input('Press ENTER to stop subscription')
awsIoTMqttClient.disconnect()

## Optional: Update the Lambda function and redeploy

Update either the Lambda function or the ML model, and 
* Step 1: [update the Cloudformation stack](#Create/Update-the-Greengrass-Cloud-configuration-with-CloudFormation)
* Step 2: [redeploy](#Greengrass-Deployment)

### Optional 1: Add image file to the MQTT prediction message

1. Open the Lambda file `greengrass_long_run.py` in `/aws-greengrass-mxnet-inception/cfn/aws-iot-greengrass-accelerators/accelerators/machine_learning_inference/cfn/lambda_functions/s3_models/greengrass_long_run.py`
2. 