## Edge Deployment to ARM Devices Demonstration

This tutorial demonstrates:

* Publishing a Wallaroo pipeline to an OCI (Open Container Initiative) registry with the target architecture set to ARM.
* Deploying the pipeline on an Edge device running on the ARM architecture.

The original sample ARM device for this demonstration was an M1 Mac using Docker to deploy the container registry.  Wallaroo pipelines are deployed on both x86 and ARM edge devices that comply with OCI standards.  For more information, see [Wallaroo SDK Essentials Guide: Pipeline Edge Publication](https://docs.wallaroo.ai/20230300/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipelines/wallaroo-sdk-essentials-pipeline-publication/).

This demonstration uses a Resnet50 computer vision model for deployment.

### Prerequisites

If using this demonstration on your own Wallaroo instance, verify that the OCI Registry used to store published pipelines is enabled.  See [Edge Deployment Registry Guide](https://docs.wallaroo.ai/20230300/wallaroo-operations-guide/wallaroo-configuration/wallaroo-edge-deployment/) for full details.

To use the Wallaroo Server engine on ARM edge devices, contact your Wallaroo support representative for access to the ARM version of the Wallaroo engine.

## Import Packages

First we import the Python packages needed, primarily [`wallaroo`](https://pypi.org/project/wallaroo/) version 2023.3.0 and above.

In [1]:
import wallaroo
from wallaroo.object import EntityNotFoundError

import pyarrow as pa
import pandas as pd


# used to display dataframe information without truncating
from IPython.display import display
pd.set_option('display.max_colwidth', None)

# import os
# import pandas as pd
# import sys
 
# setting path
# from wallaroo_demo_utils import WallarooDemoUtils 

# w_demo = WallarooDemoUtils()

### Connect to the Wallaroo Instance through the User Interface

The next step is to connect to Wallaroo through the Wallaroo client.  The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the `wallaroo.Client()` command, which provides a URL to grant the SDK permission to your specific Wallaroo environment.  When displayed, enter the URL into a browser and confirm permissions.  Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use `wl = wallaroo.Client()`.  For more information on Wallaroo Client settings, see the [Client Connection guide](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-client/).

In [4]:
wl = wallaroo.Client()

wallarooPrefix = "doc-test."
wallarooSuffix = "wallaroocommunity.ninja"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

## Sample Workspace and Pipeline

The following variables will set the workspace and pipeline for this demonstration.  The use of the `suffix` variable is to ensure a unique workspace name in the Wallaroo environment - change this variable as needed.

In [5]:
def get_workspace(name, client):
    workspace = None
    for ws in client.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = client.create_workspace(name)
    return workspace

def get_pipeline(name, client):
    try:
        pipeline = client.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = client.build_pipeline(name)
    return pipeline

In [6]:
suffix='-jch'
workspace_name = f"cv-arm-edge-demo{suffix}"
pipeline_name = 'cv-arm-edge'

workspace = get_workspace(workspace_name, wl)
pipeline = get_pipeline(pipeline_name, wl)

display(workspace)
display(pipeline)

wl.set_current_workspace(workspace)

{'name': 'cv-arm-edge-demo-jch', 'id': 5, 'archived': False, 'created_by': 'b030ff9c-41eb-49b4-afdf-2ccbecb6be5d', 'created_at': '2023-09-27T15:17:44.195906+00:00', 'models': [], 'pipelines': []}

0,1
name,cv-arm-edge
created,2023-09-27 15:17:45.593032+00:00
last_updated,2023-09-27 15:17:45.593032+00:00
deployed,(none)
tags,
versions,97a92779-0a5d-4c2b-bcf1-7afd60ac83d5
steps,
published,False


{'name': 'cv-arm-edge-demo-jch', 'id': 5, 'archived': False, 'created_by': 'b030ff9c-41eb-49b4-afdf-2ccbecb6be5d', 'created_at': '2023-09-27T15:17:44.195906+00:00', 'models': [], 'pipelines': []}

# Upload and package the model

When a model is uploaded to a Wallaroo cluster, it is optimized and packaged to make it ready to run as part of a pipeline.

We will upload the model and set it's target architecture to ARM.  For native Wallaroo runtimes, the model is deployed to any architecture, while containerized models must be run in the same architecture as the target architecture.  For more information, see [Wallaroo SDK Essentials Guide: Model Uploads and Registrations](https://docs.wallaroo.ai/20230300/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-model-uploads/).

In [7]:
from wallaroo.engine_config import Architecture

model = wl.upload_model('resnet-50', 
                        "./models/resnet50_v1.onnx", 
                        framework = wallaroo.framework.Framework.ONNX,
                        arch = Architecture.ARM)


### Reserve resources needed for this pipeline 

Before deploying an inference engine we need to tell wallaroo what resources it will need.
To do this we will use the wallaroo DeploymentConfigBuilder() and fill in the options listed below to determine what the properties of our inference engine will be

We will be testing this deployment for an edge scenario, so the resource specifications are kept small -- what's the minimum needed to meet the expected load on the planned hardware.

- cpus - 4 => allow the engine to use 4 CPU cores when running the neural net
- memory - 0.5Gi => each inference engine will have 512 MB of memory, which is plenty for processing a single image at a time.



In [8]:
from wallaroo.engine_config import Architecture

deployment_config = wallaroo.DeploymentConfigBuilder() \
    .cpus(4)\
    .memory("0.5Gi")\
    .arch(wallaroo.engine_config.Architecture.ARM)\
    .build()

### Set up the Pipeline Steps

Here we set up the pipeline steps with our sample model.  We will save the pipeline with its current steps in our SDK session into the Wallaroo database.  This allows it to be deployed or published.

In [9]:
pipeline_cv_version = pipeline.add_model_step(model).create_version()
pipeline_cv_version

0,1
name,cv-arm-edge
version,86dd133a-c12f-478b-af9a-30a7e4850fc4
creation_time,2023-27-Sep 15:20:15
last_updated_time,2023-27-Sep 15:20:15
deployed,False
tags,
steps,resnet-50


## Publish the pipeline for edge deployment

We will now publish the pipeline.  This wraps the Wallaroo pipeline into an OCI compliant container and pushes it to the OCI registry along with the engine.  For this tutorial, the x86 engine is pushed to the Wallaroo instance.

In [12]:
pub = pipeline_cv_version.publish(deployment_config)
pub

Waiting for pipeline publish... It may take up to 600 sec.
Pipeline is Publishing...Published.


0,1
ID,3
Pipeline Version,86dd133a-c12f-478b-af9a-30a7e4850fc4
Status,Published
Engine URL,ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/standalone-mini:v2023.3.0-3854
Pipeline URL,ghcr.io/wallaroolabs/doc-samples/pipelines/cv-arm-edge:86dd133a-c12f-478b-af9a-30a7e4850fc4
Helm Chart URL,ghcr.io/wallaroolabs/doc-samples/charts/cv-arm-edge
Helm Chart Reference,ghcr.io/wallaroolabs/doc-samples/charts@sha256:3f764b221289d015ed7f8347ca4d0877b132c81a056e63ed37938b57d70f523d
Helm Chart Version,0.0.1-86dd133a-c12f-478b-af9a-30a7e4850fc4
Engine Config,"{'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}}}, 'engineAux': {'images': {}}, 'enginelb': {'resources': {'limits': {'cpu': 1.0, 'memory': '512Mi'}, 'requests': {'cpu': 1.0, 'memory': '512Mi'}}}}"
User Images,[]


## Deploy on Edge Device

With our pipeline published, we will deploy it onto an Edge device via Docker.  For this sample demonstration we are using `ghcr.io` for our sample registry.    The following code will generate the docker deployment command for port 8080.  The following must be provided by the user:

* OCI_REGISTRY: The URL to the OCI registry.  This sample uses the command line variable `$REGISTRYURL`.
* OCI_USERNAME: The OCI username to authenticate to the OCI registry.  This sample uses the command line variable `$REGISTRYUSERNAME`.
* OCI_PASSWORD:  The password to the OCI user authenticating to the OCI registry.  This sample uses the command line variable `$REGISTRYPASSWORD`.
* PIPELINE_URL:  The following command uses the `pipeline_url` from the recently published pipeline.
* Engine URL:  The engine for the Wallaroo server and tag is specified here and use the environmental variables `$ENGINEURL` and `$ENGINEURLTAG`.  The Wallaroo Server x86 engine is published by default and listed in the published pipeline.  For access to the ARM, x86+GPU or ARM+GPU containers of the Wallaroo engine, see your Wallaroo support representative.

Provide  with the registry and login information provided by the user.

In [13]:
docker_deploy = f'''
docker run -p 8080:8080 \\
    -e DEBUG=true  \\
    -e OCI_REGISTRY=$REGISTRYURL \\
    -e CONFIG_CPUS=4 \\
    -e OCI_USERNAME=$REGISTRYUSERNAME \\
    -e OCI_PASSWORD=$REGISTRYPASSWORD \\
    -e PIPELINE_URL={pub.pipeline_url} \\
    $ENGINEURL:$ENGINEURLTAG
'''

print(docker_deploy)


docker run -p 8080:8080 \
    -e DEBUG=true  \
    -e OCI_REGISTRY=$REGISTRYURL \
    -e CONFIG_CPUS=4 \
    -e OCI_USERNAME=$REGISTRYUSERNAME \
    -e OCI_PASSWORD=$REGISTRYPASSWORD \
    -e PIPELINE_URL=ghcr.io/wallaroolabs/doc-samples/pipelines/cv-arm-edge:86dd133a-c12f-478b-af9a-30a7e4850fc4 \
    $ENGINEURL:$ENGINEURLTAG



### Edge Inference

Now we can perform an inference through the edge device.  This assumes the edge device is on the host `localhost`.  Modify as required.

In [14]:
!curl -X POST localhost:8080/pipelines/cv-arm-edge \
    -H "Content-Type: application/vnd.apache.arrow.file" \
    --data-binary @image_224x224.arrow

[{"check_failures":[],"elapsed":[1917625,392070250],"model_name":"resnet-50","model_version":"e868bf7b-7ad8-4c21-8f76-5fbae1789d74","original_data":null,"outputs":[{"Int64":{"data":[535],"dim":[1],"v":1}},{"Float":{"data":[0.00009498585131950676,0.00009141524060396478,0.00046068374649621546,0.00007667177851544693,0.00008047104347497225,0.00006355856021400541,0.00017580816347617656,0.000014166347682476044,0.00004344095941632986,0.000042251358536304906,0.00025400498998351395,0.005299815908074379,0.0001666695170570165,0.00019031290139537305,0.0002084688749164343,0.00014618523709941655,0.00034408163628540933,0.0008281365735456347,0.00011978298425674438,0.0002062775456579402,0.00014886555436532944,0.00026070952299050987,0.0009008666384033859,0.001475491444580257,0.0008267512894235551,0.0003027648199349642,0.00019366369815543294,0.0005283929058350623,0.00014922766422387213,0.00024121809110511094,0.00041593576315790415,0.000036156357964500785,0.0002411209134152159,0.00016002357006072998,0.000