This tutorial is available on the [Wallaroo Tutorials repository](https://github.com/WallarooLabs/Wallaroo_Tutorials/blob/2023.3.0-updates/pipeline-architecture/wallaroo-arm-cv-demonstration).

This tutorial demonstrates how to use the Wallaroo combined with ARM processors to perform inferences with pre-trained computer vision ML models.  This demonstration assumes that:

* A Wallaroo version 2023.3 or above instance is installed.
* A nodepools with ARM architecture virtual machines are part of the Kubernetes cluster.  For example, Azure supports Ampere® Altra® Arm-based processor included with the following virtual machines:
  * [Dpsv5 and Dpdsv5-series](https://learn.microsoft.com/en-us/azure/virtual-machines/dpsv5-dpdsv5-series)
  * [Epsv5 and Epdsv5-series](https://learn.microsoft.com/en-us/azure/virtual-machines/epsv5-epdsv5-series)
* The applications and modules specified in the notebook `arm-computer-vision-preparation.ipynb` are complete.

### Tutorial Goals

For our example, we will perform the following:

* Create a workspace for our work.
* Upload the the resnet computer vision model model.
* Create a pipeline using the default architecture that can ingest our submitted data, submit it to the model, and export the results while tracking how long the inference took.
* Redeploy the same pipeline on the ARM architecture, then perform the same inference on the same data and model and track how long the inference took.
* Compare the inference timing through the default architecture versus the ARM architecture.

## Steps

### Import Libraries

The first step will be to import our libraries.

In [16]:
import torch
import pickle
import wallaroo
from wallaroo.object import EntityNotFoundError
from wallaroo.framework import Framework

import numpy as np
import json
import requests
import time
import pandas as pd

# used to display dataframe information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)

# used for unique connection names

import string
import random
suffix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

### Connect to the Wallaroo Instance

The first step is to connect to Wallaroo through the Wallaroo client.  The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the `wallaroo.Client()` command, which provides a URL to grant the SDK permission to your specific Wallaroo environment.  When displayed, enter the URL into a browser and confirm permissions.  Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use `wl = wallaroo.Client()`.  For more information on Wallaroo Client settings, see the [Client Connection guide](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-client/).

In [17]:
# Login through local service

wl = wallaroo.Client()

wl = wallaroo.Client()

wallarooPrefix = "product-uat-ee."
wallarooSuffix = "wallaroocommunity.ninja"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

### Set Variables

The following variables and methods are used later to create or connect to an existing workspace, pipeline, and model.

The `suffix` is used to ensure unique workspace names across the Wallaroo instance.  Set this to '' if not required.

In [18]:
suffix=''

workspace_name = f'cv-arm-example{suffix}'
pipeline_name = 'cv-sample'

resnet_model_name = 'resnet50'
resnet_model_file_name = 'models/resnet50_v1.onnx'

In [19]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline

### Create Workspace

The workspace will be created or connected to, and set as the default workspace for this session.  Once that is done, then all models and pipelines will be set in that workspace.

In [20]:
workspace = get_workspace(workspace_name)
wl.set_current_workspace(workspace)
wl.get_current_workspace()

{'name': 'cv-arm-example', 'id': 80, 'archived': False, 'created_by': '6d75da2e-3913-4acd-b1bb-06dd1eb3d0df', 'created_at': '2023-09-07T15:41:47.832442+00:00', 'models': [{'name': 'resnet50', 'versions': 1, 'owner_id': '""', 'last_update_time': datetime.datetime(2023, 9, 7, 15, 41, 59, 385816, tzinfo=tzutc()), 'created_at': datetime.datetime(2023, 9, 7, 15, 41, 59, 385816, tzinfo=tzutc())}], 'pipelines': [{'name': 'cv-sample', 'create_time': datetime.datetime(2023, 9, 7, 15, 41, 49, 429383, tzinfo=tzutc()), 'definition': '[]'}]}

### Create Pipeline and Upload Model

We will now create or connect to our pipeline and add our CV model as a pipeline step.

In [21]:
pipeline = get_pipeline(pipeline_name)

resnet_model = wl.upload_model(resnet_model_name, resnet_model_file_name, framework=Framework.ONNX)

### Deploy Pipeline

With the model uploaded, we can add it is as a step in the pipeline, then deploy it.

For this deployment we will be using the default deployment which uses the x86 architecture.


Once deployed, resources from the Wallaroo instance will be reserved and the pipeline will be ready to use the model to perform inference requests. 

In [29]:
x86_deployment_config = (wallaroo.deployment_config
                            .DeploymentConfigBuilder()
                            .cpus(2)
                            .memory('2Gi')
                            .build()
                        )
# clear previous steps
pipeline.clear()
pipeline.add_model_step(resnet_model)

pipeline.deploy(deployment_config = x86_deployment_config)

0,1
name,cv-sample
created,2023-09-07 15:41:49.429383+00:00
last_updated,2023-09-07 15:48:05.234992+00:00
deployed,True
tags,
versions,"9c4adfd4-ac05-4e4c-990c-9187a441aaed, 0d0d2880-5b62-4923-ba63-ed9f04244741, c7129597-3add-40c3-98e9-337ab09d32f5, 41d08c07-923c-4a68-983c-5053611708e5, 246a23c6-c667-49b3-9726-684b46cc8114, 496b517f-d1c9-4a26-8a76-8315621d5b1f, 84512dae-ed10-4ace-b0a6-a024a478b9d8, 916718f4-e367-4fb6-a9d5-ada948c0d0a7"
steps,resnet50
published,False


In [30]:
pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.8.5',
   'name': 'engine-766b9f76f-9jbjm',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'cv-sample',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'resnet50',
      'version': '3dfb2793-fe98-4ea8-9366-5cbab60bb1f7',
      'sha': 'c6c8869645962e7711132a7e17aced2ac0f60dcdc2c7faa79b2de73847a87984',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.8.4',
   'name': 'engine-lb-584f54c899-9k748',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

### Run Inference

With that done, we can have the model detect the objects on the image by running an inference through the pipeline.  For this example, we will use a prepared Apache Arrow table `./data/image_224x224.arrow`

In [31]:
startTime = time.time()
# pass the table in 
results = pipeline.infer_from_file('./data/image_224x224.arrow')
endTime = time.time()
x86_time = endTime-startTime

### Deploy with ARM

We have demonstrated performing our sample inference using a standard pipeline deployment.  Now we will redeploy the same pipeline with the ARM architecture with the Wallaroo deployment setting `wallaroo.engine_config.Architecture.ARM` setting and applying it to the deployment configurations `arch` parameter.

In [32]:
from wallaroo.engine_config import Architecture
arm_deployment_config = (wallaroo.deployment_config
                            .DeploymentConfigBuilder()
                            .cpus(2)
                            .memory('2Gi')
                            .arch(Architecture.ARM)
                            .build()
                        )

pipeline.undeploy()
pipeline.deploy(deployment_config=arm_deployment_config)
pipeline.deploy()

0,1
name,cv-sample
created,2023-09-07 15:41:49.429383+00:00
last_updated,2023-09-07 15:49:23.096003+00:00
deployed,True
tags,
versions,"47bfbb13-41c2-4ad5-90aa-db2ab9a39fa4, cb17541b-8d71-4579-be07-f9e5cf2f15c7, 9c4adfd4-ac05-4e4c-990c-9187a441aaed, 0d0d2880-5b62-4923-ba63-ed9f04244741, c7129597-3add-40c3-98e9-337ab09d32f5, 41d08c07-923c-4a68-983c-5053611708e5, 246a23c6-c667-49b3-9726-684b46cc8114, 496b517f-d1c9-4a26-8a76-8315621d5b1f, 84512dae-ed10-4ace-b0a6-a024a478b9d8, 916718f4-e367-4fb6-a9d5-ada948c0d0a7"
steps,resnet50
published,False


In [36]:
pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.2.35',
   'name': 'engine-766dfc9974-pr4gk',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'cv-sample',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'resnet50',
      'version': '3dfb2793-fe98-4ea8-9366-5cbab60bb1f7',
      'sha': 'c6c8869645962e7711132a7e17aced2ac0f60dcdc2c7faa79b2de73847a87984',
      'status': 'Running'}]}},
  {'ip': '10.244.8.7',
   'name': 'engine-7cc8fbdcbb-jt2nq',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'cv-sample',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'resnet50',
      'version': '3dfb2793-fe98-4ea8-9366-5cbab60bb1f7',
      'sha': 'c6c8869645962e7711132a7e17aced2ac0f60dcdc2c7faa79b2de73847a87984',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.8.6',
   'name': 'engine-lb-584f54c899-jtw9h',
   

### ARM Inference

We will now perform the same inference we did with the standard deployment architecture, only this time through the ARM virtual machines.

In [37]:
startTime = time.time()
# pass the table in 
results = pipeline.infer_from_file('./data/image_224x224.arrow')
endTime = time.time()
arm_time = endTime-startTime

### Compare Standard against Arm

With the two inferences complete, we'll compare the standard deployment architecture against the ARM architecture.

In [38]:
display(f"Standard architecture: {x86_time}")
display(f"ARM architecture: {arm_time}")

'Standard architecture: 1.881904125213623'

'ARM architecture: 0.8074312210083008'

### Undeploy the Pipeline

With the inference complete, we can undeploy the pipeline and return the resources back to the Wallaroo instance.

In [39]:
pipeline.undeploy()

0,1
name,cv-sample
created,2023-09-07 15:41:49.429383+00:00
last_updated,2023-09-07 15:49:23.096003+00:00
deployed,False
tags,
versions,"47bfbb13-41c2-4ad5-90aa-db2ab9a39fa4, cb17541b-8d71-4579-be07-f9e5cf2f15c7, 9c4adfd4-ac05-4e4c-990c-9187a441aaed, 0d0d2880-5b62-4923-ba63-ed9f04244741, c7129597-3add-40c3-98e9-337ab09d32f5, 41d08c07-923c-4a68-983c-5053611708e5, 246a23c6-c667-49b3-9726-684b46cc8114, 496b517f-d1c9-4a26-8a76-8315621d5b1f, 84512dae-ed10-4ace-b0a6-a024a478b9d8, 916718f4-e367-4fb6-a9d5-ada948c0d0a7"
steps,resnet50
published,False
