This tutorial is available on the [Wallaroo Tutorials repository](https://github.com/WallarooLabs/Wallaroo_Tutorials/blob/2023.3.0-updates/pipeline-architecture/wallaroo-arm-classification-finserv).

## Arbitrary Python with Arm Architecture

This tutorial demonstrates how to use the Wallaroo combined with ARM processors to perform inferences with an Arbitrary Python model focused on a pre-trained `VGG16` model on `cifar10` as a feature extractor.  The Python entry points used for Wallaroo deployment will be added and described.

This demonstration assumes that:

* A Wallaroo version 2023.3 or above instance is installed.
* A nodepools with ARM architecture virtual machines are part of the Kubernetes cluster.  For example, Azure supports Ampere® Altra® Arm-based processor included with the following virtual machines:
  * [Dpsv5 and Dpdsv5-series](https://learn.microsoft.com/en-us/azure/virtual-machines/dpsv5-dpdsv5-series)
  * [Epsv5 and Epdsv5-series](https://learn.microsoft.com/en-us/azure/virtual-machines/epsv5-epdsv5-series)

In this notebook we will the example model and sample data from the Machine Learning Group's demonstration on [Credit Card Fraud detection](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud).

### Tutorial Goals

For our example, we will perform the following:

* Create a workspace for our work.
* Upload the Arbitrary Python model.
* Create a pipeline using the default architecture that can ingest our submitted data, submit it to the model, and export the results while tracking how long the inference took.
* Redeploy the same pipeline on the ARM architecture, then perform the same inference on the same data and model and track how long the inference took.
* Compare the inference timing through the default architecture versus the ARM architecture.

### Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that **must** include the following.

* `class ImageClustering(Inference)`:  The default inference class.  This is used to perform the actual inferences.  Wallaroo uses the `_predict` method to receive the inference data and call the appropriate functions for the inference.
  * `def __init__`:  Used to initialize this class and load in any other classes or other required settings.
  * `def expected_model_types`: Used by Wallaroo to anticipate what model types are used by the script.
  * `def model(self, model)`: Defines the model used for the inference.  Accepts the model instance used in the inference.
    * `self._raise_error_if_model_is_wrong_type(model)`: Returns the error if the wrong model type is used.  This verifies that only the anticipated model type is used for the inference.
    * `self._model = model`: Sets the submitted model as the model for this class, provided `_raise_error_if_model_is_wrong_type` is not raised.
  * `def _predict(self, input_data: InferenceData)`:  This is the entry point for Wallaroo to perform the inference.  This will receive the inference data, then perform whatever steps and return a dictionary of numpy arrays.
* `class ImageClusteringBuilder(InferenceBuilder)`: Loads the model and prepares it for inferencing.
  * `def inference(self) -> ImageClustering`: Sets the inference class being used for the inferences.
  * `def create(self, config: CustomInferenceConfig) -> ImageClustering`: Creates an inference subclass, assigning the model and any attributes required for it to function.

All other methods used for the functioning of these classes are optional, as long as they meet the requirements listed above.

### Tutorial Prerequisites

* A Wallaroo version 2023.2.1 or above instance.

### References

* [Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Arbitrary Python](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-model-uploads/wallaroo-sdk-model-arbitrary-python/)

In [15]:
import wallaroo
from wallaroo.object import EntityNotFoundError

import numpy as np
import pandas as pd
import pyarrow as pa

from wallaroo.framework import Framework

# used to display dataframe information without truncating
from IPython.display import display
pd.set_option('display.max_colwidth', None)
import time

In [2]:
wallaroo.__version__

'2023.3.0b0'

### Connect to the Wallaroo Instance through the User Interface

The next step is to connect to Wallaroo through the Wallaroo client.  The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the `wallaroo.Client()` command, which provides a URL to grant the SDK permission to your specific Wallaroo environment.  When displayed, enter the URL into a browser and confirm permissions.  Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use `wl = wallaroo.Client()`.  For more information on Wallaroo Client settings, see the [Client Connection guide](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-client/).

In [3]:
# Login through local Wallaroo instance

wl = wallaroo.Client()

wallarooPrefix = "product-uat-ee."
wallarooSuffix = "wallaroocommunity.ninja"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
                    auth_type="sso")

Please log into the following URL in a web browser:

	https://product-uat-ee.keycloak.wallaroocommunity.ninja/auth/realms/master/device?user_code=ZNFY-ZZLN

Login successful!


### Create Workspace

Now we'll use the SDK below to create our workspace , assign as our **current workspace**, then display all of the workspaces we have at the moment.  We'll also set up for our models and pipelines down the road, so we have one spot to change names to whatever fits your organization's standards best.

To allow this tutorial to be run multiple times or by multiple users in the same Wallaroo instance, a random 4 character prefix will be added to the workspace.  Feel free to change this `suffix` variable to `''` if not required.

When we create our new workspace, we'll save it in the Python variable `workspace` so we can refer to it as needed.

For more information, see the [Wallaroo SDK Essentials Guide: Workspace Management](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-workspace/).

In [4]:
import string
import random

# make a random 4 character prefix
suffix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

suffix='john'

workspace_name = f'arm-arbitrary-python{suffix}'
pipeline_name = 'arm-arbitrary-python-example'
model_name = 'vgg16-clustering'
model_file_name = './models/model-auto-conversion-BYOP-vgg16-clustering.zip'

In [5]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(name)
    return pipeline

In [6]:
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

{'name': 'arm-arbitrary-pythonjohn', 'id': 50, 'archived': False, 'created_by': '6d75da2e-3913-4acd-b1bb-06dd1eb3d0df', 'created_at': '2023-08-29T16:03:29.808122+00:00', 'models': [], 'pipelines': []}

### Upload Arbitrary Python Model

Arbitrary Python models are uploaded to Wallaroo through the Wallaroo Client `upload_model` method.

#### Upload Arbitrary Python Model Parameters

The following parameters are required for Arbitrary Python models.  Note that while some fields are considered as **optional** for the `upload_model` method, they are required for proper uploading of a Arbitrary Python model to Wallaroo.

| Parameter | Type | Description |
|---|---|---|
|`name` | `string` (*Required*) | The name of the model.  Model names are unique per workspace.  Models that are uploaded with the same name are assigned as a new **version** of the model. |
|`path` | `string` (*Required*) | The path to the model file being uploaded. |
|`framework` |`string` (*Upload Method Optional, Arbitrary Python model Required*) | Set as `Framework.CUSTOM`. |
|`input_schema` | `pyarrow.lib.Schema` (*Upload Method Optional, Arbitrary Python model Required*) | The input schema in Apache Arrow schema format. |
|`output_schema` | `pyarrow.lib.Schema` (*Upload Method Optional, Arbitrary Python model Required*) | The output schema in Apache Arrow schema format. |
| `convert_wait` | `bool` (*Upload Method Optional, Arbitrary Python model Optional*) (*Default: True*) | <ul><li>**True**: Waits in the script for the model conversion completion.</li><li>**False**:  Proceeds with the script without waiting for the model conversion process to display complete. |

Once the upload process starts, the model is containerized by the Wallaroo instance.  This process may take up to 10 minutes.

#### Upload Arbitrary Python Model Return

The following is returned with a successful model upload and conversion.

| Field | Type | Description |
|---|---|---|
| `name` | string | The name of the model. |
| `version` | string | The model version as a unique UUID. |
| `file_name` | string | The file name of the model as stored in Wallaroo. |
| `image_path` | string | The image used to deploy the model in the Wallaroo engine. |
| `last_update_time` | DateTime | When the model was last updated. |

For our example, we'll start with setting the `input_schema` and `output_schema` that is expected by our `ImageClustering._predict()` method.

In [7]:
input_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=32
        ),
        list_size=32
    )),
])

output_schema = pa.schema([
    pa.field('predictions', pa.int64()),
])

### Upload Model

Now we'll upload our model.  The framework is `Framework.CUSTOM` for arbitrary Python models, and we'll specify the input and output schemas for the upload.

In [9]:
model = wl.upload_model(model_name, 
                        model_file_name, 
                        framework=Framework.CUSTOM, 
                        input_schema=input_schema, 
                        output_schema=output_schema, 
                        convert_wait=True)
model

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime...
Model is attempting loading to a container runtime.....................successful

Ready


0,1
Name,vgg16-clustering
Version,45e99bd0-565b-415e-b602-a541798e44d3
File Name,model-auto-conversion-BYOP-vgg16-clustering.zip
SHA,7bb3362b1768c92ea7e593451b2b8913d3b7616c19fd8d25b73fb6990f9283e0
Status,ready
Image Path,proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3731
Updated At,2023-29-Aug 16:06:09


We can verify that our model was uploaded by listing the models uploaded to our Wallaroo instance with the `list_models()` command.  Note that since we uploaded this model before, we now have different versions of it we can use for our testing.

### Create a Pipeline

With our model uploaded, time to create our pipeline and deploy it.

* **NOTE**:  Pipeline names must be unique within the workspace.  If two pipelines are assigned the same name, the new pipeline is created as a new **version** of the pipeline.

For more information, see the [Wallaroo SDK Essentials Guide: Pipeline Management](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipelines/wallaroo-sdk-essentials-pipeline/)

In [10]:
pipeline = get_pipeline(pipeline_name)
pipeline.clear()

0,1
name,arm-arbitrary-python-example
created,2023-08-29 16:06:12.420072+00:00
last_updated,2023-08-29 16:06:12.420072+00:00
deployed,(none)
tags,
versions,024a04f4-fc51-4b52-9006-6bb8c372a996
steps,
published,False


Now our pipeline is set.  Let's add a single **step** to it - in this case, our `ccfraud_model` that we uploaded to our workspace.

In [11]:
pipeline.add_model_step(model)

0,1
name,arm-arbitrary-python-example
created,2023-08-29 16:06:12.420072+00:00
last_updated,2023-08-29 16:06:12.420072+00:00
deployed,(none)
tags,
versions,024a04f4-fc51-4b52-9006-6bb8c372a996
steps,
published,False


### Set Standard Pipeline Deployment Architecture

Now we can set our pipeline deployment architecture with the `arch(wallaroo.engine_config.Architecture)` parameter.  By default, the deployment configuration architecture default to `wallaroo.engine_config.Architecture.X86`.

For this example, we will create a pipeline deployment and leave the `arch` out so it will default to `X86`.

This deployment will be applied to the pipeline deployment.

For more information, see the [Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipelines/wallaroo-sdk-essentials-pipeline-deployment-config/).

In [38]:

deployment_config = (wallaroo.deployment_config
                     .DeploymentConfigBuilder()
                     .cpus(0.25)
                     .memory('1Gi')
                     .sidekick_cpus(model, 1)
                     .sidekick_memory(model, '4Gi')
                     .build()
                     )
deployment_config

{'engine': {'cpu': 0.25,
  'resources': {'limits': {'cpu': 0.25, 'memory': '1Gi'},
   'requests': {'cpu': 0.25, 'memory': '1Gi'}}},
 'enginelb': {},
 'engineAux': {'images': {'vgg16-clustering-13': {'resources': {'limits': {'cpu': 1,
      'memory': '4Gi'},
     'requests': {'cpu': 1, 'memory': '4Gi'}}}}},
 'node_selector': {}}

In [41]:
pipeline.deploy(deployment_config=deployment_config)

0,1
name,arm-arbitrary-python-example
created,2023-08-29 16:06:12.420072+00:00
last_updated,2023-08-29 16:22:39.677269+00:00
deployed,True
tags,
versions,"c4fa348b-3be0-4c6e-b660-f261686d9bff, 4e3aa3b9-9a07-4290-8f27-24626b77eef3, 632b94ae-d35b-462e-8788-9aef683df034, 71769fd7-ff90-4193-ac95-5387cf16c003, 61aef8d3-9cb2-4519-88ea-fcfbdab051be, 5280795c-172b-4614-a166-5b8eac70888c, 024a04f4-fc51-4b52-9006-6bb8c372a996"
steps,vgg16-clustering
published,False


### Inference on Standard Architecture

We will now perform an inference with a randomly generated dataframe, and store that for use later in our testing.

For more information, see the [Wallaroo SDK Essentials Guide: Inference Management](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-inferences/)

In [42]:
input_data = {
        "images": [np.random.randint(0, 256, (32, 32, 3), dtype=np.uint8)] * 2,
}
dataframe = pd.DataFrame(input_data)

In [43]:
start_time = time.time()
outputs = pipeline.infer(dataframe)
end_time = time.time()
x86_time = end_time - start_time

display(outputs)

Unnamed: 0,time,in.images,out.predictions,check_failures
0,2023-08-29 16:23:08.136,"[91, 219, 82, 126, 100, 125, 82, 116, 63, 108, 73, 180, 65, 36, 47, 157, 226, 41, 230, 159, 212, 9, 31, 183, 234, 227, 148, 86, 98, 187, 147, 21, 41, 171, 251, 86, 57, 223, 205, 96, 132, 84, 248, 41, 65, 36, 128, 60, 229, 218, 150, 96, 138, 145, 230, 83, 231, 240, 176, 131, 174, 251, 49, 100, 20, 112, 63, 161, 56, 145, 155, 201, 4, 182, 211, 151, 169, 62, 77, 127, 188, 232, 188, 115, 219, 241, 31, 191, 29, 58, 233, 28, 241, 161, 174, 134, 21, 3, 18, 6, ...]",1,0
1,2023-08-29 16:23:08.136,"[91, 219, 82, 126, 100, 125, 82, 116, 63, 108, 73, 180, 65, 36, 47, 157, 226, 41, 230, 159, 212, 9, 31, 183, 234, 227, 148, 86, 98, 187, 147, 21, 41, 171, 251, 86, 57, 223, 205, 96, 132, 84, 248, 41, 65, 36, 128, 60, 229, 218, 150, 96, 138, 145, 230, 83, 231, 240, 176, 131, 174, 251, 49, 100, 20, 112, 63, 161, 56, 145, 155, 201, 4, 182, 211, 151, 169, 62, 77, 127, 188, 232, 188, 115, 219, 241, 31, 191, 29, 58, 233, 28, 241, 161, 174, 134, 21, 3, 18, 6, ...]",1,0


### Set Arm Pipeline Deployment Architecture

Now we can set our pipeline deployment architecture with the `arch(wallaroo.engine_config.Architecture)` parameter to `wallaroo.engine_config.Architecture.ARM`, then redeploy our pipeline.  This will allocate ARM based virtual machines with the same configuration as earlier in the number of CPUs and RAM - the only difference will be the underlying architecture.

In [34]:
from wallaroo.engine_config import Architecture
deployment_config_arm = (wallaroo.deployment_config
                         .DeploymentConfigBuilder()
                         .cpus(0.25)
                         .memory('1Gi')
                         .sidekick_cpus(model, 1)
                         .sidekick_memory(model, '4Gi')
                         .sidekick_arch(model, Architecture.ARM)
                         .build()
                         )
deployment_config_arm

{'engine': {'cpu': 0.25,
  'resources': {'limits': {'cpu': 0.25, 'memory': '1Gi'},
   'requests': {'cpu': 0.25, 'memory': '1Gi'}}},
 'enginelb': {},
 'engineAux': {'images': {'vgg16-clustering-13': {'resources': {'limits': {'cpu': 1,
      'memory': '4Gi'},
     'requests': {'cpu': 1, 'memory': '4Gi'}},
    'arch': 'arm'}}},
 'node_selector': {}}

In [45]:
pipeline.undeploy()
pipeline.deploy(deployment_config=deployment_config_arm)

WaitForDeployError: Deployment failed. See status for details.
Status: {'status': 'Error', 'details': [], 'engines': [{'ip': '10.244.3.13', 'name': 'engine-79d4985cf5-kjdkv', 'status': 'Running', 'reason': None, 'details': [], 'pipeline_statuses': {'pipelines': [{'id': 'arm-arbitrary-python-example', 'status': 'Running'}]}, 'model_statuses': {'models': [{'name': 'vgg16-clustering', 'version': '45e99bd0-565b-415e-b602-a541798e44d3', 'sha': '7bb3362b1768c92ea7e593451b2b8913d3b7616c19fd8d25b73fb6990f9283e0', 'status': 'Running'}]}}], 'engine_lbs': [{'ip': '10.244.3.12', 'name': 'engine-lb-584f54c899-7gxlv', 'status': 'Running', 'reason': None, 'details': []}], 'sidekicks': [{'ip': '10.244.2.9', 'name': 'engine-sidekick-vgg16-clustering-13-74ccbbcc8c-gjmxl', 'status': 'Failed', 'reason': 'CrashLoopBackOff', 'details': ['containers with unready status: [engine-sidekick-vgg16-clustering-13]', 'containers with unready status: [engine-sidekick-vgg16-clustering-13]'], 'statuses': None}]}

In [47]:
pipeline.status()

{'status': 'Error',
 'details': [],
 'engines': [{'ip': '10.244.3.13',
   'name': 'engine-79d4985cf5-kjdkv',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'arm-arbitrary-python-example',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'vgg16-clustering',
      'version': '45e99bd0-565b-415e-b602-a541798e44d3',
      'sha': '7bb3362b1768c92ea7e593451b2b8913d3b7616c19fd8d25b73fb6990f9283e0',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.3.12',
   'name': 'engine-lb-584f54c899-7gxlv',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': [{'ip': '10.244.2.9',
   'name': 'engine-sidekick-vgg16-clustering-13-74ccbbcc8c-gjmxl',
   'status': 'Failed',
   'reason': 'CrashLoopBackOff',
   'details': ['containers with unready status: [engine-sidekick-vgg16-clustering-13]',
    'containers with unready status: [engine-sidekick-vgg16-clustering-13]'],
   'statuses': None}]}

### Inference with ARM

We'll do the exact same inference with the exact same file and display the same results, storing how long it takes to perform the inference under the arm processors.

In [None]:
start_time = time.time()
outputs = pipeline.infer(dataframe)
end_time = time.time()
arm_time = end_time - start_time

display(outputs)

Unnamed: 0,time,in.images,out.predictions,check_failures
0,2023-08-29 16:09:55.944,"[117, 100, 246, 212, 24, 191, 40, 173, 123, 180, 4, 221, 68, 51, 28, 201, 90, 117, 244, 74, 163, 202, 47, 223, 38, 143, 149, 208, 43, 111, 251, 168, 67, 211, 187, 18, 118, 164, 86, 115, 115, 144, 37, 34, 108, 84, 252, 102, 126, 125, 70, 65, 120, 193, 196, 33, 11, 25, 30, 126, 48, 150, 196, 212, 197, 106, 29, 161, 104, 43, 150, 110, 18, 170, 54, 72, 17, 236, 155, 185, 212, 102, 240, 12, 146, 45, 129, 55, 139, 33, 67, 242, 236, 18, 180, 211, 245, 171, 36, 182, ...]",1,0
1,2023-08-29 16:09:55.944,"[117, 100, 246, 212, 24, 191, 40, 173, 123, 180, 4, 221, 68, 51, 28, 201, 90, 117, 244, 74, 163, 202, 47, 223, 38, 143, 149, 208, 43, 111, 251, 168, 67, 211, 187, 18, 118, 164, 86, 115, 115, 144, 37, 34, 108, 84, 252, 102, 126, 125, 70, 65, 120, 193, 196, 33, 11, 25, 30, 126, 48, 150, 196, 212, 197, 106, 29, 161, 104, 43, 150, 110, 18, 170, 54, 72, 17, 236, 155, 185, 212, 102, 240, 12, 146, 45, 129, 55, 139, 33, 67, 242, 236, 18, 180, 211, 245, 171, 36, 182, ...]",1,0


### Compare Differences

We will now compare the results on the standard architecture versus ARM.  Typically, ARM delivers a 15% improvement on inference times while requireing less power and cost requirements.

In [None]:
display(f"Standard architecture: {x86_time}")
display(f"ARM architecture: {arm_time}")

'Standard architecture: 0.377669095993042'

'ARM architecture: 0.5871620178222656'

With our work in the pipeline done, we'll undeploy it to get back our resources from the Kubernetes cluster.  If we keep the same settings we can redeploy the pipeline with the same configuration in the future.

In [None]:
pipeline.undeploy()