# Edge Demo

This notebook will walk through building a computer vision (CV) pipeline in Wallaroo, deploying it to the local cluster for testing, and then publishing it for edge deployment.

# Package imports
Here we will import the libraries needed for this notebook

In [20]:
import wallaroo
import os
import pandas as pd
import sys
 
# setting path
from wallaroo_demo_utils import WallarooDemoUtils 

w_demo = WallarooDemoUtils()
os.chdir("/home/jovyan/edge-cv-demo")

## Securily connect this notebook to the Wallaroo Cluster
We can now us the wallaroo library to set up a connection to the wallaroo cluster

In [21]:
wl = wallaroo.Client()
#w_demo.set_workspace('edge-cv-demo', wl)

Please log into the following URL in a web browser:

	https://keycloak.autoscale-uat-ee.wallaroo.dev/auth/realms/master/device?user_code=LRWS-FKTK

Login successful!


# Upload and package the model

When a model is uploaded to a Wallaroo cluster, it is optimized and packaged to make it ready to run as part of a pipeline. In many times, the Wallaroo Server can natively run a model without any Python overhead. In other cases, such as a Python script, a custom Python environment will be automatically generated. This is comparable to the process of "containerizing" a model by adding a small HTTP server and other wrapping around it.

In [22]:
model = wl.upload_model('resnet-50', "./models/resnet50_v1.onnx", 
                        framework = wallaroo.framework.Framework.ONNX)


### Reserve resources needed for this pipeline 

Before deploying an inference engine we need to tell wallaroo what resources it will need.
To do this we will use the wallaroo DeploymentConfigBuilder() and fill in the options listed below to determine what the properties of our inference engine will be

We will be testing this deployment for an edge scenario, so the resource specifications are kept small -- what's the minimum needed to meet the expected load on the planned hardware.

- cpus - 4 => allow the engine to use 4 CPU cores when running the neural net
- memory - 0.5Gi => each inference engine will have 512 MB of memory, which is plenty for processing a single image at a time.



In [39]:
deployment_config = wallaroo.DeploymentConfigBuilder() \
    .cpus(15)\
    .memory("4Gi")\
    .arch(wallaroo.engine_config.Architecture.X86)\
    .build()

# Simulated edge deployment
Deploy the uploaded model into the current Kubernetes environment using the specified resource constraints. This is a "simulated edge" deploy in that we try to mimic the edge hardware as closely as possible. This can be done by selecting an appropriate VM size from the cloud provider when possible. 


In [40]:
pipeline = wl.build_pipeline('pipelinecvx86')
pipeline.add_model_step(model)

pipeline.deploy(deployment_config = deployment_config)

Waiting for deployment - this will take up to 45s ............. ok


0,1
name,pipelinecvx86
created,2023-08-18 18:46:48.663415+00:00
last_updated,2023-08-21 15:27:26.484014+00:00
deployed,True
tags,
versions,"8921bb78-d941-4f6d-bef9-38056dfbb5e9, 0505b432-a400-4132-b313-729fe38177ac, a2b6b538-7715-47f3-8f25-cffd3ec15512, 26d959f7-cd12-4d16-bc99-fac603539046, f1bc2d0d-9561-4ef7-ac5e-c492590dcbdf, 6a93fd35-b4c1-41a0-911d-999d78d2abda, 00b2b42b-351d-48ee-9cf3-5387bd55d1eb, adef1abe-d141-43b1-8b35-27426a69459f, e65e3b19-4a62-450e-a497-7fe71fd26698, 20f3b67e-abdb-4a52-be79-54e2c502b3bb, f9c24491-a6f6-4d49-9007-fea62e734f5e, e818c81c-3343-4a1c-80d6-de9b181b9908"
steps,resnet-50
published,True


In [41]:
pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.3.137',
   'name': 'engine-769556958-fm7gz',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'pipelinecvx86',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'resnet-50',
      'version': '069eb4f0-dab0-4915-9ce5-cdbd9c9bf6e8',
      'sha': 'c6c8869645962e7711132a7e17aced2ac0f60dcdc2c7faa79b2de73847a87984',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.2.99',
   'name': 'engine-lb-584f54c899-nvhg2',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}


### Run inference for a single image

A single image, encoded using the Apache Arrow format, is sent to the deployed pipeline. Arrow is used here because, as a binary protocol, there is far lower network and compute overhead than using JSON. The Wallaroo Server engine accepts both JSON and Apache Arrow formats.

In [37]:
import pyarrow as pa

with pa.ipc.open_file("image_224x224.arrow") as f:
    image = f.read_all()

for _ in range(10):
    results = pipeline.infer(image, dataset=["*", "metadata.elapsed"])

iter = 3
elapsed = 0
for _ in range(iter):
    results = pipeline.infer(image, dataset=["*", "metadata.elapsed"])
    elapsed += results['metadata.elapsed'][0].as_py()[1] / 1000000.0

print(f"Average elapsed: {elapsed/iter} ms")

Average elapsed: 33.99299166666666 ms


## Publish the pipeline for edge deployment

It worked! For a demo, we'll take working once as "testing". So now that we've tested our pipeline, we are ready to publish it for edge deployment. Publishing it means assembling all of the configuration files and model assets and pushing them to an OCI (aka Docker) repository. This came repository can hold the Wallaroo Server container image, any other container images needed for the edge system, plus any of the model pipelines.

The Wallaroo Server is available on the same repo at:
   oci.wallaroo.io/wallaroo-dev/wallaroo-server:latest

In [17]:
res = pipeline.publish(deployment_config)
res

Waiting for pipeline publish... It may take up to 60 sec.
Pipeline is Publishing.....Published.


0,1
ID,4
Pipeline Version ID,6
Status,Published
Engine URL,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/engine:v2023.3.0-main-3707
Pipeline URL,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/pipelines/pipelinecvx86:00b2b42b-351d-48ee-9cf3-5387bd55d1eb
Helm Chart URL,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/charts/pipelinecvx86
Helm Chart Reference,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/charts@sha256:ee9828d70df3775b0fec57503510a52982d26af0ee7ce610a246fd420446a9f0
Helm Chart Version,0.0.1-00b2b42b-351d-48ee-9cf3-5387bd55d1eb
Engine Config,"{'engine': {'resources': {'limits': {'cpu': 15.0, 'memory': '4Gi'}, 'requests': {'cpu': 15.0, 'memory': '4Gi'}}}, 'engineAux': {'images': {}}, 'enginelb': {}}"
Created By,cfa00d99-831f-434b-8617-97f2081d7fdf


In [18]:
wl.list_pipelines()

name,created,last_updated,deployed,tags,versions,steps,published
pipelinecvx86,2023-18-Aug 18:46:48,2023-19-Aug 00:42:52,True,,"00b2b42b-351d-48ee-9cf3-5387bd55d1eb, adef1abe-d141-43b1-8b35-27426a69459f, e65e3b19-4a62-450e-a497-7fe71fd26698, 20f3b67e-abdb-4a52-be79-54e2c502b3bb, f9c24491-a6f6-4d49-9007-fea62e734f5e, e818c81c-3343-4a1c-80d6-de9b181b9908",resnet-50,True


In [19]:
pipeline.publishes()

id,pipeline_version_id,engine_url,pipeline_url,created_by,created_at,updated_at
1,3,,,cfa00d99-831f-434b-8617-97f2081d7fdf,2023-18-Aug 18:47:25,2023-18-Aug 18:47:25
2,4,,,cfa00d99-831f-434b-8617-97f2081d7fdf,2023-18-Aug 19:08:02,2023-18-Aug 19:08:02
3,5,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/engine:v2023.3.0-main-3707,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/pipelines/pipelinecvx86:adef1abe-d141-43b1-8b35-27426a69459f,cfa00d99-831f-434b-8617-97f2081d7fdf,2023-18-Aug 21:32:37,2023-18-Aug 21:32:37
4,6,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/engine:v2023.3.0-main-3707,us-central1-docker.pkg.dev/wallaroo-dev-253816/uat/pipelines/pipelinecvx86:00b2b42b-351d-48ee-9cf3-5387bd55d1eb,cfa00d99-831f-434b-8617-97f2081d7fdf,2023-19-Aug 00:42:52,2023-19-Aug 00:42:52
