This tutorial and the assets can be downloaded as part of the [Wallaroo Tutorials repository](https://github.com/WallarooLabs/Wallaroo_Tutorials/tree/main/wallaroo-model-cookbooks/hf-clip-vit-base).

## CLIP ViT-B/32 Transformer Demonstration with Wallaroo

The following tutorial demonstrates deploying and performing sample inferences with the Hugging Face CLIP ViT-B/32 Transformer model.

### Prerequisites

This tutorial is geared towards the Wallaroo version 2023.2.1 and above.  The model `clip-vit-base-patch-32.zip` must be downloaded and placed into the `./models` directory.  This is available from the following URL:

[https://storage.googleapis.com/wallaroo-public-data/hf-clip-vit-b32/clip-vit-base-patch-32.zip](https://storage.googleapis.com/wallaroo-public-data/hf-clip-vit-b32/clip-vit-base-patch-32.zip)

If performing this tutorial from outside the Wallaroo JupyterHub environment, install the [Wallaroo SDK](https://pypi.org/project/wallaroo/).

## Steps

### Imports

The first step is to import the libraries used for the example.

In [1]:
import json
import os
import requests

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework
from wallaroo.object import EntityNotFoundError

import pyarrow as pa
import numpy as np
import pandas as pd

from PIL import Image

### Connect to the Wallaroo Instance

The first step is to connect to Wallaroo through the Wallaroo client.  The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.

This is accomplished using the `wallaroo.Client()` command, which provides a URL to grant the SDK permission to your specific Wallaroo environment.  When displayed, enter the URL into a browser and confirm permissions.  Store the connection into a variable that can be referenced later.

If logging into the Wallaroo instance through the internal JupyterHub service, use `wl = wallaroo.Client()`.  For more information on Wallaroo Client settings, see the [Client Connection guide](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-client/).

In [2]:
wl = wallaroo.Client()

### Set Workspace and Pipeline

The next step is to create the Wallaroo workspace and pipeline used for the inference requests.

* References
  * [Wallaroo SDK Essentials Guide: Workspace Management](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-workspace/)
  * [Wallaroo SDK Essentials Guide: Pipeline Management](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipelines/wallaroo-sdk-essentials-pipeline/)

In [3]:
# return the workspace called <name> through the Wallaroo client.

def get_workspace(name, client):
    workspace = None
    for ws in client.list_workspaces():
        if ws.name() == name:
            workspace= ws
            return workspace
    # if no workspaces were found
    if workspace==None:
        workspace = wl.create_workspace(name)
    return workspace

In [4]:
# create the workspace and pipeline

workspace_name = 'clip-demo'
pipeline_name = 'clip-demo'


workspace = get_workspace(workspace_name, wl)

wl.set_current_workspace(workspace)
display(wl.get_current_workspace())

pipeline = wl.build_pipeline(pipeline_name)
pipeline

{'name': 'clip-demo', 'id': 19, 'archived': False, 'created_by': '92b0e5e2-b5de-46af-baa6-0a86c702cfb4', 'created_at': '2024-02-15T21:17:56.301345+00:00', 'models': [{'name': 'clip-vit', 'versions': 3, 'owner_id': '""', 'last_update_time': datetime.datetime(2024, 2, 15, 21, 47, 56, 478712, tzinfo=tzutc()), 'created_at': datetime.datetime(2024, 2, 15, 21, 18, 14, 886516, tzinfo=tzutc())}], 'pipelines': [{'name': 'clip-demo', 'create_time': datetime.datetime(2024, 2, 15, 21, 17, 56, 593808, tzinfo=tzutc()), 'definition': '[]'}]}

0,1
name,clip-demo
created,2024-02-15 21:17:56.593808+00:00
last_updated,2024-02-15 22:12:16.419873+00:00
deployed,(none)
arch,
tags,
versions,"52b8dd4d-572e-4754-8087-3de896b1d1f9, 44134c94-2f06-4123-8d48-6c60115c42ca"
steps,
published,False


### Configure and Upload Model

The 🤗 Hugging Face model is uploaded to Wallaroo by defining the input and output schema, and specifying the model's framework as `wallaroo.framework.Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION`.

The data schemas are defined in Apache PyArrow Schema format.

The model is converted to the Wallaroo Containerized runtime after the upload is complete.

* References
  * [Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Hugging Face](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-model-uploads/wallaroo-sdk-model-upload-hugging-face/)

In [5]:
input_schema = pa.schema([
    pa.field('inputs', # required, fixed image dimensions
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640 
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=4)), # required, equivalent to `options` in the provided demo
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=4)), # has to be same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=4)), # has to be same as number of candidate labels
])

### Upload Model

In [6]:
model = wl.upload_model('clip-vit', './models/clip-vit-base-patch-32.zip', 
                        framework=Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION, 
                        input_schema=input_schema, 
                        output_schema=output_schema)
model

Waiting for model loading - this will take up to 10.0min.
Model is pending loading to a container runtime..
Model is attempting loading to a container runtime..............................................successful

Ready


0,1
Name,clip-vit
Version,d6a5a1c2-0584-40df-b94e-99c93f2b2832
File Name,clip-vit-base-patch-32.zip
SHA,4efc24685a14e1682301cc0085b9db931aeb5f3f8247854bedc6863275ed0646
Status,ready
Image Path,proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.4.1-4514
Architecture,
Updated At,2024-15-Feb 22:16:37


### Deploy Pipeline

With the model uploaded and prepared, we add the model as a pipeline step and deploy it.  For this example, we will allocate 4 Gi of RAM and 1 CPU to the model's use through the pipeline deployment configuration.

* References
  * [Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipelines/wallaroo-sdk-essentials-pipeline-deployment-config/)

In [9]:
deployment_config = wallaroo.DeploymentConfigBuilder() \
    .cpus(.25).memory('1Gi') \
    .sidekick_memory(model, '4Gi') \
    .sidekick_cpus(model, 1.0) \
    .build()

The pipeline is deployed with the specified engine deployment.

Because the model is converted to the Wallaroo Containerized Runtime, the deployment step may timeout with the `status` still as `Starting`.  If this occurs, wait an additional 60 seconds, then run the `pipeline.status()` cell.  Once the status is `Running`, the rest of the tutorial can proceed.

In [10]:
pipeline.clear()
pipeline.add_model_step(model)
pipeline.deploy(deployment_config=deployment_config)

 ok


0,1
name,clip-demo
created,2024-02-15 21:17:56.593808+00:00
last_updated,2024-02-15 22:21:06.549681+00:00
deployed,True
arch,
tags,
versions,"5d9fcea3-49af-4bec-8e78-7ac9aeaabb5f, a270a215-2fe6-457e-8d6a-ae3c73ce78a1, 52b8dd4d-572e-4754-8087-3de896b1d1f9, 44134c94-2f06-4123-8d48-6c60115c42ca"
steps,clip-vit
published,False


In [17]:
pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.100.1.169',
   'name': 'engine-6bffc9f549-7ngg6',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'clip-demo',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'clip-vit',
      'version': 'd6a5a1c2-0584-40df-b94e-99c93f2b2832',
      'sha': '4efc24685a14e1682301cc0085b9db931aeb5f3f8247854bedc6863275ed0646',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.100.0.69',
   'name': 'engine-lb-dcd9c8cd7-qkl6c',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': [{'ip': '10.100.2.231',
   'name': 'engine-sidekick-clip-vit-24-54bdd55ff5-rld5r',
   'status': 'Running',
   'reason': None,
   'details': [],
   'statuses': '\n'}]}

### Run Inference

We verify the pipeline is deployed by checking the `status()`.

The sample images in the `./data` directory are converted into numpy arrays, and the candidate labels added as inputs.  Both are set as DataFrame arrays where the field `inputs` are the image values, and `candidate_labels` the labels.

In [12]:
image_paths = [
    "./data/bear-in-tree.jpg",
    "./data/elephant-and-zebras.jpg",
    "./data/horse-and-dogs.jpg",
    "./data/kittens.jpg",
    "./data/remote-monitor.jpg"
]
images = []

for iu in image_paths:
    image = Image.open(iu)
    image = image.resize((640, 480)) # fixed image dimensions
    images.append(np.array(image))

dataframe = pd.DataFrame({"images": images})

In [13]:
input_data = {
        "inputs": images,
        "candidate_labels": [["cat", "dog", "horse", "elephant"]] * 5,
}
dataframe = pd.DataFrame(input_data)
dataframe

Unnamed: 0,inputs,candidate_labels
0,"[[[60, 62, 61], [62, 64, 63], [67, 69, 68], [7...","[cat, dog, horse, elephant]"
1,"[[[228, 235, 241], [229, 236, 242], [230, 237,...","[cat, dog, horse, elephant]"
2,"[[[177, 177, 177], [177, 177, 177], [177, 177,...","[cat, dog, horse, elephant]"
3,"[[[140, 25, 56], [144, 25, 67], [146, 24, 73],...","[cat, dog, horse, elephant]"
4,"[[[24, 20, 11], [22, 18, 9], [18, 14, 5], [21,...","[cat, dog, horse, elephant]"


### Inference Outputs

The inference is run, and the labels with their corresponding confidence values for each label are mapped to `out.label` and `out.score` for each image.

In [21]:
results = pipeline.infer(dataframe, timeout=600)
pd.set_option('display.max_colwidth', None)
display(results)

Unnamed: 0,time,in.candidate_labels,in.inputs,out.label,out.score,anomaly.count
0,2024-02-15 22:24:56.801,"[cat, dog, horse, elephant]","[60, 62, 61, 62, 64, 63, 67, 69, 68, 72, 74, 73, 76, 78, 77, 77, 79, 78, 76, 78, 77, 74, 76, 75, 73, 75, 74, 75, 77, 76, 79, 81, 80, 83, 85, 84, 83, 85, 84, 83, 85, 84, 85, 87, 86, 87, 89, 88, 88, 90, 89, 88, 90, 89, 88, 89, 89, 88, 88, 88, 88, 88, 88, 88, 88, 88, 90, 90, 90, 91, 91, 91, 93, 93, 93, 94, 94, 94, 95, 95, 95, 94, 94, 94, 96, 96, 96, 97, 97, 97, 100, 100, 100, 102, 102, 102, 105, 105, 105, 106, ...]","[elephant, dog, horse, cat]","[0.4146825075149536, 0.34838539361953735, 0.1285744309425354, 0.10835769772529602]",0
1,2024-02-15 22:24:56.801,"[cat, dog, horse, elephant]","[228, 235, 241, 229, 236, 242, 230, 237, 243, 230, 237, 243, 231, 238, 244, 232, 237, 243, 232, 237, 243, 231, 236, 242, 232, 237, 243, 233, 238, 244, 233, 238, 244, 233, 238, 244, 233, 238, 244, 234, 239, 245, 234, 239, 245, 234, 239, 245, 235, 240, 246, 235, 240, 246, 235, 240, 246, 235, 240, 246, 235, 240, 246, 235, 240, 244, 235, 240, 244, 235, 240, 244, 236, 241, 245, 236, 241, 245, 236, 241, 245, 236, 241, 245, 236, 241, 245, 236, 241, 245, 236, 241, 245, 236, 241, 245, 235, 240, 244, 235, ...]","[elephant, horse, dog, cat]","[0.9981434345245361, 0.001765842898748815, 6.823761941632256e-05, 2.2441257897298783e-05]",0
2,2024-02-15 22:24:56.801,"[cat, dog, horse, elephant]","[177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 177, 176, 176, 176, 176, 176, 176, 176, 176, 176, 176, 176, 176, 177, 177, 177, 178, 178, 178, 175, 177, 176, 175, 177, 176, 175, 177, 176, 175, 177, 176, 175, 177, 176, 175, 177, 176, 175, 176, 178, 175, 176, 178, 174, 175, 177, 174, 175, 177, 174, 175, 177, 174, 175, 177, 174, 175, 177, 174, 175, 177, 174, 175, 177, 174, 175, 177, 175, 176, 180, 175, ...]","[horse, dog, elephant, cat]","[0.7596790790557861, 0.2171126902103424, 0.020392922684550285, 0.0028152712620794773]",0
3,2024-02-15 22:24:56.801,"[cat, dog, horse, elephant]","[140, 25, 56, 144, 25, 67, 146, 24, 73, 142, 19, 65, 144, 18, 66, 154, 25, 81, 157, 28, 82, 145, 18, 63, 139, 13, 37, 155, 26, 64, 157, 30, 75, 159, 32, 77, 151, 22, 78, 152, 23, 88, 158, 30, 89, 142, 17, 59, 162, 31, 73, 162, 35, 65, 154, 29, 61, 151, 26, 70, 154, 29, 73, 146, 22, 56, 153, 26, 71, 160, 27, 93, 144, 22, 73, 156, 31, 87, 153, 22, 88, 174, 41, 96, 167, 32, 73, 163, 23, 86, 161, 20, 89, 171, 34, 76, 165, 34, 78, 157, ...]","[cat, dog, elephant, horse]","[0.9870226979255676, 0.006646943278610706, 0.0032716323621571064, 0.003058752976357937]",0
4,2024-02-15 22:24:56.801,"[cat, dog, horse, elephant]","[24, 20, 11, 22, 18, 9, 18, 14, 5, 21, 17, 8, 22, 18, 9, 21, 17, 8, 24, 19, 13, 16, 11, 5, 18, 15, 8, 17, 14, 7, 15, 12, 5, 18, 15, 8, 21, 18, 11, 18, 15, 8, 15, 12, 7, 20, 17, 12, 18, 15, 8, 15, 12, 5, 17, 14, 7, 18, 15, 8, 17, 14, 5, 18, 15, 6, 18, 15, 6, 14, 11, 2, 16, 13, 4, 16, 13, 4, 21, 18, 11, 16, 13, 6, 14, 11, 6, 16, 13, 8, 14, 11, 6, 19, 16, 11, 17, 14, 9, 14, ...]","[dog, horse, cat, elephant]","[0.5713969469070435, 0.17229345440864563, 0.15523967146873474, 0.10106996446847916]",0


### Undeploy Pipelines

With the tutorial complete, the pipeline is undeployed and the resources returned back to the cluster.

In [22]:
pipeline.undeploy()

Waiting for undeployment - this will take up to 45s .................................... ok


0,1
name,clip-demo
created,2024-02-15 21:17:56.593808+00:00
last_updated,2024-02-15 22:21:06.549681+00:00
deployed,False
arch,
tags,
versions,"5d9fcea3-49af-4bec-8e78-7ac9aeaabb5f, a270a215-2fe6-457e-8d6a-ae3c73ce78a1, 52b8dd4d-572e-4754-8087-3de896b1d1f9, 44134c94-2f06-4123-8d48-6c60115c42ca"
steps,clip-vit
published,False
