# Testing the Model Deployment

After deploying the model using RHOAI Model Serving, we'd like to test the model deployment by sending images to the model server for real-time inference.

In this notebook we'll review how to consume the model through the RHOAI Model Server.

We'll start by importing the preprocessing and rendering functions that we have worked with in the previous notebook.

In [1]:
import numpy as np

from requests import post
import torch

Let's prepare one of our sample images as a test sample.

In [36]:
prediction_url = 'https://spleen2-user1.apps.cluster-c8kwf.c8kwf.sandbox2825.opentlc.com/v2/models/spleen2/infer'

We also need to know the class labels of the objects the model has been trained to detect. In case of the default YOLO v5 model, we can take the default class labels defined in the _classes_ module.
If you want to test a custom model, replace `default_class_labels` with the list of your custom class labels, e.g.

`['Laptop', 'Computer keyboard', 'Table']`.

In [50]:
filename = 'data.npy'
with open(filename, 'rb') as inputfile:
    mydata = np.load(inputfile)
mydata = mydata[:, :, 0:96, 0:96, 0:96]
mydata.shape

(1, 1, 96, 96, 96)

In [45]:
import numpy as np

# Create a large array of size 160x160x160 with example data
large_array = np.random.rand(160, 160, 160)  # Example with random values

# Extract the subarray of the first 96x96x96 elements
sub_array = large_array[0:96, 0:96, 0:96]

print(sub_array.shape)

(96, 96, 96)


In [51]:
image = mydata.flatten().tolist()
for i, element in enumerate(image):
    new_element = round(element, 2)
    image[i] = new_element
image

[0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.15,
 0.03,
 0.0,
 0.0,
 0.0,
 0.05,
 0.16,
 0.26,
 0.35,
 0.45,
 0.58,
 0.62,
 0.63,
 0.79,
 1.0,
 1.0,
 1.0,
 1.0,
 0.84,
 0.72,
 0.63,
 0.58,
 0.6,
 0.62,
 0.63,
 0.63,
 0.8,
 0.93,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 0.89,
 0.78,
 0.93,
 1.0,
 1.0,
 1.0,
 1.0,
 0.92,
 0.79,
 0.72,
 0.68,
 0.64,
 0.68,
 0.66,
 0.72,
 0.9,
 1.0,
 1.0,
 0.91,
 0.74,
 0.72,
 0.83,
 0.97,
 0.95,
 0.91,
 0.9,
 0.91,
 0.86,
 0.81,
 0.79,
 0.73,
 0.64,
 0.62,
 0.72,
 0.87,
 1.0,
 1.0,
 0.84,
 0.63,
 0.65,
 0.76,
 0.84,
 0.78,
 0.72,
 0.72,
 0.72,
 0.71,
 0.73,
 0.75,
 0.7,
 0.66,
 0.68,
 0.71,
 0.82,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.0,
 0.77,
 0.49,
 0.31,
 0.22,
 0.2,
 0.19,
 0.27,
 0.46,
 0.67,
 0.83,
 0.87,
 0.9,
 0.86,
 0.91,
 1.0,
 1.0,
 1.0,
 1.0,
 0.88,
 0.74,
 0.62,
 0.62,
 0.64,
 0.71,
 0.82,
 0.96,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 1.0,
 0.99,
 0.79,
 0.7,
 0.82,
 1.0,
 1.0,
 0.97,
 0.83,
 0.75,
 0.8,
 0.8,
 0.73,
 0.64,
 0.61,
 0.

In [40]:
def serialize(image):
    payload = {
        'inputs': [
            {
                'name': 'x',
                'shape': [1, 1, 96, 96, 96],
                'datatype': 'FP32',
                'data': image,
            }
        ]
    }
    return payload

In [52]:
payload = serialize(image)
payload

{'inputs': [{'name': 'x',
   'shape': [1, 1, 96, 96, 96],
   'datatype': 'FP32',
   'data': [0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.15,
    0.03,
    0.0,
    0.0,
    0.0,
    0.05,
    0.16,
    0.26,
    0.35,
    0.45,
    0.58,
    0.62,
    0.63,
    0.79,
    1.0,
    1.0,
    1.0,
    1.0,
    0.84,
    0.72,
    0.63,
    0.58,
    0.6,
    0.62,
    0.63,
    0.63,
    0.8,
    0.93,
    1.0,
    1.0,
    1.0,
    1.0,
    1.0,
    0.89,
    0.78,
    0.93,
    1.0,
    1.0,
    1.0,
    1.0,
    0.92,
    0.79,
    0.72,
    0.68,
    0.64,
    0.68,
    0.66,
    0.72,
    0.9,
    1.0,
    1.0,
    0.91,
    0.74,
    0.72,
    0.83,
    0.97,
    0.95,
    0.91,
    0.9,
    0.91,
    0.86,
    0.81,
    0.79,
    0.73,
    0.64,
    0.62,
    0.72,
    0.87,
    1.0,
    1.0,
    0.84,
    0.63,
    0.65,
    0.76,
    0.84,
    0.78,
    0.72,
    0.72,
    0.72,
    0.71,
    0.73,
    0.75,
    0.7,
    0.66,
    0.68,
    0

Let's now send the serialized image to the model server. The inference results will also be returned in a generic JSON structure, which we can unpack straightaway. We'll also apply the post-processing function we defined in the previous notebook to extract the familiar object properties.

In [12]:
def get_model_response(payload, prediction_url, classes_count):
    raw_response = post(prediction_url, json=payload)
    try:
        response = raw_response.json()
    except:
        print(f'Failed to deserialize service response.\n'
              f'Status code: {raw_response.status_code}\n'
              f'Response body: {raw_response.text}')
    try:
        model_output = response['outputs']
    except:
        print(f'Failed to extract model output from service response.\n'
              f'Service response: {response}')
    return model_output

In [54]:
response = get_model_response(payload, prediction_url, 0)
response

[{'name': '218',
  'datatype': 'FP32',
  'shape': [1, 2, 96, 96, 96],
  'data': [-0.21949908,
   -0.061409637,
   -0.20525311,
   0.24671774,
   -0.24719861,
   -0.24798244,
   -0.10165331,
   0.6012609,
   -0.31004,
   0.1591478,
   -0.26741943,
   -0.16333982,
   -0.49930036,
   0.38839546,
   -0.27751467,
   0.24893792,
   -0.56179446,
   0.14083053,
   -0.41749385,
   0.041212186,
   -0.6365657,
   -0.008019067,
   -0.36787978,
   0.25588167,
   -0.69086045,
   -0.09167359,
   -0.5337379,
   -0.005084454,
   -0.61604553,
   0.04386577,
   -0.29922897,
   0.07106397,
   -0.5434085,
   -0.113418505,
   -0.39243126,
   -0.02382714,
   -0.5212448,
   -0.066877626,
   -0.31796172,
   0.16778171,
   -0.6717726,
   -0.04396847,
   -0.4753362,
   0.057640202,
   -0.6302313,
   0.25339353,
   -0.32642645,
   0.045834266,
   -0.63754004,
   -0.12814571,
   -0.44248468,
   0.08001108,
   -0.5820776,
   0.0036538392,
   -0.3191681,
   0.30288416,
   -0.58053154,
   0.13717988,
   -0.41014487,


Let's now visualize the result as in the previous notebook.

We were able to reproduce the object detection example from the previous notebook, so we can consume the deployed model as expected.

You can now head over to deploying the object detection application to consume this model in a real-time fashion.

After that, we'll explore offline scoring in the next notebook.