# Inference with RHODS Model Serving

This notebook showcases how to consume an ML model that is deployed with Red Hat OpenShift Data Science (RHODS) Model Serving. It is based on [Ultraface](https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB), a lightweight face detection model that was designed for edge computing platforms.

## Prerequisites

### Dependencies

This notebook requires the following libraries:
- numpy
- requests
- matplotlib
- opencv

For running this notebook in RHODS you will need to import a custom notebook that contains these dependencies, for instance:  
`quay.io/mmurakam/face-recognition-notebook:face-recognition-notebook-v1.0.2`

### Model

You will need to access a running inference server that serves the target model. Follow these steps to deploy the model:
1. Download the [Ultraface model ONNX file](https://github.com/mamurak/onnx-models/blob/main/vision/body_analysis/ultraface/models/version-RFB-640.onnx).
2. Upload the model to an S3 bucket that RHODS can access. Any object storage that implements the S3 interface can be used such as AWS S3, Ceph S3, Minio.
3. If not already present, create a Data Science Project in RHODS and configure a model server.
4. Select `Deploy model`. Enter a name, select framework `onnx` and choose or create a data connection that contains the required S3 credentials and settings for accessing your model bucket. Click `Deploy`.
5. Wait until the model has been deployed (`Deployed models` -> `Status` should be green). Note the `Inference endpoint` URL and the `Token secret` if you have selected token authentication (-> `Tokens`). You will need to paste these values into a cell further down this notebook.

You can now run the following cells.

## Import dependencies

The code that executes the image preprocessing and rendering as well as communication with the inference service is included in our custom `image_utils` and `face_detection` modules, respectively. Check these modules out for the implementation details.

In [None]:
from face_detection import detect_faces
from image_utils import load_and_preprocess, draw_image_and_faces

Copy and paste the `Inference endpoint` URL of your deployed model into the `prediction_url` definition. If you have configured token authentication in the model server, copy and paste the corresponding `Token secret` into the `token` definition below.

The `image_path` points to an image in your local filesystem. You can test the inference server with the provided images in the `sample-images` folder or your own images that you upload to the local filesystem.

In [None]:
prediction_url = '' # enter your Inference endpoint URL here
token = '' # enter your Token secret here, if available
image_path = 'sample-images/1.jpg'

original_image, preprocessed_image = load_and_preprocess(image_path)
faces = detect_faces(preprocessed_image, prediction_url, token)
draw_image_and_faces(original_image, *faces)

### Load test the model service

In [None]:
for i in range(100):
    detect_faces(preprocessed_image, prediction_url, token)