# Deploying a project for offline inference

In this notebook, we will show how to create a deployment for a project that can be used to run inference locally, using OpenVINO.

In [None]:
# As usual we will connect to the platform first, using the server details from the .env file

from geti_sdk import Geti
from geti_sdk.utils import get_server_details_from_env

geti_server_configuration = get_server_details_from_env()

geti = Geti(server_config=geti_server_configuration)

### Selecting a project for deployment
Let's list all projects in the workspace and select one for which to create a deployment

In [None]:
from geti_sdk.rest_clients import ProjectClient

project_client = ProjectClient(session=geti.session, workspace_id=geti.workspace_id)
projects = project_client.list_projects()

## Deploying the project
Let's go with the project we created in notebook [004](004_create_pipeline_project_from_dataset.ipynb): `COCO multitask animal demo`. To create a deployment, we can use the `geti.deploy_project` convenience method. This will download the active (OpenVINO) models for all tasks in the project to our local machine, so that we can use them to run inference locally.

> **NOTE**: Downloading the model data may take some time, especially models for anomaly tasks are on the order of 100 Mb in size so please be prepared to wait a bit

In [None]:
PROJECT_NAME = "COCO multitask animal demo"

Before deploying, we need to make sure that the project is trained. Otherwise it will not contain any models to deploy, and the deployment will fail.

In [None]:
from geti_sdk.demos import ensure_trained_example_project

ensure_trained_example_project(geti=geti, project_name=PROJECT_NAME);

Once we are sure that the project has trained models for each task, we can create the deployment in the cell below.

Note the `enable_explainable_ai` argument. If set to `True`, the deployment will include the necessary artifacts to run the Explainable AI (XAI) service. This will allow us to generate explanations for the predictions made by the models in the deployment.

In [None]:
deployment = geti.deploy_project(project_name=PROJECT_NAME, enable_explainable_ai=True)

### Preparing the models for inference
Now that the `deployment` is created and the models are saved to the local disk, we can load the models into memory to prepare them for inference. 

In [None]:
deployment.load_inference_models(device="CPU")

## Running inference on an image locally
Now, we can load an image as a numpy array (for instance using OpenCV) and use the `deployment.infer` method to generate a prediction for it.
The SDK contains an example image that we use for this. The path to the image is in the `EXAMPLE_IMAGE_PATH` constant, from the `geti_sdk.demos` module.

In [None]:
import time

import cv2

from geti_sdk.demos import EXAMPLE_IMAGE_PATH

numpy_image = cv2.imread(EXAMPLE_IMAGE_PATH)

# Convert to RGB channel order. All deployed models expect the image in RGB format
numpy_rgb = cv2.cvtColor(numpy_image, cv2.COLOR_BGR2RGB)

t_start = time.time()
prediction = deployment.infer(numpy_rgb)
t_elapsed = time.time() - t_start

print(f"Running local inference on image took {t_elapsed*1000:.2f} milliseconds")

### Inspecting the result
The `Prediction` object generated by `deployment.infer` is equal in structure to the predictions sent by the platform. So let's have a closer look at it. We can do so in two ways: 

1. Visualise it using the `Visualizer` utility class
2. Inspecting its properties via the `prediction.overview` property

Let's show it on the image first

In [None]:
from geti_sdk import Visualizer

visualizer = Visualizer()

result = visualizer.draw(numpy_rgb, prediction)
visualizer.show_in_notebook(result)

And by printing the prediction overview we can look inside the prediction object structure and properties.

In [None]:
print(prediction.overview)

### Explaining the prediction

If the deployment was created with the `enable_explainable_ai` argument set to `True`, we can also generate an explanation for the prediction. This can be done using the `deployment.explain` method, which does the inference as the `deployment.infer` method, but also generates saliency maps and adds them to `Prediction` object.

Let's generate an explanation for the prediction and visualise it using the `Visualizer` utility class' `explain_label` method.

In [None]:
t_start = time.time()
prediction_with_saliency_map = deployment.explain(numpy_rgb)
t_elapsed = time.time() - t_start

print(
    f"Running local inference with XAI on image took {t_elapsed*1000:.2f} milliseconds"
)

result = visualizer.explain_label(
    numpy_rgb, prediction_with_saliency_map, label_name="dog"
)
visualizer.show_in_notebook(result)

## Saving the deployment
When we create the deployment, the model data is saved to a temporary folder. We store the deployment for offline re-use later on by saving it: This will copy the model data from the temporary folder to the path we specify. If we want to run inference locally again, we can simply reload the deployment from the saved folder, without having to connect to the platform again.

In [None]:
import os

PATH_TO_DEPLOYMENT_FOLDER = os.path.join("deployments", PROJECT_NAME)

deployment.save(path_to_folder=PATH_TO_DEPLOYMENT_FOLDER)

## Loading a saved deployment
Loading a deployment that was previously saved to disk is easy and can be done without establishing a connection to the platform (or without even connecting to the internet, for that matter).

In [None]:
from geti_sdk.deployment import Deployment

offline_deployment = Deployment.from_folder(PATH_TO_DEPLOYMENT_FOLDER)

Again, to prepare the deployment for inference make sure to send the models to CPU (or whichever device you want to use)

In [None]:
offline_deployment.load_inference_models(device="CPU")

That's all there is to it! The `offline_deployment` can now be used to run inference.

# Comparing local inference and inference on the platform
As a final step, we can make a comparison between the local inference results and the predictions sent back from the platform. We will have a look at the time required for both methods, and compare the output.

In [None]:
from geti_sdk.rest_clients import ImageClient, PredictionClient

project = project_client.get_project_by_name(PROJECT_NAME)

image_client = ImageClient(
    session=geti.session, workspace_id=geti.workspace_id, project=project
)
prediction_client = PredictionClient(
    session=geti.session, workspace_id=geti.workspace_id, project=project
)

To prepare for platform inference, we have to upload the image to the platform first

In [None]:
geti_image = image_client.upload_image(numpy_image)
# Load the pixel data to visualize the image later on
geti_image.get_data(geti.session);

### Comparing inference times
Now, we can run inference locally and on the platform, and time both. We will set the prediction client to `ONLINE` mode, which means it will always generate a new prediction for the image, rather than returning cached predictions. Additionally you can set the mode to `AUTO` (which will return cached predictions if available) and re-run the cell to see the difference.

In [None]:
from geti_sdk.data_models.enums import PredictionMode

prediction_client.mode = PredictionMode.ONLINE

# Get platform prediction, and measure time required
t_start_platform = time.time()
platform_prediction = prediction_client.get_image_prediction(geti_image)
t_elapsed_platform = time.time() - t_start_platform

# Get local prediction, and measure time required
t_start_local = time.time()
local_prediction = offline_deployment.infer(numpy_rgb)
t_elapsed_local = time.time() - t_start_local

print(f"Platform prediction completed in {t_elapsed_platform*1000:.1f} milliseconds")
print(f"Local prediction completed in {t_elapsed_local*1000:.1f} milliseconds")

### Comparing inference results
The cell below will show the results from the platform prediction (top) and local prediction (bottom). The two predictions should be equal.

In [None]:
geti_image_rgb = cv2.cvtColor(geti_image.numpy, cv2.COLOR_BGR2RGB)
platform_result = visualizer.draw(geti_image_rgb, platform_prediction)
visualizer.show_in_notebook(platform_result)

local_result = visualizer.draw(numpy_rgb, local_prediction)
visualizer.show_in_notebook(result)

### Cleaning up
To clean up, we will delete the geti_image from the project again

In [None]:
image_client.delete_images([geti_image])