# Deploying a project for offline inference

> **WARNING**: The deployment-related features of this package are in an experimental
> stage, and subject to change.

In this notebook we'll show how to create a deployment for a project that can be used to run inference locally, using OpenVINO

> **NOTE**: Before running this notebook, please make sure that you've installed the required packages for local deployment. If you haven't already done so, you can install them by taking the following 
> steps:
> 1. Open a terminal
> 2. Navigate to the root folder of the sc-api-tools repo, i.e. one level up from the /notebooks folder
> 3. Activate the python environment you are using to run sc-api-tools
> 3. Run the command `pip install -r requirements-deployment.txt` to install the required packages

In [None]:
# As usual we'll connnect to the platform first, using the credentials from the .env file.

from dotenv import dotenv_values
from sc_api_tools import SCRESTClient

env_variables = dotenv_values(dotenv_path=".env")

if not env_variables:
    print("Unable to load login details from .env file, please make sure the file exists at the root of the notebooks directory.")

client = SCRESTClient(
    host=env_variables.get('HOST'),
    username=env_variables.get('USERNAME'),
    password=env_variables.get('PASSWORD')
)

### Selecting a project for deployment
Let's list all projects in the workspace and select one for which to create a deployment

In [None]:
from sc_api_tools.rest_managers import ProjectManager

project_manager = ProjectManager(session=client.session, workspace_id=client.workspace_id)
projects = project_manager.list_projects()

## Deploying the project
Let's go with the project we created in notebook [004](004_create_pipeline_project_from_dataset.ipynb): `COCO anomalous animal demo`. To create a deployment, we can use the `client.deploy_project` convenience method. This will download the active (OpenVINO) models for all tasks in the project to our local machine, so that we can use them to run inference locally.

> **NOTE**: Downloading the model data may take some time, especially models for anomaly tasks are on the order of 100 Mb in size so please be prepared to wait a bit

In [None]:
PROJECT_NAME = 'COCO anomalous animal demo'

deployment = client.deploy_project(project_name=PROJECT_NAME)

### Preparing the models for inference
Now that the `deployment` is created and the models are saved to the local disk, we can load the models into memory to prepare them for inference. 

In [None]:
deployment.load_inference_models(device='CPU')

## Running inference on an image locally
Now, we can load an image as a numpy array (for instance using OpenCV) and use the `deployment.infer` method to generate a prediction for it

In [None]:
import cv2
import time

numpy_image = cv2.imread('data/dogs.png')

# Convert to RGB channel order. All deployed models expect the image in RGB format
numpy_rgb = cv2.cvtColor(numpy_image, cv2.COLOR_BGR2RGB)

t_start = time.time()
prediction = deployment.infer(numpy_rgb)
t_elapsed = time.time() - t_start

print(f"Running local inference on image took {t_elapsed*1000:.2f} milliseconds")

### Inspecting the result
The `Prediction` object generated by `deployment.infer` is equal in structure to the predictions sent by the platform. So let's have a closer look at it. We can do so in two ways: 

1. Visualise it using the `show_image_with_annotation_scene` utility function
2. Inspecting it's properties via the `prediction.overview` property

Let's show it on the image first

In [None]:
from sc_api_tools.utils import show_image_with_annotation_scene

show_image_with_annotation_scene(numpy_image, prediction)

In [None]:
print(prediction.overview)

## Saving the deployment
When we create the deployment, the model data is saved to a temporary folder. We store the deployment for offline re-use later on by saving it: This will copy the model data from the temporary folder to the path we specify. If we want to run inference locally again, we can simply reload the deployment from the saved folder, without having to connect to the platform again.

In [None]:
import os

PATH_TO_DEPLOYMENT_FOLDER = os.path.join('deployments', PROJECT_NAME)

deployment.save(path_to_folder=PATH_TO_DEPLOYMENT_FOLDER)

## Loading a saved deployment
Loading a deployment that was previously saved to disk is easy and can be done without establishing a connection to the platform (or without even connecting to the internet, for that matter).

In [None]:
from sc_api_tools.deployment import Deployment

offline_deployment = Deployment.from_folder(PATH_TO_DEPLOYMENT_FOLDER)

Again, to prepare the deployment for inference make sure to send the models to CPU (or whichever device you want to use)

In [None]:
offline_deployment.load_inference_models(device='CPU')

That's all there is to it! The `offline_deployment` can now be used to run inference.

# Comparing local inference and inference on the platform
As a final step, we can make a comparison between the local inference results and the predictions sent back from the platform. We'll have a look at the time required for both methods, and compare the output.

In [None]:
from sc_api_tools.rest_managers import ImageManager, PredictionManager

project = project_manager.get_project_by_name(PROJECT_NAME)

image_manager = ImageManager(session=client.session, workspace_id=client.workspace_id, project=project)
prediction_manager = PredictionManager(session=client.session, workspace_id=client.workspace_id, project=project)

To prepare for platform inference, we have to upload the image to the platform first

In [None]:
sc_image = image_manager.upload_image(numpy_image)
# Load the pixel data to visualize the image later on
sc_image.get_data(client.session)

### Comparing inference times
Now, we can run inference locally and on the platform, and time both. We'll set the prediction manager to `ONLINE` mode, which means it will always generate a new prediction for the image, rather than returning cached predictions. Additionally you can set the mode to `AUTO` (which will return cached predictions if available) and re-run the cell to see the difference.

In [None]:
from sc_api_tools.data_models.enums import PredictionMode

prediction_manager.mode = PredictionMode.ONLINE

# Get platform prediction, and measure time required
t_start_platform = time.time()
platform_prediction = prediction_manager.get_image_prediction(sc_image)
t_elapsed_platform = time.time() - t_start_platform

# Get local prediction, and measure time required
t_start_local = time.time()
local_prediction = offline_deployment.infer(numpy_rgb)
t_elapsed_local = time.time() - t_start_local

print(f'Platform prediction completed in {t_elapsed_platform*1000:.1f} milliseconds')
print(f'Local prediction completed in {t_elapsed_local*1000:.1f} milliseconds')

### Comparing inference results
The cell below will first show a pop up window with the predictions from the platform, and after that a window showing the local predictions. The two predictions should be equal.

In [None]:
show_image_with_annotation_scene(sc_image, platform_prediction)
show_image_with_annotation_scene(numpy_image, local_prediction)

### Cleaning up
To clean up, we'll delete the sc_image from the project again

In [None]:
image_manager.delete_images([sc_image])