# Post inference hooks for model monitoring
In this notebook we will have a look at how to set up post inference hooks for your inference models. The Geti SDK provides several basic triggers and actions that can be used to construct pipelines for, for instance, data collection, alerting, or other actions that need to take place based on inference results. 

These pipelines are referred to as `post inference hooks` and can be added to any `Deployment` for any project. In this notebook we will create a post inference hook that implements the following:

For every inferred frame or image, check if the prediction contains any objects. If not, send the image to the Geti server. The image will be stored in a new dataset called `Inferred images`, within the original project.

In [None]:
# As usual we will connect to the platform first, using the server details from the .env file

from geti_sdk import Geti
from geti_sdk.utils import get_server_details_from_env

geti_server_configuration = get_server_details_from_env()

geti = Geti(server_config=geti_server_configuration)

## Selecting a project

we'll use the `COCO animal detection demo` project that we created in [notebook 002](002_create_project_from_dataset.ipynb).

In [None]:
PROJECT_NAME = "COCO animal detection demo"
project = geti.get_project(PROJECT_NAME)

## Create deployment for the project

In [None]:
deployment = geti.deploy_project(PROJECT_NAME)

## Checking deployment output
Let's quickly load the inference models and check the inference output on a sample image

In [None]:
deployment.load_inference_models()

In [None]:
import cv2

from geti_sdk.demos import EXAMPLE_IMAGE_PATH
from geti_sdk.utils import show_image_with_annotation_scene

numpy_image = cv2.imread(EXAMPLE_IMAGE_PATH)
numpy_rgb = cv2.cvtColor(numpy_image, cv2.COLOR_BGR2RGB)

prediction = deployment.infer(numpy_rgb)

show_image_with_annotation_scene(numpy_rgb, prediction, show_in_notebook=True);

## Configuring a post inference hook to send image data to Geti

With the deployment all set up and ready, let's go ahead and add a post inference hook! We will configure it to behave as follows:

For each inferred image or frame:

- Check if the prediction contains any objects
- If and only if the prediction is empty, then:
- Send the image to the Geti project, to a dedicated dataset named 'Inferred images'

Basically, this behaviour can be separated into two parts: A **Trigger** and an **Action**. The first part, in which we check if the prediction is empty, is the Trigger. If the trigger activates, the Action will be carried out: Sending the data to the Intel Geti server. 

The reasoning here is that if the prediction is empty, the model may have failed to find an object there so we would like to have a look at the image ourselves, possibly annotate it and use it in the training set so that it gets included in the next training round. Of course, many other triggers can be defined: For example, the `LabelTrigger` will activate when a certain label is present in the prediction (for instance, `dog`) and the `ConfidenceTrigger` will activate when the probability for any of the predictions is below a certain threshold. 

The cell below shows how to define the hook outlined above. 

In [None]:
from geti_sdk.post_inference_hooks import (
    EmptyLabelTrigger,
    GetiDataCollection,
    PostInferenceHook,
)

trigger = (
    EmptyLabelTrigger()
)  # Trigger will activate whenever a prediction does not contain any objects
action = GetiDataCollection(  # Action will send data to a new `Inferred images` dataset in the Geti project
    session=geti.session,
    workspace_id=geti.workspace_id,
    project=project,
    dataset="Inferred images",
    log_level="info",
)
hook = PostInferenceHook(
    trigger=trigger, action=action
)  # The hook attaches the action to the trigger

Now, we just need to add the hook to the deployment

In [None]:
deployment.add_post_inference_hook(hook)

Once added, whenever we run inference on an image or video frame, the hook will execute automatically.

In [None]:
prediction = deployment.infer(numpy_rgb)

## Benchmarking inference hook runtime overhead
Naturally, performing additional actions on inference may introduce computational overhead, which might reduce the inference throughput. To measure the extend of the effect, let's do some benchmarking.

First, we measure the time it takes to do 100 inference iterations for the deployment, without any post inference hooks. The cell below removes the hook we added previously.


In [None]:
deployment.clear_inference_hooks()

Now, we run inference and measure execution time

In [None]:
import time

from tqdm import tqdm

t_start = time.time()
for i in tqdm(range(100)):
    deployment.infer(numpy_rgb)
t_elapsed = time.time() - t_start
print(f"100 inference iterations without hooks completed in {t_elapsed:.2f} seconds.")

Next, we add the Geti hook again, using the same trigger and action as before. Note that the log level is set to `log_level="debug"` for the action to avoid spamming the log with output. 

Also note the additional parameter `max_threads=5` for the `PostInferenceHook` constructor: This parameter determines whether the hook will run in the main thread (`max_threads=0`) or whether it will be executed in parallel (`max_threads>=1`). For the benchmark, we will use parallel execution with 5 threads.

In [None]:
trigger = EmptyLabelTrigger()
action = GetiDataCollection(
    session=geti.session,
    workspace_id=geti.workspace_id,
    project=project,
    dataset="Inferred images",
    log_level="debug",
)
geti_hook = PostInferenceHook(trigger=trigger, action=action, max_threads=5)

deployment.clear_inference_hooks()
deployment.add_post_inference_hook(geti_hook)

Let's run the benchmark again

In [None]:
t_start = time.time()
for i in tqdm(range(100)):
    deployment.infer(numpy_rgb)
t_elapsed = time.time() - t_start
print(
    f"100 inference iterations with Geti I/O hook completed in {t_elapsed:.2f} seconds."
)

If you now check the project in the Geti UI, you should see that the `Inferred images` dataset now contains 101 (duplicate) images. These have all been collected by the post inference hook.

## Adding another hook

We can add as many hooks as we like, each with different triggers and actions. In the next cell, we'll create a hook that:

- Checks if the predictions contain less than 2 objects. In the count, we only consider objects that have the label `dog`. 
- If so, then:
- Save the image, the prediction and the score that triggered the action to a folder on disk. In this case, the score is the number of predicted objects

Let's suppose that we are running inference on a video stream. In that case, we might get many sequential frames with less than two dogs, and frames that appear shortly after one another may look very similar. To avoid filling up our data collection folder with such near-duplicate frames, we can choose to limit the rate at which the action is allowed to run. This can be configured in the `PostInferenceHook`constructor, using the `limit_action_rate` and `max_frames_per_second` parameters.

The cell below shows how to create this hook, and limit the rate of saving frames to a maximum of 1 frame per second.

In [None]:
from geti_sdk.post_inference_hooks import FileSystemDataCollection, ObjectCountTrigger

counting_trigger = ObjectCountTrigger(
    threshold=2, label_names=["dog"], mode="lower"
)  # Trigger will activate whenever a prediction contains less than two objects labelled 'dog'

file_system_action = FileSystemDataCollection(
    target_folder="hook_data",
    save_predictions=True,
    save_scores=True,
    log_level="debug",
)  # Action will store the image, prediction data and trigger score to the `hook_data` folder on disk

file_hook = PostInferenceHook(
    trigger=counting_trigger,
    action=file_system_action,
    max_threads=5,
    limit_action_rate=True,
    max_frames_per_second=1,
)

deployment.add_post_inference_hook(file_hook)

Now that the hook is created and added to the deployment, we can run the inference benchmark again.

In [None]:
t_start = time.time()
for i in tqdm(range(100)):
    deployment.infer(numpy_rgb)
t_elapsed = time.time() - t_start
print(
    f"100 inference iterations with 2 I/O hooks completed in {t_elapsed:.2f} seconds."
)

You should now see a folder `hook_data` in your working directory. This folder contains three subfolders: `images`, `predictions` and `scores`. The contents of these folders are the following:
- `hook_data/images` contains the image files which triggered the hook
- `hook_data/predictions` contains the prediction data in .json format
- `hook_data/scores` contains txt files with the score for each image that caused the hook to trigger

The file names are consistent across the three folders, i.e. the prediction for a certain image can be found in the .json file with the same name, in the `predictions` folder.

If you take a look in those folders now, you'll find that they are populated with images, predictions and score files. The number of images in the folder should correspond to a rate of 1 image/second multiplied by the execution time of the benchmark. Meaning that if it took 20 seconds to run the 100 inference iterations, 20 images should have been saved to disk.