From dcab64f3f5ac1dccf4c6573e5d78cc30ddf610ca Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 14 Oct 2024 16:47:59 +0200 Subject: [PATCH 01/22] Start monitoring chapter --- md-docs/user_guide/monitoring.md | 11 +++++++++++ mkdocs.yml | 1 + 2 files changed, 12 insertions(+) create mode 100644 md-docs/user_guide/monitoring.md diff --git a/md-docs/user_guide/monitoring.md b/md-docs/user_guide/monitoring.md new file mode 100644 index 0000000..c062286 --- /dev/null +++ b/md-docs/user_guide/monitoring.md @@ -0,0 +1,11 @@ +# Monitoring + +The monitoring module is a key feature of the ML cube Platform. +It enables continuous tracking of your AI models performance over time, helping to identify potential issues. +Additionally, it allows the monitoring of production data to preemptively detect distribution changes, ensuring +that the model continues to perform as expected and aligns with business requirements. + +## Monitoring Targets and Monitoring Metrics + +Before delving into the details of how monitoring is performed, it is necessary to +introduce the concepts of *Monitoring Targets* and *Monitoring Metrics*. diff --git a/mkdocs.yml b/mkdocs.yml index ad1fa9d..a837ddc 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -108,6 +108,7 @@ nav: - user_guide/index.md - user_guide/company.md - user_guide/project.md + - user_guide/monitoring.md - Modules: - user_guide/modules/index.md From 2a99b41777ae636bcbbd29b99a21221b26a473e9 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 14 Oct 2024 17:43:28 +0200 Subject: [PATCH 02/22] Why we need monitoring, start target and metrics --- md-docs/user_guide/monitoring.md | 27 ++++++++++++++++++++++++--- md-docs/user_guide/task.md | 1 + 2 files changed, 25 insertions(+), 3 deletions(-) create mode 100644 md-docs/user_guide/task.md diff --git a/md-docs/user_guide/monitoring.md b/md-docs/user_guide/monitoring.md index c062286..8f7adea 100644 --- a/md-docs/user_guide/monitoring.md +++ b/md-docs/user_guide/monitoring.md @@ -5,7 +5,28 @@ It enables continuous tracking of your AI models performance over time, helping Additionally, it allows the monitoring of production data to preemptively detect distribution changes, ensuring that the model continues to perform as expected and aligns with business requirements. -## Monitoring Targets and Monitoring Metrics +## Why do we need Monitoring? -Before delving into the details of how monitoring is performed, it is necessary to -introduce the concepts of *Monitoring Targets* and *Monitoring Metrics*. +Machine Learning algorithms are based on the assumption that the distribution of the data used for training is the same as the one from which +production data are drawn from. This assumption never holds in practice, as the real-world is characterized by dynamic and ever-changing conditions. +These distributional changes, if not addressed properly, can cause a drop in the model's performance, leading to bad estimations or predictions, which +in turn can have a negative impact on the business. + +Monitoring, more commonly known as __Drift Detection__ in the literature, refers the process of continuously tracking the performance of a model +and the distribution of the data it is operating on. +Simply put, monitoring employs statistical techniques to compare the distribution characterizing the reference data (for instance, those used for training) +with the one characterizing the production data. If a significant difference is detected, the system raises an alarm, signaling that the monitored entity +is drifting away from the expected behavior and that corrective actions should be taken. + +The MLCube Platform performs different types of monitoring, which will be explained in the following sections. + +## Targets and Metrics + +After going through the reasons why monitoring is so important in modern AI systems, and explaining how monitoring is performed in the ML cube Platform, +we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the MLCube Platform monitors, but they differ in their nature. + +### Monitoring Targets + +A Monitoring Target is a relevant entity involved in a [Task]. + +[Task]: task.md \ No newline at end of file diff --git a/md-docs/user_guide/task.md b/md-docs/user_guide/task.md new file mode 100644 index 0000000..14aa2b8 --- /dev/null +++ b/md-docs/user_guide/task.md @@ -0,0 +1 @@ +# Task \ No newline at end of file From de1256cdc1adde73dd00a7314b2e43414498d4ee Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Wed, 16 Oct 2024 17:30:26 +0200 Subject: [PATCH 03/22] Monitoring targets --- md-docs/user_guide/monitoring.md | 70 +++++++++++++++++++++++++++----- 1 file changed, 60 insertions(+), 10 deletions(-) diff --git a/md-docs/user_guide/monitoring.md b/md-docs/user_guide/monitoring.md index 8f7adea..81ef4d7 100644 --- a/md-docs/user_guide/monitoring.md +++ b/md-docs/user_guide/monitoring.md @@ -8,25 +8,75 @@ that the model continues to perform as expected and aligns with business require ## Why do we need Monitoring? Machine Learning algorithms are based on the assumption that the distribution of the data used for training is the same as the one from which -production data are drawn from. This assumption never holds in practice, as the real-world is characterized by dynamic and ever-changing conditions. +production data are drawn from. This assumption never holds in practice, as the real world is characterized by dynamic and ever-changing conditions. These distributional changes, if not addressed properly, can cause a drop in the model's performance, leading to bad estimations or predictions, which in turn can have a negative impact on the business. -Monitoring, more commonly known as __Drift Detection__ in the literature, refers the process of continuously tracking the performance of a model +Monitoring, also known as __Drift Detection__ in the literature, refers the process of continuously tracking the performance of a model and the distribution of the data it is operating on. -Simply put, monitoring employs statistical techniques to compare the distribution characterizing the reference data (for instance, those used for training) -with the one characterizing the production data. If a significant difference is detected, the system raises an alarm, signaling that the monitored entity + +## How does the MLCube Platform perform Monitoring? + +The MLCube platform performs monitoring by employing statistical techniques to compare a certain reference (for instance, data used for training or the performance of a model +on the test set) to incoming production data. If a significant difference is detected, an alarm is raised, signaling that the monitored entity is drifting away from the expected behavior and that corrective actions should be taken. -The MLCube Platform performs different types of monitoring, which will be explained in the following sections. +In more practical terms, the [set_model_reference] method can be used to specify the time period where the reference of a given model should be placed. As a consequence, +all algorithms associated with the specified model (not just those monitoring the performance, but also those operating on the data used by the model) will +be initialized on the specified reference. Of course, you should send the data you want to use as a reference to the platform before calling this method, for instance using the +[add_historical_data] method. + +After setting the reference, the [add_production_data] method can be used to send production data to the platform. This data will be analyzed by the monitoring algorithms +and, if a significant difference is detected, an alarm will be raised, in the form of a [DetectionEvent]. We will go into more detail about detection events and +how you can set up automatic actions upon their reception in the [Detection Event] section. -## Targets and Metrics +The MLCube Platform monitors different entities, which will be explored in the following section. -After going through the reasons why monitoring is so important in modern AI systems, and explaining how monitoring is performed in the ML cube Platform, +### Targets and Metrics + +After going through the reasons why monitoring is so important in modern AI systems and explaining how monitoring is performed in the ML cube Platform, we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the MLCube Platform monitors, but they differ in their nature. -### Monitoring Targets +#### Monitoring Targets + +A Monitoring Target is a relevant entity involved in a [Task]. They represent the main quantities monitored by the platform, those whose +variation can have a significant impact on the AI task success. + +The MLCube platform supports the following monitoring targets: + +- `INPUT`: the input distribution, $P(X)$. +- `CONCEPT`: the joint distribution of input and target, $P(X, Y)$. +- `PREDICTION`: the prediction of the model, $P(\hat{Y})$. +- `INPUT_PREDICTION`: the joint distribution of input and prediction, $P(X, \hat{Y})$. +- `ERROR`: the error of the model, whose computation depends on the task type. +- `USER_INPUT`: the input provided by the user, usually in the form of a query. This target is only available in tasks of type RAG. +- `USER_INPUT_RETRIEVED_CONTEXT`: the similarity between the user input and the context retrieved by the RAG system. This target is only available in tasks of type RAG. +- `USER_INPUT_MODEL_OUTPUT`: the similarity between the user input and the response of the Large Language Model. This target is only available in tasks of type RAG. +- `MODEL_OUTPUT_RETRIEVED_CONTEXT`: the similarity between the response of the Large Language Model and the context retrieved by the RAG system. This target is only available in tasks of type RAG. + +As mentioned, some targets are available only for specific task types. The following table shows all the available monitoring targets in relation with the task type. +Notice that while some targets were specifically designed for a certain task type, others are more general and can be used in different contexts. +Nonetheless, the platform might not support yet all possible combinations. + +| | **REGRESSION** | **CLASSIFICATION_BINARY** | **CLASSIFICATION_MULTICLASS** | **CLASSIFICATION_MULTILABEL** | **OBJECT_DETECTION** | **RAG** | +|--------------------------------|:------------------:|:-------------------------:|:-----------------------------:|:-----------------------------:|:--------------------:|:------------------:| +| INPUT | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | +| CONCEPT | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | +| PREDICTION | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | | +| INPUT_PREDICTION | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | | +| ERROR | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | +| USER_INPUT | | | | | | :white_check_mark: | +| USER_INPUT_RETRIEVED_CONTEXT | | | | | | :white_check_mark: | +| USER_INPUT_MODEL_OUTPUT | | | | | | :white_check_mark: | +| MODEL_OUTPUT_RETRIEVED_CONTEXT | | | | | | :white_check_mark: | + + +#### Monitoring Metrics -A Monitoring Target is a relevant entity involved in a [Task]. +A Monitoring Metric is a generic quantity that can be computed on a Monitoring Target. Its purpose -[Task]: task.md \ No newline at end of file +[Task]: task.md +[set_model_reference]: ../../api/python/client#set_model_reference +[add_production_data]: ../../api/python/client#add_production_data +[add_historical_data]: ../../api/python/client#add_historical_data +[DetectionEvent]: ../../api/python/models#detectionevent From 10ccaa2330694271cf0a8ae21732b66166e58b02 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Thu, 17 Oct 2024 17:43:32 +0200 Subject: [PATCH 04/22] Finished base monitoring page + detection event and explainability detection event rule page is already there --- md-docs/user_guide/detection_event.md | 7 ++++++ md-docs/user_guide/detection_event_rules.md | 15 +++++++----- md-docs/user_guide/drift_explainability.md | 7 ++++++ md-docs/user_guide/monitoring.md | 26 ++++++++++++++++++--- 4 files changed, 46 insertions(+), 9 deletions(-) create mode 100644 md-docs/user_guide/detection_event.md create mode 100644 md-docs/user_guide/drift_explainability.md diff --git a/md-docs/user_guide/detection_event.md b/md-docs/user_guide/detection_event.md new file mode 100644 index 0000000..dcdb457 --- /dev/null +++ b/md-docs/user_guide/detection_event.md @@ -0,0 +1,7 @@ +# Detection Event + +[Task]: task.md +[set_model_reference]: ../../api/python/client#set_model_reference +[add_production_data]: ../../api/python/client#add_production_data +[add_historical_data]: ../../api/python/client#add_historical_data +[DetectionEvent]: ../../api/python/models#detectionevent diff --git a/md-docs/user_guide/detection_event_rules.md b/md-docs/user_guide/detection_event_rules.md index 93a2de8..52c84e7 100644 --- a/md-docs/user_guide/detection_event_rules.md +++ b/md-docs/user_guide/detection_event_rules.md @@ -1,16 +1,19 @@ # Detection Event Rules -This section provides an overview of how you can setup automation rules after a detection event occurs in order to receive notifications or to start retraining. +This section provides an overview of how you can set up automation rules after a detection event occurs in +order to receive notifications or to start retraining. + +When a detection event occurs, the platform scans all the detection event rules you specified. +If a rule matches the event, it will be triggered. -When a detection event occurs, the platform evaluates your set detection event rules. -If a rule matches the event, the specified actions will be triggered. These rules are specific to a task and require the following parameters for configuration: -- `name`: A descriptive label for your rule, helping you understand its purpose quickly. -- `task_id`: The unique identifier of the task to which the rule is applicable. +- `name`: A descriptive label for your rule. +- `task_id`: The unique identifier of the task to which the rule belongs. - `severity`: Indicates the severity level of the event - it can be `HIGH`, `MEDIUM`, or `LOW`. - `detection_event_type`: Currently, only `DRIFT` events are available for detection. -- `monitoring_target`: Specifies what is being monitored, which can be `MODEL`, `INPUT`, or `CONCEPT`. If the value is `MODEL`, you need to provide a corresponding `model_name`. + - `monitoring_target`: Specifies what is being monitored, which can be `MODEL`, `INPUT`, or `CONCEPT`. + If the value is `MODEL`, you need to provide a corresponding `model_name`. - `actions`: A sequential list of actions to be executed when the rule is triggered. ## Supported Actions diff --git a/md-docs/user_guide/drift_explainability.md b/md-docs/user_guide/drift_explainability.md new file mode 100644 index 0000000..e451922 --- /dev/null +++ b/md-docs/user_guide/drift_explainability.md @@ -0,0 +1,7 @@ +# Drift Explainability + +[Task]: task.md +[set_model_reference]: ../../api/python/client#set_model_reference +[add_production_data]: ../../api/python/client#add_production_data +[add_historical_data]: ../../api/python/client#add_historical_data +[DetectionEvent]: ../../api/python/models#detectionevent diff --git a/md-docs/user_guide/monitoring.md b/md-docs/user_guide/monitoring.md index 81ef4d7..e4df1f9 100644 --- a/md-docs/user_guide/monitoring.md +++ b/md-docs/user_guide/monitoring.md @@ -56,9 +56,9 @@ The MLCube platform supports the following monitoring targets: As mentioned, some targets are available only for specific task types. The following table shows all the available monitoring targets in relation with the task type. Notice that while some targets were specifically designed for a certain task type, others are more general and can be used in different contexts. -Nonetheless, the platform might not support yet all possible combinations. +Nonetheless, the platform might not support yet all possible combinations. The table will be updated as new targets are added to the product. -| | **REGRESSION** | **CLASSIFICATION_BINARY** | **CLASSIFICATION_MULTICLASS** | **CLASSIFICATION_MULTILABEL** | **OBJECT_DETECTION** | **RAG** | +| **Monitoring Target** | **REGRESSION** | **CLASSIFICATION_BINARY** | **CLASSIFICATION_MULTICLASS** | **CLASSIFICATION_MULTILABEL** | **OBJECT_DETECTION** | **RAG** | |--------------------------------|:------------------:|:-------------------------:|:-----------------------------:|:-----------------------------:|:--------------------:|:------------------:| | INPUT | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | CONCEPT | :white_check_mark: | :white_check_mark: | :white_check_mark: | :white_check_mark: | | | @@ -73,7 +73,27 @@ Nonetheless, the platform might not support yet all possible combinations. #### Monitoring Metrics -A Monitoring Metric is a generic quantity that can be computed on a Monitoring Target. Its purpose +A Monitoring Metric is a generic quantity that can be computed on a Monitoring Target. They enable the monitoring of specific +aspects of a target, which might help in identifying the root cause of a drift, as well as defining the corrective actions to be taken. + +The following table display the monitoring metrics supported, along with their monitoring target and the conditions +under which they are actually monitored. Notice that also this table is subject to changes, as new metrics are added. + +| **Monitoring Metric** | **Monitoring Target** | **Conditions** | +|:---------------------:|:------------------------------------------------:|:--------------------------------------:| +| TEXT_TOXICITY | INPUT, USER_INPUT, PREDICTION | When the data structure is text | +| TEXT_EMOTION | INPUT, USER_INPUT | When the data structure is text | +| TEXT_SENTIMENT | INPUT, USER_INPUT | When the data structure is text | +| TEXT_LENGTH | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | +| MODEL_PERPLEXITY | PREDICTION | When the task type is RAG | +| IMAGE_BRIGHTNESS | INPUT | When the data structure is image | +| IMAGE_CONTRAST | INPUT | When the data structure is image | +| BBOXES_AREA | PREDICTION | When the task type is Object Detection | +| BBOXES_QUANTITY | PREDICTION | When the task type is Object Detection | + + + + [Task]: task.md [set_model_reference]: ../../api/python/client#set_model_reference From 54f4b5e84115679929d4b83adf59a6f3790eb9b2 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Fri, 18 Oct 2024 09:26:02 +0200 Subject: [PATCH 05/22] Add metric description --- md-docs/user_guide/monitoring.md | 26 +++++++++++--------------- mkdocs.yml | 5 ++++- 2 files changed, 15 insertions(+), 16 deletions(-) diff --git a/md-docs/user_guide/monitoring.md b/md-docs/user_guide/monitoring.md index e4df1f9..bff22c5 100644 --- a/md-docs/user_guide/monitoring.md +++ b/md-docs/user_guide/monitoring.md @@ -70,7 +70,6 @@ Nonetheless, the platform might not support yet all possible combinations. The t | USER_INPUT_MODEL_OUTPUT | | | | | | :white_check_mark: | | MODEL_OUTPUT_RETRIEVED_CONTEXT | | | | | | :white_check_mark: | - #### Monitoring Metrics A Monitoring Metric is a generic quantity that can be computed on a Monitoring Target. They enable the monitoring of specific @@ -79,20 +78,17 @@ aspects of a target, which might help in identifying the root cause of a drift, The following table display the monitoring metrics supported, along with their monitoring target and the conditions under which they are actually monitored. Notice that also this table is subject to changes, as new metrics are added. -| **Monitoring Metric** | **Monitoring Target** | **Conditions** | -|:---------------------:|:------------------------------------------------:|:--------------------------------------:| -| TEXT_TOXICITY | INPUT, USER_INPUT, PREDICTION | When the data structure is text | -| TEXT_EMOTION | INPUT, USER_INPUT | When the data structure is text | -| TEXT_SENTIMENT | INPUT, USER_INPUT | When the data structure is text | -| TEXT_LENGTH | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | -| MODEL_PERPLEXITY | PREDICTION | When the task type is RAG | -| IMAGE_BRIGHTNESS | INPUT | When the data structure is image | -| IMAGE_CONTRAST | INPUT | When the data structure is image | -| BBOXES_AREA | PREDICTION | When the task type is Object Detection | -| BBOXES_QUANTITY | PREDICTION | When the task type is Object Detection | - - - +| **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | +|:---------------------:|----------------------------------------------------------|:------------------------------------------------:|:--------------------------------------:| +| TEXT_TOXICITY | The toxicity of the text | INPUT, USER_INPUT, PREDICTION | When the data structure is text | +| TEXT_EMOTION | The emotion of the text | INPUT, USER_INPUT | When the data structure is text | +| TEXT_SENTIMENT | The sentiment of the text | INPUT, USER_INPUT | When the data structure is text | +| TEXT_LENGTH | The length of the text | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | +| MODEL_PERPLEXITY | The uncertainty of the LLM | PREDICTION | When the task type is RAG | +| IMAGE_BRIGHTNESS | The brightness of the image | INPUT | When the data structure is image | +| IMAGE_CONTRAST | The contrast of the image | INPUT | When the data structure is image | +| BBOXES_AREA | The average area of the predicted bounding boxes | PREDICTION | When the task type is Object Detection | +| BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | [Task]: task.md diff --git a/mkdocs.yml b/mkdocs.yml index a837ddc..566276b 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -109,6 +109,9 @@ nav: - user_guide/company.md - user_guide/project.md - user_guide/monitoring.md + - user_guide/detection_event_rules.md + - user_guide/drift_explainability.md + - user_guide/detection_event.md - Modules: - user_guide/modules/index.md @@ -122,7 +125,7 @@ nav: - user_guide/integrations/retrain_triggers.md - Other resources: - user_guide/rbac.md - - user_guide/detection_event_rules.md +# - user_guide/detection_event_rules.md - user_guide/data_schema.md - user_guide/glossary.md - API: From 64243150db237102eb67c17a852c02ac96c3f7b3 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 21 Oct 2024 18:22:53 +0200 Subject: [PATCH 06/22] Rearrange docs --- md-docs/user_guide/detection_event.md | 7 --- md-docs/user_guide/detection_event_rules.md | 55 ----------------- .../user_guide/monitoring/detection_event.md | 26 ++++++++ .../monitoring/detection_event_rules.md | 61 +++++++++++++++++++ .../{ => monitoring}/drift_explainability.md | 0 .../user_guide/{ => monitoring}/monitoring.md | 15 ++--- mkdocs.yml | 10 +-- 7 files changed, 101 insertions(+), 73 deletions(-) delete mode 100644 md-docs/user_guide/detection_event.md delete mode 100644 md-docs/user_guide/detection_event_rules.md create mode 100644 md-docs/user_guide/monitoring/detection_event.md create mode 100644 md-docs/user_guide/monitoring/detection_event_rules.md rename md-docs/user_guide/{ => monitoring}/drift_explainability.md (100%) rename md-docs/user_guide/{ => monitoring}/monitoring.md (93%) diff --git a/md-docs/user_guide/detection_event.md b/md-docs/user_guide/detection_event.md deleted file mode 100644 index dcdb457..0000000 --- a/md-docs/user_guide/detection_event.md +++ /dev/null @@ -1,7 +0,0 @@ -# Detection Event - -[Task]: task.md -[set_model_reference]: ../../api/python/client#set_model_reference -[add_production_data]: ../../api/python/client#add_production_data -[add_historical_data]: ../../api/python/client#add_historical_data -[DetectionEvent]: ../../api/python/models#detectionevent diff --git a/md-docs/user_guide/detection_event_rules.md b/md-docs/user_guide/detection_event_rules.md deleted file mode 100644 index 52c84e7..0000000 --- a/md-docs/user_guide/detection_event_rules.md +++ /dev/null @@ -1,55 +0,0 @@ -# Detection Event Rules - -This section provides an overview of how you can set up automation rules after a detection event occurs in -order to receive notifications or to start retraining. - -When a detection event occurs, the platform scans all the detection event rules you specified. -If a rule matches the event, it will be triggered. - -These rules are specific to a task and require the following parameters for configuration: - -- `name`: A descriptive label for your rule. -- `task_id`: The unique identifier of the task to which the rule belongs. -- `severity`: Indicates the severity level of the event - it can be `HIGH`, `MEDIUM`, or `LOW`. -- `detection_event_type`: Currently, only `DRIFT` events are available for detection. - - `monitoring_target`: Specifies what is being monitored, which can be `MODEL`, `INPUT`, or `CONCEPT`. - If the value is `MODEL`, you need to provide a corresponding `model_name`. -- `actions`: A sequential list of actions to be executed when the rule is triggered. - -## Supported Actions -Two types of actions are currently supported: notification and retrain. - - -### Notifications -- `SlackNotificationAction`: sends a notification to a Slack channel via webhook. -- `DiscordNotificationAction`: sends a notification to a Discord channel via webhook. -- `EmailNotificationAction`: sends an email to the provided email address. -- `TeamsNotificationAction`: sends a notification to Microsoft Teams via webhook. -- `MqttNotificationAction`: sends a notification to an MQTT broker. - -### Retrain Actions - -Retrain actions let you retrain your model, therefore, they are only available when the rule monitoring target is a model. -The retrain action does not need any parameter because it is automatically inferred from the `model_name` attribute of the rule. -Of course, it is mandatory that the model has a retrain trigger associated in order to add a retrain action to the rule. - -!!! example - The following code snippet demonstrates how to create a rule that matches high severity drift events for a specific model. When triggered, it sends a notification to the `ml3-platform-notifications` channel on your Slack workspace using the provided webhook URL and then start the retraining of the model. - - ```py - rule_id = client.create_detection_event_rule( - name='Retrain model with notification', - task_id='my-task-id, - model_name='my-model', - severity=DetectionEventSeverity.HIGH, - detection_event_type=DetectionEventType.DRIFT, - monitoring_target=MonitoringTarget.MODEL, - actions=[ - SlackNotificationAction( - webhook='https://hooks.slack.com/services/...', - channel='ml3-platform-notifications' - ), - RetrainAction() - ], - ) - ``` \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md new file mode 100644 index 0000000..60474a0 --- /dev/null +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -0,0 +1,26 @@ +# Detection Event + +A [Detection Event] is raised by the MLCube Platform when a significant change is detected in one of the entities being monitored. + +An event is characterized by the following attributes: + +- **event_id**: the unique identifier of the event. +- **event_type**: the [DetectionEventType] of the event. +- **severity**: the [DetectionEventSeverity] of the event. The severity should be provided only for drift events and it indicates +the criticality of the detected drift. +- **monitoring_target**: the [MonitoringTarget] being monitored. +- **monitoring_metric**: the [MonitoringMetric] that triggered the event, if the event is related to a metric. +- **model_name**: the name of the model that raised the event. +- **model_version**: the version of the model that raised the event. +- **insert_datetime**: the time when the event was raised. +- **sample_timestamp**: the timestamp of the sample that triggered the event. +- **user_feedback**: the feedback provided by the user on whether the event was expected or not. + + + + +[Detection Event]: ../../api/python/models#detectionevent +[DetectionEventType]: ../../api/python/enums#detectioneventtype +[DetectionEventSeverity]: ../../api/python/enums#detectioneventseverity +[MonitoringTarget]: ../../api/python/enums#monitoringtarget +[MonitoringMetric]: ../../api/python/enums#monitoringmetric \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md new file mode 100644 index 0000000..2b4730f --- /dev/null +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -0,0 +1,61 @@ +# Detection Event Rules + +This section outlines how to configure automation to receive notifications or start retraining after a [Detection Event] occurs. + +When a detection event is produced, the MLCube Platform reviews all the detection event rules you have set +and triggers those matching the event. + +Rules are specific to a task and require the following parameters: + +- `name`: a descriptive label of the rule. +- `task_id`: the unique identifier of the task to which the rule belongs. +- `detection_event_type`: the [DetectionEventType] that the rule matches. +- `severity`: the [DetectionEventSeverity] of the event. +- `monitoring_target`: the monitoring target whose event should trigger the rule. +- `model_name`: the name of the model to which the rule applies. This is only required when the monitoring target is related to a model + (such as `ERROR` or `PREDICTION`). +- `actions`: A sequential list of actions to be executed when the rule is triggered. + +## Detection Event Actions +Two types of actions are currently supported: notification and retrain. + +### Notifications +- `SlackNotificationAction`: sends a notification to a Slack channel via webhook. +- `DiscordNotificationAction`: sends a notification to a Discord channel via webhook. +- `EmailNotificationAction`: sends an email to the provided email address. +- `TeamsNotificationAction`: sends a notification to Microsoft Teams via webhook. +- `MqttNotificationAction`: sends a notification to an MQTT broker. + +### Retrain Action + +Retrain action lets you retrain your model. Therefore, it is only available when the monitoring target of the rule is related to a model. +The retrain action does not need any parameter because it is automatically inferred from the `model_name` attribute of the rule. +Of course, the model must already have a retrain trigger associated before adding a retrain action to a rule. + +!!! example + The following code snippet demonstrates how to create a rule that matches high severity drift events on the error of a model. + When triggered, it first sends a notification to the `ml3-platform-notifications` channel on your Slack workspace, using the + provided webhook URL, and then starts the retraining of the model. + + ```py + rule_id = client.create_detection_event_rule( + name='Retrain model with notification', + task_id='my-task-id', + model_name='my-model', + severity=DetectionEventSeverity.HIGH, + detection_event_type=DetectionEventType.DRIFT_ON, + monitoring_target=MonitoringTarget.ERROR, + actions=[ + SlackNotificationAction( + webhook='https://hooks.slack.com/services/...', + channel='ml3-platform-notifications' + ), + RetrainAction() + ], + ) + ``` + +[add_historical_data]: ../../api/python/client#add_historical_data +[Detection Event]: detection_event.md +[DetectionEventType]: ../../api/python/enums#detectioneventtype +[DetectionEventSeverity]: ../../api/python/enums#detectioneventseverity \ No newline at end of file diff --git a/md-docs/user_guide/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md similarity index 100% rename from md-docs/user_guide/drift_explainability.md rename to md-docs/user_guide/monitoring/drift_explainability.md diff --git a/md-docs/user_guide/monitoring.md b/md-docs/user_guide/monitoring/monitoring.md similarity index 93% rename from md-docs/user_guide/monitoring.md rename to md-docs/user_guide/monitoring/monitoring.md index bff22c5..88fe01d 100644 --- a/md-docs/user_guide/monitoring.md +++ b/md-docs/user_guide/monitoring/monitoring.md @@ -5,7 +5,7 @@ It enables continuous tracking of your AI models performance over time, helping Additionally, it allows the monitoring of production data to preemptively detect distribution changes, ensuring that the model continues to perform as expected and aligns with business requirements. -## Why do we need Monitoring? +## Why do you need Monitoring? Machine Learning algorithms are based on the assumption that the distribution of the data used for training is the same as the one from which production data are drawn from. This assumption never holds in practice, as the real world is characterized by dynamic and ever-changing conditions. @@ -23,18 +23,17 @@ is drifting away from the expected behavior and that corrective actions should b In more practical terms, the [set_model_reference] method can be used to specify the time period where the reference of a given model should be placed. As a consequence, all algorithms associated with the specified model (not just those monitoring the performance, but also those operating on the data used by the model) will -be initialized on the specified reference. Of course, you should send the data you want to use as a reference to the platform before calling this method, for instance using the +be initialized on the specified reference. Of course, you should provide the data you want to use as a reference to the platform before calling this method, for instance using the [add_historical_data] method. After setting the reference, the [add_production_data] method can be used to send production data to the platform. This data will be analyzed by the monitoring algorithms -and, if a significant difference is detected, an alarm will be raised, in the form of a [DetectionEvent]. We will go into more detail about detection events and -how you can set up automatic actions upon their reception in the [Detection Event] section. - -The MLCube Platform monitors different entities, which will be explored in the following section. +and, if a significant difference is detected, an alarm will be raised, in the form of a [DetectionEvent]. +You can explore more about detection events and how you can set up automatic actions upon their reception in the [Detection Event] +and the [Detection Event Rule] sections respectively. ### Targets and Metrics -After going through the reasons why monitoring is so important in modern AI systems and explaining how monitoring is performed in the ML cube Platform, +After explaining why monitoring is so important in modern AI systems and detailing how it is performed in the ML cube Platform, we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the MLCube Platform monitors, but they differ in their nature. #### Monitoring Targets @@ -96,3 +95,5 @@ under which they are actually monitored. Notice that also this table is subject [add_production_data]: ../../api/python/client#add_production_data [add_historical_data]: ../../api/python/client#add_historical_data [DetectionEvent]: ../../api/python/models#detectionevent +[Detection Event Rule]: detection_event_rules.md +[Detection Event]: detection_event.md diff --git a/mkdocs.yml b/mkdocs.yml index 566276b..1789aca 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -108,10 +108,12 @@ nav: - user_guide/index.md - user_guide/company.md - user_guide/project.md - - user_guide/monitoring.md - - user_guide/detection_event_rules.md - - user_guide/drift_explainability.md - - user_guide/detection_event.md + + - Monitoring: + - user_guide/monitoring/monitoring.md + - user_guide/monitoring/detection_event.md + - user_guide/monitoring/detection_event_rules.md + - user_guide/monitoring/drift_explainability.md - Modules: - user_guide/modules/index.md From 73fd31330d37b227bdac73d411766bf4250c5ce1 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 21 Oct 2024 18:36:17 +0200 Subject: [PATCH 07/22] Complete detection event --- .../user_guide/monitoring/detection_event.md | 29 +++++++++++++++---- .../monitoring/detection_event_rules.md | 6 ++-- .../monitoring/drift_explainability.md | 5 ---- md-docs/user_guide/monitoring/monitoring.md | 8 ++--- 4 files changed, 31 insertions(+), 17 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index 60474a0..a6ee697 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -16,11 +16,30 @@ the criticality of the detected drift. - **sample_timestamp**: the timestamp of the sample that triggered the event. - **user_feedback**: the feedback provided by the user on whether the event was expected or not. +## Retrieve Detection Events +You can access the detection events generated by the Platform in two ways: +- **SDK**: the [get_detection_events] method can be used to retrieve all detection events for a specific task programmatically. +- **WebApp**: navigate to the **`Detection Events`** section located in the task page's sidebar. Here, all detection events are displayed in a table, + with multiple filtering options available for better event management. -[Detection Event]: ../../api/python/models#detectionevent -[DetectionEventType]: ../../api/python/enums#detectioneventtype -[DetectionEventSeverity]: ../../api/python/enums#detectioneventseverity -[MonitoringTarget]: ../../api/python/enums#monitoringtarget -[MonitoringMetric]: ../../api/python/enums#monitoringmetric \ No newline at end of file +## User Feedback + +When a detection event is raised, you can provide feedback on whether the event was expected or not. This feedback is then used +to tune the monitoring algorithms and improve their performance. The feedback can be provided through the WebApp, in the +**`Detection Events`** section of the task page, or through the SDK, using the [set_detection_event_user_feedback] method. + +## Detection Event Rules + +To automate actions upon the reception of a detection event, you can set up detection event rules. +You can learn more about how to configure these rules in the [Detection Event Rules] section. + +[Detection Event]: ../../../api/python/models#detectionevent +[DetectionEventType]: ../../../api/python/enums#detectioneventtype +[DetectionEventSeverity]: ../../../api/python/enums#detectioneventseverity +[MonitoringTarget]: ../../../api/python/enums#monitoringtarget +[MonitoringMetric]: ../../../api/python/enums#monitoringmetric +[get_detection_events]: ../../../api/python/client/#get_detection_events +[set_detection_event_user_feedback]: ../../../api/python/client/#set_detection_event_user_feedback +[Detection Event Rules]: detection_event_rules.md \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index 2b4730f..025fdbc 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -55,7 +55,7 @@ Of course, the model must already have a retrain trigger associated before addin ) ``` -[add_historical_data]: ../../api/python/client#add_historical_data +[add_historical_data]: ../../../api/python/client#add_historical_data [Detection Event]: detection_event.md -[DetectionEventType]: ../../api/python/enums#detectioneventtype -[DetectionEventSeverity]: ../../api/python/enums#detectioneventseverity \ No newline at end of file +[DetectionEventType]: ../../../api/python/enums#detectioneventtype +[DetectionEventSeverity]: ../../../api/python/enums#detectioneventseverity \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index e451922..842ee68 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -1,7 +1,2 @@ # Drift Explainability -[Task]: task.md -[set_model_reference]: ../../api/python/client#set_model_reference -[add_production_data]: ../../api/python/client#add_production_data -[add_historical_data]: ../../api/python/client#add_historical_data -[DetectionEvent]: ../../api/python/models#detectionevent diff --git a/md-docs/user_guide/monitoring/monitoring.md b/md-docs/user_guide/monitoring/monitoring.md index 88fe01d..a2b4520 100644 --- a/md-docs/user_guide/monitoring/monitoring.md +++ b/md-docs/user_guide/monitoring/monitoring.md @@ -91,9 +91,9 @@ under which they are actually monitored. Notice that also this table is subject [Task]: task.md -[set_model_reference]: ../../api/python/client#set_model_reference -[add_production_data]: ../../api/python/client#add_production_data -[add_historical_data]: ../../api/python/client#add_historical_data -[DetectionEvent]: ../../api/python/models#detectionevent +[set_model_reference]: ../../../api/python/client#set_model_reference +[add_production_data]: ../../../api/python/client#add_production_data +[add_historical_data]: ../../../api/python/client#add_historical_data +[DetectionEvent]: ../../../api/python/models#detectionevent [Detection Event Rule]: detection_event_rules.md [Detection Event]: detection_event.md From 5895c659ec73bd1f855d38fbe177c29684ed2747 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Tue, 22 Oct 2024 18:31:12 +0200 Subject: [PATCH 08/22] Rename monitoring.md to index.md, drift explainability report --- .../user_guide/monitoring/detection_event.md | 27 +++++++-------- .../monitoring/drift_explainability.md | 33 +++++++++++++++++++ .../monitoring/{monitoring.md => index.md} | 0 mkdocs.yml | 2 +- 4 files changed, 48 insertions(+), 14 deletions(-) rename md-docs/user_guide/monitoring/{monitoring.md => index.md} (100%) diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index a6ee697..430103c 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -4,36 +4,37 @@ A [Detection Event] is raised by the MLCube Platform when a significant change i An event is characterized by the following attributes: -- **event_id**: the unique identifier of the event. -- **event_type**: the [DetectionEventType] of the event. -- **severity**: the [DetectionEventSeverity] of the event. The severity should be provided only for drift events and it indicates +- `event_id`: the unique identifier of the event. +- `event_type`: the [DetectionEventType] of the event. +- `severity`: the [DetectionEventSeverity] of the event. The severity should be provided only for drift events and it indicates the criticality of the detected drift. -- **monitoring_target**: the [MonitoringTarget] being monitored. -- **monitoring_metric**: the [MonitoringMetric] that triggered the event, if the event is related to a metric. -- **model_name**: the name of the model that raised the event. -- **model_version**: the version of the model that raised the event. -- **insert_datetime**: the time when the event was raised. -- **sample_timestamp**: the timestamp of the sample that triggered the event. -- **user_feedback**: the feedback provided by the user on whether the event was expected or not. +- `monitoring_target`: the [MonitoringTarget] being monitored. +- `monitoring_metric`: the [MonitoringMetric] that triggered the event, if the event is related to a metric. +- `model_name`: the name of the model that raised the event. +- `model_version`: the version of the model that raised the event. +- `insert_datetime`: the time when the event was raised. +- `sample_timestamp`: the timestamp of the sample that triggered the event. +- `user_feedback`: the feedback provided by the user on whether the event was expected or not. ## Retrieve Detection Events You can access the detection events generated by the Platform in two ways: - **SDK**: the [get_detection_events] method can be used to retrieve all detection events for a specific task programmatically. -- **WebApp**: navigate to the **`Detection Events`** section located in the task page's sidebar. Here, all detection events are displayed in a table, +- **WebApp**: navigate to the **`Detection `** section located in the task page's sidebar. Here, all detection events are displayed in a table, with multiple filtering options available for better event management. ## User Feedback When a detection event is raised, you can provide feedback on whether the event was expected or not. This feedback is then used to tune the monitoring algorithms and improve their performance. The feedback can be provided through the WebApp, in the -**`Detection Events`** section of the task page, or through the SDK, using the [set_detection_event_user_feedback] method. +**`Detection `** section of the task page, or through the SDK, using the [set_detection_event_user_feedback] method. + ## Detection Event Rules To automate actions upon the reception of a detection event, you can set up detection event rules. -You can learn more about how to configure these rules in the [Detection Event Rules] section. +You can learn more about how to configure them in the [Detection Event Rules] section. [Detection Event]: ../../../api/python/models#detectionevent [DetectionEventType]: ../../../api/python/enums#detectioneventtype diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index 842ee68..5ab5592 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -1,2 +1,35 @@ # Drift Explainability +[Monitoring] is a crucial aspect of the machine learning lifecycle, as it enables tracking the model's performance and its data over time, +ensuring the model continues to function as expected. However, monitoring only is not enough when it comes to the adaptation phase. + +In order to make the right decisions, you need to understand what were the main factors that led to the drift in the first place, so that +the correct actions can be taken to mitigate it. + +The MLCube Platform supports this process by offering what we refer to as "**Drift Explainability Reports**", +automatically generated upon the detection of a drift and containing several elements that should help you diagnose the root causes +of the change occurred. + +You can access the reports by navigating to the `Drift Explainability` tab in the sidebar of the task page. + +## Content + +A Drift Explainability Report consists in comparing the reference data and the portion of production data that caused the drift, hence +belonging to the new concept. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. +If a drift event is soon followed by a drift off event, the report might not be generated. + +Each report is composed of several entities, each one providing a different view of the data and the drift. They might be in the form of +a table, a plot, or a textual explanation. +Observed and analyzed together, they should provide a comprehensive understanding of the drift and its causes. Some entities are +only available for specific task types and data structure. +These are the entities currently available: + +- `Feature Importance`: it's a barplot that illustrates how the significance of each feature differs between the reference + and the production datasets. Variations in a feature's values might suggest that its contribution to the model's predictions + has changed over time. This entity is available only for tasks with tabular data. +- `Variable discriminative power`: it's also a bar plot displays the influence of each feature, as well as the target, + in differentiating between the reference and the production datasets. + The values represent how strongly a given feature helps distinguishing the datasets, with higher values representing stronger + separating power. This entity is available only for tasks with tabular data. + +[Monitoring]: monitoring.md \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/monitoring.md b/md-docs/user_guide/monitoring/index.md similarity index 100% rename from md-docs/user_guide/monitoring/monitoring.md rename to md-docs/user_guide/monitoring/index.md diff --git a/mkdocs.yml b/mkdocs.yml index 1789aca..2710bd8 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -110,7 +110,7 @@ nav: - user_guide/project.md - Monitoring: - - user_guide/monitoring/monitoring.md + - user_guide/monitoring/index.md - user_guide/monitoring/detection_event.md - user_guide/monitoring/detection_event_rules.md - user_guide/monitoring/drift_explainability.md From 315858ca8f5052a92fd16ad1bd597466ebf64f51 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Wed, 23 Oct 2024 09:37:09 +0200 Subject: [PATCH 09/22] Modification --- .../user_guide/monitoring/detection_event.md | 2 +- .../monitoring/detection_event_rules.md | 2 +- .../monitoring/drift_explainability.md | 20 ++++++++++--------- md-docs/user_guide/monitoring/index.md | 6 +++--- 4 files changed, 16 insertions(+), 14 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index 430103c..3a6108a 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -22,7 +22,7 @@ You can access the detection events generated by the Platform in two ways: - **SDK**: the [get_detection_events] method can be used to retrieve all detection events for a specific task programmatically. - **WebApp**: navigate to the **`Detection `** section located in the task page's sidebar. Here, all detection events are displayed in a table, - with multiple filtering options available for better event management. + with multiple filtering options available for useful event management. ## User Feedback diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index 025fdbc..0178ffa 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -30,7 +30,7 @@ Two types of actions are currently supported: notification and retrain. Retrain action lets you retrain your model. Therefore, it is only available when the monitoring target of the rule is related to a model. The retrain action does not need any parameter because it is automatically inferred from the `model_name` attribute of the rule. -Of course, the model must already have a retrain trigger associated before adding a retrain action to a rule. +Of course, the model must already have a retrain trigger associated before setting up this action. !!! example The following code snippet demonstrates how to create a rule that matches high severity drift events on the error of a model. diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index 5ab5592..fe4bfed 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -12,16 +12,17 @@ of the change occurred. You can access the reports by navigating to the `Drift Explainability` tab in the sidebar of the task page. -## Content +## Structure -A Drift Explainability Report consists in comparing the reference data and the portion of production data that caused the drift, hence -belonging to the new concept. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. -If a drift event is soon followed by a drift off event, the report might not be generated. +A Drift Explainability Report consists in comparing the reference data and the portion of production data where the drift was identified, hence +those belonging to the new concept. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. +If the distribution moves back to the reference before enough samples are collected, the report might not be generated. -Each report is composed of several entities, each one providing a different view of the data and the drift. They might be in the form of -a table, a plot, or a textual explanation. -Observed and analyzed together, they should provide a comprehensive understanding of the drift and its causes. Some entities are -only available for specific task types and data structure. +Each report is composed of several entities, each providing a different perspective on the data and the drift occurred. +Most of them are specific to a certain [Data Structure], so they might not be available for all tasks. + +These entities can take the form of tables, plots, or textual explanations. +Observed and analyzed together, they should provide a comprehensive understanding of the drift and its underlying causes. These are the entities currently available: - `Feature Importance`: it's a barplot that illustrates how the significance of each feature differs between the reference @@ -32,4 +33,5 @@ These are the entities currently available: The values represent how strongly a given feature helps distinguishing the datasets, with higher values representing stronger separating power. This entity is available only for tasks with tabular data. -[Monitoring]: monitoring.md \ No newline at end of file +[Monitoring]: monitoring.md +[Data Structure]: ../../../api/python/enums#datastructure \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index a2b4520..9defffa 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -1,7 +1,7 @@ # Monitoring The monitoring module is a key feature of the ML cube Platform. -It enables continuous tracking of your AI models performance over time, helping to identify potential issues. +It enables continuous tracking of AI models performance over time, helping to identify potential issues. Additionally, it allows the monitoring of production data to preemptively detect distribution changes, ensuring that the model continues to perform as expected and aligns with business requirements. @@ -23,7 +23,7 @@ is drifting away from the expected behavior and that corrective actions should b In more practical terms, the [set_model_reference] method can be used to specify the time period where the reference of a given model should be placed. As a consequence, all algorithms associated with the specified model (not just those monitoring the performance, but also those operating on the data used by the model) will -be initialized on the specified reference. Of course, you should provide the data you want to use as a reference to the platform before calling this method, for instance using the +be initialized on the specified reference. Of course, you should provide to the Platform the data you want to use as a reference before calling this method, for instance using the [add_historical_data] method. After setting the reference, the [add_production_data] method can be used to send production data to the platform. This data will be analyzed by the monitoring algorithms @@ -75,7 +75,7 @@ A Monitoring Metric is a generic quantity that can be computed on a Monitoring T aspects of a target, which might help in identifying the root cause of a drift, as well as defining the corrective actions to be taken. The following table display the monitoring metrics supported, along with their monitoring target and the conditions -under which they are actually monitored. Notice that also this table is subject to changes, as new metrics are added. +under which they are actually monitored. Notice that also this table is subject to changes, as new metrics will be added. | **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | |:---------------------:|----------------------------------------------------------|:------------------------------------------------:|:--------------------------------------:| From abb2c78183ae0cedc3e7f37a781828d62e7548ba Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Wed, 23 Oct 2024 09:51:24 +0200 Subject: [PATCH 10/22] Fix links and small impr --- md-docs/user_guide/index.md | 4 ++-- md-docs/user_guide/modules/retraining.md | 2 +- .../user_guide/monitoring/detection_event.md | 14 ++++++------ .../monitoring/detection_event_rules.md | 6 ++--- .../monitoring/drift_explainability.md | 4 ++-- md-docs/user_guide/monitoring/index.md | 10 ++++----- requirements.txt | 22 +++++++++++-------- 7 files changed, 33 insertions(+), 29 deletions(-) diff --git a/md-docs/user_guide/index.md b/md-docs/user_guide/index.md index 1f55726..0c3d7a5 100644 --- a/md-docs/user_guide/index.md +++ b/md-docs/user_guide/index.md @@ -53,7 +53,7 @@ A **Task** is specified by several attributes, the most important are: - `type`: regression, classification, object detection ... - `data structure`: tabular data, image data, ... -- `optional target`: if the target is not always available. This happen when input samples are labeled and the most part of production data do not have a label +- `optional target`: if the target is not always available. This happens when input samples are labeled and the most part of production data do not have a label - `data schema`: specifies the inputs and the target of the task, see [Data Schema](data_schema.md) section for more details - `cost info`: information about the economic costs of the error on the target @@ -110,7 +110,7 @@ Now that you have clear the basic concepts we invite you to explore the other ML Discover how to setup automation rules to increase your reactivity. - [:octicons-arrow-right-24: More info](detection_event_rules.md) + [:octicons-arrow-right-24: More info](monitoring/detection_event_rules.md) - :material-lock:{ .lg .middle } **Roles and access** diff --git a/md-docs/user_guide/modules/retraining.md b/md-docs/user_guide/modules/retraining.md index ba0777d..4c48d35 100644 --- a/md-docs/user_guide/modules/retraining.md +++ b/md-docs/user_guide/modules/retraining.md @@ -7,7 +7,7 @@ Even if, the data has changed you can extract useful information from the past. ML cube Platform leverages all the available data belonging to the three categories: historical, reference and production, computing an Importance Score to every data sample you have. These Importance Scores will be used during the training phase of your model as weights in the loss function. -A **Retraining Report** is generated whenever you request it or as a consequence of a [Detection automation rules](../detection_event_rules.md). +A **Retraining Report** is generated whenever you request it or as a consequence of a [Detection automation rules](../monitoring/detection_event_rules.md). It contains several information and aspects that let you to decide if it is the right time to retrain your model. It's sections are: diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index 3a6108a..18294ec 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -36,11 +36,11 @@ to tune the monitoring algorithms and improve their performance. The feedback ca To automate actions upon the reception of a detection event, you can set up detection event rules. You can learn more about how to configure them in the [Detection Event Rules] section. -[Detection Event]: ../../../api/python/models#detectionevent -[DetectionEventType]: ../../../api/python/enums#detectioneventtype -[DetectionEventSeverity]: ../../../api/python/enums#detectioneventseverity -[MonitoringTarget]: ../../../api/python/enums#monitoringtarget -[MonitoringMetric]: ../../../api/python/enums#monitoringmetric -[get_detection_events]: ../../../api/python/client/#get_detection_events -[set_detection_event_user_feedback]: ../../../api/python/client/#set_detection_event_user_feedback +[Detection Event]: ../../api/python/models.md#detectionevent +[DetectionEventType]: ../../api/python/enums.md#detectioneventtype +[DetectionEventSeverity]: ../../api/python/enums.md#detectioneventseverity +[MonitoringTarget]: ../../api/python/enums.md#monitoringtarget +[MonitoringMetric]: ../../api/python/enums.md#monitoringmetric +[get_detection_events]: ../../api/python/client.md#get_detection_events +[set_detection_event_user_feedback]: ../../api/python/client.md#set_detection_event_user_feedback [Detection Event Rules]: detection_event_rules.md \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index 0178ffa..289988a 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -55,7 +55,7 @@ Of course, the model must already have a retrain trigger associated before setti ) ``` -[add_historical_data]: ../../../api/python/client#add_historical_data +[add_historical_data]: ../../api/python/client.md#add_historical_data [Detection Event]: detection_event.md -[DetectionEventType]: ../../../api/python/enums#detectioneventtype -[DetectionEventSeverity]: ../../../api/python/enums#detectioneventseverity \ No newline at end of file +[DetectionEventType]: ../../api/python/enums.md#detectioneventtype +[DetectionEventSeverity]: ../../api/python/enums.md#detectioneventseverity \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index fe4bfed..f270fce 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -33,5 +33,5 @@ These are the entities currently available: The values represent how strongly a given feature helps distinguishing the datasets, with higher values representing stronger separating power. This entity is available only for tasks with tabular data. -[Monitoring]: monitoring.md -[Data Structure]: ../../../api/python/enums#datastructure \ No newline at end of file +[Monitoring]: index.md +[Data Structure]: ../../api/python/enums.md#datastructure \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index 9defffa..ebadac8 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -90,10 +90,10 @@ under which they are actually monitored. Notice that also this table is subject | BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | -[Task]: task.md -[set_model_reference]: ../../../api/python/client#set_model_reference -[add_production_data]: ../../../api/python/client#add_production_data -[add_historical_data]: ../../../api/python/client#add_historical_data -[DetectionEvent]: ../../../api/python/models#detectionevent +[Task]: ../task.md +[set_model_reference]: ../../api/python/client.md#set_model_reference +[add_production_data]: ../../api/python/client.md#add_production_data +[add_historical_data]: ../../api/python/client.md#add_historical_data +[DetectionEvent]: ../../api/python/models.md#detectionevent [Detection Event Rule]: detection_event_rules.md [Detection Event]: detection_event.md diff --git a/requirements.txt b/requirements.txt index 873975b..dea859f 100644 --- a/requirements.txt +++ b/requirements.txt @@ -14,8 +14,6 @@ anyio==4.4.0 # via # httpx # jupyter-server -appnope==0.1.4 - # via ipykernel argon2-cffi==23.1.0 # via jupyter-server argon2-cffi-bindings==21.2.0 @@ -51,7 +49,12 @@ charset-normalizer==3.3.2 click==8.1.7 # via mkdocs colorama==0.4.6 - # via mkdocs-material + # via + # click + # ipython + # mkdocs + # mkdocs-material + # tqdm comm==0.2.2 # via # ipykernel @@ -337,8 +340,6 @@ pathspec==0.12.1 # via # mkdocs # mkdocs-macros-plugin -pexpect==4.9.0 - # via ipython pillow==10.3.0 # via # ml3-platform-docs (pyproject.toml) @@ -360,10 +361,6 @@ psutil==5.9.8 # via # accelerate # ipykernel -ptyprocess==0.7.0 - # via - # pexpect - # terminado pure-eval==0.2.2 # via stack-data pyarrow==16.1.0 @@ -399,6 +396,13 @@ python-json-logger==2.0.7 # via jupyter-events pytz==2024.1 # via pandas +pywin32==308 + # via jupyter-core +pywinpty==2.0.14 + # via + # jupyter-server + # jupyter-server-terminals + # terminado pyyaml==6.0.1 # via # accelerate From a70c0028147580456b8d1745e3019912468cb3c9 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Wed, 23 Oct 2024 13:12:49 +0200 Subject: [PATCH 11/22] Add customer id in detection event page (will be added soon) --- md-docs/user_guide/monitoring/detection_event.md | 1 + 1 file changed, 1 insertion(+) diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index 18294ec..1ec5d95 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -14,6 +14,7 @@ the criticality of the detected drift. - `model_version`: the version of the model that raised the event. - `insert_datetime`: the time when the event was raised. - `sample_timestamp`: the timestamp of the sample that triggered the event. +- 'sample_customer_id': the id of the customer that triggered the event. - `user_feedback`: the feedback provided by the user on whether the event was expected or not. ## Retrieve Detection Events From 36fc8aa52a5ea75a275810d3354b5e1513d26236 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Wed, 23 Oct 2024 18:10:26 +0200 Subject: [PATCH 12/22] Add section in monitoring index.md to explain how to access drift status --- md-docs/user_guide/monitoring/index.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index ebadac8..985c126 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -90,6 +90,25 @@ under which they are actually monitored. Notice that also this table is subject | BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | +### Monitoring Status + +All the entities being monitored are associated with a status, which is defined according to the enumeration [MonitoringStatus]. + +The status can be one of the following: + +- `OK`: the entity is behaving as expected. +- `WARNING`: the entity has shown signs of drifts, but it is still within the acceptable range. +- `DRIFT`: the entity has experienced a significant change and corrective actions should be taken. + +You can check the status of the monitored entities in two ways: + +- **WebApp**: The homepage of the task displays the status of both monitoring targets and metrics. +- **SDK**: The [get_monitoring_status] method can be used to retrieve the status of the monitored entities programmatically. + This method returns a [MonitoringQuantityStatus], a BaseModel holding the status of the monitoring entity requested. + Otherwise, you can use the [get_task] method, which returns a BaseModel with all the information related to a task, including + the list of [MonitoringQuantityStatus] for all the entities monitored in the task. + + [Task]: ../task.md [set_model_reference]: ../../api/python/client.md#set_model_reference [add_production_data]: ../../api/python/client.md#add_production_data @@ -97,3 +116,7 @@ under which they are actually monitored. Notice that also this table is subject [DetectionEvent]: ../../api/python/models.md#detectionevent [Detection Event Rule]: detection_event_rules.md [Detection Event]: detection_event.md +[MonitoringStatus]: ../../api/python/enums.md#monitoringstatus +[get_monitoring_status]: ../../api/python/client.md#get_monitoring_status +[MonitoringQuantityStatus]: ../../api/python/models.md#monitoringquantitystatus +[get_task]: ../../api/python/client.md#get_task \ No newline at end of file From d48f1e98f978ab56cdee5baeb7a1777c5a98a8a7 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Thu, 31 Oct 2024 13:21:12 +0100 Subject: [PATCH 13/22] Quick fixes post pr --- md-docs/user_guide/monitoring/detection_event.md | 9 +++++---- .../monitoring/detection_event_rules.md | 2 +- .../user_guide/monitoring/drift_explainability.md | 3 ++- md-docs/user_guide/monitoring/index.md | 15 ++++++++------- 4 files changed, 16 insertions(+), 13 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index 1ec5d95..ac0bfc7 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -1,6 +1,6 @@ # Detection Event -A [Detection Event] is raised by the MLCube Platform when a significant change is detected in one of the entities being monitored. +A [Detection Event] is raised by the ML cube Platform when a significant change is detected in one of the entities being monitored. An event is characterized by the following attributes: @@ -10,8 +10,8 @@ An event is characterized by the following attributes: the criticality of the detected drift. - `monitoring_target`: the [MonitoringTarget] being monitored. - `monitoring_metric`: the [MonitoringMetric] that triggered the event, if the event is related to a metric. -- `model_name`: the name of the model that raised the event. -- `model_version`: the version of the model that raised the event. +- `model_name`: the name of the model that raised the event. It's present only if the event is related to a model. +- `model_version`: the version of the model that raised the event. It's present only if the event is related to a model. - `insert_datetime`: the time when the event was raised. - `sample_timestamp`: the timestamp of the sample that triggered the event. - 'sample_customer_id': the id of the customer that triggered the event. @@ -23,7 +23,8 @@ You can access the detection events generated by the Platform in two ways: - **SDK**: the [get_detection_events] method can be used to retrieve all detection events for a specific task programmatically. - **WebApp**: navigate to the **`Detection `** section located in the task page's sidebar. Here, all detection events are displayed in a table, - with multiple filtering options available for useful event management. + with multiple filtering options available for useful event management. Additionally, the latest detection events identified are shown in the Task homepage, + in the section named "Latest Detection Events". ## User Feedback diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index 289988a..ee1b826 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -2,7 +2,7 @@ This section outlines how to configure automation to receive notifications or start retraining after a [Detection Event] occurs. -When a detection event is produced, the MLCube Platform reviews all the detection event rules you have set +When a detection event is produced, the ML cube Platform reviews all the detection event rules you have set and triggers those matching the event. Rules are specific to a task and require the following parameters: diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index f270fce..74c5603 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -6,7 +6,7 @@ ensuring the model continues to function as expected. However, monitoring only i In order to make the right decisions, you need to understand what were the main factors that led to the drift in the first place, so that the correct actions can be taken to mitigate it. -The MLCube Platform supports this process by offering what we refer to as "**Drift Explainability Reports**", +The ML cube Platform supports this process by offering what we refer to as "**Drift Explainability Reports**", automatically generated upon the detection of a drift and containing several elements that should help you diagnose the root causes of the change occurred. @@ -16,6 +16,7 @@ You can access the reports by navigating to the `Drift Explainability` tab in th A Drift Explainability Report consists in comparing the reference data and the portion of production data where the drift was identified, hence those belonging to the new concept. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. +This is because the elements of the report needs a significant number of samples to guarantee statistical reliability of the results. If the distribution moves back to the reference before enough samples are collected, the report might not be generated. Each report is composed of several entities, each providing a different perspective on the data and the drift occurred. diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index 985c126..b438027 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -15,9 +15,9 @@ in turn can have a negative impact on the business. Monitoring, also known as __Drift Detection__ in the literature, refers the process of continuously tracking the performance of a model and the distribution of the data it is operating on. -## How does the MLCube Platform perform Monitoring? +## How does the ML cube Platform perform Monitoring? -The MLCube platform performs monitoring by employing statistical techniques to compare a certain reference (for instance, data used for training or the performance of a model +The ML cube platform performs monitoring by employing statistical techniques to compare a certain reference (for instance, data used for training or the performance of a model on the test set) to incoming production data. If a significant difference is detected, an alarm is raised, signaling that the monitored entity is drifting away from the expected behavior and that corrective actions should be taken. @@ -34,14 +34,15 @@ and the [Detection Event Rule] sections respectively. ### Targets and Metrics After explaining why monitoring is so important in modern AI systems and detailing how it is performed in the ML cube Platform, -we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the MLCube Platform monitors, but they differ in their nature. +we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the ML cube Platform monitors, but they differ in their nature. +They are both automatically defined by the ML cube platform based on the [Task] attributes, such as the Task type and the data structure. #### Monitoring Targets A Monitoring Target is a relevant entity involved in a [Task]. They represent the main quantities monitored by the platform, those whose variation can have a significant impact on the AI task success. -The MLCube platform supports the following monitoring targets: +The ML cube platform supports the following monitoring targets: - `INPUT`: the input distribution, $P(X)$. - `CONCEPT`: the joint distribution of input and target, $P(X, Y)$. @@ -72,9 +73,9 @@ Nonetheless, the platform might not support yet all possible combinations. The t #### Monitoring Metrics A Monitoring Metric is a generic quantity that can be computed on a Monitoring Target. They enable the monitoring of specific -aspects of a target, which might help in identifying the root cause of a drift, as well as defining the corrective actions to be taken. +aspects of an entity, which might help in identifying the root cause of a drift, as well as defining the corrective actions to be taken. -The following table display the monitoring metrics supported, along with their monitoring target and the conditions +The following table displays the monitoring metrics supported, along with their monitoring target and the conditions under which they are actually monitored. Notice that also this table is subject to changes, as new metrics will be added. | **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | @@ -83,7 +84,7 @@ under which they are actually monitored. Notice that also this table is subject | TEXT_EMOTION | The emotion of the text | INPUT, USER_INPUT | When the data structure is text | | TEXT_SENTIMENT | The sentiment of the text | INPUT, USER_INPUT | When the data structure is text | | TEXT_LENGTH | The length of the text | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | -| MODEL_PERPLEXITY | The uncertainty of the LLM | PREDICTION | When the task type is RAG | +| MODEL_PERPLEXITY | A measure of how well the LLM predicts the next words | PREDICTION | When the task type is RAG | | IMAGE_BRIGHTNESS | The brightness of the image | INPUT | When the data structure is image | | IMAGE_CONTRAST | The contrast of the image | INPUT | When the data structure is image | | BBOXES_AREA | The average area of the predicted bounding boxes | PREDICTION | When the task type is Object Detection | From ee83085824f97ebe66eb6706330819ddc934e140 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Thu, 31 Oct 2024 14:45:42 +0100 Subject: [PATCH 14/22] Second set of corrections post pr Plot configurations, more general description of monitoring --- .../monitoring/detection_event_rules.md | 10 +++++- .../monitoring/drift_explainability.md | 2 +- md-docs/user_guide/monitoring/index.md | 35 +++++++++++-------- 3 files changed, 31 insertions(+), 16 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index ee1b826..50e5ab9 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -17,15 +17,23 @@ Rules are specific to a task and require the following parameters: - `actions`: A sequential list of actions to be executed when the rule is triggered. ## Detection Event Actions -Two types of actions are currently supported: notification and retrain. +Three types of actions are currently supported: notification, plot configuration and retrain. ### Notifications + +These actions send notifications to external services when a detection event is triggered. The following notification actions are available: + - `SlackNotificationAction`: sends a notification to a Slack channel via webhook. - `DiscordNotificationAction`: sends a notification to a Discord channel via webhook. - `EmailNotificationAction`: sends an email to the provided email address. - `TeamsNotificationAction`: sends a notification to Microsoft Teams via webhook. - `MqttNotificationAction`: sends a notification to an MQTT broker. +### Plot Configuration + +This action consists in creating two plot configurations when a detection event is triggered: the first one includes +data preceding the event, while the second one includes data following the event. + ### Retrain Action Retrain action lets you retrain your model. Therefore, it is only available when the monitoring target of the rule is related to a model. diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index 74c5603..6e10129 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -15,7 +15,7 @@ You can access the reports by navigating to the `Drift Explainability` tab in th ## Structure A Drift Explainability Report consists in comparing the reference data and the portion of production data where the drift was identified, hence -those belonging to the new concept. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. +those belonging to the new data distribution. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. This is because the elements of the report needs a significant number of samples to guarantee statistical reliability of the results. If the distribution moves back to the reference before enough samples are collected, the report might not be generated. diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index b438027..5709411 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -17,25 +17,36 @@ and the distribution of the data it is operating on. ## How does the ML cube Platform perform Monitoring? -The ML cube platform performs monitoring by employing statistical techniques to compare a certain reference (for instance, data used for training or the performance of a model -on the test set) to incoming production data. If a significant difference is detected, an alarm is raised, signaling that the monitored entity +The ML cube platform performs monitoring by employing statistical techniques to compare a certain reference (for instance, data used for training or the +performance of a model +on the test set) to incoming production data. + +These statistical techniques, also known as _monitoring algorithms_, are tailored to the type of data +being observed; for instance, univariate data requires different monitoring techniques than multivariate data. However, you don't need to worry about +the specifics of these algorithms, as the ML cube Platform takes care of selecting the most appropriate ones for your task. + +If a significant difference between reference and production data is detected, an alarm is raised, signaling that the monitored entity is drifting away from the expected behavior and that corrective actions should be taken. -In more practical terms, the [set_model_reference] method can be used to specify the time period where the reference of a given model should be placed. As a consequence, -all algorithms associated with the specified model (not just those monitoring the performance, but also those operating on the data used by the model) will -be initialized on the specified reference. Of course, you should provide to the Platform the data you want to use as a reference before calling this method, for instance using the -[add_historical_data] method. +In practical terms, you can use the SDK to specify the time period where the reference of a given model should be placed. +As a consequence, all algorithms associated with the specified model (not just those monitoring the performance, but also those operating +on the data used by the model) will +be initialized on the specified reference. Of course, you should provide to the +Platform the data you want to use as a reference before setting the reference itself. This can be done through the SDK as well. -After setting the reference, the [add_production_data] method can be used to send production data to the platform. This data will be analyzed by the monitoring algorithms -and, if a significant difference is detected, an alarm will be raised, in the form of a [DetectionEvent]. +After setting the reference, you can send production data to the platform, still using the SDK. This data will be analyzed by the monitoring algorithms +and, if a significant difference is detected, an alarm will be raised, in the form of a [Detection Event]. You can explore more about detection events and how you can set up automatic actions upon their reception in the [Detection Event] and the [Detection Event Rule] sections respectively. ### Targets and Metrics After explaining why monitoring is so important in modern AI systems and detailing how it is performed in the ML cube Platform, -we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the ML cube Platform monitors, but they differ in their nature. -They are both automatically defined by the ML cube platform based on the [Task] attributes, such as the Task type and the data structure. +we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the ML cube Platform monitors, +but they differ in their nature. +They are both automatically defined by the ML cube platform based on the [Task] attributes, such as the Task type and the data structure, + + #### Monitoring Targets @@ -111,10 +122,6 @@ You can check the status of the monitored entities in two ways: [Task]: ../task.md -[set_model_reference]: ../../api/python/client.md#set_model_reference -[add_production_data]: ../../api/python/client.md#add_production_data -[add_historical_data]: ../../api/python/client.md#add_historical_data -[DetectionEvent]: ../../api/python/models.md#detectionevent [Detection Event Rule]: detection_event_rules.md [Detection Event]: detection_event.md [MonitoringStatus]: ../../api/python/enums.md#monitoringstatus From ecd4b2683ea6116711b19d1b0fe78b428f018834 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Thu, 31 Oct 2024 17:54:10 +0100 Subject: [PATCH 15/22] Third set of corrections post pr General Detection Event description, typo fix, possible metric, values, state diagram for monitoring status --- .../user_guide/monitoring/detection_event.md | 41 +++++------ .../monitoring/drift_explainability.md | 2 +- md-docs/user_guide/monitoring/index.md | 70 ++++++++++++------- 3 files changed, 65 insertions(+), 48 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index ac0bfc7..9a8657b 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -1,27 +1,29 @@ # Detection Event -A [Detection Event] is raised by the ML cube Platform when a significant change is detected in one of the entities being monitored. +A Detection Event is raised by the ML cube Platform when a significant change is detected in one of the entities being monitored. An event is characterized by the following attributes: -- `event_id`: the unique identifier of the event. -- `event_type`: the [DetectionEventType] of the event. -- `severity`: the [DetectionEventSeverity] of the event. The severity should be provided only for drift events and it indicates -the criticality of the detected drift. -- `monitoring_target`: the [MonitoringTarget] being monitored. -- `monitoring_metric`: the [MonitoringMetric] that triggered the event, if the event is related to a metric. -- `model_name`: the name of the model that raised the event. It's present only if the event is related to a model. -- `model_version`: the version of the model that raised the event. It's present only if the event is related to a model. -- `insert_datetime`: the time when the event was raised. -- `sample_timestamp`: the timestamp of the sample that triggered the event. -- 'sample_customer_id': the id of the customer that triggered the event. -- `user_feedback`: the feedback provided by the user on whether the event was expected or not. +- `Event Type`: the type of the event. It's possible values are: + - `Warning On`: the monitoring entity is experiencing slight changes that might lead to a drift. + - `Warning Off`: the monitoring entity has returned to the reference distribution. + - `Drift On`: the monitoring entity has drifted from the reference distribution. + - `Drift Off`: the monitoring entity has returned to the reference distribution. +- `Severity`: the severity of the event. It's provided only for drift events and it can be `Low`, `Medium`, or `High`. +- `Monitoring Target`: the Monitoring Target being monitored (see the [Monitoring] page for more details on what a Monitoring Target is). +- `Monitoring Metric`: the Monitoring Metric being monitored (see the [Monitoring] page for more details on what a Monitoring Metric is). +- `Model Name`: the name of the model that raised the event. It's present only if the event is related to a model. +- `Model Version`: the version of the model that raised the event. It's present only if the event is related to a model. +- `Insert datetime`: the time when the event was raised. +- `Sample timestamp`: the timestamp of the sample that triggered the event. +- `Sample customer ID`: the id of the customer that triggered the event. +- `User feedback`: the feedback provided by the user on whether the event was expected or not. ## Retrieve Detection Events You can access the detection events generated by the Platform in two ways: -- **SDK**: the [get_detection_events] method can be used to retrieve all detection events for a specific task programmatically. +- **SDK**: it can be used to retrieve all detection events for a specific task programmatically. - **WebApp**: navigate to the **`Detection `** section located in the task page's sidebar. Here, all detection events are displayed in a table, with multiple filtering options available for useful event management. Additionally, the latest detection events identified are shown in the Task homepage, in the section named "Latest Detection Events". @@ -30,7 +32,7 @@ You can access the detection events generated by the Platform in two ways: When a detection event is raised, you can provide feedback on whether the event was expected or not. This feedback is then used to tune the monitoring algorithms and improve their performance. The feedback can be provided through the WebApp, in the -**`Detection `** section of the task page, or through the SDK, using the [set_detection_event_user_feedback] method. +**`Detection `** section of the task page, or through the SDK. ## Detection Event Rules @@ -38,11 +40,6 @@ to tune the monitoring algorithms and improve their performance. The feedback ca To automate actions upon the reception of a detection event, you can set up detection event rules. You can learn more about how to configure them in the [Detection Event Rules] section. -[Detection Event]: ../../api/python/models.md#detectionevent -[DetectionEventType]: ../../api/python/enums.md#detectioneventtype -[DetectionEventSeverity]: ../../api/python/enums.md#detectioneventseverity -[MonitoringTarget]: ../../api/python/enums.md#monitoringtarget -[MonitoringMetric]: ../../api/python/enums.md#monitoringmetric -[get_detection_events]: ../../api/python/client.md#get_detection_events -[set_detection_event_user_feedback]: ../../api/python/client.md#set_detection_event_user_feedback + +[Monitoring]: index.md [Detection Event Rules]: detection_event_rules.md \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index 6e10129..7ba320c 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -31,7 +31,7 @@ These are the entities currently available: has changed over time. This entity is available only for tasks with tabular data. - `Variable discriminative power`: it's also a bar plot displays the influence of each feature, as well as the target, in differentiating between the reference and the production datasets. - The values represent how strongly a given feature helps distinguishing the datasets, with higher values representing stronger + The values represent how strongly a given feature helps to distinguish the datasets, with higher values representing stronger separating power. This entity is available only for tasks with tabular data. [Monitoring]: index.md diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index 5709411..b1a767c 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -44,9 +44,11 @@ and the [Detection Event Rule] sections respectively. After explaining why monitoring is so important in modern AI systems and detailing how it is performed in the ML cube Platform, we can introduce the concepts of Monitoring Targets and Monitoring Metrics. They both represent quantities that the ML cube Platform monitors, but they differ in their nature. -They are both automatically defined by the ML cube platform based on the [Task] attributes, such as the Task type and the data structure, - +Targets and Metrics are defined by the ML cube platform based on the [Task] attributes, such as the Task type and the data structure, and their monitoring +is automatically enabled upon the task creation. The idea underlying defining many entities to monitor, rather than monitoring +only the model error, is to provide a comprehensive view of the model's +performance and the data distribution, easing the identification of the root causes of a drift and thus facilitating the corrective actions. #### Monitoring Targets @@ -87,44 +89,62 @@ A Monitoring Metric is a generic quantity that can be computed on a Monitoring T aspects of an entity, which might help in identifying the root cause of a drift, as well as defining the corrective actions to be taken. The following table displays the monitoring metrics supported, along with their monitoring target and the conditions -under which they are actually monitored. Notice that also this table is subject to changes, as new metrics will be added. - -| **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | -|:---------------------:|----------------------------------------------------------|:------------------------------------------------:|:--------------------------------------:| -| TEXT_TOXICITY | The toxicity of the text | INPUT, USER_INPUT, PREDICTION | When the data structure is text | -| TEXT_EMOTION | The emotion of the text | INPUT, USER_INPUT | When the data structure is text | -| TEXT_SENTIMENT | The sentiment of the text | INPUT, USER_INPUT | When the data structure is text | -| TEXT_LENGTH | The length of the text | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | -| MODEL_PERPLEXITY | A measure of how well the LLM predicts the next words | PREDICTION | When the task type is RAG | -| IMAGE_BRIGHTNESS | The brightness of the image | INPUT | When the data structure is image | -| IMAGE_CONTRAST | The contrast of the image | INPUT | When the data structure is image | -| BBOXES_AREA | The average area of the predicted bounding boxes | PREDICTION | When the task type is Object Detection | -| BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | +under which they are actually monitored. The possible values that each metric can assume are also provided. +Notice that also this table is subject to changes, as new metrics will be added. + +| **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | **Possible values** | +|:---------------------:|----------------------------------------------------------|:------------------------------------------------:|:--------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| TEXT_TOXICITY | The toxicity of the text | INPUT, USER_INPUT, PREDICTION | When the data structure is text | Either _neutral_ or _toxic_. | +| TEXT_EMOTION | The emotion of the text | INPUT, USER_INPUT | When the data structure is text | If the Task text language is Italian, one between these: _anger_, _joy_, _sadness_, _fear_.

Otherwise, _one_ between these: _admiration_, _amusement_, _anger_, _annoyance_, _approval_, _caring_, _confusion_, _curiosity_, _desire_, _disappointment_, _disapproval_, _disgust_, _embarrassment_, _excitement_, _fear_, _gratitude_, _grief_, _joy_, _love_, _nervousness_, _optimism_, _pride_, _realization_, _relief_, _remorse_, _sadness_, _surprise_, _neutral_. | +| TEXT_SENTIMENT | The sentiment of the text | INPUT, USER_INPUT | When the data structure is text | If the Task text language is Italian, one between these: _POSITIVE_, _NEGATIVE_. Otherwise, one between these: _negative_, _neutral_, _positive_ | +| TEXT_LENGTH | The length of the text | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | An integer value. | +| MODEL_PERPLEXITY | A measure of how well the LLM predicts the next words | PREDICTION | When the task type is RAG | A floating point value. | +| IMAGE_BRIGHTNESS | The brightness of the image | INPUT | When the data structure is image | A floating point value. | +| IMAGE_CONTRAST | The contrast of the image | INPUT | When the data structure is image | A floating point value. | +| BBOXES_AREA | The average area of the predicted bounding boxes | PREDICTION | When the task type is Object Detection | A floating point value. | +| BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | An integer value. | ### Monitoring Status -All the entities being monitored are associated with a status, which is defined according to the enumeration [MonitoringStatus]. - -The status can be one of the following: +All the entities being monitored are associated with a status, which can be one of the following: - `OK`: the entity is behaving as expected. - `WARNING`: the entity has shown signs of drifts, but it is still within the acceptable range. - `DRIFT`: the entity has experienced a significant change and corrective actions should be taken. +The following diagram illustrates the possible transitions between the statuses. +Each transition is triggered by a [Detection Event] and the status of the entity is updated accordingly. + +
+ +```mermaid +stateDiagram-v2 + direction LR + + [*] --> OK : Initial State + + OK --> WARNING : Warning On + WARNING --> OK : Warning Off + + WARNING --> DRIFT : Drift On + DRIFT --> WARNING : Drift Off + + DRIFT --> OK : Drift Off +``` +
+ +Notice that a drift off event can either bring the entity back to the `OK` status or to the `WARNING` status, +depending on the velocity of the change and the monitoring algorithm's sensitivity. + You can check the status of the monitored entities in two ways: - **WebApp**: The homepage of the task displays the status of both monitoring targets and metrics. -- **SDK**: The [get_monitoring_status] method can be used to retrieve the status of the monitored entities programmatically. - This method returns a [MonitoringQuantityStatus], a BaseModel holding the status of the monitoring entity requested. - Otherwise, you can use the [get_task] method, which returns a BaseModel with all the information related to a task, including - the list of [MonitoringQuantityStatus] for all the entities monitored in the task. +- **SDK**: there are a couple of methods to retrieve the status of the monitored entities programmatically. You can either get the status of a specific entity + or retrieve the status of all the entities associated with a task. [Task]: ../task.md [Detection Event Rule]: detection_event_rules.md [Detection Event]: detection_event.md -[MonitoringStatus]: ../../api/python/enums.md#monitoringstatus -[get_monitoring_status]: ../../api/python/client.md#get_monitoring_status -[MonitoringQuantityStatus]: ../../api/python/models.md#monitoringquantitystatus [get_task]: ../../api/python/client.md#get_task \ No newline at end of file From a638bc321cfbbd2014ffbb843a11b996bc47cd11 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Thu, 31 Oct 2024 18:02:58 +0100 Subject: [PATCH 16/22] Detection event rules page --- .../monitoring/detection_event_rules.md | 33 +++++++++---------- 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index 50e5ab9..75f9ad0 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -5,16 +5,16 @@ This section outlines how to configure automation to receive notifications or st When a detection event is produced, the ML cube Platform reviews all the detection event rules you have set and triggers those matching the event. -Rules are specific to a task and require the following parameters: +Rules are specific to a task and are characterized by the following attributes: -- `name`: a descriptive label of the rule. -- `task_id`: the unique identifier of the task to which the rule belongs. -- `detection_event_type`: the [DetectionEventType] that the rule matches. -- `severity`: the [DetectionEventSeverity] of the event. -- `monitoring_target`: the monitoring target whose event should trigger the rule. -- `model_name`: the name of the model to which the rule applies. This is only required when the monitoring target is related to a model +- `Name`: a descriptive label of the rule. +- `Detection Event Type`: the type of event that triggers the rule. +- `Severity`: the severity of the event that triggers the rule. If not specified, the rule will be triggered by events of any severity. +- `Monitoring Target`: the monitoring target whose event should trigger the rule. +- `Monitoring Metric`: the monitoring metric whose event should trigger the rule. +- `Model name`: the name of the model to which the rule applies. This is only required when the monitoring target is related to a model (such as `ERROR` or `PREDICTION`). -- `actions`: A sequential list of actions to be executed when the rule is triggered. +- `Actions`: A list of actions to be executed sequentially when the rule is triggered. ## Detection Event Actions Three types of actions are currently supported: notification, plot configuration and retrain. @@ -23,11 +23,11 @@ Three types of actions are currently supported: notification, plot configuration These actions send notifications to external services when a detection event is triggered. The following notification actions are available: -- `SlackNotificationAction`: sends a notification to a Slack channel via webhook. -- `DiscordNotificationAction`: sends a notification to a Discord channel via webhook. -- `EmailNotificationAction`: sends an email to the provided email address. -- `TeamsNotificationAction`: sends a notification to Microsoft Teams via webhook. -- `MqttNotificationAction`: sends a notification to an MQTT broker. +- `Slack Notification`: sends a notification to a Slack channel via webhook. +- `Discord Notification`: sends a notification to a Discord channel via webhook. +- `Email Notification`: sends an email to the provided email address. +- `Teams Notification`: sends a notification to Microsoft Teams via webhook. +- `Mqtt Notification`: sends a notification to an MQTT broker. ### Plot Configuration @@ -37,7 +37,7 @@ data preceding the event, while the second one includes data following the event ### Retrain Action Retrain action lets you retrain your model. Therefore, it is only available when the monitoring target of the rule is related to a model. -The retrain action does not need any parameter because it is automatically inferred from the `model_name` attribute of the rule. +The retrain action does not need any parameter because it is automatically inferred from the `Model Name` attribute of the rule. Of course, the model must already have a retrain trigger associated before setting up this action. !!! example @@ -63,7 +63,4 @@ Of course, the model must already have a retrain trigger associated before setti ) ``` -[add_historical_data]: ../../api/python/client.md#add_historical_data -[Detection Event]: detection_event.md -[DetectionEventType]: ../../api/python/enums.md#detectioneventtype -[DetectionEventSeverity]: ../../api/python/enums.md#detectioneventseverity \ No newline at end of file +[Detection Event]: detection_event.md \ No newline at end of file From b0dda3cc80e7613ffc7a79496d0cba2520aa2831 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 4 Nov 2024 10:03:04 +0100 Subject: [PATCH 17/22] Adapted new style; small rewrites --- md-docs/stylesheets/extra.css | 4 +++ md-docs/user_guide/data.md | 4 +-- md-docs/user_guide/modules/index.md | 4 +-- .../user_guide/monitoring/detection_event.md | 16 +++++---- .../monitoring/detection_event_rules.md | 6 ++-- .../monitoring/drift_explainability.md | 5 ++- md-docs/user_guide/monitoring/index.md | 34 ++++++++++--------- mkdocs.yml | 3 +- 8 files changed, 42 insertions(+), 34 deletions(-) diff --git a/md-docs/stylesheets/extra.css b/md-docs/stylesheets/extra.css index a78dfad..1677134 100644 --- a/md-docs/stylesheets/extra.css +++ b/md-docs/stylesheets/extra.css @@ -30,4 +30,8 @@ background-color: rgb(43, 155, 70); -webkit-mask-image: var(--md-admonition-icon--code-block); mask-image: var(--md-admonition-icon--code-block); +} + +.nice-list ul{ + list-style-type: circle; } \ No newline at end of file diff --git a/md-docs/user_guide/data.md b/md-docs/user_guide/data.md index a113c94..10f85cf 100644 --- a/md-docs/user_guide/data.md +++ b/md-docs/user_guide/data.md @@ -19,7 +19,7 @@ Available categories are: The [Data Schema] created for the [Task] contains a list of Column objects, each of which has a _Role_. Naturally, there is a relationship between the Column's Role and the Data Category. In fact, each Data Category comprises a set of Column objects with certain Roles. -So that, when you upload samples belonging to a Data Category, they must contains all the Columns objects declared on the Data Schema to be considered valid. +When you upload samples belonging to a Data Category, they must contain all the Columns objects declared on the Data Schema to be considered valid. The following table shows these relationships: @@ -130,7 +130,7 @@ For RAG Tasks, reference data can be used to indicate the type of data expected You can set reference data as follow: ``` py - job_id = job_id = client.set_model_reference( + job_id = client.set_model_reference( model_id=model_id, from_timestamp=from_timestamp, to_timestamp=to_timestamp, diff --git a/md-docs/user_guide/modules/index.md b/md-docs/user_guide/modules/index.md index 879ac58..375976c 100644 --- a/md-docs/user_guide/modules/index.md +++ b/md-docs/user_guide/modules/index.md @@ -13,7 +13,7 @@ Modules can be always active or on-demand: Monitoring module and Drift Explainab Data drift detection over data. - [:octicons-arrow-right-24: More info](user_guide/company.md) + [:octicons-arrow-right-24: More info](../monitoring/index.md) - :material-compare:{ .lg .middle } **Drift Explainability** @@ -21,7 +21,7 @@ Modules can be always active or on-demand: Monitoring module and Drift Explainab Understand the nature of detected drift. - [:octicons-arrow-right-24: More info](user_guide/modules/index.md) + [:octicons-arrow-right-24: More info](../monitoring/drift_explainability.md) - :material-speedometer:{ .lg .middle } **Retraining** diff --git a/md-docs/user_guide/monitoring/detection_event.md b/md-docs/user_guide/monitoring/detection_event.md index 9a8657b..27d2ff6 100644 --- a/md-docs/user_guide/monitoring/detection_event.md +++ b/md-docs/user_guide/monitoring/detection_event.md @@ -5,13 +5,17 @@ A Detection Event is raised by the ML cube Platform when a significant change is An event is characterized by the following attributes: - `Event Type`: the type of the event. It's possible values are: - - `Warning On`: the monitoring entity is experiencing slight changes that might lead to a drift. - - `Warning Off`: the monitoring entity has returned to the reference distribution. - - `Drift On`: the monitoring entity has drifted from the reference distribution. - - `Drift Off`: the monitoring entity has returned to the reference distribution. +
+
    +
  • `Warning On`: the monitoring entity is experiencing slight changes that might lead to a drift.
  • +
  • `Warning Off`: the monitoring entity has returned to the reference distribution.
  • +
  • `Drift On`: the monitoring entity has drifted from the reference distribution.
  • +
  • `Drift Off`: the monitoring entity has returned to the reference distribution.
  • +
+
- `Severity`: the severity of the event. It's provided only for drift events and it can be `Low`, `Medium`, or `High`. -- `Monitoring Target`: the Monitoring Target being monitored (see the [Monitoring] page for more details on what a Monitoring Target is). -- `Monitoring Metric`: the Monitoring Metric being monitored (see the [Monitoring] page for more details on what a Monitoring Metric is). +- `Monitoring Target`: the [Monitoring Target](index.md#monitoring-metrics) being monitored. +- `Monitoring Metric`: the [Monitoring Metric](index.md#monitoring-metrics) being monitored. - `Model Name`: the name of the model that raised the event. It's present only if the event is related to a model. - `Model Version`: the version of the model that raised the event. It's present only if the event is related to a model. - `Insert datetime`: the time when the event was raised. diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index 75f9ad0..f20fcb7 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -9,9 +9,9 @@ Rules are specific to a task and are characterized by the following attributes: - `Name`: a descriptive label of the rule. - `Detection Event Type`: the type of event that triggers the rule. -- `Severity`: the severity of the event that triggers the rule. If not specified, the rule will be triggered by events of any severity. -- `Monitoring Target`: the monitoring target whose event should trigger the rule. -- `Monitoring Metric`: the monitoring metric whose event should trigger the rule. +- `Severity`: the severity of the event that triggers the rule. It is only applicable to drift events. If not specified, the rule will be triggered by drift events of any severity. +- `Monitoring Target`: the [Monitoring Target](index.md#monitoring-targets) whose event should trigger the rule. +- `Monitoring Metric`: the [Monitoring Metric](index.md#monitoring-metrics) whose event should trigger the rule. - `Model name`: the name of the model to which the rule applies. This is only required when the monitoring target is related to a model (such as `ERROR` or `PREDICTION`). - `Actions`: A list of actions to be executed sequentially when the rule is triggered. diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index 7ba320c..d13f41d 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -6,7 +6,7 @@ ensuring the model continues to function as expected. However, monitoring only i In order to make the right decisions, you need to understand what were the main factors that led to the drift in the first place, so that the correct actions can be taken to mitigate it. -The ML cube Platform supports this process by offering what we refer to as "**Drift Explainability Reports**", +The ML cube Platform supports this process by offering what we refer to as **Drift Explainability Reports**, automatically generated upon the detection of a drift and containing several elements that should help you diagnose the root causes of the change occurred. @@ -20,7 +20,7 @@ This is because the elements of the report needs a significant number of samples If the distribution moves back to the reference before enough samples are collected, the report might not be generated. Each report is composed of several entities, each providing a different perspective on the data and the drift occurred. -Most of them are specific to a certain [Data Structure], so they might not be available for all tasks. +Most of them are specific to a certain `Data Structure`, so they might not be available for all tasks. These entities can take the form of tables, plots, or textual explanations. Observed and analyzed together, they should provide a comprehensive understanding of the drift and its underlying causes. @@ -35,4 +35,3 @@ These are the entities currently available: separating power. This entity is available only for tasks with tabular data. [Monitoring]: index.md -[Data Structure]: ../../api/python/enums.md#datastructure \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index b1a767c..17ffaad 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -2,7 +2,7 @@ The monitoring module is a key feature of the ML cube Platform. It enables continuous tracking of AI models performance over time, helping to identify potential issues. -Additionally, it allows the monitoring of production data to preemptively detect distribution changes, ensuring +It also implements the monitoring of production data to preemptively detect distribution changes, ensuring that the model continues to perform as expected and aligns with business requirements. ## Why do you need Monitoring? @@ -13,16 +13,16 @@ These distributional changes, if not addressed properly, can cause a drop in the in turn can have a negative impact on the business. Monitoring, also known as __Drift Detection__ in the literature, refers the process of continuously tracking the performance of a model -and the distribution of the data it is operating on. +and the distribution of the data it is operating on to identify significant changes. ## How does the ML cube Platform perform Monitoring? The ML cube platform performs monitoring by employing statistical techniques to compare a certain reference (for instance, data used for training or the -performance of a model -on the test set) to incoming production data. +performance of a model on the test set) to incoming production data. These statistical techniques, also known as _monitoring algorithms_, are tailored to the type of data -being observed; for instance, univariate data requires different monitoring techniques than multivariate data. However, you don't need to worry about +being observed; for instance, univariate data requires different monitoring techniques than multivariate data. +However, you don't need to worry about the specifics of these algorithms, as the ML cube Platform takes care of selecting the most appropriate ones for your task. If a significant difference between reference and production data is detected, an alarm is raised, signaling that the monitored entity @@ -50,6 +50,8 @@ is automatically enabled upon the task creation. The idea underlying defining ma only the model error, is to provide a comprehensive view of the model's performance and the data distribution, easing the identification of the root causes of a drift and thus facilitating the corrective actions. +![Monitoring Targets and Metrics overview](../../imgs/monitoring-overview.svg) + #### Monitoring Targets A Monitoring Target is a relevant entity involved in a [Task]. They represent the main quantities monitored by the platform, those whose @@ -92,17 +94,17 @@ The following table displays the monitoring metrics supported, along with their under which they are actually monitored. The possible values that each metric can assume are also provided. Notice that also this table is subject to changes, as new metrics will be added. -| **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | **Possible values** | -|:---------------------:|----------------------------------------------------------|:------------------------------------------------:|:--------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| TEXT_TOXICITY | The toxicity of the text | INPUT, USER_INPUT, PREDICTION | When the data structure is text | Either _neutral_ or _toxic_. | -| TEXT_EMOTION | The emotion of the text | INPUT, USER_INPUT | When the data structure is text | If the Task text language is Italian, one between these: _anger_, _joy_, _sadness_, _fear_.

Otherwise, _one_ between these: _admiration_, _amusement_, _anger_, _annoyance_, _approval_, _caring_, _confusion_, _curiosity_, _desire_, _disappointment_, _disapproval_, _disgust_, _embarrassment_, _excitement_, _fear_, _gratitude_, _grief_, _joy_, _love_, _nervousness_, _optimism_, _pride_, _realization_, _relief_, _remorse_, _sadness_, _surprise_, _neutral_. | -| TEXT_SENTIMENT | The sentiment of the text | INPUT, USER_INPUT | When the data structure is text | If the Task text language is Italian, one between these: _POSITIVE_, _NEGATIVE_. Otherwise, one between these: _negative_, _neutral_, _positive_ | -| TEXT_LENGTH | The length of the text | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | An integer value. | -| MODEL_PERPLEXITY | A measure of how well the LLM predicts the next words | PREDICTION | When the task type is RAG | A floating point value. | -| IMAGE_BRIGHTNESS | The brightness of the image | INPUT | When the data structure is image | A floating point value. | -| IMAGE_CONTRAST | The contrast of the image | INPUT | When the data structure is image | A floating point value. | -| BBOXES_AREA | The average area of the predicted bounding boxes | PREDICTION | When the task type is Object Detection | A floating point value. | -| BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | An integer value. | +| **Monitoring Metric** | Description | **Monitoring Target** | **Conditions** | **Possible values** | +|:---------------------:|-----------------------------------------------------------------------|:------------------------------------------------:|:--------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| TEXT_TOXICITY | The toxicity of the text | INPUT, USER_INPUT, PREDICTION | When the data structure is text | Either _neutral_ or _toxic_. | +| TEXT_EMOTION | The emotion of the text | INPUT, USER_INPUT | When the data structure is text | If the Task text language is Italian, one between these: _anger_, _joy_, _sadness_, _fear_.

Otherwise, _one_ between these: _admiration_, _amusement_, _anger_, _annoyance_, _approval_, _caring_, _confusion_, _curiosity_, _desire_, _disappointment_, _disapproval_, _disgust_, _embarrassment_, _excitement_, _fear_, _gratitude_, _grief_, _joy_, _love_, _nervousness_, _optimism_, _pride_, _realization_, _relief_, _remorse_, _sadness_, _surprise_, _neutral_. | +| TEXT_SENTIMENT | The sentiment of the text | INPUT, USER_INPUT | When the data structure is text | If the Task text language is Italian, one between these: _POSITIVE_, _NEGATIVE_. Otherwise, one between these: _negative_, _neutral_, _positive_ | +| TEXT_LENGTH | The length of the text | INPUT, USER_INPUT, RETRIEVED_CONTEXT, PREDICTION | When the data structure is text | An integer value. | +| MODEL_PERPLEXITY | A measure of the uncertainty of an LLM when predicting the next words | PREDICTION | When the task type is RAG | A floating point value. | +| IMAGE_BRIGHTNESS | The brightness of the image | INPUT | When the data structure is image | A floating point value. | +| IMAGE_CONTRAST | The contrast of the image | INPUT | When the data structure is image | A floating point value. | +| BBOXES_AREA | The average area of the predicted bounding boxes | PREDICTION | When the task type is Object Detection | A floating point value. | +| BBOXES_QUANTITY | The average number of predicted bounding boxes per image | PREDICTION | When the task type is Object Detection | An integer value. | ### Monitoring Status diff --git a/mkdocs.yml b/mkdocs.yml index 2b92f29..2d9cae5 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -138,8 +138,7 @@ nav: - Modules: - user_guide/modules/index.md - - user_guide/modules/monitoring.md - - user_guide/detection_event_rules.md +# - user_guide/modules/monitoring.md - user_guide/modules/retraining.md - user_guide/modules/topic_modeling.md - user_guide/modules/rag_evaluation.md From ed2152f24bacbc3a437c266961c32e8397755ecf Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 4 Nov 2024 11:22:12 +0100 Subject: [PATCH 18/22] Drift Explainability Report Images --- .../drift-explainability/concept-fi.svg | 1 + .../monitoring/drift-explainability/fi.svg | 53 +++++++++++++++++++ .../monitoring/drift-explainability/score.svg | 53 +++++++++++++++++++ .../monitoring/drift_explainability.md | 35 ++++++++++-- md-docs/user_guide/monitoring/index.md | 5 +- 5 files changed, 142 insertions(+), 5 deletions(-) create mode 100644 md-docs/imgs/monitoring/drift-explainability/concept-fi.svg create mode 100644 md-docs/imgs/monitoring/drift-explainability/fi.svg create mode 100644 md-docs/imgs/monitoring/drift-explainability/score.svg diff --git a/md-docs/imgs/monitoring/drift-explainability/concept-fi.svg b/md-docs/imgs/monitoring/drift-explainability/concept-fi.svg new file mode 100644 index 0000000..4e46236 --- /dev/null +++ b/md-docs/imgs/monitoring/drift-explainability/concept-fi.svg @@ -0,0 +1 @@ +
0.300.300.250.250.200.200.150.150.100.100.050.050.000.00Discriminative PowerX_0X_0X_1X_1X_2X_2X_3X_3X_4X_4X_5X_5X_6X_6X_7X_7X_8X_8X_9X_9X_10X_10yyFeature
\ No newline at end of file diff --git a/md-docs/imgs/monitoring/drift-explainability/fi.svg b/md-docs/imgs/monitoring/drift-explainability/fi.svg new file mode 100644 index 0000000..777e0e9 --- /dev/null +++ b/md-docs/imgs/monitoring/drift-explainability/fi.svg @@ -0,0 +1,53 @@ +
Reference
Production
0.500.500.400.400.300.300.200.200.100.100.000.00Feature ImportanceX_0X_0X_1X_1X_2X_2X_3X_3X_4X_4X_5X_5X_6X_6X_7X_7X_8X_8X_9X_9X_10X_10Feature
\ No newline at end of file diff --git a/md-docs/imgs/monitoring/drift-explainability/score.svg b/md-docs/imgs/monitoring/drift-explainability/score.svg new file mode 100644 index 0000000..ed98365 --- /dev/null +++ b/md-docs/imgs/monitoring/drift-explainability/score.svg @@ -0,0 +1,53 @@ +
Test statistic
Threshold
17.0017.0016.0016.0015.0015.0014.0014.0013.0013.0012.0012.0011.0011.0010.0010.009.009.008.008.007.007.006.006.005.005.004.004.003.003.002.002.001.001.000.000.00-1.00-1.00-2.00-2.00Drift Score07 Jul07 Jul08 Jul08 Jul09 Jul09 Jul10 Jul10 Jul11 Jul11 Jul12 Jul12 Jul13 Jul13 Jul14 Jul14 Jul15 Jul15 Jul16 Jul16 Jul17 Jul17 Jul18 Jul18 Jul19 Jul19 Jul20 Jul20 Jul21 Jul21 Jul22 Jul22 Jul23 Jul23 Jul24 Jul24 Jul25 Jul25 Jul26 Jul26 Jul27 Jul27 JulTimeWarning OnDrift On - LowDrift On - MediumDrift On - High
\ No newline at end of file diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index d13f41d..bbd082c 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -10,14 +10,14 @@ The ML cube Platform supports this process by offering what we refer to as **Dri automatically generated upon the detection of a drift and containing several elements that should help you diagnose the root causes of the change occurred. -You can access the reports by navigating to the `Drift Explainability` tab in the sidebar of the task page. +You can access the reports in the WebApp, by navigating to the `Drift Explainability` tab in the sidebar of the Task page. ## Structure A Drift Explainability Report consists in comparing the reference data and the portion of production data where the drift was identified, hence -those belonging to the new data distribution. Notice that these reports are generated after a sufficient amount of samples has been collected after the drift. -This is because the elements of the report needs a significant number of samples to guarantee statistical reliability of the results. -If the distribution moves back to the reference before enough samples are collected, the report might not be generated. +those belonging to the new data distribution. Notice that these reports are generated after a sufficient amount of samples has been collected +after the drift, in order to ensure statistical reliability of the results. +If the data distribution moves back to the reference before enough samples are collected, the report might not be generated. Each report is composed of several entities, each providing a different perspective on the data and the drift occurred. Most of them are specific to a certain `Data Structure`, so they might not be available for all tasks. @@ -29,9 +29,36 @@ These are the entities currently available: - `Feature Importance`: it's a barplot that illustrates how the significance of each feature differs between the reference and the production datasets. Variations in a feature's values might suggest that its contribution to the model's predictions has changed over time. This entity is available only for tasks with tabular data. + +
+ ![Feature Importance](../../imgs/monitoring/drift-explainability/fi.svg) +
Example of a feature importance plot.
+
+ - `Variable discriminative power`: it's also a bar plot displays the influence of each feature, as well as the target, in differentiating between the reference and the production datasets. The values represent how strongly a given feature helps to distinguish the datasets, with higher values representing stronger separating power. This entity is available only for tasks with tabular data. +
+ ![Variable discriminative power](../../imgs/monitoring/drift-explainability/concept-fi.svg) +
Example of a variable discriminative power plot.
+
+ +- `Drift Score`: it's a line plot that shows the evolution of the drift score over time. The drift score is a + measure of the statistical distance between a sliding window of the production data and the reference data. It also shows the threshold, + which is the value that the drift score must exceed to raise a drift alarm, and all the [Detection Events] that were triggered in + the time frame of the report. This plot helps in understanding how the drift evolved over time and the moments in which the difference + between the two datasets was higher. Notice that some postprocessing is applied on the events to account for the functioning of the drift detection algorithms. + Specifically, + we shift back the drift on events by a certain offset, aiming to point at the precise time when the drift actually started. As a result, + drift on events might be shown before the threshold is exceeded. This entity is available for all tasks. + + +
+ ![Drift score](../../imgs/monitoring/drift-explainability/score.svg) +
Example of a drift score plot with detection events of increasing severity displayed.
+
+ [Monitoring]: index.md +[Detection Events]: detection_event.md \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index 17ffaad..c3c7208 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -50,7 +50,10 @@ is automatically enabled upon the task creation. The idea underlying defining ma only the model error, is to provide a comprehensive view of the model's performance and the data distribution, easing the identification of the root causes of a drift and thus facilitating the corrective actions. -![Monitoring Targets and Metrics overview](../../imgs/monitoring-overview.svg) +
+ ![Monitoring Overview](../../imgs/monitoring-overview.svg) +
Monitoring Targets and Metrics overview
+
#### Monitoring Targets From db4c759351e1fc3d2ca48a32e440ceb09f0c84af Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 4 Nov 2024 12:18:57 +0100 Subject: [PATCH 19/22] Monitoring overview image --- md-docs/imgs/monitoring/overview.svg | 1 + md-docs/user_guide/monitoring/index.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 md-docs/imgs/monitoring/overview.svg diff --git a/md-docs/imgs/monitoring/overview.svg b/md-docs/imgs/monitoring/overview.svg new file mode 100644 index 0000000..668eb81 --- /dev/null +++ b/md-docs/imgs/monitoring/overview.svg @@ -0,0 +1 @@ +InputPredictionTrue valueInputConceptErrorMetricsMetricsPredictionMetrics \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index c3c7208..4aa51f8 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -51,7 +51,7 @@ only the model error, is to provide a comprehensive view of the model's performance and the data distribution, easing the identification of the root causes of a drift and thus facilitating the corrective actions.
- ![Monitoring Overview](../../imgs/monitoring-overview.svg) + ![Monitoring Overview](../../imgs/monitoring/overview.svg)
Monitoring Targets and Metrics overview
From 31ff43c5f59df8ffad85bd99e8963873b6c6e7af Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 4 Nov 2024 12:26:46 +0100 Subject: [PATCH 20/22] Last minor fixes --- md-docs/user_guide/monitoring/detection_event_rules.md | 4 ++-- md-docs/user_guide/monitoring/drift_explainability.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/md-docs/user_guide/monitoring/detection_event_rules.md b/md-docs/user_guide/monitoring/detection_event_rules.md index f20fcb7..69c4036 100644 --- a/md-docs/user_guide/monitoring/detection_event_rules.md +++ b/md-docs/user_guide/monitoring/detection_event_rules.md @@ -34,9 +34,9 @@ These actions send notifications to external services when a detection event is This action consists in creating two plot configurations when a detection event is triggered: the first one includes data preceding the event, while the second one includes data following the event. -### Retrain Action +### Retrain -Retrain action lets you retrain your model. Therefore, it is only available when the monitoring target of the rule is related to a model. +Retrain Action enables the automatic retraining of your model. Therefore, it is only available when the target of the rule is related to a model. The retrain action does not need any parameter because it is automatically inferred from the `Model Name` attribute of the rule. Of course, the model must already have a retrain trigger associated before setting up this action. diff --git a/md-docs/user_guide/monitoring/drift_explainability.md b/md-docs/user_guide/monitoring/drift_explainability.md index bbd082c..dd0575e 100644 --- a/md-docs/user_guide/monitoring/drift_explainability.md +++ b/md-docs/user_guide/monitoring/drift_explainability.md @@ -52,7 +52,7 @@ These are the entities currently available: between the two datasets was higher. Notice that some postprocessing is applied on the events to account for the functioning of the drift detection algorithms. Specifically, we shift back the drift on events by a certain offset, aiming to point at the precise time when the drift actually started. As a result, - drift on events might be shown before the threshold is exceeded. This entity is available for all tasks. + drift on events might be shown before the threshold is exceeded. This explainability entity is available for all tasks.
From fa3a04f336c17b9acf6475a8f39926066190088d Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 4 Nov 2024 13:21:29 +0100 Subject: [PATCH 21/22] New monitoring status state diagram image --- md-docs/imgs/monitoring/states.svg | 1 + md-docs/user_guide/monitoring/index.md | 31 +++++++++----------------- 2 files changed, 12 insertions(+), 20 deletions(-) create mode 100644 md-docs/imgs/monitoring/states.svg diff --git a/md-docs/imgs/monitoring/states.svg b/md-docs/imgs/monitoring/states.svg new file mode 100644 index 0000000..764f8c6 --- /dev/null +++ b/md-docs/imgs/monitoring/states.svg @@ -0,0 +1 @@ +.InitialstateOKWARNINGDRIFTDrift ONWarningONDrift OFFWarning OFFSet newreferenceDrift ONSetnewreferenceDrift OFF \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index 4aa51f8..df43f04 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -121,26 +121,17 @@ All the entities being monitored are associated with a status, which can be one The following diagram illustrates the possible transitions between the statuses. Each transition is triggered by a [Detection Event] and the status of the entity is updated accordingly. -
- -```mermaid -stateDiagram-v2 - direction LR - - [*] --> OK : Initial State - - OK --> WARNING : Warning On - WARNING --> OK : Warning Off - - WARNING --> DRIFT : Drift On - DRIFT --> WARNING : Drift Off - - DRIFT --> OK : Drift Off -``` -
- -Notice that a drift off event can either bring the entity back to the `OK` status or to the `WARNING` status, -depending on the velocity of the change and the monitoring algorithm's sensitivity. +
+ ![Drift score](../../imgs/monitoring/states.svg) +
Monitoring status state diagram
+
+ +Notice that a Drift OFF event can either bring the entity back to the `OK` status or to the `WARNING` status, +depending on the velocity of the change and the monitoring algorithm's sensitivity. The same applies +to the Drift ON events, which can both happen when the entity is in the `WARNING` status or in the `OK` status. + +The only transition which is not due to a detection event is the one caused by the specification of a new reference. In this case, the status of the entity is reset to `OK` +for every entity as all the monitoring algorithms are reinitialized on the new reference. You can check the status of the monitored entities in two ways: From 413f74f860c9ffe2c7ddcc3446095a9675de61d0 Mon Sep 17 00:00:00 2001 From: Giovanni Giacometti Date: Mon, 4 Nov 2024 15:22:20 +0100 Subject: [PATCH 22/22] New state diagram for monitoring status --- md-docs/stylesheets/extra.css | 6 +++++- md-docs/user_guide/model.md | 21 ++++++++++++++++++++- md-docs/user_guide/monitoring/index.md | 21 +++++++++++++++++---- 3 files changed, 42 insertions(+), 6 deletions(-) diff --git a/md-docs/stylesheets/extra.css b/md-docs/stylesheets/extra.css index 1677134..49b3f25 100644 --- a/md-docs/stylesheets/extra.css +++ b/md-docs/stylesheets/extra.css @@ -34,4 +34,8 @@ .nice-list ul{ list-style-type: circle; -} \ No newline at end of file +} + +.mermaid { + text-align: center; + } \ No newline at end of file diff --git a/md-docs/user_guide/model.md b/md-docs/user_guide/model.md index ace4641..ba492e2 100644 --- a/md-docs/user_guide/model.md +++ b/md-docs/user_guide/model.md @@ -1 +1,20 @@ -# Model \ No newline at end of file +# Model + + + + +[//]: # () +[//]: # () +[//]: # (What is additional probabilistic output?) + +[//]: # () +[//]: # (What is metric?) + +[//]: # () +[//]: # (What is suggestion type?) + +[//]: # () +[//]: # (What is retraining cost?) + +[//]: # () +[//]: # (What is retraining trigger?) \ No newline at end of file diff --git a/md-docs/user_guide/monitoring/index.md b/md-docs/user_guide/monitoring/index.md index df43f04..1cfd14d 100644 --- a/md-docs/user_guide/monitoring/index.md +++ b/md-docs/user_guide/monitoring/index.md @@ -121,10 +121,23 @@ All the entities being monitored are associated with a status, which can be one The following diagram illustrates the possible transitions between the statuses. Each transition is triggered by a [Detection Event] and the status of the entity is updated accordingly. -
- ![Drift score](../../imgs/monitoring/states.svg) -
Monitoring status state diagram
-
+```mermaid +stateDiagram-v2 + + [*] --> OK : Initial State + + OK --> WARNING : Warning On + WARNING --> OK : Set new reference + WARNING --> OK : Warning Off + + + WARNING --> DRIFT : Drift On + DRIFT --> WARNING : Drift Off + + DRIFT --> OK : Set new reference + DRIFT --> OK : Drift Off +``` + Notice that a Drift OFF event can either bring the entity back to the `OK` status or to the `WARNING` status, depending on the velocity of the change and the monitoring algorithm's sensitivity. The same applies