# **Amazon Lookout for Equipment**
*Part 4 - Model evaluation* (Optional)

### Notebook configuration update
Let's make sure that we have access to the latest version of the AWS Python packages. If you see a `pip` dependency error, check that the `boto3` version is ok: if it's greater than 1.17.48 (the first version that includes the `lookoutequipment` API), you can discard this error and move forward with the next cell:

In [None]:
import boto3
print(f'boto3 version: {boto3.__version__} (should be >= 1.17.48 to include Lookout for Equipment API)')

# Restart the current notebook to ensure we take into account the previous updates:
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

### Imports

In [None]:
import config
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
import os
import pandas as pd
import sys

# Helper functions for managing Lookout for Equipment API calls:
sys.path.append('../utils')
import lookout_equipment_utils as lookout

### Parameters

In [None]:
MODEL_NAME     = config.MODEL_NAME
ASSET_ID       = config.ASSET_ID
TMP_DATA       = os.path.join('..', 'data', 'interim', ASSET_ID)
PROCESSED_DATA = os.path.join('..', 'data', 'processed', ASSET_ID)
LABEL_DATA     = os.path.join(PROCESSED_DATA, 'label-data')
TRAIN_DATA     = os.path.join(PROCESSED_DATA, 'training-data', 'centrifugal-pump')
REGION_NAME    = boto3.session.Session().region_name

%matplotlib inline
plt.style.use('Solarize_Light2')
plt.rcParams['lines.linewidth'] = 0.5

Based on the label time ranges, we will use the following time ranges:

* **Train set:** 1st January 2019 - 31st July 2019: Lookout for Equipment needs at least 180 days of training data and this period contains a few labelled ranges with some anomalies.
* **Evaluation set:** 1st August 2019 - 27th October 2019 *(this test set includes both normal and abnormal data to evaluate our model on)*

In [None]:
# Configuring time ranges:
training_start   = pd.to_datetime('2019-01-01 00:00:00')
training_end     = pd.to_datetime('2019-07-31 00:00:00')
evaluation_start = pd.to_datetime('2019-08-01 00:00:00')
evaluation_end   = pd.to_datetime('2019-10-27 00:00:00')

print(f'  Training period | from {training_start} to {training_end}')
print(f'Evaluation period | from {evaluation_start} to {evaluation_end}')

### Loading original datasets for visualization purpose

In [None]:
# Let's load all our original signals (they will be useful later on):
all_tags_fname = os.path.join(TRAIN_DATA, 'sensors.csv')
all_tags_df = pd.read_csv(all_tags_fname)
all_tags_df['Timestamp'] = pd.to_datetime(all_tags_df['Timestamp'])
all_tags_df = all_tags_df.set_index('Timestamp')

## Model evaluation
---

The [**DescribeModel**](https://docs.aws.amazon.com/lookout-for-equipment/latest/ug/API_DescribeModel.html) API can be used to extract, among other things, the metrics associated to the trained model. Here are the different fields available when calling this API:

In [None]:
lookout_client = lookout.get_client(region_name=REGION_NAME)
describe_model_response = lookout_client.describe_model(ModelName=MODEL_NAME)
list(describe_model_response.keys())

The `ModelMetrics` field above is a dictionnary that follows this format:

```json
{
    'labeled_ranges': [
        {'start': '2019-08-08T00:00:00.000000', 'end': '2019-08-09T00:00:00.000000'},
        {'start': '2019-08-18T00:00:00.000000', 'end': '2019-08-19T00:00:00.000000'},
        {'start': '2019-08-28T00:00:00.000000', 'end': '2019-08-29T00:00:00.000000'},
        {'start': '2019-09-07T00:00:00.000000', 'end': '2019-09-08T00:00:00.000000'},
        {'start': '2019-09-17T00:00:00.000000', 'end': '2019-09-18T00:00:00.000000'},
        {'start': '2019-09-27T00:00:00.000000', 'end': '2019-09-28T00:00:00.000000'},
        {'start': '2019-10-07T00:00:00.000000', 'end': '2019-10-08T00:00:00.000000'},
        {'start': '2019-10-17T00:00:00.000000', 'end': '2019-10-18T00:00:00.000000'}
    ],
    'labeled_event_metrics': {
        'num_labeled': 8,
        'num_identified': 8,
        'total_warning_time_in_seconds': 668040.0
    },
    'predicted_ranges': [
        {
            'start': '2019-08-08T00:42:00.000000',
            'end': '2019-08-08T01:48:00.000000',
            'diagnostics': [
                {'name': 'centrifugal-pump\\Sensor0', 'value': 0.05218326564181105},
                {'name': 'centrifugal-pump\\Sensor1', 'value': 0.023636079094576},
                {'name': 'centrifugal-pump\\Sensor2', 'value': 0.03825258734479793},
                {'name': 'centrifugal-pump\\Sensor3', 'value': 0.023349531399873558},
                
                ...
                
                {'name': 'centrifugal-pump\\Sensor20', 'value': 0.04989340342761552},
                {'name': 'centrifugal-pump\\Sensor21', 'value': 0.033976174168938014},
                {'name': 'centrifugal-pump\\Sensor22', 'value': 0.046622167459421035},
                {'name': 'centrifugal-pump\\Sensor23', 'value': 0.044698573526762944}
            ]
        },
        
        ...
        
    ],
    'unknown_event_metrics': {
        'num_identified': 8,
        'total_duration_in_seconds': 4200.0
    }
}
```

The `labeled_ranges` contains the label provided as an input while the `predicted_ranges` contains all the predicted ranges where Lookout for Equipment detected an anomaly. Each predicted range contains a `diagnostics` field with a percentage associated to each sensor available in the dataset. During the training, Lookout for Equipment learns the relationship between the sensors that denotes a normal behavior. When this normal relationship is broken, the service considers that it detected an an anomalous event. It then proceeds with calculating which sensors are indicating that the asset is no longer operating normally. You can read this diagnostic as a feature importance output of the model: the percentage associated to a given sensor corresponds to the magnitude of impact (*importance*) this sensor has with regards to a given anomaly.

Let's use the following utility function get these results into two dataframes (labeled and predicted):

In [None]:
LookoutDiagnostics = lookout.LookoutEquipmentAnalysis(model_name=MODEL_NAME, tags_df=all_tags_df, region_name=REGION_NAME)
LookoutDiagnostics.set_time_periods(evaluation_start, evaluation_end, training_start, training_end)
predicted_ranges = LookoutDiagnostics.get_predictions()
labels_fname = os.path.join(LABEL_DATA, 'labels.csv')
labeled_range = LookoutDiagnostics.get_labels(labels_fname)

**Note:** the labeled range from the model Describe API, only provides any labelled data falling within the evaluation range. We use the original label data to get all of them.

Let's now display one of the original signal and map both the labeled and the predicted ranges on the same plot:

In [None]:
# We load one of the original signal we looked at in the data preparation step:
tag = 'Sensor0'
tag_df = all_tags_df.loc[training_start:evaluation_end, [tag]]
tag_df.columns = ['Value']

# Plot all of that:
fig, axes = lookout.plot_timeseries(
    timeseries_df=tag_df, 
    tag_name=tag,
    fig_width=20, 
    tag_split=evaluation_start, 
    labels_df=labeled_range,
    predictions=predicted_ranges,
    custom_grid=False
)

## Unpacking event details
---
### Single event overview
Each detected event have some detailed diagnostics stored in JSON format. Let's unpack the event details for the first large event and plot a similar bar chart than what the console provides:

![Event details](assets/model-diagnostics.png)

In [None]:
# Let's get the event details:
first_event_details = predicted_ranges.loc[1, 'diagnostics']
first_event_details = pd.DataFrame(first_event_details).sort_values(by='value', ascending=False).reset_index(drop=True)
first_event_details = first_event_details.sort_values(by='value')

# We can then plot a horizontal bar chart:
y_pos = np.arange(first_event_details.shape[0])
values = list(first_event_details['value'])

fig = plt.figure(figsize=(12,10))
ax = plt.subplot(1,1,1)
ax.barh(y_pos, first_event_details['value'], align='center')
ax.set_yticks(y_pos)
ax.set_yticklabels(first_event_details['name'])
ax.xaxis.set_major_formatter(mtick.PercentFormatter(1.0))

# Add the values in each bar:
for i, v in enumerate(values):
    ax.text(0.0005, i, f'{v*100:.2f}%', color='#FFFFFF', fontweight='bold', verticalalignment='center')
    
plt.show()

### Grouping sensors by component
The above bar chart is already a great help to pinpoint what might be going wrong with your asset. Let's load the initial tags description file we prepared in the first notebook and match the sensors with our initial components:

In [None]:
tags_description_fname = os.path.join(TMP_DATA, 'tags_description.csv')
tags_description_df = pd.read_csv(tags_description_fname)
first_event_details[['asset', 'sensor']] = first_event_details['name'].str.split('\\', expand=True)
component_diagnostics = pd.merge(first_event_details, tags_description_df, how='inner', left_on='sensor', right_on='Tag')[['name', 'value', 'Component']]
component_diagnostics.sort_values(by='value', ascending=False).head(10)

If we group the contribution of all sensors by component we end up seeing that the volute component has a 30% contribution to this particular event, while the other components are ranging from 16 to 19%: **time to give the volute a visit?**

In [None]:
event_diagnostics = component_diagnostics.groupby(by='Component').sum().sort_values(by='value')
event_diagnostics

In [None]:
# We can then plot a horizontal bar chart:
y_pos = np.arange(event_diagnostics.shape[0])
values = list(event_diagnostics['value'])

fig = plt.figure(figsize=(12,5))
ax = plt.subplot(1,1,1)
ax.barh(y_pos, event_diagnostics['value'], align='center')
ax.set_yticks(y_pos)
ax.set_yticklabels(list(event_diagnostics.index))
ax.xaxis.set_major_formatter(mtick.PercentFormatter(1.0))

# Add the values in each bar:
for i, v in enumerate(values):
    ax.text(0.005, i, f'{v*100:.2f}%', color='#FFFFFF', fontweight='bold', verticalalignment='center')
    
plt.show()

## Conclusion
---

In [None]:
# Needed for visualizing markdowns programatically
from IPython.display import display, Markdown

display(Markdown(
'''
<span style="color:green"><span style="font-size:50px">**Success!**</span></span>
<br/>
In this notebook, we use the model created in part 3 of this notebook series and performed a few visualization and diagnostics on the results obtained. You can now move forward to the next step to the **inference scheduling notebook** where we will start the model, feed it some new data and catch the results.
'''))