# Getting Started With Amazon Lookout for Metrics

Amazon Lookout for Metrics can help you identify anomalies within your data regardless of its origin. By following this notebook you will build out a solution using Amazon Lookout for Metrics to capture incoming data and to detect anomalies within it.

This guide assumes you completed all the work in `README.md`, if you have not, go back to that first then return

## Amazon Lookout for Metrics's Workflow

1. Prepare your source data and create an AWS Identity and Access Management (IAM) role that can access the data.
2. Create a detector and configure its detection properties.
3. Create a metric set:
    1. Provide the location of your source data and the IAM permissions needed to access it. 
    1. Define the metrics that you want to investigate.
    1. Attach the dataset to your detector.
4. Activate the detector.
5. (Optional) Set up alerts to get notified when L4M detects important outliers.
6. Inspect the detected outliers to figure out their root causes.
7. Provide feedback on the outliers to improve predictor accuracy.


The first step is to update your boto3 installation to the latest version which has the new L4M API:

In [None]:
# !pip install boto3 --upgrade

Now, let's import the new Amazon Lookout for Metrics and to establish a connection to AWS:

In [None]:
import boto3
region = "us-west-2"
session = boto3.Session(region_name = region)

# FIXME : Beta endpoint
L4M = session.client( "lookoutmetrics", endpoint_url='https://lookoutmetrics-beta.us-west-2.amazonaws.com/' )

Next, validate that you are receiving anything back from this API call below, if you get a 200 code it means you are whitelisted and connecting to the service successfully!

In [None]:
L4M.list_metric_sets()

## Creating A Detector

Now the basic external resources are ready, so it is time to get started with Amazon Lookout for Metrics, that starts with creating a `Detector`.

### Detectors

To detect outliers, Amazon Lookout for Metrics builds a machine learning model that is trained with your source data. This model, called a `detector`, is automatically trained with the machine learning algorithm that best fits your data and use case. You can either provide your historical data for training, if you have any, or get started with real-time data, and Amazon Lookout for Metrics will learn on-the-go. 

You specify the Amazon S3 location that Amazon Lookout for Metrics should continuously monitor for new data, and your detector analyzes your data and returns information about the outliers that it detected. When you create a `detector`, you also specify a `detecting domain` and an `outlier detection frequency`. 

The `anomaly detection frequency` specifies how frequently the detector should wake-up and look for new data, run analysis and alert you with any interesting findings.


Execute the cells below to create a new anomaly detector with a frequency of 1 Hour. 

**NOTE** If you do not have an S3 bucket created for your data, go create one first then come back to these cells.

In [None]:
project = "initial-poirot-testing-cf" # just a string used to name resources such as MetricSet, Detector, etc.

frequency = "PT1H" # one of 'P1D', 'PT1H', 'PT10M' and 'PT5M'

In [None]:
response = L4M.create_anomaly_detector( 
    AnomalyDetectorName = project + "-detector-1",
    # AnomalyDetectorDomain = "ADS",
    AnomalyDetectorDescription = "My Detector",
    AnomalyDetectorConfig = {
        "AnomalyDetectorFrequency" : frequency,
    },
)

anomaly_detector_arn = response["AnomalyDetectorArn"]
anomaly_detector_arn

See details of created detector and it's status (should be `INACTIVE` at this point)

In [None]:
L4M.describe_anomaly_detector(AnomalyDetectorArn=anomaly_detector_arn)

## Define Metrics

### Measures and Dimensions

`Measures` are variables or key performance indicators on which customers want to detect outliers and `Dimensions` are meta-data that represent categorical information about the measures. 

In this E-commerce example, views and revenue are our measures and platform and marketplace are our dimensions. Customers may want to monitor their data for anomalies in number of views or revenue for every platform, marketplace, and combination of both. You can designate up to five measures and five dimensions per dataset.

### Metrics 


After creating a detector, and mapping your measures and dimensions, Amazon Lookout for Metrics will analyze each combination of these measures and dimensions. For the above example, we have of 7 unique values (us, jp, de, etc.) for marketplace and 3 unique values (mobile web, mobile app, pc web) for platform for a total of 21 unique combinations. Each unique combination of measures with the dimension values (e.g. us/mobile app/revenue) is a time series `metric`. In this case, we have 21 dimensions and 2 measures for a total of 42 time-series `metrics`. 

Amazon Lookout for Metrics detects anomalies at the most granular level so you are able to pin-point any unexpected behavior in your data.

### Datasets

Measures, dimensions and metrics map to `datasets`, which also contain the Amazon S3 locations of your source data, an IAM role that has both read and write permissions to those Amazon S3 locations, and the rate at which data should be ingested from the source location (the upload frequency and data ingestion delay).


First, let's create a role that can work with the Amazon Lookout for Metrics service:


In [None]:
import utility as pu
role_name = "L4MTestRole"
role_arn = pu.get_or_create_iam_role(role_name)

Now, let's create a metric set for our detector that point to the Live data in S3:

In [None]:
s3_path_format = 's3://'+ s3_bucket + '/ecommerce/live/{{yyyyMMdd}}/{{HHmm}}'
s3_path_format

In [None]:
params = {
    "AnomalyDetectorArn": anomaly_detector_arn,
    "MetricSetName" : project + '-metric-set-1',
    "MetricList" : [
        {
            "MetricName" : "views",
            "AggregationFunction" : "AVG",
        },
        {
            "MetricName" : "revenue",
            "AggregationFunction" : "SUM",
        },
    ],

    "DimensionList" : [ "platform", "marketplace" ],

    "TimestampColumn" : {
        "ColumnName" : "timestamp",
        "ColumnFormat" : "yyyy-MM-dd HH:mm:ss",
    },

   #"Delay" : 120, # seconds the detector will wait before attempting to read latest data per current time and detection frequency below
    "MetricSetFrequency" : frequency,

    "MetricSource" : {
        "S3SourceConfig": {
            "RoleArn" : role_arn,
#            "HistoricalDataPathList": [
#                s3_path_training_prefix,
#            ],
            "TemplatedPathList": [
                s3_path_format,
            ],

            "FileFormatDescriptor" : {
                "CsvFormatDescriptor" : {
                    "FileCompression" : "NONE",
                    "Charset" : "UTF-8",
                    "ContainsHeader" : True,
                    "Delimiter" : ",",
#                    "HeaderList" : [
#                        "platform",
#                        "marketplace",
#                        "timestamp",
#                        "views",
#                        "revenue"
#                    ],
                    "QuoteSymbol" : '"'
                },
            }
        }
    },
}

params

In [None]:
response = L4M.create_metric_set( ** params )

metric_set_arn = response["MetricSetArn"]
metric_set_arn

To see that the metric set was created correctly:

In [None]:
L4M.describe_metric_set(MetricSetArn=metric_set_arn)

## Activate the Detector

Now that the MetricSet has been specified, we are ready to start training the detector, that's done by activating it.

In [None]:
L4M.activate_anomaly_detector(AnomalyDetectorArn = anomaly_detector_arn)

In [None]:
pu.wait_anomaly_detector( L4M, anomaly_detector_arn )

## Create Anomaly Alerts:

Once your detector is active, you can attach alerts to it. `Alerts` are customized notifications available via the Amazon Simple Notification Service (SNS), configurable directly via the Amazon Lookout for Metrics console and SDK. These alerts notify you whenever an anomaly of a specified severity level is detected. Severity levels are a measure of the urgency or criticality of detected outliers. Alerts are meant to guide you towards relative prioritization of the outliers. Amazon Lookout for Metrics supports Critical, High, Medium, and Low thresholds. For example, you can set an alert on your detector to notify you whenever an outlier with a Medium severity level or greater is detected.


Before we get to creating an alert, 2 additional things are needed:

1. A role giving Poirot access to SNS
2. An SNS topic to deliver the alerts to

The cells below will guide you through creating those and then it is time to create the alert.


In [None]:
import json
import time

In [None]:
iam = boto3.client("iam")
role_name = "L4M-SNSFullAccessCF"
assume_role_policy_document = {
    "Version": "2012-10-17",
    "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "Service": "lookoutmetrics.amazonaws.com"
          },
          "Action": "sts:AssumeRole"
        }
    ]
}

create_role_response = iam.create_role(
    RoleName = role_name,
    AssumeRolePolicyDocument = json.dumps(assume_role_policy_document)
)

# Now add SNS support
iam.attach_role_policy(
    PolicyArn='arn:aws:iam::aws:policy/AmazonSNSFullAccess',
    RoleName=role_name
)
time.sleep(60) # wait for a minute to allow IAM role policy attachment to propagate

role_arn = create_role_response["Role"]["Arn"]
print(role_arn)

Now create the SNS topic for the alerts:

**UPDATE YOUR CELL NUMBER BELOW!!!**

In [None]:
sns_client = boto3.client("sns")
topic = sns_client.create_topic(Name="anomalyalertsCF")
topic_arn = topic['TopicArn']

# Change to your cell
number = "+15555555555" # Change to your cell

sns_client.subscribe(
        TopicArn=topic_arn,
        Protocol='sms',
        Endpoint=number  
)

Lastly execute the cell below to configure the alerts to notify your topic.

In [None]:
response = L4M.create_alert(
    Action = {
      "SNSConfiguration": { 
         "RoleArn": role_arn,
         "SnsTopicArn": topic_arn
      }
    },
    AlertDescription = "Test Alert 1",
    AlertName = project + "-alert-1",
    AnomalyDetectorArn = anomaly_detector_arn,
    AlertSensitivityThreshold = 50
)

alert_arn = response["AlertArn"]
alert_arn

To make things easier on yourself we are going to leverage the magic functions of Ipython in order to save a few variables for later.

Once you have executed the cells below, move on to 2.BackTestingWithHistorical.ipynb

In [None]:
%store project
%store s3_bucket
%store frequency
%store role_name