# Run Backtesting on Historical Data for Amazon Lookout for Metrics

Amazon Lookout for Metrics also supports backtesting against your historical information and in this notebook we will demonstrate this functionality on the same e-commerce data. Once the backtesting job has completed you can see all of the anomalies that Amazon Lookout for Metrics detected in the last 20% of your historical data. From here you can begin to unpack the kinds of results you will see from Amazon Lookout for Metrics in the future when you start streaming in new data. **NOTE YOU MUST CREATE A NEW DETECTOR TO LEVERAGE REAL TIME DATA. BACKTESTING IS ONLY FOR EXPLORATION.**

This notebook assumes that you already executed the live data example in the previous notebook, if you have not, go back and complete those first, then give backtesting a whirl while you are waiting for the real time anomalies to alert you!

First restore the variables from the previous notebook and then import the libraries needed.

In [1]:
%store -r

Just as in the last notebook, connect to AWS through the SDK.

In [2]:
import boto3
region = "us-west-2"
session = boto3.Session(region_name = region)

# FIXME : Beta endpoint
L4M = session.client( "lookoutmetrics", endpoint_url='https://lookoutmetrics-beta.us-west-2.amazonaws.com/' )

## Creating A Detector

This step is identical to the one in the previous notebook. Here, we're simply creating a diffrent `detector` that we will use for backtesting. 

In [4]:
response = L4M.create_anomaly_detector( 
    AnomalyDetectorName = project + "-detector-2",
    # AnomalyDetectorDomain = "ADS",
    AnomalyDetectorDescription = "My Detector",
    AnomalyDetectorConfig = {
        "AnomalyDetectorFrequency" : frequency,
    },
)

anomaly_detector_arn = response["AnomalyDetectorArn"]
anomaly_detector_arn

ServiceQuotaExceededException: An error occurred (ServiceQuotaExceededException) when calling the CreateAnomalyDetector operation: Request would cause a service quota to be exceeded.

## Define Metrics

At present ALFM supports 5,000 historical data points for backtesting starting from the moment when you create your detector.

After creating a detector, we need to point it to the s3 path for our backtest data. This process is also, similar to the one from the previous notebook.

First, let's create a role that can work with the Amazon Lookout for Metrics service:

In [5]:
import utility as pu
role_arn = pu.get_or_create_iam_role(role_name)

Role L4MTestRole already existed
Attaching policies
Waiting for a minute to allow IAM role policy attachment to propagate
arn:aws:iam::357984623133:role/L4MTestRole


Now, let's create a metric set for our detector that point to the backtest data in S3:

In [None]:
s3_path_backtest = 's3://'+ s3_bucket + '/ecommerce/backtest/'
s3_path_backtest

In [None]:
params = {
    "AnomalyDetectorArn": anomaly_detector_arn,
    "MetricSetName" : project + '-metric-set-1',
    "MetricList" : [
        {
            "MetricName" : "views",
            "AggregationFunction" : "AVG",
        },
        {
            "MetricName" : "revenue",
            "AggregationFunction" : "SUM",
        },
    ],

    "DimensionList" : [ "platform", "marketplace" ],

    "TimestampColumn" : {
        "ColumnName" : "timestamp",
        "ColumnFormat" : "yyyy-MM-dd HH:mm:ss",
    },

   #"Delay" : 120, # seconds the detector will wait before attempting to read latest data per current time and detection frequency below
    "MetricSetFrequency" : frequency,

    "MetricSource" : {
        "S3SourceConfig": {
            "RoleArn" : role_arn,
            "HistoricalDataPathList": [
                s3_path_backtest,
            ],
#            "TemplatedPathList": [
#                s3_path_format,
#            ],

            "FileFormatDescriptor" : {
                "CsvFormatDescriptor" : {
                    "FileCompression" : "NONE",
                    "Charset" : "UTF-8",
                    "ContainsHeader" : True,
                    "Delimiter" : ",",
#                    "HeaderList" : [
#                        "platform",
#                        "marketplace",
#                        "timestamp",
#                        "views",
#                        "revenue"
#                    ],
                    "QuoteSymbol" : '"'
                },
            }
        }
    },
}

params

In [None]:
L4M.delete_anomaly_detector(AnomalyDetectorArn=anomaly_detector_arn)

In [None]:
response = L4M.create_metric_set( ** params )

metric_set_arn = response["MetricSetArn"]
metric_set_arn

## Activate the Detector

Now that the MetricSet has been specified, we are ready to start backtesting, that's done by activating the back test anomaly detector.

In [11]:
anomaly_detector_arn = 'arn:aws:lookoutmetrics:us-west-2:357984623133:AnomalyDetector:initial-poirot-testing-cf-detector-2'

In [None]:
L4M.back_test_anomaly_detector(AnomalyDetectorArn = anomaly_detector_arn)

In [None]:
pu.wait_anomaly_detector( L4M, anomaly_detector_arn )

In [None]:
pu.wait_anomaly_detector( L4M, 'arn:aws:lookoutmetrics:us-west-2:357984623133:AnomalyDetector:initial-poirot-testing-cf-detector-2' )

## Validate results

After backtesting is finished, you can visually validate the historical anomalies via the console or inspect the results by running the following commands:

In [19]:
response = L4M.list_anomaly_group_summaries(AnomalyDetectorArn = anomaly_detector_arn,
                                 SensitivityThreshold=50,
                                 MaxResults=10,
                                )
response

{'ResponseMetadata': {'RequestId': 'e25e4c05-5581-480d-85cc-8bf773ff59db',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 05 Dec 2020 23:06:18 GMT',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '2734',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'e25e4c05-5581-480d-85cc-8bf773ff59db',
   'x-amz-apigw-id': 'XGfQrFTIvHcFRnw=',
   'x-amzn-trace-id': 'Root=1-5fcc126a-186e67e872f66ecb617b1c9c'},
  'RetryAttempts': 0},
 'AnomalyGroupSummaryList': [{'StartTime': '2020-12-05T01:00Z[UTC]',
   'EndTime': '2020-12-05T01:00Z[UTC]',
   'AnomalyGroupId': '2389a0ae-1d2b-44d0-99d8-0dbd189cbb1c',
   'AnomalyGroupScore': 86.78,
   'PrimaryMetricName': 'views'},
  {'StartTime': '2020-12-02T21:00Z[UTC]',
   'EndTime': '2020-12-02T21:00Z[UTC]',
   'AnomalyGroupId': '45ab34f8-f1e0-4706-86c3-0bb25a6b1e66',
   'AnomalyGroupScore': 86.78,
   'PrimaryMetricName': 'views'},
  {'StartTime': '2020-12-02T21:00Z[UTC]',
   'EndTime': '2020-12-02T21:00Z[UTC]',
   'AnomalyGr

And to dive even deeper into a specific anomaly group, simlpy choose your anomaly group of interest and drill down to it's time-series.

In [20]:
response = L4M.list_anomaly_group_time_series(AnomalyDetectorArn = anomaly_detector_arn,
                                   AnomalyGroupId='c1db0cc9-ec45-4467-ba4d-2459eec86d68',
                                   MetricName='views',
                                   MaxResults=10,
                                  )
response

{'ResponseMetadata': {'RequestId': '49c36236-b3c7-4067-b79c-ff7cfc585737',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'date': 'Sat, 05 Dec 2020 23:06:36 GMT',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '20722',
   'connection': 'keep-alive',
   'x-amzn-requestid': '49c36236-b3c7-4067-b79c-ff7cfc585737',
   'x-amz-apigw-id': 'XGfTeFV2vHcFgTg=',
   'x-amzn-trace-id': 'Root=1-5fcc127c-316376671191e68247b277ee'},
  'RetryAttempts': 0},
 'AnomalyGroupId': 'c1db0cc9-ec45-4467-ba4d-2459eec86d68',
 'MetricName': 'views',
 'TimestampList': ['2020-11-04T20:00Z[UTC]',
  '2020-11-04T21:00Z[UTC]',
  '2020-11-04T22:00Z[UTC]',
  '2020-11-04T23:00Z[UTC]',
  '2020-11-05T00:00Z[UTC]',
  '2020-11-05T01:00Z[UTC]',
  '2020-11-05T02:00Z[UTC]',
  '2020-11-05T03:00Z[UTC]',
  '2020-11-05T04:00Z[UTC]',
  '2020-11-05T05:00Z[UTC]',
  '2020-11-05T06:00Z[UTC]',
  '2020-11-05T07:00Z[UTC]',
  '2020-11-05T08:00Z[UTC]',
  '2020-11-05T09:00Z[UTC]',
  '2020-11-05T10:00Z[UTC]',
  '2020-11-05T11: