# Backtesting on Historical Data for Amazon Lookout for Metrics

Amazon Lookout for Metrics also supports backtesting against your historical information and in this notebook we will demonstrate this functionality on the same e-commerce data. Once the backtesting job has completed you can see all of the anomalies that Amazon Lookout for Metrics detected in the last 20% of your historical data. From here you can begin to unpack the kinds of results you will see from Amazon Lookout for Metrics in the future when you start streaming in new data. **NOTE YOU MUST CREATE A NEW DETECTOR TO LEVERAGE REAL TIME DATA. BACKTESTING IS ONLY FOR EXPLORATION.**

This notebook assumes that you already completed the prerequisites in the `1.PrereqSetupPackages.ipynb`, `2.PrereqSetupData.ipynb`, and `3.GettingStartedWithLiveData.ipynb`. If you have not, go back and complete those first, then give backtesting a whirl while you are waiting for the real time anomalies to alert you!

First restore the variables from the previous notebook and then import the libraries needed.

In [None]:
%store -r

Just as in the last notebook, connect to AWS through the SDK.

In [None]:
import boto3
import utility

In [None]:
L4M = boto3.client( "lookoutmetrics", region_name="us-west-2" )

## Creating A Detector

This step is identical to the one in the previous notebook. Here, we're simply creating a diffrent `detector` that we will use for backtesting. 

In [None]:
project = "initial-lookoutmetrics-backtesting-test"

frequency = "PT1H" # one of 'P1D', 'PT1H', 'PT10M' and 'PT5M'

In [None]:
response = L4M.create_anomaly_detector( 
    AnomalyDetectorName = project + "-detector",
    AnomalyDetectorDescription = "My Detector",
    AnomalyDetectorConfig = {
        "AnomalyDetectorFrequency" : "PT1H",
    },
)

anomaly_detector_arn = response["AnomalyDetectorArn"]
anomaly_detector_arn

## Define Metrics

After creating a detector, we need to point it to the s3 path for our backtest data. This process is also, similar to the one from the previous notebook.

First, let's create a role that can work with the Amazon Lookout for Metrics service:

In [None]:
role_name = "L4MTestRole"
role_arn = utility.get_or_create_iam_role(role_name)

Now, let's create a metric set for our detector that point to the backtest data in S3:

In [None]:
s3_path_backtest = 's3://'+ s3_bucket + '/ecommerce/backtest/'
s3_path_backtest

In [None]:
params = {
    "AnomalyDetectorArn": anomaly_detector_arn,
    "MetricSetName" : project + '-metric-set-1',
    "MetricList" : [
        {
            "MetricName" : "views",
            "AggregationFunction" : "AVG",
        },
        {
            "MetricName" : "revenue",
            "AggregationFunction" : "SUM",
        },
    ],

    "DimensionList" : [ "platform", "marketplace" ],

    "TimestampColumn" : {
        "ColumnName" : "timestamp",
        "ColumnFormat" : "yyyy-MM-dd HH:mm:ss",
    },

    #"Delay" : 120, # seconds the detector will wait before attempting to read latest data per current time and detection frequency below
    "MetricSetFrequency" : frequency,

    "MetricSource" : {
        "S3SourceConfig": {
            "RoleArn" : role_arn,
            "HistoricalDataPathList": [
                s3_path_backtest,
            ],
#            "TemplatedPathList": [
#                s3_path_format,
#            ],

            "FileFormatDescriptor" : {
                "CsvFormatDescriptor" : {
                    "FileCompression" : "NONE",
                    "Charset" : "UTF-8",
                    "ContainsHeader" : True,
                    "Delimiter" : ",",
#                    "HeaderList" : [
#                        "platform",
#                        "marketplace",
#                        "timestamp",
#                        "views",
#                        "revenue"
#                    ],
                    "QuoteSymbol" : '"'
                },
            }
        }
    },
}

params

In [None]:
response = L4M.create_metric_set( ** params )

metric_set_arn = response["MetricSetArn"]
metric_set_arn

## Activate the Detector and Execute Backtesting

Now that the MetricSet has been specified, we are ready to start backtesting, that's done by activating the back test anomaly detector. The backtesting process can take 25 minutes or so, so feel free to take a break and grab a snack and catch up on any articles you have saved. Note, when it says `BACK_TEST_ACTIVE` the service has trained a model and is now evaluating the holdout period.

In [None]:
L4M.back_test_anomaly_detector(AnomalyDetectorArn = anomaly_detector_arn)

In [None]:
utility.wait_anomaly_detector( L4M, anomaly_detector_arn )

## Validate results

After backtesting is finished, you can visually validate the historical anomalies via the console or inspect the results by running the commands below. It is recommended that you start your exploration in the console however. The console will be your tool for viewing and understanding alerts in the online mode later, this way you start to get familiar with the process.

In [None]:
anomaly_groups = []
next_token = None
first_response = None

while True:
    params = {
        "AnomalyDetectorArn" : anomaly_detector_arn,
        "SensitivityThreshold" : 50,
        "MaxResults" : 100,
    }
    
    if next_token:
        params["NextToken"] = next_token
    
    response = L4M.list_anomaly_group_summaries( **params )
    if first_response is None:
        first_response = response
    
    anomaly_groups += response["AnomalyGroupSummaryList"]
    
    if "NextToken" in response:
        next_token = response["NextToken"]
        continue
    break

first_response

In [None]:
anomaly_groups

And to dive even deeper into a specific anomaly group, simlpy choose your anomaly group of interest and drill down to it's time-series. Here we will use the first anomaly group in the list.

In [None]:
anomaly_group_id = anomaly_groups[0]["AnomalyGroupId"]

anomaly_group_time_series = []
next_token = None
first_response = None

while True:
    params = {
        "AnomalyDetectorArn" : anomaly_detector_arn,
        "AnomalyGroupId" : anomaly_group_id,
        "MetricName" : "views",
        "MaxResults" : 10,
    }
    
    if next_token:
        params["NextToken"] = next_token

    response = L4M.list_anomaly_group_time_series( **params )
    if first_response is None:
        first_response = response
    
    anomaly_group_time_series += response["TimeSeriesList"]
    
    if "NextToken" in response:
        next_token = response["NextToken"]
        continue
    break

first_response

In [None]:
anomaly_group_time_series

## Clean up resources

Once we have completed backtesting, we can start to cleanup the resources we created. Before cleaning up, you can visit the "Anomalies" page of the Amazon Lookout for Metrics console, and visually check the detected anomalies.

Note this will erase all the resources that have been created, so wait to run this until you are sure you wish to delete everything.

In [None]:
answer = input("Delete resources? (y/n)")
if answer=="y":
    delete_resources = True
else:
    delete_resources = False
    
if delete_resources:
    L4M.delete_anomaly_detector( AnomalyDetectorArn = anomaly_detector_arn )
    utility.wait_delete_anomaly_detector( L4M, anomaly_detector_arn )
    utility.delete_iam_role(role_name)
else:
    print("Not deteleting resources.")