# Creating and Evaluating Predictors: Part 2 - Related Time Series

This notebook will build off of all the earlier work and requires that at least the importing of target time series and related time series data be complete. If you have not performed those steps yet, go back, do so, then continue.

At this point you now have a target-time-series dataset and a related-time-series dataset loaded into a singular Dataset Group, this is what is required to leverage the models that support related data in Amazon Forecast. If your data supports item level metadata it could be added to the dataset group as well and would benefit only algorithms that support that (e.g. CNN-QR, DeepAR+, but **not** Prophet). 

To continue the work, start with the imports, determine your region, establish your API connections, and load all previously stored values

In [None]:
%load_ext autoreload
%autoreload 2

# Python Built-Ins:
import json
from types import SimpleNamespace

# External Dependencies:
import boto3
from IPython.display import Markdown
import pandas as pd
from pprint import pprint as prettyprint

# Local Dependencies:
import util

In [None]:
%store -r

In [None]:
session = boto3.Session(region_name=region)

forecast = session.client("forecast")
forecast_query = session.client("forecastquery")

s3 = session.resource("s3")
export_bucket = s3.Bucket(export_bucket_name)

## Creating and Training Predictors
 
Given that that our data is hourly and we want to generate a forecast on the hour, Forecast limits us to a horizon of 500 of whatever the slice is. This means we will be able to predict about 20 days into the future.

The cells below will define a few variables to be used with all of our models. We'll then re-use these to create each `Predictor` we investigate.


In [None]:
forecast_horizon = 240
num_backtest_windows = 1
backtest_window_offset = 240
forecast_frequency = "H"
evaluation_parameters = {
    "NumberOfBacktestWindows": 1,
    "BackTestWindowOffset": 240,
}
input_data_config = {
    "DatasetGroupArn": datasetGroupArn,
    "SupplementaryFeatures": [
        { "Name": "holiday", "Value": "US" },
    ],
}

In [None]:
prophet_algorithm_arn = "arn:aws:forecast:::algorithm/Prophet"
deeparp_algorithm_arn = "arn:aws:forecast:::algorithm/Deep_AR_Plus"
cnnqr_algorithm_arn = "arn:aws:forecast:::algorithm/CNN-QR"

### Prophet

In [None]:
prophet_create_predictor_response = forecast.create_predictor(
    PredictorName=f"{project}_prophet_rel_algo_1",
    AlgorithmArn=prophet_algorithm_arn,
    ForecastHorizon=forecast_horizon,
    PerformAutoML=False,
    PerformHPO=False,
    EvaluationParameters=evaluation_parameters,
    InputDataConfig=input_data_config,
    FeaturizationConfig={
        "ForecastFrequency": forecast_frequency,
        "Featurizations": [
            {
                "AttributeName": "target_value",
                "FeaturizationPipeline": [
                    {
                        "FeaturizationMethodName": "filling",
                        "FeaturizationMethodParameters": {
                            "frontfill": "none",
                            "middlefill": "zero",
                            "backfill": "zero",
                        },
                    },
                ],
            },
        ],
    },
)
results["Prophet with RTS"] = SimpleNamespace(predictor_arn=prophet_create_predictor_response["PredictorArn"])

### DeepAR+

In [None]:
deeparp_create_predictor_response = forecast.create_predictor(
    PredictorName=f"{project}_deeparp_rel_algo_1",
    AlgorithmArn=deeparp_algorithm_arn,
    ForecastHorizon=forecast_horizon,
    PerformAutoML=False,
    PerformHPO=False,
    EvaluationParameters=evaluation_parameters,
    InputDataConfig=input_data_config,
    FeaturizationConfig={
        "ForecastFrequency": forecast_frequency,
        "Featurizations": [
            {
                "AttributeName": "target_value",
                "FeaturizationPipeline": [
                    {
                        "FeaturizationMethodName": "filling",
                        "FeaturizationMethodParameters": {
                            "frontfill": "none",
                            "middlefill": "zero",
                            "backfill": "zero",
                        },
                    },
                ],
            },
        ],
    },
)
results["DeepAR+ with RTS"] = SimpleNamespace(predictor_arn=deeparp_create_predictor_response["PredictorArn"])

### CNN-QR

In [None]:
# cnnqr_create_predictor_response = forecast.create_predictor(
#     PredictorName=f"{project}_cnnqr_rel_algo_1",
#     AlgorithmArn=cnnqr_algorithm_arn,
#     ForecastHorizon=forecast_horizon,
#     PerformAutoML=False,
#     PerformHPO=False,
#     EvaluationParameters=evaluation_parameters,
#     InputDataConfig=input_data_config,
#     FeaturizationConfig={
#         "ForecastFrequency": forecast_frequency,
#         "Featurizations": [
#             {
#                 "AttributeName": "target_value",
#                 "FeaturizationPipeline": [
#                     {
#                         "FeaturizationMethodName": "filling",
#                         "FeaturizationMethodParameters": {
#                             "frontfill": "none",
#                             "middlefill": "zero",
#                             "backfill": "zero",
#                         },
#                     },
#                 ],
#             },
#         ],
#     },
# )
# results["CNN-QR with RTS"] = SimpleNamespace(predictor_arn=cnnqr_create_predictor_response["PredictorArn"])

Normally in our notebooks we would have a while loop that polls for each of these to determine the status of the models in training. For simplicity sake here we are going to rely on you opening a new browser tab and following along in the console until a predictor has been created for each algorithm. 

Your previous tab from opening this session of Jupyter Lab should still be open, from there navigate to the Amazon Forecast service page, then select your dataset group. Lastly click `Predictors` and you should see the creation in progress. Once they are active you are ready to continue.

In [None]:
in_progress_predictors = [results[r].predictor_arn for r in results]
failed_predictors = []

def check_status():
    """Check and update in_progress_predictors"""
    just_stopped = []  # Can't edit the in_progress list directly the loop!
    for arn in in_progress_predictors:
        predictor_desc = forecast.describe_predictor(PredictorArn=arn)
        status = predictor_desc["Status"]
        if status == "ACTIVE":
            print(f"\nBuild succeeded for {arn}")
            just_stopped.append(arn)
        elif "FAILED" in status:
            print(f"\nBuild failed for {arn}")
            just_stopped.append(arn)
            failed_predictors.append(arn)
    for arn in just_stopped:
        in_progress_predictors.remove(arn)
    return in_progress_predictors

util.progress.polling_spinner(
    fn_poll_result=check_status,
    fn_is_finished=lambda l: len(l) == 0,
    fn_stringify_result=lambda l: f"{len(l)} predictor builds in progress",
    poll_secs=60,  # Poll every minute
    timeout_secs=3*60*60,  # Max 3 hours
)

if len(failed_predictors):
    raise RuntimeError(f"The following predictors failed to train:\n{failed_predictors}")

## Examining the Predictors

Once each of the Predictors is in an `Active` state you can get metrics about it to better understand its accuracy and behavior. These are computed based on the hold out periods we defined when building the Predictor. The metrics are meant to guide our decisions when we use a particular Predictor to generate a forecast

In [None]:
def evaluate_trial_metrics(trial_name=None) -> pd.DataFrame:
    """Utility to fetch the accuracy metrics for a predictor and output the leaderboard so far"""
    if (trial_name):
        # Print the raw API response:
        metrics_response = forecast.get_accuracy_metrics(PredictorArn=results[trial_name].predictor_arn)
        print(f"Raw metrics for {trial_name}:")
        prettyprint(metrics_response)

        # Save the payload section to results:
        evaluation_results = metrics_response["PredictorEvaluationResults"]
        results[trial_name].evaluation_results = evaluation_results

        # Construct simplified version for our comparison:
        try:
            summary_metrics = next(
                w for w in evaluation_results[0]["TestWindows"] if w["EvaluationType"] == "SUMMARY"
            )["Metrics"]
        except StopIteration:
            raise ValueError("Couldn't find SUMMARY metrics in Forecast API response")
        results[trial_name].summary_metrics = {
            "RMSE": summary_metrics["RMSE"],
            "10% wQL": next(
                l["LossValue"] for l in summary_metrics["WeightedQuantileLosses"] if l["Quantile"] == 0.1
            ),
            "50% wQL (MAPE)": next(
                l["LossValue"] for l in summary_metrics["WeightedQuantileLosses"] if l["Quantile"] == 0.5
            ),
            "90% wQL": next(
                l["LossValue"] for l in summary_metrics["WeightedQuantileLosses"] if l["Quantile"] == 0.9
            ),
        }
    # Render the leaderboard:
    return pd.DataFrame([
        { "Predictor": name, **results[name].summary_metrics } for name in results
        if "summary_metrics" in results[name].__dict__
    ]).set_index("Predictor")

### Prophet

Here we are going to look to see the metrics from this Predictor like the earlier sessions, we will now add the related data metrics to the table from the previous notebook as well.

In [None]:
evaluate_trial_metrics("Prophet with RTS")

### DeepAR+

Same as Prophet, now you should look at the metrics from it.

In [None]:
evaluate_trial_metrics("DeepAR+ with RTS")

### CNN-QR

In [None]:
# evaluate_trial_metrics("CNN-QR with RTS")