# Creating and Evaluating Predictors: Part 2 - Related Time Series

This notebook will build off of all the ealrier work and requires that at least the importing of target time series and related time series data be complete. If you have not performed those steps yet, go back, do so, then continue.

At this point you now have a target-time-series dataset and a related-time-series dataset loaded into a singular Dataset Group, this is what is required to leverage the models that support related data in Amazon Forecast. If your data supports item level metadata it could be added to the dataset group as well and would benefit only DeepAR+. 

To continue the work, start with the imports, determine your region, establish your API connections, and load all previously stored values

In [None]:
import boto3
from time import sleep
import subprocess
import pandas as pd
import json
import time
import pprint
import numpy as np
import util
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter
import matplotlib.dates as mdates
from IPython.display import Markdown

In [None]:
with open('/opt/ml/metadata/resource-metadata.json') as notebook_info:
    data = json.load(notebook_info)
    resource_arn = data['ResourceArn']
    region = resource_arn.split(':')[3]
print(region)

In [None]:
session = boto3.Session(region_name=region)
forecast = session.client(service_name='forecast')
forecast_query = session.client(service_name='forecastquery')

In [None]:
%store -r

## Creating and Training Predictors
 
Given that that our data is hourly and we want to generate a forecast on the hour, Forecast limits us to a horizon of 500 of whatever the slice is. This means we will be able to predict about 20 days into the future.

The cells below will define a few variables to be used with all of our models. Then there will be an API call to create each `Predictor` where they are based on Prophet and DeepAR+ respectfully.


In [None]:
forecastHorizon = 480
NumberOfBacktestWindows = 1
BackTestWindowOffset = 480
ForecastFrequency = "H"

In [None]:
arima_algorithmArn = 'arn:aws:forecast:::algorithm/ARIMA'
prophet_algorithmArn = 'arn:aws:forecast:::algorithm/Prophet'
deepAR_Plus_algorithmArn = 'arn:aws:forecast:::algorithm/Deep_AR_Plus'

### Prophet

In [None]:
# Prophet Specifics
# Note the REL to indicate related time series data
prophet_predictorName= project+'_prophet_rel_algo_1'

In [None]:
# Build Prophet:
prophet_create_predictor_response=forecast.create_predictor(PredictorName=prophet_predictorName, 
                                                  AlgorithmArn=prophet_algorithmArn,
                                                  ForecastHorizon=forecastHorizon,
                                                  PerformAutoML= False,
                                                  PerformHPO=False,
                                                  EvaluationParameters= {"NumberOfBacktestWindows": NumberOfBacktestWindows, 
                                                                         "BackTestWindowOffset": BackTestWindowOffset}, 
                                                  InputDataConfig= {"DatasetGroupArn": datasetGroupArn, "SupplementaryFeatures": [ 
                                                                     { 
                                                                        "Name": "holiday",
                                                                        "Value": "US"
                                                                     }
                                                                  ]},
                                                  FeaturizationConfig= {"ForecastFrequency": ForecastFrequency, 
                                                                        "Featurizations": 
                                                                        [
                                                                          {"AttributeName": "target_value", 
                                                                           "FeaturizationPipeline": 
                                                                            [
                                                                              {"FeaturizationMethodName": "filling", 
                                                                               "FeaturizationMethodParameters": 
                                                                                {"frontfill": "none", 
                                                                                 "middlefill": "zero", 
                                                                                 "backfill": "zero"}
                                                                              }
                                                                            ]
                                                                          }
                                                                        ]
                                                                       }
                                                 )




### DeepAR+

In [None]:
# DeepAR+ Specifics
prophet_predictorName= project+'_deeparp_rel_algo_1'

In [None]:
# Build DeepAR+:
deeparp_create_predictor_response=forecast.create_predictor(PredictorName=prophet_predictorName, 
                                                  AlgorithmArn=deepAR_Plus_algorithmArn,
                                                  ForecastHorizon=forecastHorizon,
                                                  PerformAutoML= False,
                                                  PerformHPO=False,
                                                  EvaluationParameters= {"NumberOfBacktestWindows": NumberOfBacktestWindows, 
                                                                         "BackTestWindowOffset": BackTestWindowOffset}, 
                                                  InputDataConfig= {"DatasetGroupArn": datasetGroupArn, "SupplementaryFeatures": [ 
                                                                     { 
                                                                        "Name": "holiday",
                                                                        "Value": "US"
                                                                     }
                                                                  ]},
                                                  FeaturizationConfig= {"ForecastFrequency": ForecastFrequency, 
                                                                        "Featurizations": 
                                                                        [
                                                                          {"AttributeName": "target_value", 
                                                                           "FeaturizationPipeline": 
                                                                            [
                                                                              {"FeaturizationMethodName": "filling", 
                                                                               "FeaturizationMethodParameters": 
                                                                                {"frontfill": "none", 
                                                                                 "middlefill": "zero", 
                                                                                 "backfill": "zero"}
                                                                              }
                                                                            ]
                                                                          }
                                                                        ]
                                                                       }
                                                 )





Normally in our notebooks we would have a while loop that polls for each of these to determine the status of the models in training. For simplicity sake here we are going to rely on you opening a new browser tab and following along in the console until a predictor has been created for each algorithm. 

Your previous tab from opening this session of Jupyter Lab should still be open, from there navigate to the Amazon Forecast service page, then select your dataset group. Lastly click `Predictors` and you should see the creation in progress. Once they are active you are ready to continue.

## Examining the Predictors

Once each of the Predictors is in an `Active` state you can get metrics about it to better understand its accuracy and behavior. These are computed based on the hold out periods we defined when building the Predictor. The metrics are meant to guide our decisions when we use a particular Predictor to generate a forecast

### Prophet

Here we are going to look to see the metrics from this Predictor like the earlier sessions, we will now add the related data metrics to the table from the previous notebook as well.

In [None]:
# Prophet Metrics
prophet_arn = prophet_create_predictor_response['PredictorArn']
prophet_rd_metrics = forecast.get_accuracy_metrics(PredictorArn=prophet_arn)
pp = pprint.PrettyPrinter()
pp.pprint(prophet_rd_metrics)
prophet_rd_RMSEs= util.extract_json_values(prophet_rd_metrics, 'RMSE')
markdown_results.append(prophet_rd_RMSEs[0])
prophet_rd_lossValues = util.extract_json_values(prophet_rd_metrics, 'LossValue')
markdown_results = markdown_results+prophet_rd_lossValues[::-1][:3]

In [None]:
Markdown("""
Here we see an RMSE of {0[12]} which is better than the original 
RMSE indicating that we may not be best served using related data for this algorithm.

| Predictor | RMSE               | 10%                 | 50%                 | 90%                |
|-----------|--------------------|---------------------|---------------------|--------------------|
| ARIMA     | {0[0]}             | {0[1]}              | {0[2]}              | {0[3]}             |
| Prophet   | {0[4]}             | {0[5]}              | {0[6]}              | {0[7]}             |
| Prophet + Related Data| {0[12]}| {0[13]}             | {0[14]}             | {0[15]}            |
| DeepAR+   | {0[8]}             | {0[9]}              | {0[10]}             | {0[11]}            |

When digging into the metrics we did not see a single improvement to Prophet, next let us see how DeepAR+ performed.
""".format(markdown_results))

### DeepAR+

Same as Prophet, now you should look at the metrics from it.

In [None]:
# DeepAR+ Metrics
deeparp_arn = deeparp_create_predictor_response['PredictorArn']
deeparp_rd_metrics = forecast.get_accuracy_metrics(PredictorArn=deeparp_arn)
pp = pprint.PrettyPrinter()
pp.pprint(deeparp_rd_metrics)
deeparp_rd_RMSEs= util.extract_json_values(deeparp_rd_metrics, 'RMSE')
markdown_results.append(deeparp_rd_RMSEs[0])
deeparp_rd_lossValues = util.extract_json_values(deeparp_rd_metrics, 'LossValue')
markdown_results = markdown_results+deeparp_rd_lossValues[::-1][:3]

In [None]:
Markdown("""
Now after training with DeepAR+ we can seen an RMSE of {0[16]} which is still not ideal but the full break down is:

| Predictor | RMSE               | 10%                 | 50%                 | 90%                |
|-----------|--------------------|---------------------|---------------------|--------------------|
| ARIMA     | {0[0]}             | {0[1]}              | {0[2]}              | {0[3]}             |
| Prophet   | {0[4]}             | {0[5]}              | {0[6]}              | {0[7]}             |
| Prophet + Related Data| {0[12]}| {0[13]}             | {0[14]}             | {0[15]}            |
| DeepAR+   | {0[8]}             | {0[9]}              | {0[10]}             | {0[11]}            |
| DeepAR+ & Related Data | {0[16]}| {0[17]}            | {0[18]}             | {0[19]}            |

From this table we can see that DeepAR+ with the related data is the leader for the 10% and 90% quantiles. If you are predicting in this range then it is a clear leader for usage. However if 50% is the target then DeepAR+ is the leader for now.

Additional work would need to be kicked off from here to determine the specific impact of these figures and how they compare to the existing Forecasting approaches performed by your customer.
""".format(markdown_results))