# Building Your Predictor

The next step after preparing and importing your data via `Getting_Data_Ready.ipynb` is to build your first model.

The overall process for this is:

* Setup
* Create a Predictor
* Deploy a Predictor
* Obtain a Forecast

To get started, simply execute the cells below:


## Setup


Import the standard Python Libraries that are used in this lesson.

In [1]:
import boto3
from time import sleep
import subprocess
import pandas as pd
import json
import time

The last part of the setup process is to validate that your account can communicate with Amazon Forecast, the cell below does just that.

In [2]:
session = boto3.Session(region_name='us-east-1') 
forecast = session.client(service_name='forecast') 
forecastquery = session.client(service_name='forecastquery')

## Create a Predictor

Now in the previous notebook, your data was imported to be used by Forecast, here we will once again define your dataset information and then start building your model or predictor.

Forecast horizon is the number of number of time points to predicted in the future. For weekly data, a value of 12 means 12 weeks. Our example is yearly data, we try forecast the next year, so we can set to 1.

In [3]:
project = 'cof_revenue_forecastdemo' # This should be the same as the project defined in your previous notebook

In [4]:
predictorName= project+'_autoML'

In [8]:
forecastHorizon = 1

In [9]:
algorithmArn = 'arn:aws:forecast:::algorithm/autoML'

In [10]:
datasetGroupArn = "arn:aws:forecast:us-east-1:457927431838:dataset-group/cof_revenue_forecastdemo_dsg" 
# Fill in the quotes from the output of the DatasetArn from the previous notebook.

In [11]:
create_predictor_response=forecast.create_predictor(PredictorName=predictorName, 
                                                  #AlgorithmArn=algorithmArn,
                                                  ForecastHorizon=forecastHorizon,
                                                  PerformAutoML= True,
                                                  PerformHPO=False,
                                                  EvaluationParameters= {"NumberOfBacktestWindows": 3, 
                                                                         "BackTestWindowOffset": 3}, 
                                                  InputDataConfig= {"DatasetGroupArn": datasetGroupArn},
                                                  FeaturizationConfig= {"ForecastFrequency": "Y"}
                                                                                                
                                                 )

In [12]:
predictorArn=create_predictor_response['PredictorArn']

Check the status of the predictor. When the status change from **CREATE_IN_PROGRESS** to **ACTIVE**, we can continue to next steps. Depending on data size, model selection and hyper parameters，it can take 10 mins to more than one hour to be **ACTIVE**.

In [13]:
while True:
    predictorStatus = forecast.describe_predictor(PredictorArn=predictorArn)['Status']
    print(predictorStatus)
    if predictorStatus != 'ACTIVE' and predictorStatus != 'CREATE_FAILED':
        sleep(30)
    else:
        break

CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PR

### Get Error Metrics

In [14]:
forecast.get_accuracy_metrics(PredictorArn=predictorArn)

{'PredictorEvaluationResults': [{'TestWindows': [{'EvaluationType': 'SUMMARY',
     'Metrics': {'RMSE': 2042440816.412962,
      'WeightedQuantileLosses': [{'Quantile': 0.9,
        'LossValue': 0.095468952116404},
       {'Quantile': 0.5, 'LossValue': 0.11524933289397565},
       {'Quantile': 0.1, 'LossValue': 0.05878505428922506}]}},
    {'TestWindowStart': datetime.datetime(2009, 12, 31, 0, 0, tzinfo=tzlocal()),
     'TestWindowEnd': datetime.datetime(2010, 12, 31, 0, 0, tzinfo=tzlocal()),
     'ItemCount': 1,
     'EvaluationType': 'COMPUTED',
     'Metrics': {'RMSE': 1966888807.9952602,
      'WeightedQuantileLosses': [{'Quantile': 0.9,
        'LossValue': 0.04854085499173587},
       {'Quantile': 0.5, 'LossValue': 0.15149413533552536},
       {'Quantile': 0.1, 'LossValue': 0.10851119228226838}]}},
    {'TestWindowStart': datetime.datetime(2015, 12, 31, 0, 0, tzinfo=tzlocal()),
     'TestWindowEnd': datetime.datetime(2016, 12, 31, 0, 0, tzinfo=tzlocal()),
     'ItemCount': 1,
   

## Create a Forecast

Now create a forecast using the model that was trained

In [15]:
forecastName= project+'_autoML_forecast'

In [16]:
create_forecast_response=forecast.create_forecast(ForecastName=forecastName,
                                                  PredictorArn=predictorArn)
forecastArn = create_forecast_response['ForecastArn']

Check the status of the forecast process, when the status change from **CREATE_IN_PROGRESS** to **ACTIVE**, we can continue to next steps. Depending on data size, model selection and hyper parameters，it can take 10 mins to more than one hour to be **ACTIVE**.

In [17]:
while True:
    forecastStatus = forecast.describe_forecast(ForecastArn=forecastArn)['Status']
    print(forecastStatus)
    if forecastStatus != 'ACTIVE' and forecastStatus != 'CREATE_FAILED':
        sleep(30)
    else:
        break

CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
CREATE_IN_PROGRESS
ACTIVE


### Get Forecast

Once created, the forecast results are ready and you view them. 

In [18]:
print(forecastArn)
print()
forecastResponse = forecastquery.query_forecast(
    ForecastArn=forecastArn,
    Filters={"metric_name":"Revenue"}
)
print(forecastResponse)

arn:aws:forecast:us-east-1:457927431838:forecast/cof_revenue_forecastdemo_autoML_forecast

{'Forecast': {'Predictions': {'p10': [{'Timestamp': '2018-01-01T00:00:00', 'Value': 26715791360.0}], 'p50': [{'Timestamp': '2018-01-01T00:00:00', 'Value': 28476710912.0}], 'p90': [{'Timestamp': '2018-01-01T00:00:00', 'Value': 30237632512.0}]}}, 'ResponseMetadata': {'RequestId': '35e05459-d53a-4b6e-883b-c4cb11cde6b9', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1', 'date': 'Wed, 09 Oct 2019 07:16:33 GMT', 'x-amzn-requestid': '35e05459-d53a-4b6e-883b-c4cb11cde6b9', 'content-length': '233', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


## Next Steps

Now that your forecast has been created, use the arn that was printed above to evaluate it in `Evaluating_Your_Predictor.ipynb`