## Predict with Amazon Forecast

Now we are going to use the previous data set to train a model with Amazon Forecast.



In [1]:
import boto3
from time import sleep
import subprocess

session = boto3.Session(region_name='us-east-1') # Check supported regions

forecast = session.client(service_name='forecast')
forecastquery = session.client(service_name='forecastquery')

In [2]:
# Check available algorithms
forecast.list_recipes()

{'RecipeNames': ['forecast_ARIMA',
  'forecast_DEEP_AR',
  'forecast_DEEP_AR_PLUS',
  'forecast_ETS',
  'forecast_MDN',
  'forecast_MQRNN',
  'forecast_NPTS',
  'forecast_PROPHET',
  'forecast_SQF'],
 'ResponseMetadata': {'RequestId': 'a2c52cfb-1c5c-48ec-8c02-a35c9467b03c',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 15:20:11 GMT',
   'x-amzn-requestid': 'a2c52cfb-1c5c-48ec-8c02-a35c9467b03c',
   'content-length': '174',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### 1. Train/Test split

To be able to evaluate the forecast quality later, we are going to leave out the last 15 days of data from the analysis.

In [3]:
import pandas as pd
df = pd.read_csv("target_time_series.csv", dtype = object,header=None)
df.head(3)

Unnamed: 0,0,1,2
0,96995,2013-02-04,1.0
1,96995,2013-02-05,0.0
2,96995,2013-02-06,0.0


In [4]:
df[1] = pd.to_datetime(df[1])

In [5]:
df = df[df[1]<'2017-08-01']

In [6]:
df.to_csv('target_time_series_train.csv',index=None,header=None)

### 2. Upload data to S3

In [7]:
s3 = session.client('s3')

accountId = boto3.client('sts').get_caller_identity().get('Account')

bucketName = 'amazon-forecast-chrisking-data-mg'# Update to your bucket name
key="favorita/target_time_series_train.csv"

s3.upload_file(Filename="target_time_series_train.csv", Bucket=bucketName, Key=key)

roleArn = 'arn:aws:iam::%s:role/amazonforecast'%accountId

### 3. Create Dataset

Now we are going to create the data set schema in Amazon Forecast.

In [8]:
DATASET_FREQUENCY = "D" 
TIMESTAMP_FORMAT = "yyyy-MM-dd"

In [23]:
project = 'favorita_forecast3' # Replace this with a unique name here, make sure the entire name is < 30 characters.
datasetName= project+'_ds3'
datasetGroupName= project +'_gp3'
s3DataPath = "s3://"+bucketName+"/"+key

In [12]:
# Specify the schema of your dataset here. Make sure the order of columns matches the raw data files.
schema ={
   "Attributes":[
      {
         "AttributeName":"item_id",
         "AttributeType":"string"
      },
      {
         "AttributeName":"timestamp",
         "AttributeType":"timestamp"
      },
      {
         "AttributeName":"demand",
         "AttributeType":"float"
      }
   ]
}

response=forecast.create_dataset(
                    Domain="RETAIL",
                    DatasetType='TARGET_TIME_SERIES',
                    DataFormat='CSV',
                    DatasetName=datasetName,
                    DataFrequency=DATASET_FREQUENCY, 
                    TimeStampFormat=TIMESTAMP_FORMAT,
                    Schema = schema
                   )

In [13]:
forecast.create_dataset_group(DatasetGroupName=datasetGroupName,RoleArn=roleArn,DatasetNames=[datasetName])

{'DatasetGroupName': 'favorita_forecast3_gp3',
 'DatasetGroupArn': 'arn:aws:forecast:us-east-1:452432741922:dsgroup/favorita_forecast3_gp3',
 'ResponseMetadata': {'RequestId': 'd275a882-32fc-4a26-a637-c0babd6d7cef',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 15:24:08 GMT',
   'x-amzn-requestid': 'd275a882-32fc-4a26-a637-c0babd6d7cef',
   'content-length': '136',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### 4. Create Data Import Job
Brings the data into Amazon Forecast system ready to forecast from raw data. 

In [31]:
forecast.list_dataset_groups()

{'DatasetGroupNames': ['acindar_forecast2_gp2',
  'acindar_gp2',
  'acindar_gpv2',
  'favorita_forecast2_gp2',
  'favorita_forecast3_gp3',
  'favorita_forecastdemo_gpsimple'],
 'ResponseMetadata': {'RequestId': '4d984544-6272-489d-b834-b30cb060d9ed',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 18:08:04 GMT',
   'x-amzn-requestid': '4d984544-6272-489d-b834-b30cb060d9ed',
   'content-length': '159',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [29]:
ds_import_job_response=forecast.create_dataset_import_job(DatasetName=datasetName,Delimiter=',', DatasetGroupName =datasetGroupName ,S3Uri= s3DataPath)

In [15]:
ds_versionId=ds_import_job_response['VersionId']
print(ds_versionId)

66acc2bb


In [16]:
while True:
    dataImportStatus = forecast.describe_dataset_import_job(DatasetName=datasetName,VersionId=ds_versionId)['Status']
    print(dataImportStatus)
    if dataImportStatus != 'ACTIVE' and dataImportStatus != 'FAILED':
        sleep(30)
    else:
        break

QUEUED
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
ACTIVE


### 5. Create Solution with your own forecast horizon

We are going to use a forecast horizon of 30 days. Even though we are going to evaluate predictions on the 15 days we left out of training.

In [17]:
predictorName= project+'_mqrnn3'

In [18]:
forecastHorizon = 30

In [19]:
forecast.list_recipes()

{'RecipeNames': ['forecast_ARIMA',
  'forecast_DEEP_AR',
  'forecast_DEEP_AR_PLUS',
  'forecast_ETS',
  'forecast_MDN',
  'forecast_MQRNN',
  'forecast_NPTS',
  'forecast_PROPHET',
  'forecast_SQF'],
 'ResponseMetadata': {'RequestId': 'd491385f-96d2-4eae-9285-a41285bc04d7',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 16:05:12 GMT',
   'x-amzn-requestid': 'd491385f-96d2-4eae-9285-a41285bc04d7',
   'content-length': '174',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [20]:
createPredictorResponse=forecast.create_predictor(RecipeName='forecast_MQRNN',DatasetGroupName= datasetGroupName ,PredictorName=predictorName, 
  ForecastHorizon = forecastHorizon)

In [21]:
predictorVerionId=createPredictorResponse['VersionId']

In [24]:
predictorVerionId

'4b6a3da5'

In [None]:
while True:
    predictorStatus = forecast.describe_predictor(PredictorName=predictorName,VersionId=predictorVerionId)['Status']
    print(predictorStatus)
    if predictorStatus != 'ACTIVE' and predictorStatus != 'FAILED':
        sleep(30)
    else:
        break

CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
C

In [33]:
forecast.describe_predictor(PredictorName=predictorName,VersionId=predictorVerionId)

{'PredictorName': 'favorita_forecast3_mqrnn3',
 'VersionId': '4b6a3da5',
 'Status': 'ACTIVE',
 'LastModificationTime': 1552334150.267,
 'PredictorArn': 'arn:aws:forecast:us-east-1:452432741922:predictor/favorita_forecast3_mqrnn3',
 'RecipeName': 'forecast_MQRNN',
 'DatasetGroup': 'favorita_forecast3_gp3',
 'RecipeParameters': {},
 'ResponseMetadata': {'RequestId': '1f30012b-9e07-4212-ab20-d77bc94a1b27',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 19:56:13 GMT',
   'x-amzn-requestid': '1f30012b-9e07-4212-ab20-d77bc94a1b27',
   'content-length': '385',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### 6. Get Accuracy Metrics

In [34]:
project = 'favorita_forecast3' # Replace this with a unique name here, make sure the entire name is < 30 characters.
predictorName= project+'_mqrnn3'

In [35]:
forecastquery.get_accuracy_metrics(PredictorName=predictorName)

{'ModelMetrics': {'MQRNN': {'Metrics': {'p10': '0.1899730645663291',
    'p50': '0.6563960518373003',
    'p90': '0.46699407174441915',
    'rmse': '5.51639276471261'},
   'MetricsByBucket': []}},
 'ResponseMetadata': {'RequestId': '6b725d19-d4fe-4aec-9493-552f050557e9',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 19:58:38 GMT',
   'x-amzn-requestid': '6b725d19-d4fe-4aec-9493-552f050557e9',
   'content-length': '169',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### 7. Deploy Predictor

In [37]:
forecast.deploy_predictor(PredictorName=predictorName)

{'PredictorName': 'favorita_forecast3_mqrnn3',
 'VersionId': '4b6a3da5',
 'PredictorArn': 'arn:aws:forecast:us-east-1:452432741922:predictor/favorita_forecast3_mqrnn3',
 'ResponseMetadata': {'RequestId': '4ec9f40b-4522-48bb-b577-b38b1aa7f6cb',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Mon, 11 Mar 2019 19:59:22 GMT',
   'x-amzn-requestid': '4ec9f40b-4522-48bb-b577-b38b1aa7f6cb',
   'content-length': '161',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [38]:
deployedPredictorsResponse=forecast.list_deployed_predictors()
print(deployedPredictorsResponse)

{'PredictorNames': ['acindar_arima', 'acindar_deeparplus', 'favorita_forecast3_mqrnn3'], 'ResponseMetadata': {'RequestId': '5cd445e6-ef9f-44f5-bd13-f663166cd565', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1', 'date': 'Mon, 11 Mar 2019 19:59:30 GMT', 'x-amzn-requestid': '5cd445e6-ef9f-44f5-bd13-f663166cd565', 'content-length': '85', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


In [None]:
while True:
    deployedPredictorStatus = forecast.describe_deployed_predictor(PredictorName=predictorName)['Status']
    print(deployedPredictorStatus)
    if deployedPredictorStatus != 'ACTIVE' and deployedPredictorStatus != 'FAILED':
        sleep(30)
    else:
        break
print(deployedPredictorStatus)

CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
C

In the <a href='4.InferWithDeployedPredictor.ipynb' >next notebook </a> we are going to use the deployed predictor and compare the results with the ground thruth from our test set. 