# How to use Amazon Forecast

Helps advanced users start with Amazon Forecast quickly. The demo notebook runs through a typical end to end usecase for a simple timeseries forecasting scenario. 

Prerequisites: 
[AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/installing.html) . 

For more informations about APIs, please check the [documentation](https://docs.aws.amazon.com/forecast/latest/dg/what-is-forecast.html)

## Table Of Contents
* [Setting up](#setup)
* [Test Setup - Running first API](#hello)
* [Forecasting Example with Amazon Forecast](#forecastingExample)

## Set up Preview SDK<a class="anchor" id="setup"></a>

In [1]:
# Configures your AWS CLI to now understand our up and coming service Amazon Forecast
!aws configure add-model --service-model file://../sdk/forecastquery-2018-06-26.normal.json --service-name forecastquery
!aws configure add-model --service-model file://../sdk/forecast-2018-06-26.normal.json --service-name forecast

In [2]:
# Prerequisites : 1 time install only
#!pip install boto3
#!pip install pandas

In [3]:
import boto3
from time import sleep
import subprocess

In [4]:
session = boto3.Session(region_name='us-west-2') #us-east-1 is also supported

forecast = session.client(service_name='forecast')
forecastquery = session.client(service_name='forecastquery')

## Test Setup <a class="anchor" id="hello"></a>
Let's say Hi to the Amazon Forecast to interact with our Simple API ListRecipes. The API returns a list of the global recipes Forecast offers that you could potentially use as a part of your forecasting solution. 

In [5]:
forecast.list_recipes()

{'RecipeNames': ['forecast_ARIMA',
  'forecast_DEEP_AR',
  'forecast_DEEP_AR_PLUS',
  'forecast_ETS',
  'forecast_MDN',
  'forecast_MQRNN',
  'forecast_NPTS',
  'forecast_PROPHET',
  'forecast_SQF'],
 'ResponseMetadata': {'RequestId': '9694a460-039a-4ebb-8c2b-b8042fae940f',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 16:44:43 GMT',
   'x-amzn-requestid': '9694a460-039a-4ebb-8c2b-b8042fae940f',
   'content-length': '174',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

*If this ran successfully, kudos! If there are any errors at this point runing the following list_recipes, please contact us at the [AWS support forum](https://forums.aws.amazon.com/forum.jspa?forumID=327)

## Forecasting with Amazon Forecast<a class="anchor" id="forecastingExample"></a>
### Preparing your Data

In Amazon Forecast , a dataset is a collection of file(s) which contain data that is relevant for a forecasting task. A dataset must conform to a schema provided by Amazon Forecast. 

For this exercise, we use the individual household electric power consumption dataset. (Dua, D. and Karra Taniskidou, E. (2017). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.) We aggregate the usage data hourly. 

# Data Type

Amazon forecast can import data from Amazon S3. We first explore the data locally to see the fields

In [6]:
import pandas as pd
df = pd.read_csv("../data/item-demand-time.csv", dtype = object)
df.head(3)

Unnamed: 0,2014-01-01 01:00:00,38.34991708126038,client_12
0,2014-01-01 02:00:00,33.5820895522388,client_12
1,2014-01-01 03:00:00,34.41127694859037,client_12
2,2014-01-01 04:00:00,39.800995024875625,client_12


Now upload the data to S3.

In [9]:
s3 = session.client('s3')

In [10]:
accountId = boto3.client('sts').get_caller_identity().get('Account')

In [11]:
#bucketName = 'amazon-forecast-%s-data'%accountId # use your own bucket
bucketName = 'chrisking-s3-sagemaker-test-forecast'
key="elec_data/item-demand-time.csv"

In [12]:
s3.upload_file(Filename="../data/item-demand-time.csv", Bucket=bucketName, Key=key)

In [13]:
bucketName

'chrisking-s3-sagemaker-test-forecast'

In [14]:
# One time setup only, uncomment the following command to create the role to provide to Amazon Forecast. 
# Save the generated role for all future calls to use for importing or exporting data. 

cmd = 'python ../setup_forecast_permissions.py '+bucketName
p = subprocess.Popen(cmd.split(' '), stdout=subprocess.PIPE, stderr=subprocess.PIPE)

In [15]:
roleArn = 'arn:aws:iam::%s:role/amazonforecast'%accountId

### CreateDataset

More details about `Domain` and dataset type can be found on the [documentation](https://docs.aws.amazon.com/forecast/latest/dg/howitworks-domains-ds-types.html) . For this example, we are using [CUSTOM](https://docs.aws.amazon.com/forecast/latest/dg/custom-domain.html) domain with 3 required attributes `timestamp`, `target_value` and `item_id`. 

In [16]:
DATASET_FREQUENCY = "H" 
TIMESTAMP_FORMAT = "yyyy-MM-dd hh:mm:ss"

In [24]:
project = 'workshopsagemakerclean'
datasetName= project+'_ds'
datasetGroupName= project +'_gp'
s3DataPath = "s3://"+bucketName+"/"+key

In [25]:
datasetName

'workshopsagemakerclean_ds'

In [26]:
# Specify the schema of your dataset here. Make sure the order of columns matches the raw data files.
schema ={
   "Attributes":[
      {
         "AttributeName":"timestamp",
         "AttributeType":"timestamp"
      },
      {
         "AttributeName":"target_value",
         "AttributeType":"float"
      },
      {
         "AttributeName":"item_id",
         "AttributeType":"string"
      }
   ]
}

response=forecast.create_dataset(
                    Domain="CUSTOM",
                    DatasetType='TARGET_TIME_SERIES',
                    DataFormat='CSV',
                    DatasetName=datasetName,
                    DataFrequency=DATASET_FREQUENCY, 
                    TimeStampFormat=TIMESTAMP_FORMAT,
                    Schema = schema
                   )

ResourceAlreadyExistsException: An error occurred (ResourceAlreadyExistsException) when calling the CreateDataset operation: Failed to create Dataset

In [27]:
forecast.describe_dataset(DatasetName=datasetName)

{'DatasetName': 'workshopsagemakerclean_ds',
 'DatasetType': 'TARGET_TIME_SERIES',
 'DataFormat': 'CSV',
 'Domain': 'CUSTOM',
 'ScheduleExpression': 'none',
 'DatasetArn': 'arn:aws:forecast:us-west-2:059124553121:ds/workshopsagemakerclean_ds',
 'Status': 'ACTIVE',
 'ResponseMetadata': {'RequestId': '03d2e384-9e4a-41c4-8f24-81c3fe6aef87',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 16:46:49 GMT',
   'x-amzn-requestid': '03d2e384-9e4a-41c4-8f24-81c3fe6aef87',
   'content-length': '245',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [28]:
forecast.create_dataset_group(DatasetGroupName=datasetGroupName,RoleArn=roleArn,DatasetNames=[datasetName])

ResourceAlreadyExistsException: An error occurred (ResourceAlreadyExistsException) when calling the CreateDatasetGroup operation: Resource already exists

If you have an existing datasetgroup, you can update it

In [29]:
forecast.describe_dataset_group(DatasetGroupName=datasetGroupName)

{'DatasetGroupName': 'workshopsagemakerclean_gp',
 'DatasetGroupArn': 'arn:aws:forecast:us-west-2:059124553121:dsgroup/workshopsagemakerclean_gp',
 'Datasets': ['workshopsagemakerclean_ds'],
 'RoleArn': 'arn:aws:iam::059124553121:role/amazonforecast',
 'ResponseMetadata': {'RequestId': 'ca488e9b-34f7-4fb8-a201-79f45f04aeed',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 16:46:59 GMT',
   'x-amzn-requestid': 'ca488e9b-34f7-4fb8-a201-79f45f04aeed',
   'content-length': '241',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### Create Data Import Job
Brings the data into Amazon Forecast system ready to forecast from raw data. 

In [30]:
ds_import_job_response=forecast.create_dataset_import_job(DatasetName=datasetName,Delimiter=',', DatasetGroupName =datasetGroupName ,S3Uri= s3DataPath)


In [31]:
ds_versionId=ds_import_job_response['VersionId']
print(ds_versionId)

c29b4bbe


Check the status of dataset, when the status change from **CREATING** to **ACTIVE**, we can continue to next steps. Depending on the data size. It can take 10 mins to be **ACTIVE**.

In [32]:
while True:
    dataImportStatus = forecast.describe_dataset_import_job(DatasetName=datasetName,VersionId=ds_versionId)['Status']
    print(dataImportStatus)
    if dataImportStatus != 'ACTIVE' and dataImportStatus != 'FAILED':
        sleep(30)
    else:
        break

CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
ACTIVE


In [33]:
forecast.describe_dataset_import_job(DatasetName=datasetName,VersionId=ds_versionId)

{'DatasetArn': 'arn:aws:forecast:us-west-2:059124553121:ds/workshopsagemakerclean_ds',
 'DatasetName': 'workshopsagemakerclean_ds',
 'VersionId': 'c29b4bbe',
 'Status': 'ACTIVE',
 'FieldStatistics': {'date': {'Count': 26280,
   'CountDistinct': 8396,
   'CountNull': 0,
   'Min': '2014-01-01T00:00:00Z',
   'Max': '2015-01-01T00:00:00Z'},
  'item': {'Count': 26280, 'CountDistinct': 3, 'CountNull': 0},
  'target': {'Count': 26280,
   'CountDistinct': 5059,
   'CountNull': 0,
   'Min': '0.0',
   'Max': '212.27197346600326',
   'Avg': 50.82350576202014,
   'Stddev': 37.9125549309785}},
 'ResponseMetadata': {'RequestId': 'f99d00ad-c3c8-444a-86f9-c65c0d131164',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 17:31:26 GMT',
   'x-amzn-requestid': 'f99d00ad-c3c8-444a-86f9-c65c0d131164',
   'content-length': '508',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### Recipe

In [34]:
recipesResponse=forecast.list_recipes()
recipesResponse

{'RecipeNames': ['forecast_ARIMA',
  'forecast_DEEP_AR',
  'forecast_DEEP_AR_PLUS',
  'forecast_ETS',
  'forecast_MDN',
  'forecast_MQRNN',
  'forecast_NPTS',
  'forecast_PROPHET',
  'forecast_SQF'],
 'ResponseMetadata': {'RequestId': '1655bc3f-51b6-4d91-b47b-b8b703e43349',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 17:31:29 GMT',
   'x-amzn-requestid': '1655bc3f-51b6-4d91-b47b-b8b703e43349',
   'content-length': '174',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

Get details about each recipe.

In [35]:
forecast.describe_recipe(RecipeName='forecast_MQRNN')

{'Recipe': {'Name': 'forecast_MQRNN',
  'Train': [{'TrainingInfo': {'TrainedModelName': 'algorithm_MQRNN',
     'AlgorithmName': 'MQRNN',
     'TrainingParameters': {'epochs': '60',
      'learning_rate': '3E-3',
      'mini_batch_size': '32',
      'quantiles': '[0.1,0.5,0.9]'}},
    'BackTestWindowCount': 1,
    'MetricsBuckets': []}]},
 'ResponseMetadata': {'RequestId': '4b28b02b-4685-44c7-8dfe-4fe7e91029ca',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 17:31:31 GMT',
   'x-amzn-requestid': '4b28b02b-4685-44c7-8dfe-4fe7e91029ca',
   'content-length': '281',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### Create Solution with customer forecast horizon

Forecast horizon is how long in future the forecast should be predicting. For weekly data, a value of 12 means 1 weeks. Our example is hourly data, we try forecast the next day, so we can set to 24.

In [36]:
predictorName= project+'_mqrnn'

In [37]:
forecastHorizon = 24

In [38]:
createPredictorResponse=forecast.create_predictor(RecipeName='forecast_MQRNN',DatasetGroupName= datasetGroupName ,PredictorName=predictorName, 
  ForecastHorizon = forecastHorizon)

In [39]:
predictorVerionId=createPredictorResponse['VersionId']

In [40]:
forecast.list_predictor_versions(PredictorName=predictorName)

{'PredictorVersions': [{'PredictorName': 'workshopsagemakerclean_mqrnn',
   'VersionId': '4baafc67'}],
 'ResponseMetadata': {'RequestId': 'f472312c-5c12-4a6e-b169-43417c1772dc',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 17:31:46 GMT',
   'x-amzn-requestid': 'f472312c-5c12-4a6e-b169-43417c1772dc',
   'content-length': '95',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

Check the status of solutions, when the status change from **CREATING** to **ACTIVE**, we can continue to next steps. Depending on data size, model selection and hyper parameters，it can take 10 mins to more than one hour to be **ACTIVE**.

In [41]:
while True:
    predictorStatus = forecast.describe_predictor(PredictorName=predictorName,VersionId=predictorVerionId)['Status']
    print(predictorStatus)
    if predictorStatus != 'ACTIVE' and predictorStatus != 'FAILED':
        sleep(30)
    else:
        break

CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
ACTIVE


### Get Error Metrics

In [42]:
forecastquery.get_accuracy_metrics(PredictorName=predictorName)

{'ModelMetrics': {'MQRNN': {'Metrics': {'p10': '0.1502397047434946',
    'p50': '0.3194854165247812',
    'p90': '0.28586567076359515',
    'rmse': '20.818490508747182'},
   'MetricsByBucket': []}},
 'ResponseMetadata': {'RequestId': 'a33146bf-5c48-4eed-be32-c8b4bec8951e',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 18:01:17 GMT',
   'x-amzn-requestid': 'a33146bf-5c48-4eed-be32-c8b4bec8951e',
   'content-length': '171',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

### Deploy Predictor

In [43]:
forecast.deploy_predictor(PredictorName=predictorName)

{'PredictorName': 'workshopsagemakerclean_mqrnn',
 'VersionId': '4baafc67',
 'PredictorArn': 'arn:aws:forecast:us-west-2:059124553121:predictor/workshopsagemakerclean_mqrnn',
 'ResponseMetadata': {'RequestId': 'd814d2a5-24fe-4341-aab6-f9c594608497',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1',
   'date': 'Fri, 04 Jan 2019 18:01:19 GMT',
   'x-amzn-requestid': 'd814d2a5-24fe-4341-aab6-f9c594608497',
   'content-length': '167',
   'connection': 'keep-alive'},
  'RetryAttempts': 0}}

In [44]:
deployedPredictorsResponse=forecast.list_deployed_predictors()
print(deployedPredictorsResponse)

{'PredictorNames': ['workshop_mqrnn', 'workshop_mqrnn_pn', 'workshopsagemaker_mqrnn', 'workshopsagemakerclean_mqrnn'], 'ResponseMetadata': {'RequestId': 'db7d5731-f647-441b-8e13-5a32dfed94be', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/x-amz-json-1.1', 'date': 'Fri, 04 Jan 2019 18:01:20 GMT', 'x-amzn-requestid': 'db7d5731-f647-441b-8e13-5a32dfed94be', 'content-length': '114', 'connection': 'keep-alive'}, 'RetryAttempts': 0}}


In [45]:
while True:
    deployedPredictorStatus = forecast.describe_deployed_predictor(PredictorName=predictorName)['Status']
    if deployedPredictorStatus != 'ACTIVE' and deployedPredictorStatus != 'FAILED':
        sleep(30)
    else:
        break
print(deployedPredictorStatus)

ACTIVE


### Get Forecast

When the solution is deployed and forecast results are ready, you can view them. 

In [46]:
forecastResponse = forecastquery.get_forecast(
    PredictorName=predictorName,
    Interval="hour",
    Filters={"item_id":"client_12"}
)
print(forecastResponse)

{'Forecast': {'ForecastId': '1546625987_250087c0', 'Predictions': {'mean': [{'Date': '2015-01-01T01:00:00', 'Val': 40.59370040893555}, {'Date': '2015-01-01T02:00:00', 'Val': 40.5814323425293}, {'Date': '2015-01-01T03:00:00', 'Val': 41.37274932861328}, {'Date': '2015-01-01T04:00:00', 'Val': 40.503475189208984}, {'Date': '2015-01-01T05:00:00', 'Val': 40.5191535949707}, {'Date': '2015-01-01T06:00:00', 'Val': 40.99903869628906}, {'Date': '2015-01-01T07:00:00', 'Val': 40.87933349609375}, {'Date': '2015-01-01T08:00:00', 'Val': 41.12674331665039}, {'Date': '2015-01-01T09:00:00', 'Val': 40.76581573486328}, {'Date': '2015-01-01T10:00:00', 'Val': 40.91756057739258}, {'Date': '2015-01-01T11:00:00', 'Val': 40.81345748901367}, {'Date': '2015-01-01T12:00:00', 'Val': 40.73418045043945}, {'Date': '2015-01-01T13:00:00', 'Val': 40.40203857421875}, {'Date': '2015-01-01T14:00:00', 'Val': 40.635501861572266}, {'Date': '2015-01-01T15:00:00', 'Val': 40.39127731323242}, {'Date': '2015-01-01T16:00:00', 'Val': 

# Export Forecast

You can batch export forecast to s3 bucket. To do so an role with s3 put access is needed. 

In [47]:
forecastInfoList= forecast.list_forecasts(PredictorName=predictorName)['ForecastInfoList']
forecastId= forecastInfoList[0]['ForecastId']

In [48]:
outputPath="s3://"+bucketName+"/output"

In [49]:
forecastExportResponse = forecast.create_forecast_export_job(ForecastId=forecastId, OutputPath={"S3Uri": outputPath,"RoleArn":roleArn})

In [50]:
forecastExportJobId = forecastExportResponse['ForecastExportJobId']

In [51]:
while True:
    forecastExportStatus = forecast.describe_forecast_export_job(ForecastExportJobId=forecastExportJobId)['Status']
    print(forecastExportStatus)
    if forecastExportStatus != 'ACTIVE' and forecastExportStatus != 'FAILED':
        sleep(30)
    else:
        break

CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
CREATING
ACTIVE


Check s3 bucket for results

In [52]:
s3.list_objects(Bucket=bucketName,Prefix="output")

{'ResponseMetadata': {'RequestId': 'AFEB2EC36A6CD044',
  'HostId': '2w5NLkt+TlBMV8goowmpH+nzB2wXuE5MMRdXe6QPLOTR5m+k2fLUK9UmrAIIQs9ktSy9/IijJEE=',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amz-id-2': '2w5NLkt+TlBMV8goowmpH+nzB2wXuE5MMRdXe6QPLOTR5m+k2fLUK9UmrAIIQs9ktSy9/IijJEE=',
   'x-amz-request-id': 'AFEB2EC36A6CD044',
   'date': 'Fri, 04 Jan 2019 18:54:21 GMT',
   'x-amz-bucket-region': 'us-west-2',
   'content-type': 'application/xml',
   'transfer-encoding': 'chunked',
   'server': 'AmazonS3'},
  'RetryAttempts': 0},
 'IsTruncated': False,
 'Marker': '',
 'Contents': [{'Key': 'output/_SUCCESS',
   'LastModified': datetime.datetime(2019, 1, 4, 18, 24, 34, tzinfo=tzlocal()),
   'ETag': '"d41d8cd98f00b204e9800998ecf8427e"',
   'Size': 0,
   'StorageClass': 'STANDARD',
   'Owner': {'DisplayName': 'chrskn',
    'ID': 'e1882fabbcc65e3b5a2283bd582e21e5eb99e24345ea64fc81872f37d8173502'}},
  {'Key': 'output/part-00000-d625a568-2632-408d-9088-a30e4651afa4-c000.csv',
   'LastModified': d