# Amazon Forecast  Sample Inventory Planning

Sample INVENTORY_PLANNING for  bike rentals. The bike rental data came from the [Kaggle Bike Share dataset](https://www.kaggle.com/c/bike-sharing-demand) 

Prerequisites: 
[AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/installing.html) . 

For more informations about APIs, please check the [documentation](https://docs.aws.amazon.com/forecast/latest/dg/what-is-forecast.html)

## Table Of Contents
* [Setting up](#setup)
* [Test Setup - Running first API](#hello)
* [Forecasting Example with Amazon Forecast](#forecastingExample)

**Read Every Cell FULLY before executing it**


## Set up Preview SDK<a class="anchor" id="setup"></a>

In [None]:
# Configures your AWS CLI for Amazon Forecast
!aws configure add-model --service-model file://../sdk/forecastquery-2019-07-22.normal.json --service-name forecastquery
!aws configure add-model --service-model file://../sdk/forecast-2019-07-22.normal.json --service-name forecast

In [None]:
# Prerequisites : 1 time install only, remove the comments to execute the lines.
#!pip install boto3
#!pip install pandas

In [None]:
import boto3
from time import sleep
import subprocess
import matplotlib.pyplot as plt
%matplotlib inline


In [None]:
session = boto3.Session(region_name='us-west-2') 

forecast = session.client(service_name='forecast') 
forecastquery = session.client(service_name='forecastquery')

## Forecasting with Amazon Forecast<a class="anchor" id="forecastingExample"></a>
### Preparing your Data

In Amazon Forecast , a dataset is a collection of file(s) which contain data that is relevant for a forecasting task. A dataset must conform to a schema provided by Amazon Forecast. 

For this exercise, we use sample bike share rental dataset. 

# Data Pre-processing 
  Amazon forecast can import data from Amazon S3. We first explore the data locally to see the attributes . We are using all the attriutes , so the user can modify the attributes to see the different results. 
  IF you remove any attributes , those changes must be reflected in 

In [None]:
import pandas as pd
trf = pd.read_csv("../data/train.csv", dtype = object)
trf['count'] = trf['count'].astype(float) 
trf['itemname'] = 'bike_12'
trf.to_csv("../data/bike.csv", index= False, header = False)
trf.head(3)

In [None]:
df = pd.read_csv("../data/bike.csv", dtype = object)
df.head(3)

Now upload the data to S3. But before doing that, go into your AWS Console, select S3 for the service and create a new bucket inside the `Oregon` or `us-west-2` region. Use that bucket name convention of `amazon-forecast-unique-value-data`. The name must be unique, if you get an error, just adjust until your name works, then update the `bucketName` cell below.

In [None]:
s3 = session.client('s3')

In [None]:
accountId = boto3.client('sts').get_caller_identity().get('Account')

In [None]:
bucketName = "amazon-forecast-data-{0}".format(accountId)# Update the unique-value bit here.
key="bikedata/bike.csv"

In [None]:
s3.upload_file(Filename="../data/bike.csv", Bucket=bucketName, Key=key)

In [None]:
bucketName

In [None]:
# One time setup only, uncomment the following command to create the role to provide to Amazon Forecast. 
# Save the generated role for all future calls to use for importing or exporting data. 

cmd = 'python ../setup_forecast_permissions.py '+bucketName
p = subprocess.Popen(cmd.split(' '), stdout=subprocess.PIPE, stderr=subprocess.PIPE)

In [None]:
roleArn = 'arn:aws:iam::%s:role/amazonforecast'%accountId

### CreateDataset

More details about `Domain` and dataset type can be found on the [documentation](https://docs.aws.amazon.com/forecast/latest/dg/howitworks-domains-ds-types.html) . For this example, we are using [RETAIL](https://docs.aws.amazon.com/forecast/latest/dg/retail-domain.html) domain with 3 required attributes `timestamp`, `demand` and `item_id`. Also for your project name, update it to reflect your name in a lowercase format.

In [None]:
DATASET_FREQUENCY = "H" 
TIMESTAMP_FORMAT = "yyyy-MM-dd hh:mm:ss"

In [None]:
project = 'bike_forecastdemo' # Replace this with a unique name here, make sure the entire name is < 30 characters.
datasetName= project+'_ds'
datasetGroupName= project +'_gp'
s3DataPath = "s3://"+bucketName+"/"+key
s3DataPath

In [None]:
datasetName

### Schema Definition 
### We are defining the attributes for the model 

In [None]:
# Specify the schema of your dataset here. Make sure the order of columns matches the raw data files.
schema ={
   "Attributes":[    
      { "AttributeName":"timestamp",     "AttributeType":"timestamp"    },      
      { "AttributeName":"season",        "AttributeType":"integer"      },      
      { "AttributeName":"holiday",       "AttributeType":"integer"      },      
      { "AttributeName":"workday",       "AttributeType":"integer"      },      
      { "AttributeName":"weather",       "AttributeType":"integer"      },     
      { "AttributeName":"temperature",   "AttributeType":"float"        },       
      { "AttributeName":"atemp",         "AttributeType":"float"        }, 
      { "AttributeName":"humidity",      "AttributeType":"integer"      },       
      { "AttributeName":"windspeed",     "AttributeType":"float"      },       
      { "AttributeName":"casual",        "AttributeType":"integer"      },      
      { "AttributeName":"registered",    "AttributeType":"integer"      },         
      { "AttributeName":"demand",         "AttributeType":"float"       },      
      { "AttributeName":"item_id",       "AttributeType":"string"       }       
  ]
}

response=forecast.create_dataset(
                    Domain="RETAIL",
                    DatasetType='TARGET_TIME_SERIES',
                    DatasetName=datasetName,
                    DataFrequency=DATASET_FREQUENCY, 
                    Schema = schema
                   )
datasetArn = response['DatasetArn']

In [None]:
forecast.describe_dataset(DatasetArn=datasetArn) 

In [None]:
create_dataset_group_response = forecast.create_dataset_group(DatasetGroupName=datasetGroupName,
                                                              DatasetArns= [datasetArn]
                                                             )
datasetGroupArn = create_dataset_group_response['DatasetGroupArn']

If you have an existing datasetgroup, you can update it using **update_dataset_group** to update dataset group.

In [None]:
forecast.describe_dataset_group(DatasetGroupArn=datasetGroupArn)

### Create Data Import Job
Brings the data into Amazon Forecast system ready to forecast from raw data. 

In [None]:
datasetImportJobName = 'DSIMPORT_JOB_TARGET'
ds_import_job_response=forecast.create_dataset_import_job(DatasetImportJobName=datasetImportJobName,
                                                          DatasetArn=datasetArn,
                                                          DataSource= {
                                                              "S3Config" : {
                                                                 "Path":s3DataPath,
                                                                 "RoleArn": roleArn
                                                              } 
                                                          },
                                                          TimestampFormat=TIMESTAMP_FORMAT
                                                         )

In [None]:
ds_import_job_arn=ds_import_job_response['DatasetImportJobArn']
print(ds_import_job_arn)

Check the status of dataset, when the status change from **CREATE_IN_PROGRESS** to **ACTIVE**, we can continue to next steps. Depending on the data size. It can take 10 mins to be **ACTIVE**. This process will take 5 to 10 minutes.

In [None]:
while True:
    dataImportStatus = forecast.describe_dataset_import_job(DatasetImportJobArn=ds_import_job_arn)['Status']
    print(dataImportStatus)
    if dataImportStatus != 'ACTIVE' and dataImportStatus != 'CREATE_FAILED':
        sleep(30)
    else:
        break

In [None]:
forecast.describe_dataset_import_job(DatasetImportJobArn=ds_import_job_arn)

### Create Predictor 

Forecast horizon is the number of number of time points to predicted in the future. For weekly data, a value of 12 means 12 weeks. Our example is hourly data, we try forecast the next day, so we can set to 24.

In [None]:
predictorName= project+'_deeparp_algo'

In [None]:
forecastHorizon = 24

In [None]:
algorithmArn = 'arn:aws:forecast:::algorithm/Deep_AR_Plus'

In [None]:
create_predictor_response=forecast.create_predictor(PredictorName=predictorName, 
                                                  AlgorithmArn=algorithmArn,
                                                  ForecastHorizon=forecastHorizon,
                                                  PerformAutoML= False,
                                                  PerformHPO= False,
                                                  EvaluationParameters= {"NumberOfBacktestWindows": 1, 
                                                                         "BackTestWindowOffset": 24}, 
                                                  InputDataConfig= {"DatasetGroupArn": datasetGroupArn},
                                                  FeaturizationConfig= {"ForecastFrequency": "H", 
                                                                        "TrainingSubsampleRatio": 1, 
                                                                        "Featurizations": 
                                                                        [
                                                                          {"AttributeName": "demand", 
                                                                           "FeaturizationPipeline": 
                                                                            [
                                                                              {"FeaturizationMethodName": "filling", 
                                                                               "FeaturizationMethodParameters": 
                                                                                {"frontfill": "none", 
                                                                                 "middlefill": "zero", 
                                                                                 "backfill": "zero"}
                                                                              }
                                                                            ]
                                                                          }
                                                                        ]
                                                                       }
                                                 )

In [None]:
predictorArn=create_predictor_response['PredictorArn']

Check the status of the predictor. When the status change from **CREATE_IN_PROGRESS** to **ACTIVE**, we can continue to next steps. Depending on data size, model selection and hyper parameters，it can take 10 mins to more than one hour to be **ACTIVE**.

In [None]:
while True:
    predictorStatus = forecast.describe_predictor(PredictorArn=predictorArn)['Status']
    print(predictorStatus)
    if predictorStatus != 'ACTIVE' and predictorStatus != 'CREATE_FAILED':
        sleep(30)
    else:
        break

### Get Error Metrics
 This provides metrics on the accuracy of the models that was  trained with the CreatePredictor operation.
 Use metrics to see how well the model did and to decide whether to use the predictor to generate forecasts 

In [None]:
forecast.get_accuracy_metrics(PredictorArn=predictorArn)

### Create Forecast

Now create a forecast using the model that was trained.

In [None]:
forecastName= project+'_deeparp_algo_forecast'

In [None]:
create_forecast_response=forecast.create_forecast(ForecastName=forecastName,
                                                  PredictorArn=predictorArn)
forecastArn = create_forecast_response['ForecastArn']

Check the status of the forecast process, when the status change from **CREATE_IN_PROGRESS** to **ACTIVE**, we can continue to next steps. Depending on data size, model selection and hyper parameters，it can take 10 mins to more than one hour to be **ACTIVE**.

In [None]:
while True:
    forecastStatus = forecast.describe_forecast(ForecastArn=forecastArn)['Status']
    print(forecastStatus)
    if forecastStatus != 'ACTIVE' and forecastStatus != 'CREATE_FAILED':
        sleep(30)
    else:
        break

### Get Forecast

Once created, the forecast results are ready and you view them. 
The result set shows the predictions and confidence on the p10,p50 rates. 

In [None]:
 forecastResponse = forecastquery.query_forecast(
    ForecastArn=forecastArn,
   Filters={"item_id":"bike_12"}
)
print(forecastResponse)

In [None]:
## Generate Plot 
d = pd.DataFrame.from_dict(forecastResponse['Forecast'])
df = pd.DataFrame.from_dict(d.loc['mean']['Predictions']).dropna().rename(columns = {'Val':'p50mean'})
df.plot()

In [None]:
## Generate Plot 
d = pd.DataFrame.from_dict(forecastResponse['Forecast'])
##df = pd.DataFrame.from_dict(d.loc['mean']['Predictions']).dropna().rename(columns = {'Val':'p90'})
df = pd.DataFrame.from_dict(d.loc['mean']['Predictions']).dropna().rename(columns = {'Val':'p90mean'})
df.plot()

# Export Forecast

You can export forecast to s3 bucket. To do so an role with s3 put access is needed, but this has already been created.

In [None]:
forecastExportName= project+'_deeparp_algo_forecast_export'

In [None]:
outputPath="s3://"+bucketName+"/output"

In [None]:
forecast_export_response = forecast.create_forecast_export_job(
                                                                ForecastExportJobName = forecastExportName,
                                                                ForecastArn=forecastArn, 
                                                                Destination = {
                                                                   "S3Config" : {
                                                                       "Path":outputPath,
                                                                       "RoleArn": roleArn
                                                                   } 
                                                                }
                                                              )

In [None]:
forecastExportJobArn = forecast_export_response['ForecastExportJobArn']

In [None]:
while True:
    forecastExportStatus = forecast.describe_forecast_export_job(ForecastExportJobArn=forecastExportJobArn)['Status']
    print(forecastExportStatus)
    if forecastExportStatus != 'ACTIVE' and forecastExportStatus != 'CREATE_FAILED':
        sleep(30)
    else:
        break

Check s3 bucket for results

In [None]:
s3.list_objects(Bucket=bucketName,Prefix="output")

# Cleanup

While Forecast is in preview there are no charges for using it, but to future proof this work below are the instructions to cleanup your work space.

In [None]:
# Delete Import job
forecast.delete_dataset_import_job(DatasetImportJobArn=ds_import_job_arn)

In [None]:
# Delete Dataset Group
forecast.delete_dataset_group(DatasetGroupArn=datasetGroupArn)