# Amazon Simple Forecast Accelerator

## AWS Resource Recommendations

By default, the dashboard can process datasets of up to 5,000 timeseries (# unique SKUs/Item IDs x # unique channels)
and is hosted by the default `ml.t2.medium` EC2 instance.

A limit increase request is required to process larger datasets, which can be made in one of two ways:

1. self-service (~24-48 hours)  
   - Request an instance type limit increase via the instructions here:
       - https://aws.amazon.com/premiumsupport/knowledge-center/resourcelimitexceeded-sagemaker/
   - Refer to our recommended instance types below.


2. contacting your AWS Account Manager (instant)

Instance type recommendations:

- 5,000 to 10,000 timeseries (`ml.t3.xlarge`)
- 10,000 to 50,000 timeseries (`ml.t3.2xlarge`)
- 50,000 to 100,000 timeseries (`ml.m4.4xlarge`)

For datasets containing more than 100,000 timeseries, we recommend using an instance type container at least 48GiB of memory, see the available "On Demand Notebook Instances" and pricing for your region here:
- https://aws.amazon.com/sagemaker/pricing/

## Instructions

Please follow the steps below to get started using the dashboard.

### Step 1 – Prepare your dataset

Your historical demand dataset must be a single CSV (`.csv`) or gzipped CSV (`.csv.gz`) file
with the following columns:

- `timestamp` – String, date of the demand, in the format `YYYY-mm-dd` (e.g. "2020-12-25")
- `channel` – String, the originating store or platform of the demand (e.g. Website, Store-22)
- `family` – String, the category of the item (e.g. Shirts)
- `item_id` – String, the unique item identifier/SKU code (e.g. SKU29292)
- `demand` – Numeric, the demand amount of the item, which must be >= 0 (e.g. 413) 

Each _timeseries_ in a dataset is delineated by its `channel`, `family`, and `item_id` values.

Here is an example of what the file should like:
```
timestamp,channel,family,item_id,demand
2018-07-02,Website,Tops,SKU29292,2
2018-07-02,Store-22,Footwear,SKU29293,4
...
...
```

**Please ensure that the values in each column is correctly formated prior to generating each forecast.**

### Step 2 – Upload your dataset

Drag-and-drop your file(s) from your OS's file explorer into the SageMaker Notebook file browser to the left of this page, as shown below:

![upload-file-example.png](images/upload-file-example.png)

There will be an upload progress bar at the bottom of the browser window:

![upload-progress-bar](images/upload-progress-bar.png)


#### Troubleshooting
If you encounter any issues attempting to upload your files or reports, this can sometimes be remedied by refreshing the page in your browser and trying again.

### Step 3 – Accessing the Dashboard

Visit the dashboard below to generate your forecasts:

- **INSERT_URL_HERE**

### Step 4 – Exporting Forecasts and Results

You can export the forecasts and results generated from the dashboard by clicking the
"Export" buttons.

The dashboard is capable of exporting three types of files:

- **top-performers** (`*-top-performers.csv.gz`), which can be used to analyse the forecast accuracy achieved by SKU/Channel/Family for a specific time period.

- **forecasts** (`*-forecasts.csv.gz`), containing your new forecast which uses the best performing model(s) selected  
  for each timeseries. You can use this file to: 
  
  - Benchmark against an existing forecast for the same period, e.g AWS Forecast vs. actual vs. existing forecast vs. actual.
  - Benchmark against actuals  


- **backtests** (`*-backtests.csv.gz`), which contain the historical forecast accuracy obtained during the model training and selection process for each timeseries.

### Step 5 – Saving a Report

Reports can be saved and loaded for later use via the "Save Report" panel in the dashboard, as shown below:

![save-a-report.png](./images/save-a-report.png)

### Step 6 – Loading a Previously Saved Report

Reports can be loaded via the "Load Report" panel in the dashboard, as shown below:

![load-report.png](./images/load-a-report.png)

## AWS Recommendations

### Shutdown the Notebook Instance when not in use to minimise cost

We recommend that you shut down the notebook instance when:

- when you are waiting for the Machine Learning (ML) Forecasts to complete, which may take several hours depending on the dataset size;
- after you have exported and downloaded all forecasts, results, and reports.

The notebook instance can be shut down via the AWS Console, as follows:

1. visit https://console.aws.amazon.com/sagemaker/home?#/notebook-instances
2. select the "SfsStack-NotebookInstance" (via the radio button)
3. select "Actions" > "Stop"

![shutdown-example.png](images/shutdown-example.png)

### How to permanently delete Amazon Simple Forecast Accelerator from your AWS account

The AWS resources for this deployment can be deleted via the AWS Console, as follows:

1. visit https://console.aws.amazon.com/cloudformation/home?#/stacks
2. one-at-a-time, select each of the stacks outlined below and click the "Delete" button

![delete-stacks.png](./images/delete-stacks.png)