The first step is to run this notebook and prepare a dataset for input into Amazon Forecast.

# 1.Download dataset
We use data from the following sites to track sales on e-commerce sites.   
https://archive.ics.uci.edu/ml/datasets/Online+Retail+II

In [None]:
! wget https://archive.ics.uci.edu/ml/machine-learning-databases/00502/online_retail_II.xlsx -P ./input

# 2.Load dataset
Load the downloaded data and add a sales column.

In [None]:
import pandas as pd

In [None]:
df = pd.read_excel('./input/online_retail_II.xlsx', sheet_name='Year 2009-2010')

In [None]:
df['sales'] = df['Price'] * df['Quantity']

# 3.Build dataset
From the dataset, create two sets, one for initial training and one for automatic training using the pipeline.

train:2009/12/01 - 2010/12/02   
train_added:2009/12/01 - 2010/12/09

In [None]:
df2 = df[['Country', 'InvoiceDate', 'sales']]

In [None]:
df2 = df2.query('Country == "United Kingdom"')

In [None]:
df2.head()

In [None]:
df2.to_csv('./output/tr_target_add_20091201_20101209.csv', header=False, index=False)

In [None]:
tr1 = df2.query('InvoiceDate <= "20101203"')

In [None]:
tr1.tail()

In [None]:
!mkdir -p output

In [None]:
tr1.to_csv('./output/tr_target_20091201_20101202.csv', header=False, index=False)

# 4.Upload dataset to S3
Create a bucket in S3 and upload the dataset.

## make bucket

In [None]:
import boto3

In [None]:
boto3.__version__

In [None]:
sts = boto3.client('sts')
id_info = sts.get_caller_identity()
print(id_info['Account'])

In [None]:
s3 = boto3.client('s3')

In [None]:
bucket_name = 'demo-forecast-' + id_info['Account']

In [None]:
bucket_name

In [None]:
s3.create_bucket(Bucket=bucket_name)

## upload dataset

In [None]:
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)

In [None]:
bucket.upload_file('./output/tr_target_20091201_20101202.csv',
                   'input/tr_target_20091201_20101202.csv')

In [None]:
bucket.upload_file('./output/tr_target_add_20091201_20101209.csv',
                   'input/tr_target_add_20091201_20101209.csv')

## upload manifest file
Create a manifest file for use in Quick Sight and upload it to S3.

In [None]:
import json

In [None]:
manifest_for_qs={
  "fileLocations": [
    {
      "URIs": []
    },
    {
      "URIPrefixes": [
        "s3://" + bucket_name + "/output/"
      ]
    }
  ],
  "globalUploadSettings": {
    "format": "CSV",
    "delimiter": ",",
    "textqualifier": "'",
    "containsHeader": "true"
  }
}

In [None]:
!mkdir -p manifest_for_quicksight

In [None]:
with open('./manifest_for_quicksight/manifest_uk_sales_pred.json', 'w') as f:
    json.dump(manifest_for_qs, f, indent=2, ensure_ascii=False)

In [None]:
bucket.upload_file('./manifest_for_quicksight/manifest_uk_sales_pred.json',
                   'manifest_for_quicksight/manifest_uk_sales_pred.json')

# 5.NEXT
Manually run the forecast with Amazon Forecast. export the forecast results to S3 and visualize them in Amazon QuickSight.    
When the visualization is complete, run 2_build_forecast_pipeline.ipynb to build an automatic forecast pipeline.  