This project provides an end-to-end solution for Demand Forecasting task using a new state-of-the-art Deep Learning model LSTNet available in GluonTS and Amazon SageMaker.
The input data is a multi-variate time-series.
An example includes hourly electricity consumption of 321 users over the period of 29 months. Here is a snapshot of the normalized data
We have provided example of how to feed your time-series data with GluonTS in the notebook. To convert CSV data or other formats to GluonTS format, please see the customization.
- A trained LSTNet model and
- A SageMaker endpoint that can predict the future (multi-variate) values given a prediction interval
For example, we can estimate the hourly electricity consumption of 321 users for the coming week.
We have implemented LSTNet which is a state-of-the-art Deep Learning model and is available in GluonTS.
Running the solution end-to-end costs less than $5 USD. Please make sure you have read the cleaning up part here.
Demand forecasting uses historical time-series data to help streamline the supply-demand decision-making process across businesses. Examples include predicting the number of
- Customer representatives to hire for multiple locations in the next month
- Product sales across multiple regions in the next quarter
- Cloud server usage for next day for a video streaming service
- Electricity consumption for multiple regions over the next week
- IoT devices and sensors such as energy consumption
The status quo approaches for time-series forecasting include:
- Auto Regressive Integrated Moving Average (ARIMA) for univariate time-series data and
- Vector Auto-Regression (VAR) for multi-variate time-series data
These methods often require tedious data preprocessing and features generation prior to model training. One main advantage of Deep Learning (DL) methods such as LSTNet is automating the feature generation step prior to model training such as incorporating various data normalization, lags, different time scales, some categorical data, dealing with missing values, etc. with better prediction power and fast GPU-enabled training and deployment.
Please check out our blog post for more details.
You will need an AWS account to use this solution. Sign up for an account here.
The easiest is to click on the following button to create the AWS CloudFormation Stack required for this solution
AWS Region | AWS CloudFormation | ||
---|---|---|---|
US West | Oregon | us-west-2 | |
US East | N. Virginia | us-east-1 | |
US East | Ohio | us-east-2 |
Then acknowledge adding the default AWS IAM policy or use your own policy
- Click on the Create Stack (you can leave the pre-specified Stack name, S3 Bucket Name and SageMaker Notebook Instance as they are)
- Once the stack was created, go to the Outputs tab and click on the NotebookInstance link to directly go to the created notebook instance
- To see the demo, click on demo.ipynb and follow the instructions
- Finally, click on deep-demand-forecast.ipynb notebook and follow the instructions inside the notebook
Alternatively, you can clone this repository then navigate to AWS CloudFormation in your account and use the provided CloudFormation template to create the AWS resources needed to train and deploy the model using the SageMaker deep-demand-forecast notebook.
cloudformation/
deep-demand-forecast.yaml
: The root cloudformation nested stack which creates the AWS stack for this solutiondeep-demand-forecast-sagemaker-notebook-instance.yaml
: Creates SageMaker notebook instancedeep-demand-forecast-permissions.yaml
: Manages all the permission necessary to launch the stackdeep-demand-forecast-endpoint.yaml
: Creates demo endpoint using indemo.ipynb
solution-assistant
: Deletes the created resources such as endpoint, S3 bucket etc. during cleanup
src/
preprocess/
container/
: To build and register the preprocessing ECR jobDockerfile
: Docker container configbuild_and_push.sh
: Build and push bash scripts used indeep-demand-forecast.ipynb
requirements.txt
: Dependencies forpreprocess.py
container_build/
: UsesCodeBuild
to the build the container for ECRpreprocess.py
: Preprocessing script
deep_demand_forecast/
: Contains the train and inference codetrain.py
: SageMaker train codeinference.py
: SageMaker inference codedata.py
:GluonTS
data preparationmetrics.py
: A training metricmonitor.py
: Preparing results for visualizationutils.py
: Helper functionsrequirements.txt
: Dependencies for SageMaker MXNet Estimator
demo.ipynb
: Demo notebook to quickly get some predictions from the demo endpointdeep-demand-forecast.ipynb
: See below
The notebook trains an LSTNet estimator on electricity consumption data which is multivariate time-series dataset capturing the electricity consumption (in kW) with 15min frequency from 2011-01-01 to 2014-05-26. We compare the model performance by visualizing the metrics MASE vs. sMAPE.
Finally, we deploy an endpoint for the trained model and can interactively compare its performance by comparing the train, test data and predictions.
Here is architecture for the end-to-end training and deployment process
- The input data located in an Amazon S3 bucket
- The provided SageMaker notebook that gets the input data and launches the later stages below
- Preprocessing step to normalize the input data. We use SageMaker processing job that is designed as a microservice. This allows users to build and register their own Docker image via Amazon ECR and execute the job using Amazon SageMaker
- Training an LSTNet model using the previous preprocessed step and evaluating its results using Amazon SageMaker. If desired, one can deploy the trained model and create a SageMaker endpoint
- SageMaker endpoint created from the previous step, is an HTTPS endpoint and is capable of producing predictions
- Monitoring the training and deployed model via Amazon CloudWatch
Here is the architecture of the inference
- The input data, located in an Amazon S3 bucket
- From SageMaker notebook, normalize the new input data using the statistics of the training data
- Sending the requests to the SageMaker endpoint
- Predictions
When you've finished with this solution, make sure that you delete all unwanted AWS resources. AWS CloudFormation can be used to automatically delete all standard resources that have been created by the solution and notebook. Go to the AWS CloudFormation Console, and delete the parent stack. Choosing to delete the parent stack will automatically delete the nested stacks.
Caution: You need to manually delete any extra resources that you may have created in this notebook. Some examples include, extra Amazon S3 buckets (to the solution's default bucket), extra Amazon SageMaker endpoints (using a custom name), and extra Amazon ECR repositories.
To use your own data, please take a look at
- Extensive GluonTS tutorials
- Consult with the dataset API
- Amazon SageMaker Getting Started
- Amazon SageMaker Developer Guide
- Amazon SageMaker Python SDK Documentation
- AWS CloudFormation User Guide
This project is licensed under the Apache-2.0 License.