m5-prediction-accuracy

Objective

Demo how to use AWS SageMaker to perform machine learning tasks. Dataset used is sales data of Walmart. We employ machine learning algorithms to predict forthcoming 28 days sales unit of each item in each store.

Structure

Preparation

Open a SageMaker notebook. FYI - I use ml.c5.2xlarge with 5GB EBS to run the code in this repository.

Data analysis and preprocessing

The notebook This notebook shows how to do data analysis and preprocessing on SageMaker notebook. Packages are pre-installed therefore we can execute the notebook directly without machine provisioning.

Performing training and prediction by a SageMaker built-in algorithm - xgboost

On SageMaker, there are many built-in algorithms can be used directly. There is the list to referece. All algorithms are managed in form of docker images, and are hosted on ECR (Elastic Container Registry). In this notebook, We use xgboost 1.0-1 to perform training and inferencing

Performing training by your own algorithm

On SageMaker, you can define your own algorithms to use. This notebook demonstrate how to perform bring your own container.

SageMaker Experiment and Debugger

It is important to facilitate efficient communication between data scientists. SageMaker Experiments enables the team exchange the experiment information transparently. Moreover, results of the experiments can be easily reproduced; since the input/output artifacts in the experiments are kept in AWS S3, the hyperparameters, types of machines and algorithm used are recorded as well.

It is also important to have a machanism to monitor the experiments and detect the troubles encountered early. To do troubleshooting further, record the criticle metrics and/or tensors are necessary. SageMaker Debugger provides the machanism for team to do training job monitoring and this notebook demonstrate how to use SageMaker Experiments and Debugger

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
light_gbm_sagemaker		light_gbm_sagemaker
.gitignore		.gitignore
01-data-analysis-and-preparation.ipynb		01-data-analysis-and-preparation.ipynb
02-training-built-in-and-byoc.ipynb		02-training-built-in-and-byoc.ipynb
03-training-bring-your-own-container.ipynb		03-training-bring-your-own-container.ipynb
04-experiment-and-debugger.ipynb		04-experiment-and-debugger.ipynb
05-autopilot.ipynb		05-autopilot.ipynb
06-multi-variat-rnn-prediction.ipynb		06-multi-variat-rnn-prediction.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

m5-prediction-accuracy

Objective

Structure

Preparation

Data analysis and preprocessing

Performing training and prediction by a SageMaker built-in algorithm - xgboost

Performing training by your own algorithm

SageMaker Experiment and Debugger

About

Releases

Packages

Languages

catwhiskers/m5-prediction-accuracy

Folders and files

Latest commit

History

Repository files navigation

m5-prediction-accuracy

Objective

Structure

Preparation

Data analysis and preprocessing

Performing training and prediction by a SageMaker built-in algorithm - xgboost

Performing training by your own algorithm

SageMaker Experiment and Debugger

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages