### How to train any GluonTS model on any dataset on SageMaker:

***Gluonts is currently lacking an Experimentation Framework that is easy to use, supports execution on Sagemaker, and allows for easy configuration of reporoducible experiments.***<br/>
***For this reason the GluonTS SageMaker SDK was created building on the Amazon Sagemaker Pythond SDK.***<br/>
***In this how-to tutorial we will train a SimpleFeedForwardEstimator on the m4_hourly dataset on AWS Sagemaker using the GluonTSFramework, and later evaluate its performance.***

In [None]:
import boto3
import sagemaker
import gluonts
from gluonts.sagemaker.estimator import GluonTSFramework
from gluonts.model.simple_feedforward import SimpleFeedForwardEstimator
from gluonts.trainer import Trainer

First, you should define where all the files generated during the experiment (model artifacts, result files, other custom scripts and dependencies used for the experiment) will be saved.

In [None]:
experiment_dir = "<your_s3_bucket>"

Since we want to run experiments on Sagemaker, we need to create a sagemaker session with our AWS credentials.<br/>
Here we use the "default" profile (see [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#using-boto-3)) and the "us-west-2" region (where our specified s3 bucket has to be!).<br/>
We also need to provide an AWS IAM role (see [IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html)) to with which to access the resources on our account.

In [None]:
my_region="<your_region>"
boto_session = boto3.session.Session(profile_name="<default or your_profile>", region_name=my_region)
sagemaker_session =  sagemaker.session.Session(boto_session=boto_session)
role = '<your_aws_iam_role>'

For now we have to also specify an own image hosted on ECR (see [ECR](https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-basics.html)) that we want to run our experiment in. <br/>
// You can use one of the provided Docker files to create the appropriate image.

In [None]:
docker_image = "<your_ecr_dorcker_image_path>"

We give our training job a base name that reflects the overall sentiment of the experiments that we are about to run.

In [None]:
base_job_description = "<your_experiment_007>"

Here we create the experimentation framework for the train job.<br/>
Here we have to decide on the instance type we want to run our experiments on.

In [None]:
my_experiment = GluonTSFramework(
                    sagemaker_session=sagemaker_session,
                    role=role,
                    image_name=docker_image,  
                    base_job_name=base_job_description,
                    train_instance_type="ml.c5.xlarge", # CPU instance. If you use a GPU image, use a GPU instance here.
                    output_path=experiment_dir, # optional.
                    code_location=experiment_dir, # optional.
                )

In GluonTSFramework you can specify: "dependencies=[my_specific_gluonts_version_path]" as a parameter, with any specific gluonts version you would like your experiments to run in. For this you will have to use a docker image that has all the corresponding requirements installed, but not gluonts itself, as dependencies are only appended to the sys.path. # will be fixed very soon

Now we define the Estimator we want to train, which can be any GluonEstimator with any hyperparameter.

In [None]:
my_estimator = SimpleFeedForwardEstimator(
                    prediction_length=48,
                    freq="H",
                    trainer=Trainer(ctx="cpu") # optional
                )

And finally we call the *train* method to train our estimator, where we just specify our dataset and estimator. <br/>
The dataset can be either a built in one provided by gluonts: *gluonts.dataset.repository.datasets.dataset_recipes.keys()* or any dataset in the gluonts dataset format located on s3: <br/>
>dataset<br/>
>  ├-> train<br/>
>  |   └> data.json<br/>
>  ├-> test<br/>
>  |   └> data.json<br/>
>  └> metadata.json<br/>

In [None]:
agg_metrics, item_metrics, job_name = my_experiment.train(dataset="m4_hourly", estimator=my_estimator) 
#agg_metrics, item_metrics, job_name = my_experiment.train(dataset="s3://<you_s3_dataset_path>", estimator=my_estimator) 

Now we can inspect our training progress and monitored metrics (like resource consumption or epoch loss) on Sagemaker under "Training/Training jobs" here:

In [None]:
print(f"https://{my_region}.console.aws.amazon.com/sagemaker/home?region={my_region}#/jobs/{job_name}")

Or look at our results right here when our training job finished:

In [None]:
agg_metrics

Or head to our bucket to download the model artifacts, which will be located here:

In [None]:
print(f"{experiment_dir}/{job_name}/")