### How to run a custom script on SageMaker

***Gluonts is currently lacking an Experimentation Framework that is easy to use, supports execution on Sagemaker, and allows for easy configuration of reproducible experiments.***<br/>
***For this reason the GluonTS SageMaker SDK was created building on the Amazon Sagemaker Pythond SDK.***<br/>
***In this how-to tutorial we will write a script where we train a SimpleFeedForwardEstimator on the m4_hourly dataset located in our s3 bucket on AWS Sagemaker using the GluonTSFramework, and later evaluate its performance.***

In [None]:
import boto3
import sagemaker
import gluonts
from gluonts.nursery.sagemaker_sdk.estimator import GluonTSFramework
from gluonts.model.simple_feedforward import SimpleFeedForwardEstimator
from gluonts.trainer import Trainer
import tempfile
from pathlib import Path

First, you should define where all the files generated during the experiment (model artifacts, result files, other custom scripts and dependencies used for the experiment) will be saved.

In [None]:
experiment_dir = "<your_s3_bucket>"

Since we want to run experiments on Sagemaker, we need to create a sagemaker session with our AWS credentials.<br/>
Here we use the "default" profile (see [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#using-boto-3)) and the "us-west-2" region (where our specified s3 bucket has to be!).<br/>
We also need to provide an AWS IAM role (see [IAM](https://docs.aws.amazon.com/IAM/latest/UserGuide/introduction.html)) to with which to access the resources on our account.

In [None]:
my_region="<your_region>"
boto_session = boto3.session.Session(profile_name="<default or your_profile>", region_name=my_region)
sagemaker_session =  sagemaker.session.Session(boto_session=boto_session)
role = '<your_aws_iam_role>'

For now we have to also specify an own image hosted on ECR (see [ECR](https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-basics.html)) that we want to run our experiment in. <br/>
// You can use one of the provided Docker files to create the appropriate image.

In [None]:
docker_image = "<your_ecr_dorcker_image_path>"

We give our training job a base name that reflects the overall sentiment of the experiments that we are about to run.

In [None]:
base_job_description = "<your_experiment_007>"

Now, since we want to train on a custom dataset, we have to specify its location, 
which has to ne on s3 for now. We will dub this dataset "my_dataset" which is relevant when
we write our custom script.

In [None]:
my_s3_dataset = "<s3_location_of_your_dataset/m4_hourly/>"

In [None]:
my_inputs = {'my_dataset': sagemaker.session.s3_input(my_s3_dataset, content_type='application/json')} # at least one required

Additionally, we can specify any dependencies. For now we will only specify a specific gluonts version using git+ and a specific hash. # Be careful to include the dependencies of that gluonts version either in the docker image that you use or in the requirements.txt too. 

In [None]:
requirements_dot_txt_file_name = "requirements.txt"
requirements_dot_txt_file_content = """
git+https://github.com/awslabs/gluon-ts.git@b9ee9cbc9d6212040fd4be21a460a048e7188306
"""

Finally, we can define our custom script that we want to run on sagemaker: in this case we will only train a SimpleFeedForwardEstimator on our dataset located in s3.

In [None]:
entrypoint_dot_py_file_name = "my_entrypoint.py"
entrypoint_dot_py_file_content = """
# Standard library imports
import argparse
import os
import json
import logging
from pathlib import Path

# First-party imports
from gluonts.dataset import common
from gluonts.dataset.repository import datasets # for the built in gluonts dataset
from gluonts.evaluation import Evaluator, backtest
from gluonts.model.simple_feedforward import SimpleFeedForwardEstimator
from gluonts.trainer import Trainer

# Logging: print logs analogously to Sagemaker.
logger = logging.getLogger(__name__)

def run(arguments):
    logger.info("Starting - started running custom script.")

    # We will add a hyperparameter called "num_samples" later
    # Note that we have to select hyperparameters from a dict called sm_hp
    samples = int(arguments.sm_hps["num_samples"])

    # define estimator
    my_estimator = SimpleFeedForwardEstimator(
                        prediction_length=48,
                        freq="H",
                        trainer=Trainer(ctx="cpu") # optional
                    )

    # load custom dataset in gluonts format
    s3_dataset_dir = Path(arguments.my_dataset)
    dataset = common.load_datasets(
        metadata=s3_dataset_dir,
        train=s3_dataset_dir / "train",
        test=s3_dataset_dir / "test",
    )

    # train our model
    predictor = my_estimator.train(dataset.train)
    forecast_it, ts_it = backtest.make_evaluation_predictions(
        dataset=dataset.test,
        predictor=predictor,
        num_samples=samples,
    )
    
    # evaluate our model
    evaluator = Evaluator()
    agg_metrics, item_metrics = evaluator(
        ts_it, forecast_it, num_series=len(dataset.test)
    )
    
    # anything saved to output_data_dir will be copied back to s3
    output_dir = Path(arguments.output_data_dir)
    with open(output_dir / "agg_metrics.json", "w") as f:
        json.dump(agg_metrics, f)
    
    # model has special folder, which will be zipped and copied back to s3
    model_output_dir = Path(arguments.model_dir) 
    predictor.serialize(model_output_dir)
    
    # log the metrics
    logger.info(str(agg_metrics))
    logger.info(str(ts_it))

    return


if __name__ == "__main__":
    parser = argparse.ArgumentParser()

    # load hyperparameters via SM_HPS environment variable
    parser.add_argument(
        "--sm_hps", type=json.loads, default=os.environ["SM_HPS"]
    )

    # save your model here to deploy it to an endpoint later with deploy()
    parser.add_argument(
        "--model_dir", type=str, default=os.environ["SM_MODEL_DIR"]
    )
    # specified inputs (input channels) are saved here
    parser.add_argument(
        "--input_dir", type=str, default=os.environ["SM_INPUT_DIR"]
    )
    # contents of this folder will be written back to s3
    parser.add_argument(
        "--output_data_dir", type=str, default=os.environ["SM_OUTPUT_DATA_DIR"]
    )

    # because we add my_dataset in inputs:
    parser.add_argument('--my_dataset', type=str, default=os.environ['SM_CHANNEL_MY_DATASET']) 

    args, _ = parser.parse_known_args()

    run(args)
"""

Next we create the temporary files. Ideally you would just have a "requirements.txt" and "entrypoint.py" in your directory.

In [None]:
# only using temporary directory for demonstration
temp_dir = Path(tempfile.mkdtemp())

# create the requirements.txt file
with open(temp_dir / requirements_dot_txt_file_name, "w") as req_file: # has to be called requirements.txt
    req_file.write(requirements_dot_txt_file_content)
my_requirements_txt_file = str(temp_dir / requirements_dot_txt_file_name)

# create the entrypoint.py file
with open(temp_dir / entrypoint_dot_py_file_name, "w") as entry_file: # has to be called requirements.txt
    entry_file.write(entrypoint_dot_py_file_content)
entrypoint_dot_py_file = str(temp_dir / entrypoint_dot_py_file_name)

Now that we have everything defined we can finally run our experiment. Note that here we have to define the hyperparameters we want to use in out script.

In [None]:
my_experiment, my_job_name = GluonTSFramework.run(
                    entry_point=entrypoint_dot_py_file,
                    inputs = my_inputs,
                    sagemaker_session=sagemaker_session,
                    role=role,
                    image_name=docker_image,  
                    base_job_name=base_job_description,
                    train_instance_type="ml.c5.xlarge", # CPU instance. If you use a GPU image, use a GPU instance here.
                    output_path=experiment_dir, # optional.
                    code_location=experiment_dir, # optional.
                    dependencies=[my_requirements_txt_file], # or the currently imported one [Path(gluonts.__path__[0])]
                    hyperparameters={"num_samples":150} # optional
                )


Now we could head to our bucket to download the model artifacts and anything else we saved into the output dir, which will be located here:

In [None]:
print(f"{experiment_dir}/{my_job_name}/")