**<div style='font-size:200%'>Minimalistic example for gluonts entrypoint script</div>**

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'

import boto3
import sagemaker as sm
from sagemaker.mxnet.estimator import MXNet

def get_sm_execution_role(region='us-east-1'):
    # cf - https://github.com/aws/sagemaker-python-sdk/issues/300
    client = boto3.client('iam', region_name=region)
    response_roles = client.list_roles(
        PathPrefix='/',
        # Marker='string',
        MaxItems=999
    )
    for role in response_roles['Roles']:
        if role['RoleName'].startswith('AmazonSageMaker-ExecutionRole-'):
            #print('Resolved SageMaker IAM Role to: ' + str(role))
            return role['Arn']
    raise Exception('Could not resolve what should be the SageMaker role to be used')

# A few standard SageMaker's stanzas
role: str = get_sm_execution_role()
sess = sm.Session()
region: str = sess.boto_session.region_name

# Global config

In [12]:
# Ensure no trailing '/'.
bucket = 'vm-hello-world'
prefix = 'gluonts'

%set_env BUCKET=$bucket
%set_env PREFIX=$prefix

env: BUCKET=vm-hello-world
env: PREFIX=gluonts


# Prepare data

Simply upload sample data to S3 bucket.

In [3]:
!aws s3 cp --recursive ../refdata/ s3://$BUCKET/$PREFIX/ --storage-class ONEZONE_IA
!aws s3 ls --recursive s3://$BUCKET/$PREFIX/

upload: ../refdata/test/test.jsonl to s3://vm-hello-world/gluonts/test/test.jsonl
upload: ../refdata/metadata.json to s3://vm-hello-world/gluonts/metadata.json
upload: ../refdata/train/train.jsonl to s3://vm-hello-world/gluonts/train/train.jsonl
2020-04-22 12:57:39          0 gluonts/
2020-04-22 14:29:42        412 gluonts/metadata.json
2020-04-22 14:29:42        230 gluonts/test/test.jsonl
2020-04-22 14:29:42        230 gluonts/train/train.jsonl


# Training job

In [6]:
# Equivalent to: python entrypoint.py --s3-dataset s3_dir --distr_output gluonts.distribution.gaussian.GaussianOutput --use_feat_static_cat True --cardinality '[5]' --prediction_length 2 --trainer gluonts.trainer.Trainer --trainer.epochs 2
mxnet_estimator = MXNet(
                    entry_point='entrypoint.py',
                    source_dir='../src',
                    role=role,
                    train_instance_count=1,
                    train_instance_type='ml.m5.large',
                    framework_version='1.4.1',
                    hyperparameters={
                        # Let's start with non-algorithm hyperparameters
                        'plot_transparent': 0,   # Whether plot should be transparent or white background
                        'num_samples': 1000,     # Number of samples during backtesting.

                        # Here, you specify the algorithm to use, such as DeepAR, DeepFactor, DeepState, Transformer,
                        # etc. See glounts.model packages for the list of available algorithms.
                        #
                        # If 'algo' is not specified, then defaults to 'gluonts.model.deepar.DeepAREstimator'.
                        'algo': 'gluonts.model.deepar.DeepAREstimator',

                        # The remaining here are kwargs to the chosen estimator. For e.g., for DeepAR, consult the
                        # documentation for gluonts.model.deepar.DeepAREstimator.
                        #
                        # There're two types of kwargs hyperparameters:
                        # - primitive python types (incl. dictionaries & lists that can be deserialized from JSON).
                        #   Note that string "True", "False", and "None" will automatically become True, False, and
                        #   None, respectively.
                        # - Custom classes, notably Trainer and distribution output.
                        #   Note that time_feat is unsupported at this point in time.
                        
                        # Kwargs: Primitive python types.
                        'use_feat_static_cat': 'True',
                        'cardinality': '[5]',
                        'prediction_length': 2,

                        # Kwargs: custom classes.
                        # Currently, this is implemented as a whitelist, and notably missing is for kwarg time_feat.

                        # Equivalent to DeepAREstimator(..., distr_output=GaussianOutput(), ...)
                        'distr_output': 'gluonts.distribution.gaussian.GaussianOutput',

                        # Equivalent to DeepAREstimator(..., trainer=Trainer(epochs=2), ...)
                        'trainer': 'gluonts.trainer.Trainer',
                        'trainer.epochs': 2,
                    },
                    py_version='py3')

mxnet_estimator.fit({'s3_dataset': f's3://{bucket}/{prefix}'})

2020-04-22 06:42:32 Starting - Starting the training job...
2020-04-22 06:42:34 Starting - Launching requested ML instances......
2020-04-22 06:43:36 Starting - Preparing the instances for training...
2020-04-22 06:44:17 Downloading - Downloading input data...
2020-04-22 06:44:58 Training - Training image download completed. Training in progress..[34m2020-04-22 06:44:59,601 sagemaker-containers INFO     Imported framework sagemaker_mxnet_container.training[0m
[34m2020-04-22 06:44:59,604 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-04-22 06:44:59,616 sagemaker_mxnet_container.training INFO     MXNet training environment: {'SM_HOSTS': '["algo-1"]', 'SM_NETWORK_INTERFACE_NAME': 'eth0', 'SM_HPS': '{"algo":"gluonts.model.deepar.DeepAREstimator","cardinality":"[5]","distr_output":"gluonts.distribution.gaussian.GaussianOutput","num_samples":1000,"plot_transparent":0,"prediction_length":2,"trainer":"gluonts.trainer.Trainer","trainer.epochs":2,"u

In [10]:
model_s3 = mxnet_estimator.latest_training_job.describe()['ModelArtifacts']['S3ModelArtifacts']
output_s3 = os.path.join(mxnet_estimator.latest_training_job.describe()['OutputDataConfig']['S3OutputPath'], mxnet_estimator.latest_training_job.job_name, 'output/output.tar.gz')
model_s3, output_s3
%set_env MODEL_S3=$model_s3
%set_env OUTPUT_S3=$output_s3

env: MODEL_S3=s3://sagemaker-ap-southeast-1-484597657167/mxnet-training-2020-04-22-06-42-43-224/output/model.tar.gz
env: OUTPUT_S3=s3://sagemaker-ap-southeast-1-484597657167/mxnet-training-2020-04-22-06-42-43-224/output/output.tar.gz


# Observe training results

As in any SageMaker training job, entrypoint script will generate two artifacts in the S3: `model.tar.gz` and `output.tar.gz`.

The `model.tar.gz` contains the persisted model that can be used later on for inference.

The `output.tar.gz` contains the following:
- individual plot of each test timeseries
- montage of plots of all test timeseries
- backtest evaluation metrics.

In [11]:
%%bash
echo -e "\nModel artifacts $MODEL_S3:"
aws s3 cp $MODEL_S3 - | tar -tzvf -

echo -e "\nOutput $OUTPUT_S3:"
aws s3 cp $OUTPUT_S3 - | tar -tzvf -


Model artifacts s3://sagemaker-ap-southeast-1-484597657167/mxnet-training-2020-04-22-06-42-43-224/output/model.tar.gz:
-rw-r--r--  0 0      0         476 Apr 22 14:45 parameters.json
-rw-r--r--  0 0      0         674 Apr 22 14:45 prediction_net-network.json
-rw-r--r--  0 0      0          38 Apr 22 14:45 version.json
-rw-r--r--  0 0      0        2778 Apr 22 14:45 input_transform.json
-rw-r--r--  0 0      0          51 Apr 22 14:45 type.txt
-rw-r--r--  0 0      0      105665 Apr 22 14:45 prediction_net-0000.params

Output s3://sagemaker-ap-southeast-1-484597657167/mxnet-training-2020-04-22-06-42-43-224/output/output.tar.gz:
-rw-r--r--  0 0      0        1000 Apr 22 14:45 item_metrics.csv
-rw-r--r--  0 0      0        1332 Apr 22 14:45 agg_metrics.json
drwxr-xr-x  0 0      0           0 Apr 22 14:45 plots/
-rw-r--r--  0 0      0       61004 Apr 22 14:45 plots/plots.png
drwxr-xr-x  0 0      0           0 Apr 22 14:45 plots/single/
-rw-r--r--  0 0      0      122084 Apr 22 14:45 plots/s