**<div style='font-size:200%'>Minimalistic example for gluonts entrypoint script</div>**

In [None]:
%matplotlib inline
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'

from time import gmtime, strftime 

import boto3
import sagemaker as sm
from sagemaker.mxnet.estimator import MXNet
from sagemaker.tuner import CategoricalParameter, ContinuousParameter, HyperparameterTuner, IntegerParameter

# A few standard SageMaker's stanzas
role: str = get_sm_execution_role()
sess = sm.Session()
region: str = sess.boto_session.region_name

# Global config

In [None]:
# Input data for training. Ensure the same as bucket, prefix, and dataset_name from 01-create-dataset.ipynb.
bucket = 'BUCKET'
prefix = 'PREFIX/DATASET_NAME'   # Ensure no trailing '/'.
data_channels = {'s3_dataset': f's3://{bucket}/{prefix}'}
print(data_channels)

# Model hyperparameters. Here we demonstrate that:
# - the dataset embeds its recommended forecast length (=12) in its metadata.
# - However, each train job is free to override the forecast length. We override to 6 weeks.
fcast_length = 6

source_dir = '../src/entrypoint/'

%set_env BUCKET=$bucket
%set_env PREFIX=$prefix

# Metric emitted by each training job. The entrypoint script may emit even more metrics, but
# we choose to capture only a few.
metric=[
    {"Name": "train:loss", "Regex": r"Epoch\[\d+\] Evaluation metric 'epoch_loss'=(\S+)"},
    {"Name": "test:loss", "Regex": r"Epoch\[\d+\] Evaluation metric 'validation_epoch_loss'=(\S+)"},
    {"Name": "train:learning_rate", "Regex": r"Epoch\[\d+\] Learning rate is (\S+)"},
    {"Name": "test:abs_error", "Regex": r"gluonts\[metric-abs_error\]: (\S+)"},
    {"Name": "test:rmse", "Regex": r"gluonts\[metric-RMSE\]: (\S+)"},
    {"Name": "test:wmape", "Regex": r"gluonts\[metric-wMAPE\]: (\S+)"},
]


# Read cardinality of static feature

In [None]:
# IF this cell fails, upgrade s3fs.
with Path2(f's3://{bucket}/{prefix}/metadata/metadata.json').open('rb') as f:
    cardinality = [int(json.load(f)['feat_static_cat'][0]['cardinality'])]
cardinality

We leave as an exercise to read the recommended forecast length in the dataset's metadata in `f's3://{bucket}/{prefix}/metadata/metadata.json'`.

# Training job

The same entrypoint script supports the various estimators in gluonts. You specify the estimator to the entrypoint script using hyperparameter `algo`. The next cell demonstrate the DeepAR estimator `gluonts.model.deepar.DeepAREstimator`.

Feel free to experiment with another estimator e.g., `gluonts.model.deepstate.DeepStateEstimator`.

Do note that you must consult the estimator's documentation for their hyperparameters aka. `kwargs`.

In [None]:
# Equivalent to:
#
# python entrypoint.py \
#     --s3-dataset <local_path for direct invocation> \
#     --plot_transparent 0 \
#     --num_samples 1000 \
#     --algo 'gluonts.model.deepar.DeepAREstimator' \
#     --use_feat_static_cat True \
#     --cardinality <json string in cardinality variable> \
#     --prediction_length <fcast_len> \
#     --distr_output gluonts.distribution.gaussian.GaussianOutput \
#     --trainer gluonts.trainer.Trainer \
#     --trainer.epochs 10
mxnet_estimator = MXNet(
                    entry_point='entrypoint.py',
                    source_dir=source_dir,
                    framework_version='1.6.0',
                    py_version='py3',
                    role=role,
                    train_instance_count=1,
                    train_instance_type='ml.m5.large',
                    sagemaker_session=sess,
                    hyperparameters={
                        # Let's start with non-algorithm hyperparameters
                        'plot_transparent': 0,   # Whether plot should be transparent or white background
                        'num_samples': 1000,     # Number of samples during backtesting.

                        # Here, you specify the algorithm to use, such as DeepAR, DeepFactor, DeepState, Transformer,
                        # etc. See glounts.model packages for the list of available algorithms.
                        #
                        # If 'algo' is not specified, then defaults to 'gluonts.model.deepar.DeepAREstimator'.
                        'algo': 'gluonts.model.deepar.DeepAREstimator',

                        # The remaining here are kwargs to the chosen estimator. For e.g., for DeepAR, consult the
                        # documentation for gluonts.model.deepar.DeepAREstimator.
                        #
                        # There're two types of kwargs hyperparameters:
                        # - primitive python types (incl. dictionaries & lists that can be deserialized from JSON).
                        #   Note that string "True", "False", and "None" will automatically become True, False, and
                        #   None, respectively.
                        # - Custom classes, notably Trainer and distribution output.
                        #   Note that time_feat is unsupported at this point in time.
                        
                        # Kwargs: Primitive python types.
                        'use_feat_static_cat': 'True',
                        'cardinality': cardinality,
                        'prediction_length': fcast_length,

                        # Equivalent to DeepAREstimator(..., distr_output=GaussianOutput(), ...)
                        'distr_output.__class__': 'gluonts.distribution.gaussian.GaussianOutput',

                        # Equivalent to DeepAREstimator(..., trainer=Trainer(epochs=2), ...)
                        'trainer.__class__': 'gluonts.trainer.Trainer',
                        'trainer.epochs': 10,
                    },

                    # Metric emitted by each training job. The entrypoint script may emit even more metrics, but
                    # we choose to capture only a few.
                    metric_definitions=[
                        {"Name": "train:loss", "Regex": r"Epoch\[\d+\] Evaluation metric 'epoch_loss'=(\S+)"},
                        {"Name": "train:learning_rate", "Regex": r"Epoch\[\d+\] Learning rate is (\S+)"},
                        {"Name": "test:abs_error", "Regex": r"gluonts\[metric-abs_error\]: (\S+)"},
                        {"Name": "test:rmse", "Regex": r"gluonts\[metric-RMSE\]: (\S+)"},
                        {"Name": "test:wmape", "Regex": r"gluonts\[metric-wMAPE\]: (\S+)"},
                    ],
)

mxnet_estimator.fit(data_channels)

In [None]:
model_s3 = mxnet_estimator.latest_training_job.describe()['ModelArtifacts']['S3ModelArtifacts']
output_s3 = os.path.join(mxnet_estimator.latest_training_job.describe()['OutputDataConfig']['S3OutputPath'], mxnet_estimator.latest_training_job.job_name, 'output/output.tar.gz')
model_s3, output_s3
%set_env MODEL_S3=$model_s3
%set_env OUTPUT_S3=$output_s3

<div style="color:green;font-weight:bold;font-size:250%">IMPORTANT:</div>

Please note the resulted `model_s3` and `output_s3` in the above cell.
- You'll need the `model_s3` in the next notebook to perform Batch Transform.
- You can download, extract and inspect the content of `output_s3` to observe the training output: metrics, and the forecast plots.

# Observe training results

As in any SageMaker training job, entrypoint script will generate two artifacts in the S3: `model.tar.gz` and `output.tar.gz`.

The `model.tar.gz` contains the persisted model that can be used later on for inference.

The `output.tar.gz` contains the following:
- individual plot of each test timeseries
- montage of plots of all test timeseries
- backtest evaluation metrics.

In [None]:
%%bash
echo -e "\nModel artifacts $MODEL_S3:"
aws s3 cp $MODEL_S3 - | tar -tzvf -

echo -e "\nOutput $OUTPUT_S3:"
aws s3 cp $OUTPUT_S3 - | tar -tzvf -

# Optional: HPO

We provide a few more examples how to run the entrypoint scripts through SageMaker HPO.

In [None]:
def create_tuning_job(objective_metric_name, estimator_hp, tuner_hp, metric, role, sess):
    # FIXME: do not use global variable source_dir
    estimator = MXNet(entry_point='entrypoint.py',
                      source_dir=source_dir,
                      framework_version='1.6.0',
                      py_version='py3',
                      role=role,
                      train_instance_count=1,
                      train_instance_type='ml.m5.large',
                      sagemaker_session=sess,
                      hyperparameters=estimator_hp,
                      metric_definitions=metric,
    )

    tuner = HyperparameterTuner(
                estimator,
                objective_metric_name,
                tuner_hp,
                metric,   # Also needed for custom algo. (i.e., entrypoint script).
                objective_type='Minimize',
                max_jobs=4,   # Hardcoded (for testing).
                max_parallel_jobs=1)
    return tuner

def get_ts():
    return strftime("%y%m%d-%H%M%S", gmtime())

## Create a tuner for DeepAR

\[SFG17\] Salinas, David, Valentin Flunkert, and Jan Gasthaus. “DeepAR: Probabilistic forecasting with autoregressive recurrent networks.” arXiv preprint arXiv:1704.04110 (2017).

In [None]:
# Equivalent to:
# python entrypoint.py --s3-dataset s3_dir --distr_output gluonts.distribution.gaussian.GaussianOutput \
#     --use_feat_static_cat True --cardinality '[5]' --prediction_length 2 \
#     --trainer gluonts.trainer.Trainer --trainer.epochs 2
tuner_deepar = create_tuning_job(
    objective_metric_name='test:wmape',

    # Fixed hyperparameters, i.e., same for all training jobs.
    estimator_hp={
        # Let's start with non-algorithm hyperparameters
        'plot_transparent': 0,   # Whether plot should be transparent or white background
        'num_samples': 1000,     # Number of samples during backtesting.

        # Here, you specify the algorithm to use, such as DeepAR, DeepFactor, DeepState, Transformer,
        # etc. See glounts.model packages for the list of available algorithms.
        #
        # If 'algo' is not specified, then defaults to 'gluonts.model.deepar.DeepAREstimator'.
        'algo': 'gluonts.model.deepar.DeepAREstimator',

        # The remaining here are kwargs to the chosen estimator. For e.g., for DeepAR, consult the
        # documentation for gluonts.model.deepar.DeepAREstimator.
        #
        # There're two types of kwargs hyperparameters:
        # - primitive python types (incl. dictionaries & lists that can be deserialized from JSON).
        #   Note that string "True", "False", and "None" will automatically become True, False, and
        #   None, respectively.
        # - Custom classes, notably Trainer and distribution output.
        #   Note that time_feat is unsupported at this point in time.

        # Kwargs: Primitive python types.
        'use_feat_static_cat': 'True',
        'cardinality': '[5]',
        'prediction_length': 2,

        # Kwargs: custom classes.
        # Currently, this is implemented as a whitelist, and notably missing is for kwarg time_feat.

        # Equivalent to DeepAREstimator(..., distr_output=GaussianOutput(), ...)
        'distr_output': 'gluonts.distribution.gaussian.GaussianOutput',

        # Equivalent to DeepAREstimator(..., trainer=Trainer(epochs=2), ...)
        'trainer': 'gluonts.trainer.Trainer',
        'trainer.epochs': 10,
    },

    # Tunable hyperparameters, i.e., may vary across training jobs.
    tuner_hp={
        "num_cells": IntegerParameter(30, 200),
        "num_layers": IntegerParameter(1, 8),
        "trainer.learning_rate": ContinuousParameter(1e-5, 1e-1, scaling_type='Logarithmic'),
    },

    # Metric emitted by each training job. The entrypoint script may emit even more metrics, but
    # we choose to capture only a few.
    metric=[
        {"Name": "train:loss", "Regex": r"Epoch\[\d+\] Evaluation metric 'epoch_loss'=(\S+)"},
        {"Name": "train:learning_rate", "Regex": r"Epoch\[\d+\] Learning rate is (\S+)"},
        {"Name": "test:abs_error", "Regex": r"gluonts\[metric-abs_error\]: (\S+)"},
        {"Name": "test:rmse", "Regex": r"gluonts\[metric-RMSE\]: (\S+)"},
        {"Name": "test:wmape", "Regex": r"gluonts\[metric-wMAPE\]: (\S+)"},
    ],

    role=role,
    sess=sess,
)

## Create a tuner for DeepState

\[RSG+18\] Rangapuram, Syama Sundar, et al. “Deep state space models for time series forecasting.” Advances in Neural Information Processing Systems. 2018.

In [None]:
# Equivalent to:
# python entrypoint.py --s3_dataset s3_dir --algo gluonts.model.deepstate.DeepStateEstimator \
#    --trainer gluonts.trainer.Trainer --trainer.epochs 3 --use_feat_static_cat True --cardinality '[5]' \
#    --noise_std_bounds gluonts.distribution.lds.ParameterBounds --noise_std_bounds.lower 1e-5 --noise_std_bounds.upper 1e-1
tuner_deepstate = create_tuning_job(
    objective_metric_name='test:wmape',

    # Fixed hyperparameters, i.e., same for all training jobs.
    estimator_hp={
        # Let's start with non-algorithm hyperparameters
        'plot_transparent': 0,   # Whether plot should be transparent or white background
        'num_samples': 1000,     # Number of samples during backtesting.

        # Here, you specify the algorithm to use, such as DeepAR, DeepFactor, DeepState, Transformer,
        # etc. See glounts.model packages for the list of available algorithms.
        #
        # If 'algo' is not specified, then defaults to 'gluonts.model.deepar.DeepAREstimator'.
        'algo': 'gluonts.model.deepstate.DeepStateEstimator',

        # The remaining here are kwargs to the chosen estimator. For e.g., for DeepAR, consult the
        # documentation for gluonts.model.deepar.DeepAREstimator.
        #
        # There're two types of kwargs hyperparameters:
        # - primitive python types (incl. dictionaries & lists that can be deserialized from JSON).
        #   Note that string "True", "False", and "None" will automatically become True, False, and
        #   None, respectively.
        # - Custom classes, notably Trainer and distribution output.
        #   Note that time_feat is unsupported at this point in time.

        # Kwargs: Primitive python types.
        'use_feat_static_cat': 'True',
        'cardinality': '[5]',
        'prediction_length': 2,

        # Kwargs: custom classes.
        # Currently, this is implemented as a whitelist, and notably missing is for kwarg time_feat.

        # Equivalent to DeepStateEstimator(..., noise_std_bound=ParameterBounds(lower=1e-06, upper=1.0), ...)
        # This bounds are exactly the default, so you can choose not to specify the hyperparameters.
        # However, we specify here to show you how you can customze the bounds. Please consult the
        # DeepState documentations all the bounds it supports.
        "noise_std_bounds": "gluonts.distribution.lds.ParameterBounds",
        "noise_std_bounds.lower": "1e-5",
        "noise_std_bounds.upper": "1e-1",

        # Equivalent to DeepAREstimator(..., trainer=Trainer(epochs=2), ...)
        'trainer': 'gluonts.trainer.Trainer',
        'trainer.epochs': 10,
    },

    # Tunable hyperparameters, i.e., may vary across training jobs.
    tuner_hp={
        "num_cells": IntegerParameter(30, 200),
        "num_layers": IntegerParameter(1, 8),
        "trainer.learning_rate": ContinuousParameter(1e-5, 1e-1, scaling_type='Logarithmic'),
    },

    # Metric emitted by each training job. The entrypoint script may emit even more metrics, but
    # we choose to capture only a few.
    metric=[
        {"Name": "train:loss", "Regex": r"Epoch\[\d+\] Evaluation metric 'epoch_loss'=(\S+)"},
        {"Name": "train:learning_rate", "Regex": r"Epoch\[\d+\] Learning rate is (\S+)"},
        {"Name": "test:abs_error", "Regex": r"gluonts\[metric-abs_error\]: (\S+)"},
        {"Name": "test:rmse", "Regex": r"gluonts\[metric-RMSE\]: (\S+)"},
        {"Name": "test:wmape", "Regex": r"gluonts\[metric-wMAPE\]: (\S+)"},
    ],

    role=role,
    sess=sess,
)

## Start each tuner

In [None]:
tuner_deepar.fit(data_channels, job_name='gtsdeepar-'+get_ts(), include_cls_metadata=False)
tuner_deepstate.fit(data_channels, job_name='gtsdeepstate-'+get_ts(), include_cls_metadata=False)

# Show tuning names
for tuner in [tuner_deepar, tuner_deepstate]:
    print(tuner.latest_tuning_job.job_name)

Wait until all tuning job finishes (check from console), then proceed with belows.

**<div style="color:firebrick">If you start a restart this notebook kernel, then you need to manually attach to an existing tuning job.
Sample code shown below, but be aware that you need to manually specify the tuning jobs to attach!</div>**

In [None]:
#for tuner in completed_tuners:
for tuner in [tuner_deepar, tuner_deepstate]:
    # Ugh, why must we travel all down the way to botocore level just to query tuning status..
    if int(tuner.sagemaker_session.boto_session.client('sagemaker').describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=tuner.latest_tuning_job.name)['ObjectiveStatusCounters']['Pending']) > 0:
        status = 'NOT_DONE'
    else:
        status = 'DONE'

    try:
        best_training_job = tuner.best_training_job()
    except:
        # Exception: Best training job not available for tuning job: gtsdeepar-200423-064644
        best_training_job = None

    print('\n',
        status,
        tuner.latest_tuning_job.name,
        best_training_job,
        #sess.sagemaker_client.describe_training_job(TrainingJobName=tuner.best_training_job())['HyperParameters']['likelihood'],
        tuner.objective_metric_name,
        tuner.analytics().dataframe()['FinalObjectiveValue'].min(),
          
        sep='\n',
    )