# Time Serie Forecast on Twitter Volume Using GluonTS with SageMaker Automatic Model Tuner for Hyperparameter Tunings



After understanding GluonTS in the previous section, descriptive statistics, we now want to quickly run a time-series forecast using SageMaker. In this example we use the twitter volume dataset  and create first a baseline (seasonal naive estimator). Afterwards we create and train a DeepAR model and compare it to the baseline.


In the followng cell we build a training dataset ending at April 5th, 2015 and a test dataset that will be used forecast the hour following the midnight on April 15th, 2015. GluonTS requires the full timeseries to be in the test dataset. So test and train data will start at February 26 2015. GluonTS will then cut out the `n` last elements from test dataset, in order to predict those. `n` is equal the prediction length. 


### DeepAR Hyperparameters 

Apart from prediction_length, time freqency and number of epochs we did not specify any other hyperparameters. DeepAR has many hyperparameters and in this section we will use SageMaker automatic model tuner to find the right set for our model. Here is a short list of some hyperparameters and their default values in GluonTS DeepAR:

| Hyperparameters          | Value                     |
|--------------------------|---------------------------|
| epochs                   | 100                       |
| context_length           | prediction_length         |
| batch size               | 32                        |
| learning rate            | $1e-3$                    |
| LSTM layers              | 2                         |
| LSTM nodes               | 40                        |
| likelihood               | StudentTOutput()          |


### Likelihood Models

We also need to choose a likelihood model. For example, we choose negative binomial likelihood or StudentT for count data. Other likelihood models can also readily be used as long as samples from the distribution can cheaply be obtained and the log-likelihood and its gradients with respect to the parameters can be evaluated. For example:

- **Gaussian:** Use for real-valued data.
- **Beta:** Use for real-valued targets between 0 and 1 inclusive.
- **Negative-binomial:** Use for count data (non-negative integers).
- **Student-T:** An alternative for real-valued data that works well for bursty data.
- **Deterministic-L1:** A loss function that does not estimate uncertainty and only learns a point forecast.

Refer to the  [documentation](https://gluon-ts.mxnet.io/api/gluonts/gluonts.model.deepar.html) for a full description of the available parameters. In this notebook your will learn how to train your GluonTS model on Amazon SageMaker and to tune it with automatic model tuner.

In [1]:
import pandas as pd
import gluonts
import numpy as np
import matplotlib.pyplot as plt
import pathlib
import json
import boto3
import s3fs
import csv
import sagemaker

### Upload data to Amazon S3
In order to run the model training with Amazon SageMaker, we need to upload our train and test data to Amazon S3. In the following code cell, we define SageMaker default bucket where data will be uploaded to. 

In [2]:
sagemaker_session = sagemaker.Session()
s3_bucket = sagemaker_session.default_bucket()

s3_train_data_path = "s3://{}/gluonts/train".format(s3_bucket)
s3_test_data_path = "s3://{}/gluonts/test".format(s3_bucket)

print("Data will be uploaded to: ", s3_bucket)

Data will be uploaded to:  sagemaker-us-west-2-976939723775


Now we download the file and split it into training and test data. Afterwards we write it to a csv.

In [3]:
url = "https://raw.githubusercontent.com/numenta/NAB/master/data/realTweets/Twitter_volume_AMZN.csv"
df = pd.read_csv(filepath_or_buffer=url, header=0, index_col=0)

train = df[: "2015-04-05 00:00:00"]
train.to_csv("train.csv")

test = df[: "2015-04-15 00:00:00"]
test.to_csv("test.csv")

The following function will create a `train` and `test` folder in the S3 bucket and upload the csv files.

In [4]:
s3 = boto3.resource('s3')
def copy_to_s3(local_file, s3_path, override=False):
    assert s3_path.startswith('s3://')
    split = s3_path.split('/')
    bucket = split[2]
    path = '/'.join(split[3:])
    buk = s3.Bucket(bucket)
    
    if len(list(buk.objects.filter(Prefix=path))) > 0:
        if not override:
            print('File s3://{}/{} already exists.\nSet override to upload anyway.\n'.format(s3_bucket, s3_path))
            return
        else:
            print('Overwriting existing file')
    with open(local_file, 'rb') as data:
        print('Uploading file to {}'.format(s3_path))
        buk.put_object(Key=path, Body=data)
        
copy_to_s3("train.csv", s3_train_data_path + "/train.csv")
copy_to_s3("test.csv", s3_test_data_path + "/test.csv")

File s3://sagemaker-us-west-2-976939723775/s3://sagemaker-us-west-2-976939723775/gluonts/train/train.csv already exists.
Set override to upload anyway.

File s3://sagemaker-us-west-2-976939723775/s3://sagemaker-us-west-2-976939723775/gluonts/test/test.csv already exists.
Set override to upload anyway.



Let's have a look to what we just wrote to S3. With `s3fs` we can have a look on the files in the bucket.

In [5]:
s3filesystem = s3fs.S3FileSystem()
with s3filesystem.open(s3_train_data_path + "/train.csv", 'rb') as fp:
    print(fp.readline().decode("utf-8")[:100] + "...")

timestamp,value
...


### Train DeepAR model with Amazon SageMaker

Since SageMaker will automatically spin up instances for us, we need to provide a role. 

In [6]:
import sagemaker
from sagemaker.mxnet import MXNet

sagemaker_session = sagemaker.Session()
#role = sagemaker.get_execution_role()
role = 'arn:aws:iam::976939723775:role/service-role/AmazonSageMaker-ExecutionRole-20210317T133000'

Now we define the MXNet estimator. An [estimator](https://sagemaker.readthedocs.io/en/stable/estimators.html) is a higher level interface to define the SageMaker training. It takes several parameters like the [training](entry_point/train.py) script, which defines our DeepAR model. We indicate the train instance type on which we want to execute our model training. Here we choose `ml.m5.xlarge` which is a CPU instance. We need to provide the role so that SageMaker can spin up the instance for us. We also indicate the framework and python version for MXNet. Afterwards we provide a dictionary of hyperparameters that will be parsed in the [training](entry_point/train.py) script to set the hyperparameters of our model. During hyperparameter tuning SageMaker will adjust the hyperparameters passed into our training job.

In [21]:
mxnet_estimator = MXNet(entry_point='twitter_volume_sagemaker_train.py',
                        source_dir='./',
                        role=role,
                        train_instance_type='ml.m5.xlarge',
                        train_instance_count=1,
                        framework_version='1.8.0', 
                        py_version='py37',
                        hyperparameters={
                             'epochs': 1, 
                             'prediction_length':12,
                             'num_layers':2, 
                             'dropout_rate': 0.2,
                         })

train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_count has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.
train_instance_type has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


We are ready to start the training job. Once we call `fit`, SageMaker will spin up an `ml.m5.xlarge` instance, download the MXNet docker image, download the train and test data from Amazon S3 and execute the `train` function from our `train.py` file. 

While the model is training you may want to have a look at [train.py](entry_point/train.py) file. The file follows a certain structure and has the following functions:
- `train`: defines the training procedure as we defined it in [lab 3](../notebooks/twitter_volume_forecast.ipynb) So in our case it creates the ListDataset, the DeepAR estimator and performs the training. It also performs the evaluation and prints the MSE metric. This is necessary for the hyperparameter tuning later on.
- `model_fn`: used for inference. Once the model is trained we can deploy it and this function will load the trained model.
- `transform_fn`: used for inference. If we send requests to the endpoint, the data will by default be encoded as json string. We decode the data from json into a Pandas data frame. We then create the ListDataset and perform inference. The forecasts will be sent back as a json string.

In [22]:
mxnet_estimator.fit({"train": s3_train_data_path, "test": s3_test_data_path}, wait=True)

dict_keys(['training', 'inference', 'eia'])
['0.12.1', '1.0.0', '1.1.0', '1.2.1', '1.3.0', '1.4.0', '1.4.1', '1.6.0', '1.7.0', '1.8.0', '0.12', '1.0', '1.1', '1.2', '1.3', '1.4', '1.6', '1.7', '1.8']
['py37']
dict_keys(['af-south-1', 'ap-east-1', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3', 'ap-south-1', 'ap-southeast-1', 'ap-southeast-2', 'ca-central-1', 'cn-north-1', 'cn-northwest-1', 'eu-central-1', 'eu-north-1', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'eu-south-1', 'me-south-1', 'sa-east-1', 'us-east-1', 'us-east-2', 'us-gov-west-1', 'us-iso-east-1', 'us-west-1', 'us-west-2'])
['cpu', 'gpu']
['debugger']
dict_keys(['af-south-1', 'ap-east-1', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3', 'ap-south-1', 'ap-southeast-1', 'ap-southeast-2', 'ca-central-1', 'cn-north-1', 'cn-northwest-1', 'eu-central-1', 'eu-north-1', 'eu-south-1', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'me-south-1', 'sa-east-1', 'us-east-1', 'us-east-2', 'us-gov-west-1', 'us-west-1', 'us-west-2'])
dict_k

In [23]:
mxnet_estimator_endpoint = mxnet_estimator.deploy(instance_type="ml.m5.xlarge", initial_instance_count=1)

dict_keys(['training', 'inference', 'eia'])
['0.12.1', '1.0.0', '1.1.0', '1.2.1', '1.3.0', '1.4.0', '1.4.1', '1.6.0', '1.7.0', '1.8.0', '0.12', '1.0', '1.1', '1.2', '1.3', '1.4', '1.6']
['py37']
dict_keys(['af-south-1', 'ap-east-1', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3', 'ap-south-1', 'ap-southeast-1', 'ap-southeast-2', 'ca-central-1', 'cn-north-1', 'cn-northwest-1', 'eu-central-1', 'eu-north-1', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'eu-south-1', 'me-south-1', 'sa-east-1', 'us-east-1', 'us-east-2', 'us-gov-west-1', 'us-iso-east-1', 'us-west-1', 'us-west-2'])
['cpu', 'gpu']
----!

### SageMaker Automatic Model Tuner

Now that we are able to run our DeepAR model with SageMaker, we can start tuning its hyperparameter. In the following section we define the `HyperparameterTuner`, which takes the following hyperparameters:
- `epochs`: number of training epochs. If this value is too large we may overfit the training data, which means the model achieves good performance on the trasining dataset but bad performance on the test dataset.
- `prediction_length`: how many time units shall the model predict
- `num_layers`: number of RNN layers
- `dropout_rate`: dropouts help to regularize the training because they randomly switch off neurons. 

You can find more information about DeepAR parameters [here](https://gluon-ts.mxnet.io/api/gluonts/gluonts.model.deepar.html) 

Next we have to indicate the metric we want to optimize on. We have to make sure that our training job prints those metrics. [train.py](entry_point/train.py) prints the MSE value of evaluated test dataset. These printouts will appear in Cloudwatch and the automatic model tuner will then retrieve those outputs by using the regular expression indicated in `Regex`. 
Next we indicate the `max_jobs` and `max_parallel_jobs`. Here we will run 10 jobs in total and in each step we will start 5 parallel jobs.

In [24]:
from sagemaker.tuner import HyperparameterTuner, ContinuousParameter, IntegerParameter 

tuner = HyperparameterTuner(estimator=mxnet_estimator,  
                               objective_metric_name='loss',
                               hyperparameter_ranges={
                                    'epochs': IntegerParameter(5,20),
                                    'prediction_length':IntegerParameter(5,20),
                                    'num_layers': IntegerParameter(1, 5),
                                    'dropout_rate': ContinuousParameter(0, 0.5) },
                               metric_definitions=[{'Name': 'loss', 'Regex': "MSE: ([0-9\\.]+)"}],
                               max_jobs=10,
                               max_parallel_jobs=5,
                               objective_type='Minimize')

`tuner.fit` will start the automatic model tuner. You can go now to the SageMaker console and check the training jobs or proceed to the next cells, to get some real time results from the jobs. 

The search space grows exponentially with the number of hyperparameters. Assuming 5 parameters where each one has 10 discrete options we end up with $10^5$ possible combinations. Clearly we do not want to run $10^5$ jobs. Automatic model tuner will use per default Bayesian optimization which is a combination of explore and exploit. That means after each training job it will evaluate whether to jump into a new area of the search space (explore) or whether to further exploit the local search space. You can find some more information [here](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-how-it-works.html)

In [25]:
tuner.fit({'train': s3_train_data_path, "test": s3_test_data_path}, wait=True)

['debugger']
dict_keys(['af-south-1', 'ap-east-1', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3', 'ap-south-1', 'ap-southeast-1', 'ap-southeast-2', 'ca-central-1', 'cn-north-1', 'cn-northwest-1', 'eu-central-1', 'eu-north-1', 'eu-south-1', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'me-south-1', 'sa-east-1', 'us-east-1', 'us-east-2', 'us-gov-west-1', 'us-west-1', 'us-west-2'])
dict_keys(['training', 'inference', 'eia'])
['0.12.1', '1.0.0', '1.1.0', '1.2.1', '1.3.0', '1.4.0', '1.4.1', '1.6.0', '1.7.0', '1.8.0', '0.12', '1.0', '1.1', '1.2', '1.3', '1.4', '1.6', '1.7', '1.8']
['py37']
dict_keys(['af-south-1', 'ap-east-1', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3', 'ap-south-1', 'ap-southeast-1', 'ap-southeast-2', 'ca-central-1', 'cn-north-1', 'cn-northwest-1', 'eu-central-1', 'eu-north-1', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'eu-south-1', 'me-south-1', 'sa-east-1', 'us-east-1', 'us-east-2', 'us-gov-west-1', 'us-iso-east-1', 'us-west-1', 'us-west-2'])
['cpu', 'gpu']
dict_k

We can track the status of the hyperparameter tuning jobs by running the following code. Get the name of your job from the sagemaker console.

In [26]:
tuning_job_name = tuner.latest_tuning_job.job_name

Now we retrieve information about the training jobs from SageMaker and we can see how many have already completed.

In [27]:
sage_client = boto3.Session().client('sagemaker')

# run this cell to check current status of hyperparameter tuning job
tuning_job_result = sage_client.describe_hyper_parameter_tuning_job(HyperParameterTuningJobName=tuning_job_name)

status = tuning_job_result['HyperParameterTuningJobStatus']
if status != 'Completed':
    print('Reminder: the tuning job has not been completed.')
    
job_count = tuning_job_result['TrainingJobStatusCounters']['Completed']
print("%d training jobs have completed" % job_count)
    
is_minimize = (tuning_job_result['HyperParameterTuningJobConfig']['HyperParameterTuningJobObjective']['Type'] != 'Maximize')
objective_name = tuning_job_result['HyperParameterTuningJobConfig']['HyperParameterTuningJobObjective']['MetricName']

10 training jobs have completed


In the following cell, we retrieve information about training jobs that have already finished. We will plot their hyperparameters versus objective metric.

In [28]:
import pandas as pd

job_analytics = sagemaker.HyperparameterTuningJobAnalytics(tuning_job_name)

full_df = job_analytics.dataframe()

if len(full_df) > 0:
    df = full_df[full_df['FinalObjectiveValue'] > -float('inf')]
    if len(df) > 0:
        df = df.sort_values('FinalObjectiveValue', ascending=is_minimize)
        print("Number of training jobs with valid objective: %d" % len(df))
        print({"lowest":min(df['FinalObjectiveValue']),"highest": max(df['FinalObjectiveValue'])})
        pd.set_option('display.max_colwidth', -1)  # Don't truncate TrainingJobName        
    else:
        print("No training jobs have reported valid results yet.")
        
df

Number of training jobs with valid objective: 10
{'lowest': 99.30044555664062, 'highest': 260.81640625}


  pd.set_option('display.max_colwidth', -1)  # Don't truncate TrainingJobName


Unnamed: 0,dropout_rate,epochs,num_layers,prediction_length,TrainingJobName,TrainingJobStatus,FinalObjectiveValue,TrainingStartTime,TrainingEndTime,TrainingElapsedTimeSeconds
1,0.327333,5.0,1.0,8.0,mxnet-training-220513-2347-009-6252d57f,Completed,99.300446,2022-05-13 23:53:27+00:00,2022-05-13 23:55:04+00:00,97.0
4,0.44068,5.0,2.0,8.0,mxnet-training-220513-2347-006-9807ab22,Completed,115.611267,2022-05-13 23:52:12+00:00,2022-05-13 23:53:54+00:00,102.0
9,0.428982,7.0,2.0,8.0,mxnet-training-220513-2347-001-b6f55a39,Completed,119.13652,2022-05-13 23:48:36+00:00,2022-05-13 23:50:24+00:00,108.0
7,0.316092,12.0,2.0,8.0,mxnet-training-220513-2347-003-084935ef,Completed,127.570602,2022-05-13 23:48:30+00:00,2022-05-13 23:50:37+00:00,127.0
0,0.234458,11.0,2.0,6.0,mxnet-training-220513-2347-010-abfda3cf,Completed,160.339462,2022-05-13 23:53:51+00:00,2022-05-13 23:55:48+00:00,117.0
3,0.444465,6.0,4.0,15.0,mxnet-training-220513-2347-007-f5982575,Completed,193.192047,2022-05-13 23:52:08+00:00,2022-05-13 23:54:21+00:00,133.0
6,0.346133,14.0,5.0,16.0,mxnet-training-220513-2347-004-ee49274a,Completed,198.110168,2022-05-13 23:48:38+00:00,2022-05-13 23:52:21+00:00,223.0
5,0.269285,17.0,2.0,20.0,mxnet-training-220513-2347-005-dbe994a2,Completed,205.476288,2022-05-13 23:48:50+00:00,2022-05-13 23:51:52+00:00,182.0
2,0.093406,6.0,2.0,16.0,mxnet-training-220513-2347-008-ea238d0c,Completed,207.456787,2022-05-13 23:53:15+00:00,2022-05-13 23:55:08+00:00,113.0
8,0.423817,19.0,3.0,12.0,mxnet-training-220513-2347-002-e7c1f4c3,Completed,260.816406,2022-05-13 23:48:32+00:00,2022-05-13 23:51:44+00:00,192.0


Once the hyperparameter tuning job has finished we will plot all results. 

In [29]:
import bokeh
import bokeh.io
bokeh.io.output_notebook()
from bokeh.plotting import figure, show
from bokeh.models import HoverTool

ranges = job_analytics.tuning_ranges
figures = []

class HoverHelper():

    def __init__(self, tuning_analytics):
        self.tuner = tuning_analytics

    def hovertool(self):
        tooltips = [
            ("FinalObjectiveValue", "@FinalObjectiveValue"),
            ("TrainingJobName", "@TrainingJobName"),
        ]
        for k in self.tuner.tuning_ranges.keys():
            tooltips.append( (k, "@{%s}" % k) )

        ht = HoverTool(tooltips=tooltips)
        return ht

    def tools(self, standard_tools='pan,crosshair,wheel_zoom,zoom_in,zoom_out,undo,reset'):
        return [self.hovertool(), standard_tools]

hover = HoverHelper(job_analytics)

for hp_name, hp_range in ranges.items():
    categorical_args = {}
    if hp_range.get('Values'):
        # This is marked as categorical.  Check if all options are actually numbers.
        def is_num(x):
            try:
                float(x)
                return 1
            except:
                return 0           
        vals = hp_range['Values']
        if sum([is_num(x) for x in vals]) == len(vals):
            # Bokeh has issues plotting a "categorical" range that's actually numeric, so plot as numeric
            print("Hyperparameter %s is tuned as categorical, but all values are numeric" % hp_name)
        else:
            # Set up extra options for plotting categoricals.  A bit tricky when they're actually numbers.
            categorical_args['x_range'] = vals

    # Now plot it
    p = figure(plot_width=500, plot_height=500, 
               title="Objective vs %s" % hp_name,
               tools=hover.tools(),
               x_axis_label=hp_name, y_axis_label=objective_name,
               **categorical_args)
    p.circle(source=df, x=hp_name, y='FinalObjectiveValue')
    figures.append(p)
show(bokeh.layouts.Column(*figures))

Running hyperparamter tuning jobs may take a while so in the meantime freel free to check out [this notebook](deepar_datails.ipynb) that gives more in depth details about DeepAR.

Now that we have found a model with good hyperparameters we can deploy it. Note: This endpoint will take approximately 5-8 minutes to launch. 

In [30]:
tuned_endpoint = tuner.deploy(instance_type="ml.m5.xlarge", initial_instance_count=1)


2022-05-13 23:55:04 Starting - Preparing the instances for training
2022-05-13 23:55:04 Downloading - Downloading input data
2022-05-13 23:55:04 Training - Training image download completed. Training in progress.
2022-05-13 23:55:04 Uploading - Uploading generated training model
2022-05-13 23:55:04 Completed - Training job completed
dict_keys(['training', 'inference', 'eia'])
['0.12.1', '1.0.0', '1.1.0', '1.2.1', '1.3.0', '1.4.0', '1.4.1', '1.6.0', '1.7.0', '1.8.0', '0.12', '1.0', '1.1', '1.2', '1.3', '1.4', '1.6']
['py37']
dict_keys(['af-south-1', 'ap-east-1', 'ap-northeast-1', 'ap-northeast-2', 'ap-northeast-3', 'ap-south-1', 'ap-southeast-1', 'ap-southeast-2', 'ca-central-1', 'cn-north-1', 'cn-northwest-1', 'eu-central-1', 'eu-north-1', 'eu-west-1', 'eu-west-2', 'eu-west-3', 'eu-south-1', 'me-south-1', 'sa-east-1', 'us-east-1', 'us-east-2', 'us-gov-west-1', 'us-iso-east-1', 'us-west-1', 'us-west-2'])
['cpu', 'gpu']
----!

Now we can send some test data to the endpoint, but first we convert the Numpy arrays `test.value` and `test.index` to lists and add them to a dictionary. SageMaker will encode them as a json string when they are sent to the endpoint. Let's compare how much better our predictions are:

In [35]:
input_data = {'value': test.value.tolist(), 'timestamp': test.index.tolist() }
result_original = mxnet_estimator_endpoint.predict(input_data)
result_tuned = tuned_endpoint.predict(input_data)

When we call `endpoint.predict()`, SageMaker will execute the `transform_fn` in the  [train.py](entry_point/train.py) file. As discussed above, this function will decode the json string into a Pandas frame. Afterwards it creates the `ListDataset` and performs inference. The endpoint will then return forecasts. Let's have a look at the result

In [36]:
result_original

{'predictions': [55.783302307128906,
  51.409305572509766,
  63.80619812011719,
  63.91878890991211,
  55.82491683959961,
  44.65625762939453,
  49.701324462890625,
  81.8489761352539,
  71.547607421875,
  45.27735900878906,
  46.00373458862305,
  82.60206604003906]}

In this notebook you have learnt how to build and train a DeepAR model with GluonTS, how to tune and deploy it with Amazon SageMaker.

In [37]:
result_tuned

{'predictions': [48.059783935546875,
  55.23781204223633,
  68.90005493164062,
  56.68130111694336,
  52.78828811645508,
  55.34917449951172,
  52.07051467895508,
  49.926612854003906]}

### Delete the endpoint
Remember to delete your Amazon SageMaker endpoint once it is no longer needed.

In [None]:
mxnet_estimator_endpoint.delete_endpoint()
tuned_endpoint.delete_endpoint()

# Challenge
Now it is your turn to find even better hyperparameters for the model. Go to  [documentation](https://gluon-ts.mxnet.io/api/gluonts/gluonts.model.deepar.html) and try out other hyperparameters.