# <span style="color:orange">GluonTS</span> Deep Learning for Forecasting

# 1. What is GluonTS?

GluonTS is a Python toolkit for probabilistic time series modeling, built around Apache MXNet (incubating).

GluonTS is especially suited for working with multiple time series datasets. It provides utilities for loading and iterating over time series datasets, state of the art models ready to be trained, as well as building blocks to define your own models and quickly experiment with different solutions.

The toolkit targets scientists and engineers who want to tweak algorithms or build and experiment with their own models.

* Web page: https://gluon-ts.mxnet.io/
* GitHub repo: https://github.com/awslabs/gluon-ts

## 1.1 Exercise 0

Execute the following lines

In [None]:
! pip install --upgrade --quiet gluonts

In [None]:
import gluonts
gluonts.__version__

## 1.2 Package overview

Main modules:
* `gluonts.dataset` -- abstractions and utilities to manipulate datasets
* `gluonts.model` -- pre-implemented models, plus abstractions and utilities to help you define your own
* `gluonts.evaluation` -- model evaluation tools, i.e., to compute accuracy metrics and compare models

Other useful modules:
* `gluonts.distribution` -- several types of parametric probability distributions, to help you define probabilistic models
* `gluonts.kernels` -- kernel functions, to support developing e.g. Gaussian-processes-based models
* `gluonts.trainer` -- provides a default `Trainer` class, exposing several training options

# 2. Quick start guide

In [None]:
# Third-party imports
%matplotlib inline
import mxnet as mx
from mxnet import gluon
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
from pprint import pprint

## 2.1 Download a dataset

GluonTS comes with a number of publicly available datasets.

In [None]:
from gluonts.dataset.repository.datasets import get_dataset, dataset_recipes
from gluonts.dataset.util import to_pandas

In [None]:
print(f"Available datasets: {list(dataset_recipes.keys())}")

To download one of the built-in datasets, simply call `get_dataset` with one of the above names. GluonTS can re-use the saved dataset so that it does not need to be downloaded again: simply set `regenerate=False`.

In [None]:
dataset = get_dataset("m4_hourly", regenerate=False)

In general, the datasets provided by GluonTS are objects that consists of three main members:

- `dataset.train` is an iterable collection of data entries used for training. Each entry corresponds to one time series
- `dataset.test` is an iterable collection of data entries used for inference. The test dataset is an extended version of the train dataset that contains a window in the end of each time series that was not seen during training. This window has length equal to the recommended prediction length.
- `dataset.metadata` containts metadata of the dataset such as the frequency of the time series, a recommended prediction horizon, associated features, etc.

In [None]:
len(dataset.train)

What are these datasets made of?

In [None]:
# get the first time series in the training set
train_entry = next(iter(dataset.train))
pprint(train_entry)

In [None]:
# get the first time series in the test set
test_entry = next(iter(dataset.test))

In [None]:
dataset.metadata

In [None]:
# convert the timeseries to pandas series objects to make plotting easier
test_series = to_pandas(test_entry)
train_series = to_pandas(train_entry)

fig, ax = plt.subplots(2, 1, sharex=True, sharey=True, figsize=(10, 7))

train_series.plot(ax=ax[0])
ax[0].grid(which="both")
ax[0].legend(["train series"], loc="upper left")

test_series.plot(ax=ax[1])
ax[1].axvline(train_series.index[-1], color='r') # end of train dataset
ax[1].grid(which="both")
ax[1].legend(["test series", "end of train series"], loc="upper left")

plt.show()

In [None]:
print(f"Length of forecasting window in test dataset: {len(test_series) - len(train_series)}")
print(f"Recommended prediction horizon: {dataset.metadata.prediction_length}")
print(f"Frequency of the time series: {dataset.metadata.freq}")

## 2.2 Train an existing model 

GluonTS comes with a number of pre-built models. All the user needs to do is configure some hyperparameters. The existing models focus on (but are not limited to) probabilistic forecasting. Probabilistic forecasts are predictions in the form of a probability distribution, rather than simply a single point estimate.

We will begin with GluonTS's pre-built feedforward neural network estimator, a simple but powerful forecasting model. We will use this model to demonstrate the process of training a model, producing forecasts, and evaluating the results.

GluonTS's built-in feedforward neural network (`SimpleFeedForwardEstimator`) accepts an input window of length `context_length` and predicts the distribution of the values of the subsequent `prediction_length` values. In GluonTS parlance, the feedforward neural network model is an example of `Estimator`. In GluonTS, `Estimator` objects represent a forecasting model as well as details such as its coefficients, weights, etc.

In general, each estimator (pre-built or custom) is configured by a number of hyperparameters that can be either common (but not binding) among all estimators (e.g., the `prediction_length`) or specific for the particular estimator (e.g., number of layers for a neural network or the stride in a CNN).

Finally, each estimator is configured by a `Trainer`, which defines how the model will be trained i.e., the number of epochs, the learning rate, etc.

In [None]:
from gluonts.model.simple_feedforward import SimpleFeedForwardEstimator
from gluonts.trainer import Trainer

The code block below creates the feedforward network estimator. There are many more available options than the ones we have used here, which are described in the documentation. The `SimpleFeedForwardEstimator` documentation is available at:   
https://gluon-ts.mxnet.io/api/gluonts/gluonts.model.simple_feedforward.html.

In [None]:
estimator = SimpleFeedForwardEstimator(
    # required
    prediction_length=dataset.metadata.prediction_length,
    freq=dataset.metadata.freq,
    # optional
    num_hidden_dimensions=[10],    
    context_length=100, 
    trainer=Trainer(
        ctx="cpu", 
        epochs=10, 
        learning_rate=1e-3,
        batch_size=32,
        num_batches_per_epoch=100,
    )
)

After specifing our estimator with all the necessary hyperparameters we can train it using our training dataset `dataset.train` by invoking the `train` method of the estimator. The training algorithm returns a fitted model (or a `Predictor` in GluonTS parlance) that can be used to construct forecasts.

In [None]:
predictor = estimator.train(dataset.train)

## 2.3 Evaluate the model

With a predictor in hand, we can now predict the last window of the `dataset.test` and evaluate our model's performance.

GluonTS comes with the `make_evaluation_predictions` function that helps in the process of prediction and model evaluation. Roughly, this function performs the following steps:

- Removes the final window of length `prediction_length` of the `dataset.test` that we want to predict
- The estimator uses the remaining data to predict (in the form of sample paths) the "future" window that was just removed
- The module outputs the forecast sample paths and the `dataset.test` (as python generator objects)

In [None]:
from gluonts.evaluation.backtest import make_evaluation_predictions

In [None]:
forecast_it, ts_it = make_evaluation_predictions(
    dataset=dataset.test,  # test dataset
    predictor=predictor,  # predictor
    num_eval_samples=100,  # number of sample paths we want for evaluation
)

In [None]:
# convert the generators to lists
forecasts = list(forecast_it)
tss = list(ts_it)

We can examine the first element of these lists (that corresponds to the first time series of the dataset). Let's start with the list containing the time series, i.e., `tss`. We expect the first entry of `tss` to contain the (target of the) first time series of `dataset.test`.

In [None]:
type(tss[0])

In [None]:
tss[0].plot()

The entries in the `forecast` list are a bit more complex. They are objects that contain all the sample paths in the form of `numpy.ndarray` with dimension `(num_samples, prediction_length)`, the start date of the forecast, the frequency of the time series, etc. We can access all these information by simply invoking the corresponding attribute of the forecast object.

In [None]:
print(f"Number of sample paths: {forecasts[0].num_samples}")
print(f"Dimension of samples: {forecasts[0].samples.shape}")
print(f"Start date of the forecast window: {forecasts[0].start_date}")
print(f"Frequency of the time series: {forecasts[0].freq}")

`Forecast` objects have a `plot` method that can summarize the forecast paths as the mean, prediction intervals, etc. The prediction intervals are shaded in different colors as a "fan chart".

In [None]:
forecasts[0].plot(color='g')

This way you can of course plot predictions alongside actual data:

In [None]:
tss[0][-150:].plot(figsize=(10,5))  # plot the time series
forecasts[0].plot(color='g')  # plot the predicted distribution
plt.grid(axis='both', which='both')

We can also do calculations to summarize the sample paths, such computing the mean or a quantile for each of the 48 time steps in the forecast window.

In [None]:
print(f"Mean of the forecast:\n {forecasts[0].mean}")
print(f"0.1-quantile of the forecast:\n {forecasts[0].quantile(0.1)}")
print(f"0.9-quantile of the forecast:\n {forecasts[0].quantile(0.9)}")

We can also evaluate the quality of our forecasts numerically. In GluonTS, the `Evaluator` class can compute aggregate performance metrics, as well as metrics per time series (which can be useful for analyzing performance across heterogeneous time series).

In [None]:
from gluonts.evaluation import Evaluator

In [None]:
evaluator = Evaluator(quantiles=[0.1, 0.5, 0.9])
agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(dataset.test))

Aggregate metrics aggregate both across time-steps and across time series.

In [None]:
pprint(agg_metrics)

Individual metrics are aggregated only across time-steps.

In [None]:
item_metrics.head()

In [None]:
item_metrics.plot(x='MSIS', y='MASE', kind='scatter')
plt.grid(which="both")
plt.show()

**Summary**: 

## 2.4 Exercise 1

Use a different model:
- Replace the `SimpleFeedForwardEstimator` with a `DeepAREstimator` 
- Configure (at least) the required hyperparameters (use the same `Trainer` as in `SimpleFeedForwardEstimator`)
- Set `solved = True` and check the result

`DeepAREstimator` documentation:  
https://gluon-ts.mxnet.io/api/gluonts/gluonts.model.deepar.html

In [None]:
# place your code here
from gluonts.model.deepar import DeepAREstimator

estimator = None

In [None]:
def evaluate_solution(predictor, test_ds, plot_idx=0):
    forecast_it, ts_it = make_evaluation_predictions(
        dataset=test_ds,  # test dataset
        predictor=predictor,  # predictor
        num_eval_samples=100,  # number of sample paths we want for evaluation
    )

    forecasts = list(forecast_it)
    tss = list(ts_it)
    
    tss[plot_idx][-150:].plot(figsize=(10,5))  # plot the time series
    forecasts[plot_idx].plot(color='g')  # plot the predicted distribution
    plt.grid(axis='both', which='both')

In [None]:
solved = False

if solved:
    predictor = estimator.train(dataset.train)
    evaluate_solution(predictor, dataset.test)

## 2.5 Exercise 2

Try to underfit `DeepAREstimator`:
- Add a large `dropout_rate` 
- Decrease significantly the network size (`num_layers` and `num_cells`)
- Decrease `context_length`
- Set `solved = True` and check the result

Try to plot more time series: use the `plot_idx` argument of `evaluate_solution`

In [None]:
# place your code here
estimator = None

In [None]:
solved = False

if solved:
    predictor = estimator.train(dataset.train)
    evaluate_solution(predictor, dataset.test)

# 3. Custom datasets and models

## 3.1 Datasets 

In general, a dataset should satisfy some minimum format requirements to be compatible with GluonTS. In particular:
- it should be an iterable collection of data entries (dictionaries) 
- each entry corresponds to one time series
- each entry should have at least a `target` field, which contains the actual values of the time series, and a `start` field, which denotes the starting date of the time series
- there are optional fields that define possible features

Let's revisit the `m4_hourly` dataset that we have already downloaded. We can examine the first entry of the dataset and see what is the underlying structure.

In [None]:
train_entry = next(iter(dataset.train))

In [None]:
type(train_entry)

In [None]:
train_entry.keys()

In [None]:
# start field: starting date of the time series
train_entry['start']

In [None]:
# target field: contains the time series values
train_entry['target'][:10]

In [None]:
# feat_static_cat: contains time independent (static) categorical features
train_entry['feat_static_cat']

The datasets provided by GluonTS are already in the appropriate format and they can be used without any further processing steps. However, custom datasets need to be converted. Fortunately this is an easy task since the only requirements is that it is iterable and that is has a `target` and a `start` field.

For example, suppose your dataset is in the form of a `numpy.array` where the initial time stamp is given as a string:

In [None]:
num_series = 100  
period = 24
num_steps = 10 * period  
prediction_length = 24
freq = "1H"

pattern = np.sin(np.tile(np.linspace(-np.pi, np.pi, period), int(num_steps / period)))
noise = np.random.normal(loc=1, scale=0.3, size=(num_series, num_steps))

target = pattern + noise

start = "01-01-2019 00:00:00"

In [None]:
plt.plot(target[0])
plt.grid(which="both")
plt.show()

Now, we can split your dataset and make it available to GluonTS with just two lines of code:

In [None]:
from gluonts.dataset.common import ListDataset

# train dataset: cut the last window of length "prediction_length", add "target" and "start" fields
train_ds = ListDataset(
    data_iter=[{'target': x, 'start': start} for x in target[:, :-prediction_length]],
    freq=freq
)

# test dataset: use the whole dataset, add "target" and "start" fields
test_ds = ListDataset(
    data_iter=[{'target': x, 'start': start} for x in target],
    freq=freq
)

That's it! Now we can simpy use `train_ds` and `test_ds` instead of `dataset.train` and `dataset.test`.

## 3.2 Exercise 3

Consider the following publicly available time series
* https://raw.githubusercontent.com/numenta/NAB/master/data/realTraffic/occupancy_6005.csv
* https://raw.githubusercontent.com/numenta/NAB/master/data/realTraffic/occupancy_t4013.csv

Try to:
1. Read in the data (hint: you can use `pandas.read_csv`)
2. Create a `ListDataset` composed of both time series (what is the frequency of the data?)
3. Slice out a training portion (you can choose where in time to do that)
4. Train a model with `prediction_length=12`
5. Plot the forecasts & evaluation metrics

## 3.3 Probabilistic forecasting with a feedforward neural network 

For creating our own forecast model we need the following basic components:
- a training network
- a prediction network
- an estimator that specifies any data processing and uses the networks

### Training network

The training network can be arbitrarily complex but it should follow some basic rules:
- It should have a `hybrid_forward` method that defines what should happen when the network is called    
- Its `hybrid_forward` should return a **loss** based on the prediction and the true values. The loss for probabilistic forecasting is usually the negative log-probability of the chosen distribution

#### How can our model learn a distribution?

In order to learn a distribution we need to learn its parameters. For example, in the simple case where we assume a Gaussian distribution, we need to learn the mean and the variance that fully specify the distribution.

Each distribution that is available in GluonTS is defined by the corresponding `Distribution` class (e.g., `Gaussian`). This class defines -among others- the parameters of the distribution, its (log-)likelihood and a sampling method (given the parameters). 

However, it is not straightforward how to connect a model with such a distribution and learn its parameters. For this, each distribution comes with a `DistributionOutput` class (e.g., `GaussianOutput`) that makes this connection possible. 

The main usage of `DistributionOutput` is to take the output tensor of the model and use its last dimension as features which maps to the parameters of the distribution. For example, if the output of the network is a tensor of dimension `(a,b)`, there will be `b` features that are going to be projected and create `a` different `Distribution` objects.

#### Design choices

- Create a simple training network that defines a neural network which takes as input a window of length `context_length` and outputs the subsequent window of dimension `prediction_length`
- We need to output a `Distribution` for each time step, i.e., `prediction_length` distribution objects. Therefore the network should output `prediction_length * num_features` parameters (where `num_features` can be a hyperparameter)
- The `DistributionOutput` should take as input a tensor of shape `(prediction_length, num_features)` to create `prediction_length` distributions. Therefore, we need to reshape the network output to `(prediction_length, num_features)` 
- We can choose to fit Gaussian distributions, i.e., use the `GaussianOutput`
- The `hybrid_forward` method of the training network returns the negative log-probability as loss

Note that in all the tensors that we handle, there is an initial dimension that refers to the batch, e.g., the actual output dimension of our network will be `(batch_size, prediction_length * num_features)`. 

In [None]:
from gluonts.distribution.distribution_output import DistributionOutput
from gluonts.distribution.gaussian import GaussianOutput

In [None]:
class MyProbNetwork(gluon.HybridBlock):
    def __init__(self, 
                 prediction_length, 
                 distr_output, 
                 num_cells, 
                 num_sample_paths=100, 
                 **kwargs
    ) -> None:
        super().__init__(**kwargs)
        self.prediction_length = prediction_length
        self.distr_output = distr_output
        self.num_cells = num_cells
        self.num_sample_paths = num_sample_paths
        self.proj_distr_args = distr_output.get_args_proj()

        with self.name_scope():
            # Set up a 2 layer neural network that its ouput will be projected to the distribution parameters
            self.nn = mx.gluon.nn.HybridSequential()
            self.nn.add(mx.gluon.nn.Dense(units=self.num_cells, activation='relu'))
            self.nn.add(mx.gluon.nn.Dense(units=self.prediction_length * self.num_cells, activation='relu'))

In [None]:
class MyProbTrainNetwork(MyProbNetwork):
    def hybrid_forward(self, F, past_target, future_target):
        # compute network output
        net_output = self.nn(past_target)

        # (batch, prediction_length * nn_features)  ->  (batch, prediction_length, nn_features)
        net_output = net_output.reshape(0, self.prediction_length, -1)

        # project network output to distribution parameters domain
        distr_args = self.proj_distr_args(net_output)

        # compute distribution
        distr = self.distr_output.distribution(distr_args)

        # negative log-likelihood
        loss = distr.loss(future_target)
        return loss

### Prediction network

The prediction network should be identical to the training network. Further, it should also comply to the following rule:
- The prediction network's `hybrid_forward` should return the predictions 

#### Design choices

- We want the prediction network to output sample paths for each time series. To achieve this we can repeat each time series as many times as the number of sample paths and do a standard forecast for each of them

In [None]:
class MyProbPredNetwork(MyProbTrainNetwork):
    # The prediction network only receives past_target and returns predictions
    def hybrid_forward(self, F, past_target):
        # repeat past target: from (batch_size, past_target_length)  
        # to (batch_size * num_sample_paths, past_target_length)
        repeated_past_target = past_target.repeat(
            repeats=self.num_sample_paths, axis=0
        )
        
        # compute network output
        net_output = self.nn(repeated_past_target)

        # from (batch * num_sample_paths, prediction_length * nn_features)  
        # to (batch * num_sample_paths, prediction_length, nn_features)
        net_output = net_output.reshape(0, self.prediction_length, -1)
       
        # project network output to distribution parameters domain
        distr_args = self.proj_distr_args(net_output)

        # compute distribution
        distr = self.distr_output.distribution(distr_args)

        # get (batch_size * num_sample_paths, prediction_length) samples
        samples = distr.sample()
        
        # reshape from (batch_size * num_sample_paths, prediction_length) to 
        # (batch_size, num_sample_paths, prediction_length)
        return samples.reshape(shape=(-1, self.num_sample_paths, self.prediction_length))

### Estimator

The estimator should comply with the following structure:
- It should include a `create_transformation` method that defines all the possible feature transformations and how the data is split during training
- It should include a `create_training_network` method that returns the training network configured with any necessary hyperparameters
- It should include a `create_predictor` method that creates the prediction network, and returns a `Predictor` object 

A `Predictor` defines the `predictor.predict` method of a given predictor. This method takes the test dataset, it passes it through the prediction network, and yields the predictions. You can think of the `Predictor` object as a wrapper of the prediction network that defines its `predict` method. 

In [None]:
from gluonts.model.estimator import GluonEstimator
from gluonts.model.predictor import Predictor, RepresentableBlockPredictor
from gluonts.core.component import validated
from gluonts.trainer import Trainer
from gluonts.support.util import copy_parameters
from gluonts.transform import ExpectedNumInstanceSampler, Transformation, InstanceSplitter, FieldName
from mxnet.gluon import HybridBlock

In [None]:
class MyProbEstimator(GluonEstimator):
    @validated()
    def __init__(
            self,
            prediction_length: int,
            context_length: int,
            freq: str,
            distr_output: DistributionOutput,
            num_cells: int,
            num_sample_paths: int = 100,
            trainer: Trainer = Trainer()
    ) -> None:
        super().__init__(trainer=trainer)
        self.prediction_length = prediction_length
        self.context_length = context_length
        self.freq = freq
        self.distr_output = distr_output
        self.num_cells = num_cells
        self.num_sample_paths = num_sample_paths

    def create_transformation(self):
        return InstanceSplitter(
            target_field=FieldName.TARGET,
            is_pad_field=FieldName.IS_PAD,
            start_field=FieldName.START,
            forecast_start_field=FieldName.FORECAST_START,
            train_sampler=ExpectedNumInstanceSampler(num_instances=1),
            past_length=self.context_length,
            future_length=self.prediction_length,
        )

    def create_training_network(self) -> MyProbTrainNetwork:
        return MyProbTrainNetwork(
            prediction_length=self.prediction_length,
            distr_output=self.distr_output,
            num_cells=self.num_cells,
            num_sample_paths=self.num_sample_paths
        )

    def create_predictor(
            self, transformation: Transformation, trained_network: HybridBlock
    ) -> Predictor:
        prediction_network = MyProbPredNetwork(
            prediction_length=self.prediction_length,
            distr_output=self.distr_output,
            num_cells=self.num_cells,
            num_sample_paths=self.num_sample_paths
        )

        copy_parameters(trained_network, prediction_network)

        return RepresentableBlockPredictor(
            input_transform=transformation,
            prediction_net=prediction_network,
            batch_size=self.trainer.batch_size,
            freq=self.freq,
            prediction_length=self.prediction_length,
            ctx=self.trainer.ctx,
        )

In [None]:
estimator = MyProbEstimator(
    prediction_length=prediction_length,
    freq=freq,
    context_length=2*prediction_length,
    distr_output=GaussianOutput(),
    num_cells=40,
    num_sample_paths=100,
    trainer=Trainer(
        ctx="cpu", 
        epochs=5, 
        learning_rate=1e-3, 
        hybridize=False,
        batch_size=32,
        num_batches_per_epoch=100,
    ),
)

The estimator can be trained using our training dataset `train_ds` just by invoking its `train` method. The training returns a predictor that can be used to predict.

In [None]:
predictor = estimator.train(train_ds)

In [None]:
forecast_it, ts_it = make_evaluation_predictions(
    dataset=test_ds,  # test dataset
    predictor=predictor,  # predictor
    num_eval_samples=100,  # number of sample paths we want for evaluation
)

In [None]:
forecast_entry = list(forecast_it)[0]
ts_entry = list(ts_it)[0]

## 3.5 Exercise 4

Convert the probabilistic feedforward model to a point forecast model:
- The output of the network should be `prediction_length` parameters
- We do not need a distribution
- We need to define a loss between the predictions and the true target for the training network
- The prediction network should output directly the predictions
- Set `solved = True` and check the result

In [None]:
class MyPointNetwork(gluon.HybridBlock):
    pass

class MyPointTrainNetwork(MyPointNetwork):    
    pass

class MyPointPredNetwork(MyPointTrainNetwork):
    pass

class MyPointEstimator(GluonEstimator):
    pass

In [None]:
estimator = None

In [None]:
solved = False

if solved:
    predictor = estimator.train(train_ds)
    evaluate_solution(predictor, test_ds)

## <span style="color:red">WARNING:</span> YOU ARE TOO CLOSE TO THE EXERCISE SOLUTIONS

# Appendix

## A.1 Solution to Exercise 1

In [None]:
from gluonts.model.deepar import DeepAREstimator

estimator = DeepAREstimator(
    # required
    prediction_length=dataset.metadata.prediction_length,
    freq=dataset.metadata.freq,
    # optional
    context_length=100,
    num_layers=2,
    num_cells=20,
    trainer=Trainer(
        ctx="cpu", 
        epochs=5, 
        learning_rate=1e-3, 
        batch_size=32,
        num_batches_per_epoch=100
    )
)

## A.2 Solution to Exercise 2

In [None]:
estimator = DeepAREstimator(
    # required
    prediction_length=dataset.metadata.prediction_length,
    freq=dataset.metadata.freq,
    # optional
    context_length=10,
    num_layers=1,
    num_cells=5,
    dropout_rate=0.9,
    trainer=Trainer(
        ctx="cpu", 
        epochs=3, 
        learning_rate=1e-3, 
        batch_size=32,
        num_batches_per_epoch=100
    )
)

## A.3 Solution to Exercise 3

In [None]:
import pandas as pd

df1 = pd.read_csv(
    "https://raw.githubusercontent.com/numenta/NAB/master/data/realTraffic/occupancy_6005.csv",
    index_col=0
)
df2 = pd.read_csv(
    "https://raw.githubusercontent.com/numenta/NAB/master/data/realTraffic/occupancy_t4013.csv",
    index_col=0
)

# we resample the dataframes to fill-in missing time points with NaN
df1.index = pd.to_datetime(df1.index)
df1 = df1.resample("5min").mean()

df2.index = pd.to_datetime(df2.index)
df2 = df2.resample("5min").mean()

dataset = ListDataset(
    data_iter=[
        {"start": df1.index[0], "target": df1.value.tolist()},
        {"start": df2.index[0], "target": df2.value.tolist()}
    ],
    freq="5min"
)

dataset_train = ListDataset(
    data_iter=[
        {"start": df1.index[0], "target": df1.value.tolist()[:-12]},
        {"start": df2.index[0], "target": df2.value.tolist()[:-12]}
    ],
    freq="5min"
)

from gluonts.model.deepar import DeepAREstimator
from gluonts.trainer import Trainer

estimator = DeepAREstimator(
    prediction_length=12, freq="5min", trainer=Trainer(epochs=20)
)

predictor = estimator.train(training_data=dataset_train)

In [None]:
forecast_it, ts_it = make_evaluation_predictions(
    dataset=dataset,  # test dataset
    predictor=predictor,  # predictor
    num_eval_samples=100,  # number of sample paths we want for evaluation
)

forecasts = list(forecast_it)
tss = list(ts_it)

evaluator = Evaluator(quantiles=[0.1, 0.5, 0.9])
agg_metrics, item_metrics = evaluator(iter(tss), iter(forecasts), num_series=len(dataset))

## A.4 Solution to Exercise 4

In [None]:
class MyPointNetwork(gluon.HybridBlock):
    def __init__(self, prediction_length, num_cells, **kwargs):
        super().__init__(**kwargs)
        self.prediction_length = prediction_length
        self.num_cells = num_cells
    
        with self.name_scope():
            # Set up a 3 layer neural network that directly predicts the target values
            self.nn = mx.gluon.nn.HybridSequential()
            self.nn.add(mx.gluon.nn.Dense(units=self.num_cells, activation='relu'))
            self.nn.add(mx.gluon.nn.Dense(units=self.num_cells, activation='relu'))
            self.nn.add(mx.gluon.nn.Dense(units=self.prediction_length, activation='softrelu'))

class MyPointTrainNetwork(MyPointNetwork):    
    def hybrid_forward(self, F, past_target, future_target):
        prediction = self.nn(past_target)
        # calculate L1 loss with the future_target to learn the median
        return (prediction - future_target).abs().mean(axis=-1)


class MyPointPredNetwork(MyPointTrainNetwork):
    # The prediction network only receives past_target and returns predictions
    def hybrid_forward(self, F, past_target):
        prediction = self.nn(past_target)
        return prediction.expand_dims(axis=1)

In [None]:
class MyPointEstimator(GluonEstimator):
    @validated()
    def __init__(
        self,
        prediction_length: int,
        context_length: int,
        freq: str,
        num_cells: int,
        trainer: Trainer = Trainer()
    ) -> None:
        super().__init__(trainer=trainer)
        self.prediction_length = prediction_length
        self.context_length = context_length
        self.freq = freq
        self.num_cells = num_cells
            
    def create_transformation(self):
        return InstanceSplitter(
                    target_field=FieldName.TARGET,
                    is_pad_field=FieldName.IS_PAD,
                    start_field=FieldName.START,
                    forecast_start_field=FieldName.FORECAST_START,
                    train_sampler=ExpectedNumInstanceSampler(num_instances=1),
                    past_length=self.context_length,
                    future_length=self.prediction_length,
                )
    
    def create_training_network(self) -> MyPointTrainNetwork:
        return MyPointTrainNetwork(
            prediction_length=self.prediction_length,
            num_cells = self.num_cells
        )

    def create_predictor(
        self, transformation: Transformation, trained_network: HybridBlock
    ) -> Predictor:
        prediction_network = MyPointPredNetwork(
            prediction_length=self.prediction_length,
            num_cells=self.num_cells
        )

        copy_parameters(trained_network, prediction_network)

        return RepresentableBlockPredictor(
            input_transform=transformation,
            prediction_net=prediction_network,
            batch_size=self.trainer.batch_size,
            freq=self.freq,
            prediction_length=self.prediction_length,
            ctx=self.trainer.ctx,
        )

In [None]:
estimator = MyPointEstimator(
    prediction_length=prediction_length,
    freq=freq,
    context_length=2*prediction_length,
    num_cells=40,
    trainer=Trainer(
        ctx="cpu", 
        epochs=5, 
        learning_rate=1e-3, 
        hybridize=False, 
        num_batches_per_epoch=100
    )
)