Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Automated Machine Learning
**Demand Forecasting Using TCN**

## Contents
1. [Introduction](#Introduction)
1. [Setup](#Setup)
1. [Data](#Data)
1. [Train TCN](#TrainTCN)
1. [Train Baseline](#TrainBaseline)
1. [Test Set Inference](#TestSetInference)
1. [Test Set Evaluation](#TestSetEvaluation)
1. [Generate Forecast](#GenerateForecast)
1. [Schedule Inference Pipelines](#ScheduleInference)

## 1. Introduction

The objective of this notebook is to illustrate how to use the AutoML deep learning model, temporal convolutional network (TCN), for demand forecasting tasks. It walks you through all stages of model evaluation and production process starting with data ingestion and concluding with scheduling inference runs. For more information on the TCN model in AutoML refer to this [publication](https://learn.microsoft.com/en-us/azure/machine-learning/concept-automl-forecasting-deep-learning).

We use a subset of UCI electricity data ([link](https://archive.ics.uci.edu/ml/datasets/ElectricityLoadDiagrams20112014#)) with the objective of predicting electricity demand per consumer 24 hours ahead. The data was preprocessed using the [data prep notebook](https://github.com/Azure/azureml-examples/blob/main/v1/python-sdk/tutorials/automl-with-azureml/forecasting-data-preparation/auto-ml-forecasting-data-preparation.ipynb) notebook. Please refer to it for illustration on how to download the data from the source, aggregate to an hourly frequency, convert from wide to long format and upload to the Datastore. Here, we will work with the already uploaded data. 

Having a problem description such as to generate accurate forecasts 24 hours ahead sounds like a relatively straight forward task. However, there are quite a few steps a user needs to take before the model is put in production. A user needs to prepare the data, partition it into appropriate sets, select the best model, evaluate it against a baseline, and monitor the model in real life to collect enough observations on how it would perform had it been put in production. Some of these steps are time consuming, some require certain expertise in writing code. The steps shown in this notebook follow a typical thought process one follows before the model is put in production.

Make sure you have executed the [configuration](https://github.com/Azure/MachineLearningNotebooks/blob/master/configuration.ipynb) before running this notebook.

<!-- Both training and inferencing is done using pipelines.

Relies on the following scripts:

- `helper_scripts.py`: Inference script that is executed on the remote compute
- `register_model.py`: needed for the training stage. Registers model in the workspace.
- `inference_script_naive.py`: Inference script for the Naive model
- `inference_script_tcn.py`: Inference script for the TCN model
-->

## 2. Setup

In [None]:
import json
import logging
import os

from matplotlib import pyplot as plt
import pandas as pd

import azureml.core
from azureml.core.experiment import Experiment
from azureml.core.workspace import Workspace
from azureml.train.automl import AutoMLConfig

This sample notebook may use features that are not available in previous versions of the Azure ML SDK.

In [None]:
print("This notebook was created using version 1.47.0 of the Azure ML SDK")
print("You are currently using version", azureml.core.VERSION, "of the Azure ML SDK")

Accessing the Azure ML workspace requires authentication with Azure.

The default authentication is an interactive authentication using the default tenant. Executing the ws = Workspace.from_config() line in the cell below will prompt for authentication the first time it is run.

If you have multiple Azure tenants, you can specify the tenant by replacing the ws = Workspace.from_config() line in the cell below with the following:
```
from azureml.core.authentication import InteractiveLoginAuthentication
auth = InteractiveLoginAuthentication(tenant_id = 'mytenantid')
ws = Workspace.from_config(auth = auth)
```
If you need to run in an environment where interactive login is not possible, you can use Service Principal authentication by replacing the ws = Workspace.from_config() line in the cell below with the following:
```
from azureml.core.authentication import ServicePrincipalAuthentication
auth = ServicePrincipalAuthentication('mytenantid', 'myappid', 'mypassword')
ws = Workspace.from_config(auth = auth)
```
For more details, see [this link](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azureml.ipynb).

In [None]:
import datetime
import uuid

ws = Workspace.from_config()
datastore = ws.get_default_datastore()

# Choose a name for the run history container in the workspace.
experiment_name = "forecasting-pipeline-tcn-" + datetime.datetime.now().strftime(
    "%Y%m%d"
)
experiment = Experiment(ws, experiment_name)

output = {}
output["Subscription ID"] = ws.subscription_id
output["Workspace"] = ws.name
output["Resource Group"] = ws.resource_group
output["Location"] = ws.location
output["Run History Name"] = experiment_name
pd.set_option("display.max_colwidth", None)
outputDf = pd.DataFrame(data=output, index=[""])
outputDf.T

### 2.1. Compute 

#### Create or Attach existing AmlCompute

You will need to create a compute target for your AutoML run. In this tutorial, you will create AmlCompute as your training compute resource.

> Note that if you have an AzureML Data Scientist role, you will not have permission to create compute resources. Talk to your workspace or IT admin to create the compute targets described in this section, if they do not already exist.


To run deep learning models we recommend to use GPU compute. Here, we use a 12 node cluster of the `Standard_NC8as_T4_v3` [series](https://learn.microsoft.com/en-us/azure/virtual-machines/nct4-v3-series) for illustration purposes. You will need to adjust the compute type and the number of nodes based on your needs which can be driven by the speed needed for model seelction, data size, etc. 

#### Creation of AmlCompute takes approximately 5 minutes. 
If the AmlCompute with that name is already in your workspace, this code will skip the creation process.
As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota.

In [None]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# Choose a name for your CPU cluster
amlcompute_cluster_name = "demand-fcst-gpu-cluster"

# Verify that cluster does not exist already
try:
    compute_target = ComputeTarget(workspace=ws, name=amlcompute_cluster_name)
    print("Found existing cluster, use it.")
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(
        vm_size="Standard_NC8as_T4_v3", max_nodes=12, vm_priority="lowpriority"
    )
    compute_target = ComputeTarget.create(ws, amlcompute_cluster_name, compute_config)
compute_target.wait_for_completion(show_output=True)

## 3. Data
If you ran the data preparation notebook [link placeholder]() and want to use the registered data, skip section 3.1 and, instead, uncomment and execute the code in section 3.2. If, on the other hand, you did not run the notebook and want to use the data that we pre-processed and saved in the public blob, execute the code in section 3.1.

### 3.1 Loading and registering the data from public blob store

Run the code in this section only if you want to use the data that is already available in the blobstore. If you want to use your own data that is already registered in your workspace, skip this section and procceed to run the commented out code in section 3.2.

The following code registers a datastore `autom_fcst_tcn` in your workspace and links the data from the container `automl-sample-notebook-data`.

In [None]:
from azureml.core import Datastore

# Please change the following to point to your own blob container and pass in account_key
blob_datastore_name = "automl_fcst_tcn"
container_name = "automl-sample-notebook-data"
account_name = "automlsamplenotebookdata"

print(f'Creating datastore "{blob_datastore_name}" in your workspace ...\n---')
demand_tcn_datastore = Datastore.register_azure_blob_container(
    workspace=ws,
    datastore_name=blob_datastore_name,
    container_name=container_name,
    account_name=account_name,
    create_if_not_exists=True,
)

The following code registers datasets from the `automl-sample-notebook-data` container in the datastore we just created. Once the datasets are registered, we will be able to use them in our experiments.

In [None]:
from azureml.core import Dataset

print("Registering datasets in your workspace ...\n---")

FOLDER_PREFIX_NAME = "uci_electro_small_public_tcn"

target_path_train = f"{FOLDER_PREFIX_NAME}_train"
target_path_valid = f"{FOLDER_PREFIX_NAME}_valid"
target_path_test = f"{FOLDER_PREFIX_NAME}_test"
target_path_inference = f"{FOLDER_PREFIX_NAME}_infer"

train_dataset = Dataset.Tabular.from_delimited_files(
    path=demand_tcn_datastore.path(target_path_train + "/"),
    validate=False,
    infer_column_types=True,
).register(workspace=ws, name=target_path_train, create_new_version=True)

valid_dataset = Dataset.Tabular.from_delimited_files(
    path=demand_tcn_datastore.path(target_path_valid + "/"),
    validate=False,
    infer_column_types=True,
).register(workspace=ws, name=target_path_valid, create_new_version=True)

test_dataset = Dataset.Tabular.from_delimited_files(
    path=demand_tcn_datastore.path(target_path_test + "/"),
    validate=False,
    infer_column_types=True,
).register(workspace=ws, name=target_path_test, create_new_version=True)

inference_dataset = Dataset.Tabular.from_delimited_files(
    path=demand_tcn_datastore.path(target_path_inference + "/"),
    validate=False,
    infer_column_types=True,
).register(workspace=ws, name=target_path_inference, create_new_version=True)

### 3.2 Using data that is registered in your workspace

If you ran the [data prep notebook link placeholder]() notebook, the partitioned data is already uploaded and registered in your workspace. Uncomment the following code and change the `DATASET_PREFIX_NAME`, to match the value in the data preparation notebook, and run the code.

In [None]:
# from azureml.data.dataset_factory import TabularDatasetFactory
# from azureml.core.dataset import Dataset

# DATASET_PREFIX_NAME = "uci_electro_small_tcn"
# print(f'Dataset prefix name: {DATASET_PREFIX_NAME}\n---\nLoading train, validation, test and inference sets ...\n---')

# target_path_train = f"{DATASET_PREFIX_NAME}_train"
# target_path_valid = f"{DATASET_PREFIX_NAME}_valid"
# target_path_test = f"{DATASET_PREFIX_NAME}_test"
# target_path_inference = f"{DATASET_PREFIX_NAME}_inference"

# train_dataset = Dataset.get_by_name(ws, name=target_path_train)
# valid_dataset = Dataset.get_by_name(ws, name=target_path_valid)
# test_dataset = Dataset.get_by_name(ws, name=target_path_test)
# inference_dataset = Dataset.get_by_name(ws, name=target_path_inference)

### 3.3 Test and inference sets

Note that we have *test* and *inference* sets. The difference between the two is the presence of the target column. The test set contains the target column and is used to evaluate model performance using [rolling forecast](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-auto-train-forecast-v1#evaluating-model-accuracy-with-a-rolling-forecast). On the other hand, the target column is not present in the inference set to illustrate how to generate an actual forecast.

In [None]:
print("The first few rows of the test set ...\n---")
print(test_dataset.to_pandas_dataframe().head())

In [None]:
print("The first few rows of the inference set ...\n---")
print(inference_dataset.to_pandas_dataframe().head())

Let's set up what we know about the dataset.

- **Target column** is what we want to forecast. In our case it is electricity consumption per customer measured in kilowatt hours (kWh).
- **Time column** is the time axis along which to predict.
- **Time series identifier columns** are identified by values of the columns listed `time_series_id_column_names`. In our case all unique time series are identified by a single column `customer_id`. However, it is quite common to have multiple columns identifying unique time series. See the [link](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-forecast#configuration-settings) for a more detailed explanation on this topic.

In [None]:
target_column_name = "usage"
time_column_name = "datetime"
GRAIN_COL = "customer_id"
time_series_id_column_names = [GRAIN_COL]

Next, we download training data from the Datastore to make sure it looks as expected.

In [None]:
train_df = train_dataset.to_pandas_dataframe()

nseries = train_df.groupby(time_series_id_column_names).ngroups
print(
    f"Data contains {nseries} individual time-series:\n{list(train_df[GRAIN_COL].unique())}\n---"
)
print("Printing the first few rows of the training data ...\n---")
train_df.head()

## 4. Train TCN

In this section we will train and select the best TCN model as well as the baseline model. The baseline model will be used as a reference point to understand TCN's accuracy performance. The goal of forecasting is to have the most accurate predictions measured by some accuracy metric. What is considered an accurate prediction is fairly subjective. Take, for example, the MAPE (mean absolute percentage error) metric. A perfect forecast will result in the MAPE value of zero, which is not achievable using business data. For this reason it is imperative to have a baseline model to compare TCN results against. Doing this adds objectivity to the model acceptance criteria. 

The baseline model can be the model that is currently in production. Oftentimes, the baseline is set to be a Naive forecast, which we will use in this notebook. The choice of the baseline is also specific to the data. For example, if there is a clear trend in the data one may not want to use a Naive model.  Instead, one can use an ARIMA model. Please see this [document](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-configure-auto-train-v1#supported-models) for a list of AutoML models one can chose from to use as a baseline model.

The following 2 parameters allow us to re-use training runs for the TCN and baseline models, respectively. This can be helpful it you need to experiment with the post model training steps thus avoiding the need to train a new model which can be computationally expensive.

In [None]:
IS_TCN_MODEL_TRAINED = False
IS_BASE_MODEL_TRAINED = False

### 4.1 Train AutoML model

#### 4.1.1 Set up training parameters
We need to provide the `ForecastingParameters` and `AutoMLConfig` objects. For the forecasting task we also need to define several settings including the name of the time column, the maximum forecast horizon, and the partition column name(s) definition.

#### Forecasting Parameters
To define forecasting parameters for your experiment training, you can leverage the `ForecastingParameters` class. The table below details the forecasting parameters we will be passing into our experiment.


|Property|Description|
|-|-|
|**time_column_name**|The name of the time column in the data.|
|**forecast_horizon**|The forecast horizon is how many periods forward you would like to forecast. This integer horizon is in units of the timeseries frequency (e.g. daily, weekly).|
|**time_series_id_column_names**|The column names used to uniquely identify the time series in data that has multiple rows with the same timestamp. If the time series identifiers are not defined, the data set is assumed to be one time series.|
|**freq**|Forecast frequency. This optional parameter represents the period for which the forecast is desired, for example, daily, weekly, yearly, etc. Use this parameter for the correction of time series containing irregular data points or for padding of short time series. The frequency needs to be a pandas offset alias. Please refer to [pandas documentation](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) for more information.


#### AutoMLConfig arguments
|Property|Description|
|-|-|
| **task**                           | forecasting |
| **primary_metric**                 | This is the metric that you want to optimize. Forecasting supports the following primary metrics<ul><li>`normalized_root_mean_squared_error`</li><li>`normalized_mean_absolute_error`</li><li>`spearman_correlation`</li><li>`r2_score`</li></ul> We recommend using either the normalized root mean squared error or normalized mean absolute erorr as a primary metric because they measure forecast accuracy. See the [link](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-automl-forecasting-faq#how-do-i-choose-the-primary-metric) for a more detailed discussion on this topic. |
| **experiment_timeout_hours**       | Maximum amount of time in hours that each experiment can take before it terminates. This is optional but provides customers with greater control on exit criteria. When setting this criteria we advise to take into account the number of desired iterations parameter and set experiment timeout setting such that the desired number of iterations will be completed.|
| **iterations**                     | Number of models to train. This is optional but provides customers with greater control on exit criteria. For TCN models we recommend to have at least 50 iterations to choose the best architecture. For our experiment we will set the number of iterations to 100, however, due to the experiment timeout settings being 1 hour on the specified compute cluster, we will not obtain 100 completions. |
| **label_column_name**              | The name of the target column we are trying to predict. |
| **enable_early_stopping**          | Flag to enable early termination if the primary metric is no longer improving. |
| **max_concurrent_iterations**      | Number of TCN models that are estimated simultaneously. This number should be set to the number of nodes in your cluster. In our case, we have a 12 node cluster and set this value to 12. |
| **enable_dnn**                     | Enable Forecasting DNNs. The default value is `False`. |
| **allowed_models**                 | List of models we want to consider. Since we are only interested in the deep learning models, we list only the `TCNForecaster`.|

In [None]:
from azureml.automl.core.forecasting_parameters import ForecastingParameters

forecast_horizon = 24

forecasting_parameters = ForecastingParameters(
    time_column_name=time_column_name,
    forecast_horizon=forecast_horizon,
    time_series_id_column_names=time_series_id_column_names,
    freq="H",
)

automl_config = AutoMLConfig(
    task="forecasting",
    debug_log="tcn_main.log",
    primary_metric="normalized_root_mean_squared_error",
    experiment_timeout_hours=1,
    iterations=100,
    training_data=train_dataset,
    validation_data=valid_dataset,
    label_column_name=target_column_name,
    compute_target=compute_target,
    enable_early_stopping=True,
    verbosity=logging.INFO,
    max_concurrent_iterations=12,
    max_cores_per_iteration=-1,
    enable_dnn=True,
    allowed_models=["TCNForecaster"],
    forecasting_parameters=forecasting_parameters,
)

### 4.2 Construct pipeline steps

The objective of the next block of code is to create an AzureML pipeline step that encapsulates the AutoML run. For more details see this [link](https://learn.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps.automlstep?view=azure-ml-py). You do not have to change anything here, so run it as is.

In [None]:
from azureml.pipeline.core import PipelineData, TrainingOutput
from azureml.pipeline.steps import AutoMLStep, PythonScriptStep
from azureml.pipeline.core import Pipeline, PipelineParameter

metrics_output_name = "metrics_output"
best_model_output_name = "best_model_output"
model_file_name = "model_file"
metrics_data_name = "metrics_data"

metrics_data = PipelineData(
    name=metrics_data_name,
    datastore=datastore,
    pipeline_output_name=metrics_output_name,
    training_output=TrainingOutput(type="Metrics"),
)
model_data = PipelineData(
    name=model_file_name,
    datastore=datastore,
    pipeline_output_name=best_model_output_name,
    training_output=TrainingOutput(type="Model"),
)

automl_step = AutoMLStep(
    name="automl_module",
    automl_config=automl_config,
    outputs=[metrics_data, model_data],
    allow_reuse=False,
)

### 4.3 Register model step

#### 4.3.1 Run Configuration and Environment
To have a pipeline step run, we first need an environment to run the jobs. The environment can be built using the following code and you do not have to change anything here.

In [None]:
from azureml.core.runconfig import CondaDependencies, RunConfiguration

# create a new RunConfig object
conda_run_config = RunConfiguration(framework="python")

# Set compute target to AmlCompute
conda_run_config.target = compute_target

conda_run_config.docker.use_docker = True

cd = CondaDependencies.create(
    pip_packages=[
        "azureml-sdk[automl]",
        "applicationinsights",
        "azureml-opendatasets",
        "azureml-defaults",
    ],
    conda_packages=["numpy==1.19.5"],
    pin_sdk_version=False,
)
conda_run_config.environment.python.conda_dependencies = cd

print("run config is ready")

#### 4.3.2 Step to register the model.
The following code generates a step to register the model in your workspace from the previous step. Once you have the environment created, you are ready to define your pipeline's steps. There are many built-in steps available via the Azure Machine Learning SDK, as you can see on the [reference documentation](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps?view=azure-ml-py) for the `azureml.pipeline.steps` package. The most flexible class is *PythonScriptStep*, which runs a Python script.

The arguments values specify the inputs and outputs of the step. In the example below, the data is the `uci_electro_small_train` dataset. The script `register_model.py` registers the best model from the training run in the workspace. This step will run on the machine defined by the `compute_target` parameter using the configuration `conda_run_config`.

Please note that in order to use the already completed training pipeline for the TCN to generate predictions, we need to know the model name under which the best model is registered in the workspace. To avoid confusion and registering a different model with the same name, we give the model a unique name that is based on date and time of the day. As a result, you will need to record the model name that is printed in the console when the `IS_TCN_MODEL_TRAINED` parameter is `False` in order to use it later.

In [None]:
# The model name with which to register the trained model in the workspace.
if not IS_TCN_MODEL_TRAINED:
    model_name_str = "uci-small-tcn-model-" + datetime.datetime.now().strftime(
        "%Y%m%d%H%M"
    )
    model_name = PipelineParameter("model_name", default_value=model_name_str)
    print(
        f"A new model will be registered under the name: {model_name_str}\n\
    Please record this in order to use in the subsequent inference runs.\n---"
    )
else:
    model_name_str = (
        "<Replace with the output of model_name_str during the model training stage.>"
    )
    model_name = PipelineParameter("model_name", default_value=model_name_str)
    print(f"Using trained model name: {model_name_str}\n---")

register_model_step = PythonScriptStep(
    script_name="register_model.py",
    name="register_model",
    source_directory="scripts",
    allow_reuse=False,
    arguments=[
        "--model_name",
        model_name,
        "--model_path",
        model_data,
        "--ds_name",
        target_path_train,
    ],
    inputs=[model_data],
    compute_target=compute_target,
    runconfig=conda_run_config,
)

In [None]:
print(f"Model name: {model_name_str}\n---")

#### 4.3.3 Build pipeline

Next, we build the pipeline and kick off a run which will select the best TCN model and register it in the workspace.

In [None]:
training_pipeline = Pipeline(
    description="training_pipeline_tcn",
    workspace=ws,
    steps=[automl_step, register_model_step],
)

In [None]:
print(f"Is the TCN model trained? {IS_TCN_MODEL_TRAINED}\n---")

In [None]:
if not IS_TCN_MODEL_TRAINED:
    print("Training new AutoML model ...\n---")
    training_pipelin_run = experiment.submit(training_pipeline)
    training_pipelin_run.wait_for_completion(show_output=False)
    IS_TCN_MODEL_TRAINED = True
else:
    from azureml.train.automl.run import AutoMLRun
    from azureml.pipeline.core.run import PipelineRun

    PIPELINE_RUN_ID = (
        "<Replace with the PipelineRunId used for training the best TCN model>"
    )
    training_pipelin_run = PipelineRun(experiment=experiment, run_id=PIPELINE_RUN_ID)
    print(f"Using previously trained model. Pipeline run ID: {PIPELINE_RUN_ID}\n---")

## 5. Train the baseline model

We will use Naive model as our baseline. To train it, we kick off another automl experiment with the following settings. Please note that we disabled DNN (`enable_dnn=False`), added the Naive model to the allowed models list, and set the number of iterations to 1 since we only interested in training one specific model. To reduce the training time for non-deep learning models, we set the number of cross validations to 2. Read the following [document](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-auto-train-forecast-v1#training-and-validation-data) for more information on this topic. Note that we did not use the cross validations settings for the TCN model training because such models typically require validation set that are substantially longer than a small multiple of a forecast horizon.

The only `AutoMLConfig` settings you might consider changing are the `experiment_timeout_hours` and `allowed_models`. You might want to increase the experiment timeout if your data has lots of unique time series. The allowed model list can be modified to refect a different choice of the baseline model and can be selected from the supported [forecasting models](https://learn.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.forecasting) and [regression models](https://learn.microsoft.com/en-us/python/api/azureml-train-automl-client/azureml.train.automl.constants.supportedmodels.regression).

In [None]:
automl_config = AutoMLConfig(
    task="forecasting",
    debug_log="baseline.log",
    primary_metric="normalized_root_mean_squared_error",
    experiment_timeout_hours=1,
    iterations=1,
    training_data=train_dataset,
    label_column_name=target_column_name,
    compute_target=compute_target,
    enable_early_stopping=True,
    n_cross_validations=2,
    verbosity=logging.INFO,
    max_cores_per_iteration=-1,
    enable_dnn=False,
    allowed_models=["Naive"],
    forecasting_parameters=forecasting_parameters,
)

In [None]:
if not IS_BASE_MODEL_TRAINED:
    remote_base_run = experiment.submit(automl_config, show_output=False)
    remote_base_run.wait_for_completion(show_output=False)
    IS_BASE_MODEL_TRAINED = True
else:
    BASE_RUN_ID = "<Replace with the run ID used for training the baseline model>"
    # during the initial training run copy-paste the run id to be utilized later if nedded.
    remote_base_run = AutoMLRun(experiment=experiment, run_id=BASE_RUN_ID)
    print(f"Using previously trained model. Run ID: {BASE_RUN_ID}\n---")

## 6. Test set inference 

### 6.1 Inference the best TCN model

We create an output folder which will be used to save the output of our experiments.

In [None]:
# create an output folder
OUTPUT_DIR = os.path.join(os.getcwd(), "forecast_output")
os.makedirs(OUTPUT_DIR, exist_ok=True)

### 6.1.1 Get Inference Pipeline Environment
To trigger an inference pipeline run, we first need a running environment that contains all the appropriate packages for the model unpickling. This environment can be either assessed from the training run or using the `yml` file that comes with the model. In this section we use the environment from the training run.

In [None]:
from scripts.register_model import get_best_automl_run

best_run = get_best_automl_run(training_pipelin_run)
inference_env = best_run.get_environment()

In [None]:
from azureml.core.runconfig import RunConfiguration

run_config = RunConfiguration()
run_config.environment = inference_env

### 6.1.2 Build and submit the inference pipeline

The inference pipeline will create two different output formats, 1) a tabular dataset that contains the prediction and 2) an `OutputFileDatasetConfig` that can be used for the sequential pipeline steps. The `inference_script_tcn.py` performs a rolling evaluation on the test set with the step size of 24 hours. This generates the same results as if the model was put in production and generated a 24-hour forecast once a day using the most recent data context. This is an efficient way of testing historical model performance at a tiny fraction of the compute and time commitment costs. For a more detailed information on rolling evaluation see the this [document](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-auto-train-forecast-v1#evaluating-model-accuracy-with-a-rolling-forecast) for a more detailed explanation.

<!-- The test_step uses the following arguments:<ul><li>`spearman_correlation`</li><li>`normalized_root_mean_squared_error`</li><li>`r2_score`</li><li>`normalized_mean_absolute_error`</li></ul> -->

The `PythonScriptStep` uses the following arguments:<ul>
    <li> `model_name`: the name under which the model was registered in the workspace. </li>
    <li> `ouput_dataset_name`: name of the dataset under which the test set predictions will registered in the datastore</li>
    <li> `test_dataset_name`: name of the test dataset </li>
    <li> `target_column_name`: target column name</li>
    <li> `time_column_name`: time column name</li>
    <li> `output_path`: output data that is obtained from prediction step. </li>
    <li> `run_rolling_evaluation`: Boolean parameter. Set to True to run rolling evaluation. Otherwise, runs a recursive forecast for the entire test set. If the length of the test set is larger than the forecast horizon the model was trained for, the recursive forecast method recursively applies the regular forecaster to generate context so that we can forecast further into the future. For more information, see the "Recursive forecasting" section in [this notebook](https://github.com/Azure/azureml-examples/blob/main/v1/python-sdk/tutorials/automl-with-azureml/forecasting-forecast-function/auto-ml-forecasting-function.ipynb).</li>
    <li> `rolling_evaluation_step_size`: Optional parameter that instructs the rolling forecaster how many periods to step forward. See this [document](https://learn.microsoft.com/en-us/azure/machine-learning/v1/how-to-auto-train-forecast-v1#evaluating-model-accuracy-with-a-rolling-forecast) for a more detailed explanation. </li></ul>
    
The only parameters you may consider changing in this sections are the `run_rolling_evaluation` and the `rolling_evaluation_step_size`. The rest should be left as is.

In [None]:
from azureml.data import OutputFileDatasetConfig

output_data = OutputFileDatasetConfig(name="prediction_result")

output_ds_name = "uci-tcn-test-output"

test_step = PythonScriptStep(
    name="infer-results",
    source_directory="scripts",
    script_name="inference_script_tcn.py",
    arguments=[
        "--model_name",
        model_name_str,
        "--ouput_dataset_name",
        output_ds_name,
        "--test_dataset_name",
        test_dataset.name,
        "--target_column_name",
        target_column_name,
        "--time_column_name",
        time_column_name,
        "--output_path",
        output_data,
        "--run_rolling_evaluation",
        True,
        "--rolling_evaluation_step_size",
        forecast_horizon,
    ],
    compute_target=compute_target,
    allow_reuse=False,
    runconfig=run_config,
)

In [None]:
test_pipeline = Pipeline(ws, [test_step])
test_run = experiment.submit(test_pipeline)

In [None]:
test_run.wait_for_completion(show_output=False)

### 6.1.3 Get the predicted data

We download the test set predictions and actuals, and store them in the `backtest_tcn` object, which in turn is saved to the output folder. We chose the name *backtest* to reflect the fact that the rolling evaluation on the test set is equivalent to monitoring model's performance in real life had it been put in production for the duration of the test set and generated forecast every 24 hours using the most recent data.

In [None]:
from azureml.core import Dataset

backtest_tcn = Dataset.get_by_name(ws, output_ds_name)
backtest_tcn = backtest_tcn.to_pandas_dataframe()
backtest_tcn.to_csv(os.path.join(OUTPUT_DIR, "predictions-tcn.csv"), index=False)
backtest_tcn.tail(5)

## 6.2 Inference the baseline model

Next, we perform a rolling evaluation on the test set for the baseline model. To do this, we use the `run_remote_inference` method which downloads the pickle file of the model into the temporary folder `forecast_naive` and copies the `inference_script_naive.py` file to it. This folder is then uploaded on the compute cluster where inference is performed. The `inference_script_naive.py` script performs a rolling evaluation on the test set, similarly to what we have done for the TCN model. Upon completion of this step, we delete the newly created `forecast_naive` folder.

In [None]:
test_experiment_base = Experiment(ws, experiment_name + "_inference_base")

baseline_run = remote_base_run.get_best_child()
baseline_model_name = baseline_run.properties["model_name"]

In [None]:
import shutil
from scripts.helper_scripts import run_remote_inference

if True:
    remote_base_run_test = run_remote_inference(
        test_experiment=test_experiment_base,
        compute_target=compute_target,
        train_run=baseline_run,
        test_dataset=test_dataset,
        target_column_name=target_column_name,
        rolling_evaluation_step_size=forecast_horizon,
        inference_folder="./forecast_naive",
    )
    remote_base_run_test.wait_for_completion(show_output=False)

    # download the forecast file to the local machine
    remote_base_run_test.download_file(
        "outputs/predictions.csv", os.path.join(OUTPUT_DIR, "predictions-base.csv")
    )

    # delete downloaded scripts
    shutil.rmtree("./forecast_naive")

## 7. Test set model evaluation

In this section we will evaluate the test set performance for the best TCN model and compare it with the baseline. We will generate time series plots for forecasts and actuals, calculate accuracy metrics and plot the evolution of metrics for each model over time. All output from this section will be stored in the `forecast_output` folder and can be referenced any time you need it.

### 7.1 Load test set results

Here, we will import test set results for the TCN and the baseline experiments.

In [None]:
backtest_tcn = pd.read_csv(
    os.path.join(OUTPUT_DIR, "predictions-tcn.csv"), parse_dates=[time_column_name]
)
backtest_base = pd.read_csv(
    os.path.join(OUTPUT_DIR, "predictions-base.csv"), parse_dates=[time_column_name]
)

Next, we combine outputs into a single dataframe which will be used for plotting and scoring.

In [None]:
backtest = backtest_tcn.merge(
    backtest_base.drop(target_column_name, axis=1),
    on=["customer_id", "datetime"],
    how="inner",
    suffixes=["", "_base"],
)
backtest.head()

In [None]:
print(
    f"N model: {backtest_tcn.shape[0]}. N baseline: {backtest_base.shape[0]}. N merged: {backtest.shape[0]}"
)

The data we are working with has an hourly frequency and we plan to generate the forecasts every 24 hours. If the model were to be put in production such that the forecasts are generated and model's performance is monitored every 24 hours, we will mimic the scoring process on the test set by generating daily accuracy metrics. To do this, we create a date column ("ymd"). If you want to score the output at any other frequency, say, weekly, just change the frequency parameter to the desired frequency.

In [None]:
PERIOD_COLUMN = "ymd"

backtest[PERIOD_COLUMN] = backtest[time_column_name].dt.to_period(
    "D"
)  # year-month-day to be used for daily metrics computation
backtest.head()

### 7.2 Generate time series plots

Here, we generate forecast versus actuals plot for the test set for both the best TCN model and the baseline. Since we use rolling evaluation with the step size of 24 hours, this mimics the behavior of putting both models in production and monitoring their behavior for the duration of the test set. This step allows users to make informed decisons about model performance and saves numerous costs associated with productionalizing the model and monitoring its performance in real life. 

In [None]:
from scripts.helper_scripts import _draw_one_plot
from matplotlib import pyplot as plt
from matplotlib.backends.backend_pdf import PdfPages

plot_filename = "forecast_vs_actual.pdf"

pdf = PdfPages(os.path.join(os.getcwd(), OUTPUT_DIR, plot_filename))
for _, one_forecast in backtest.groupby(GRAIN_COL):
    one_forecast[time_column_name] = pd.to_datetime(one_forecast[time_column_name])
    one_forecast.sort_values(time_column_name, inplace=True)
    _draw_one_plot(
        one_forecast,
        time_column_name,
        target_column_name,
        [GRAIN_COL],
        [target_column_name, "predicted", "predicted_base"],
        pdf,
        plot_predictions=True,
    )
pdf.close()

### 7.3 Calculate metrics
Here, we will calculate the metric of interest for each day. For illustration purposes we use root mean squared error as the metric of choice. However, the `compute_all_metrics` method calculated all primary and secondary metrics for AutoML runs. Please refer to this <u>*Regression/forecasting metrics*</u> section in this [document](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml#regressionforecasting-metrics) for the list of available metrics. We will calculate the distribution of this metrics for each time series in our dataset. Looking at the descrptive stats of such metrics can be more informative than calculating a single metric such as the mean for each time series. As an example, we are looking at the RMSE (root mean squared error) metric, but you can choose any other metric computed.

In [None]:
from scripts.helper_scripts import compute_all_metrics

DESIRED_METRIC_NAME = "root_mean_squared_error"

metrics_per_grain_day = compute_all_metrics(
    fcst_df=backtest,
    actual_col=target_column_name,
    fcst_col="predicted",
    ts_id_colnames=[GRAIN_COL, PERIOD_COLUMN],
)
metrics_per_grain_day = metrics_per_grain_day.query(
    f'metric_name == "{DESIRED_METRIC_NAME}"'
)
metrics_per_grain_day[[GRAIN_COL, PERIOD_COLUMN]] = metrics_per_grain_day[
    "time_series_id"
].str.split("|", 1, expand=True)
metrics_per_grain_day.to_csv(
    os.path.join(OUTPUT_DIR, "metrics-automl.csv"), index=False
)
metrics_per_grain_day.groupby(GRAIN_COL)["metric"].describe()

In [None]:
# Uncomment the following line to see all metrics we computed
# print(f'List of available metrics: {metrics_per_grain_day["metric_name"].unique()}\n---')

In [None]:
# baseline metrics
metrics_per_grain_day_base = compute_all_metrics(
    fcst_df=backtest,
    actual_col=target_column_name,
    fcst_col="predicted_base",
    ts_id_colnames=[GRAIN_COL, PERIOD_COLUMN],
)
metrics_per_grain_day_base = metrics_per_grain_day_base.query(
    f'metric_name == "{DESIRED_METRIC_NAME}"'
)
metrics_per_grain_day_base[[GRAIN_COL, PERIOD_COLUMN]] = metrics_per_grain_day[
    "time_series_id"
].str.split("|", 1, expand=True)
metrics_per_grain_day_base.to_csv(
    os.path.join(OUTPUT_DIR, "metrics-base.csv"), index=False
)
metrics_per_grain_day_base.groupby(GRAIN_COL)["metric"].describe()

### 7.4  Visualize metrics

In this section we plot metric evolution over time for the TCN and the baseline models.

In [None]:
metrics_df = metrics_per_grain_day.drop("time_series_id", axis=1).merge(
    metrics_per_grain_day_base.drop("time_series_id", axis=1),
    on=["metric_name", GRAIN_COL, PERIOD_COLUMN],
    how="inner",
    suffixes=["", "_base"],
)
metrics_df

In [None]:
grain = [GRAIN_COL]
plot_filename = "metrics_plot.pdf"

pdf = PdfPages(os.path.join(os.getcwd(), OUTPUT_DIR, plot_filename))
for _, one_forecast in metrics_df.groupby(grain):
    one_forecast[PERIOD_COLUMN] = pd.to_datetime(one_forecast[PERIOD_COLUMN])
    one_forecast.sort_values(PERIOD_COLUMN, inplace=True)
    _draw_one_plot(
        one_forecast,
        PERIOD_COLUMN,
        target_column_name,
        grain,
        ["metric", "metric_base"],
        pdf,
        plot_predictions=True,
    )
pdf.close()

In [None]:
from IPython.display import IFrame

IFrame(os.path.join("./forecast_output/metrics_plot.pdf"), width=800, height=300)

## 8. Inference TCN

In this step, we generate an actual forecast by providing an inference set that does not contain actual values. This illustrates how to generate production forecasts in real life. The code in this section is pretty much identical to the one in section 6.1 with one exception, we set the `run_rolling_evaluation` argument to `False`.

### 8.1 Build and submit the inference pipeline

In [None]:
from azureml.core.runconfig import RunConfiguration

run_config = RunConfiguration()
run_config.environment = inference_env

In [None]:
output_data = OutputFileDatasetConfig(name="prediction_result")

output_ds_name = "uci-tcn-inference-output"

inference_step = PythonScriptStep(
    name="infer-results",
    source_directory="scripts",
    script_name="inference_script_tcn.py",
    arguments=[
        "--model_name",
        model_name_str,
        "--ouput_dataset_name",
        output_ds_name,
        "--test_dataset_name",
        inference_dataset.name,
        "--target_column_name",
        target_column_name,
        "--time_column_name",
        time_column_name,
        "--output_path",
        output_data,
        "--run_rolling_evaluation",
        False,
    ],
    compute_target=compute_target,
    allow_reuse=False,
    runconfig=run_config,
)

In [None]:
inference_pipeline = Pipeline(ws, [inference_step])
inference_run = experiment.submit(inference_pipeline)

In [None]:
inference_run.wait_for_completion(show_output=False)

### 8.2 Get the predicted data

In [None]:
from azureml.core import Dataset

inference_ds = Dataset.get_by_name(ws, output_ds_name)
inference_df = inference_ds.to_pandas_dataframe()
inference_df.tail(5)

## 9. Schedule Pipeline

This section is about how to schedule a pipeline for periodically predictions. For more info about pipeline schedule and pipeline endpoint, please follow this [notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-setup-schedule-for-a-published-pipeline.ipynb).

In [None]:
inference_published_pipeline = inference_pipeline.publish(
    name="UCI Inference", description="UCI Inference"
)
print("Newly published pipeline id: {}".format(inference_published_pipeline.id))

If `inference_dataset` is going to refresh every 24 hours and we want to predict every 24 hours (forecast_horizon), we can schedule our pipeline to run every day at 11 pm to get daily inference results. You can refresh your test dataset (a newer version will be created) periodically when new data is available (i.e. target column in test dataset would have values in the beginning as context data, and followed by NaNs to be predicted). The inference pipeline will pick up context to further improve the forecast accuracy. See the <u><i>Forecasting away from training data</i></u> in this [notebook](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-forecast-function/auto-ml-forecasting-function.ipynb).

In [None]:
from azureml.pipeline.core.schedule import ScheduleRecurrence, Schedule

recurrence = ScheduleRecurrence(
    frequency="Day", interval=1, hours=[23], minutes=[00]  # Runs every day at 11:00 pm
)

schedule = Schedule.create(
    workspace=ws,
    name="tcn_inference_schedule",
    pipeline_id=inference_published_pipeline.id,
    experiment_name="schedule-run-tcn-uci-electro",
    recurrence=recurrence,
    wait_for_provisioning=True,
    description="Schedule Run",
)

# You may want to make sure that the schedule is provisioned properly
# before making any further changes to the schedule

print("Created schedule with id: {}\n---".format(schedule.id))

### 9.1 [Optional] Disable schedule

In [None]:
schedule.disable()