# Dense model
It is end of May 2020. 

Your models worked. You managed to develop accurate short-term predictors in a matter of days.

Now, you are asked to explain how you did it. And you decide to do it the hard way. You will show your colleagues a demo and explain it step by step.

You enter the room, look at the audience, and recognize familiare faces: Gabriele and Matteo, the ML engineers, Marta and Gabriele, your fellow data scientists, and Paolo, the new guy of the group.

As always, you start from the problem statement.

<div class="alert alert-block alert-info">
<b>Problem Statement</b> 
    
Given Italian daily power load data from 2006 to day $t$, predict the load on day $t+1$.
</div>

Your first idea was an autoregressive model. 

You wanted to be flexible, so you opted for a dense neural network. By no means it is the state of the art, but it was just a first shot.

You start to explain how you trained the model.

# Setup
Again, you update the default packages in SageMaker studio.

Please, restart the kernel if this is the first time you run this notebook: it is necessary to ensure that we can actually import the libraries we've just installed in the previous cells.

In [None]:
# To read data from S3
! pip install pandas s3fs --upgrade

In [None]:
# 'ml.m5.xlarge' is included in the AWS Free Tier
INSTANCE_TYPE = 'ml.m5.xlarge'

In [None]:
import os

import boto3
import sagemaker
import numpy as np
import pandas as pd

from sklearn.model_selection import train_test_split
from sagemaker.tensorflow import TensorFlow

In [None]:
# Configuring the default size for matplotlib plots
import matplotlib.pyplot as plt

plt.rcParams["figure.figsize"] = (20, 6)

# Data gathering
As in Fourier Regression.

In [None]:
boto_session = boto3.Session()
sagemaker_session = sagemaker.Session()
sagemaker_client = boto_session.client("sagemaker")
sagemaker_bucket = sagemaker_session.default_bucket()
main_prefix = "amld22-workshop-sagemaker"

raw_data_s3_path = "s3://public-workshop/normalized_data/processed/2006_2022_data.parquet"
raw_df = pd.read_parquet(raw_data_s3_path)
resampled_df = raw_df.resample('D').sum()

# Data preparation

You consider two datasets:
- data until the end of 2019: the same data that has been used to train the long-term models
- data until the end of May 2020: to show the behaviour of models during the pandemic period

In [None]:
data_df = resampled_df[:'2019-12-31 23:59'].copy()
covid_df = resampled_df[:'2020-05-31 23:59'].copy()

covid_len = covid_df.shape[0] - data_df.shape[0]

You use as **training** set the data until the end of 2019, and as **test** set the data until the end of May 2020.   
To prepare the dataset for the neural network, you write a utility function to transform the load series into a dataframe of lagged features.

In [None]:
# To transform a series of datapoints into a dataframe that contains the lagged features
def build_lagged_df(series: pd.Series, n_lags: int) -> pd.DataFrame:
    df = pd.DataFrame({series.name: series})
    for i in range(1, n_lags + 1):
        df[f'{series.name}_{i}'] = series.shift(i)
    df = df.dropna()
    return df

In [None]:
n_lags = 7

# We rescale the dataset using the Max value within the training set
covid_max = data_df.Load.max()
dense_df = build_lagged_df(covid_df.Load / covid_max, n_lags=n_lags)

x_train_dense_scaled, x_test_dense_scaled, y_train_dense_scaled, y_test_dense_scaled = train_test_split(
    dense_df.drop(columns=['Load']),
    dense_df.Load,
    test_size=covid_len,
    shuffle=False
)

print(f"Train set length: {x_train_dense_scaled.shape[0]} | Test set length: {x_test_dense_scaled.shape[0]}")

train_df = dense_df.copy().loc[x_train_dense_scaled.index]
train_df.head()

As in the Fourier regression, you upload the data to S3.

This is needed to train the TensorFlow estimator in framework mode.

In [None]:
s3_train_path = f's3://{sagemaker_bucket}/{main_prefix}/data/modelling/dense-model/train.parquet'
train_df.to_parquet(s3_train_path)

# TensorFlow model fit

In order to minimize the friction in using TensorFlow, you choose a managed container and go for framework mode.

In SageMaker framework mode, you specify the entry point `dense_model.py` as the Python script to use within the TensorFlow container.

The script defines the neural net's structure and trains it on the training set according to the hyperparameters passed in this notebook.

In [None]:
# Hyperparameters
NUM_OF_EPOCHS = 300
BATCH_SIZE = 64
LEARNING_RATE = 0.0001

dense_estimator = TensorFlow(
    entry_point='dense_model.py',
    role=sagemaker.get_execution_role(),
    instance_count=1,
    instance_type=INSTANCE_TYPE,
    framework_version="2.4.1",
    py_version="py37",
    hyperparameters={
        "num_of_epochs": NUM_OF_EPOCHS,
        "batch_size": BATCH_SIZE,
        "learning_rate": LEARNING_RATE,
        "version_number": "0000001"
    }
)
dense_estimator.fit({'training': s3_train_path})

# Deployment
Using the facility of AWS SageMaker, you deploy the model to a managed endpoint.

On inference, the model trained in the `dense_model.py` module will be retrieved and asked to predict with new data.

Unlike Fourier regression, you do not rely on the Feature Store, so there are no limitations on the predicted number of samples.

In [None]:
dense_predictor = dense_estimator.deploy(
    initial_instance_count=1, 
    instance_type=INSTANCE_TYPE,
    serializer=sagemaker.serializers.CSVSerializer(),
    deserializer=sagemaker.deserializers.JSONDeserializer()
)

# Prediction
Finally, you use the deployed model to perform some predictions. 

You use the estimator client, but your fellow engineers will be able to call the API using any REST client.

As you selected the CSV serializer, you can pass a numpy array directly to the `.predict()` method.

In [None]:
prediction = dense_predictor.predict(x_test_dense_scaled.to_numpy())
y_pred = pd.Series([y[0] for y in prediction['predictions']], index=x_test_dense_scaled.index)

In [None]:
def mean_absolute_percentage_error(y_true, y_pred):
    return np.mean(np.abs(y_true - y_pred) / y_pred)

In [None]:
dense_mape = mean_absolute_percentage_error(y_test_dense_scaled * covid_max, y_pred * covid_max)

plt.title(f"Dense model | MAPE: {100 * dense_mape:.2f} %")
plt.plot(y_test_dense_scaled * covid_max, label='Actual')
plt.plot(y_pred * covid_max, label='Predicted')
plt.legend()
plt.grid(0.4)
plt.show()

In [None]:
covid_prediction_df = pd.DataFrame({'actual': y_test_dense_scaled * covid_max, 'predicted': y_pred * covid_max})
rolling_mape_df = pd.DataFrame(
    {'rolling_mape': map(lambda w: mean_absolute_percentage_error(w["actual"], w["predicted"]),
                         covid_prediction_df.rolling(7, method="table"))},
    index=covid_prediction_df.index
)
plt.plot(rolling_mape_df.rolling_mape)
plt.title('Rolling MAPE (7-day window)')
plt.grid(0.4)
plt.show()

The rolling MAPE is way better than before. You realize that the scale is different: not the max is 0.1, using the Fourier regression it was 0.3.

# Conclusions
Yes, it is a completely different beast with respect to the Fourier regression.

No, it is not a state-of-the-art model.

But it did the job. 

The MAPE on the COVID period is much lower than before, and you are satisfied with such results. You get a lot of questions about more complex time-series-focused model, but this will be topic for another day.

In fact, you have wondered about the issue for quite a while. Would a more complex and appropriate model for time series be able to improve? You read on the AWS documentation that one of their built-in algorithms is tailored to time series. It is more suited to sets of series with similar structure, but maybe it can help also with this problem.

Giving it a try looks just so easy...

# Cleanup
If you’re ready to be done with this notebook, please run the cells below with `CLEANUP = True`. 

This will remove the model, hosted endpoint, and all the experiments you created to avoid any charges from a stray instance being left on.

In [None]:
CLEANUP = True
if CLEANUP:
    dense_predictor.delete_model()
    dense_predictor.delete_endpoint()