## Time-Series Forecasting

This notebook covers the experimentation for choosing the forecasting model architecture to be used in the proposed design.

In [3]:
!nvidia-smi

Thu May  4 21:30:40 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla V100-SXM2...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    23W / 300W |      0MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [4]:
#Necessary imports

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import joblib
import re

from darts import TimeSeries
from darts.dataprocessing.transformers import Scaler
from darts.metrics import mape, rmse
from darts.utils.timeseries_generation import datetime_attribute_timeseries
from darts.models import NBEATSModel, NHiTSModel, TransformerModel, TFTModel


In [6]:
df = pd.read_csv("data/master_data.zip")

df["timestamp"] = pd.to_datetime(df["timestamp"], unit='s')

df["Year"] = df["timestamp"].dt.year
df["Month"] = df["timestamp"].dt.month

movie_ids = df["movieId"].unique()
movie_ids.sort()

### Generate Time-Series

- Historical demands of movies have been increased by 1 to prevent possible errors in calculations.
- Movies that have less than 48 time-steps are excluded for this experimentation to ensure a minimum validation split of 25%. 
- For a faster implementation of this experiment, only 100 selected movies are considered as a subset of samples.


Covariates are generated for Month and Year values from the target time-series index.

In [7]:
train_series = []
validation_series = []

covariates_train = []
covariates_full = []

excluded_movie_ids = []

MINIMUM_MONTHS = 48

for i in movie_ids[:100]:
    ts_movie = df.loc[df[df["movieId"]==i].index, :]
    ts_movie = ts_movie.set_index("timestamp")
    ts_movie = ts_movie.groupby(pd.Grouper(freq='M'))["userId"].count()
    ts_movie.name = "RatingCounts"
    ts_movie = ts_movie + 1 #For preventing errors in calculations
    ts_movie = pd.DataFrame(ts_movie).reset_index()
    
    if len(ts_movie) < MINIMUM_MONTHS:
        excluded_movie_ids.append(i)
    
    else:
        ts = TimeSeries.from_dataframe(ts_movie, "timestamp", "RatingCounts")
        # Set aside the last 12 months as validation series
        train, val = ts[:-12], ts[-12:]
        train_series.append(train)
        validation_series.append(val)

        covs = datetime_attribute_timeseries(ts, attribute="year", one_hot=False)
        covs = covs.stack(datetime_attribute_timeseries(ts, attribute="month", one_hot=False))
        covs = covs.astype(np.float32)
        train_cov= covs[:-12]
        covariates_train.append(train_cov)
        covariates_full.append(covs)


Both the target values and covariates are scaled.

In [8]:
#scale
target_scaler = Scaler()
train_series_scaled = target_scaler.fit_transform(train_series, n_jobs=-1)

covariate_scaler = Scaler()
covariate_scaler.fit(covariates_train, n_jobs=-1)
covariates_scaled = covariate_scaler.transform(covariates_full, n_jobs=-1)

### Experimentation

Same models will be experimented with OUTPUT_CHUNK_LENGTH as 12 and 1.

In [9]:
INPUT_CHUNK_LENGTH = 12
OUTPUT_CHUNK_LENGTH = 12
N_EPOCHS = 50
PREDICTION_LENGTH = 12


def eval_model(model_name):
  if model_name == TFTModel:
    model = model_name(input_chunk_length=12, output_chunk_length=1, n_epochs=1, add_encoders={"cyclic": {"future": ["month"]}}, pl_trainer_kwargs={"accelerator": "gpu","devices": [0]})
  else:
    model = model_name(input_chunk_length=INPUT_CHUNK_LENGTH, output_chunk_length=OUTPUT_CHUNK_LENGTH, n_epochs=N_EPOCHS, pl_trainer_kwargs={"accelerator": "gpu","devices": [0]})
  model_name = re.findall(r"'(.*?)'", str(model_name))[0].split(".")[-1]
  print(f"\n{model_name} training started\n")
  model.fit(train_series_scaled, past_covariates=covariates_scaled)
  forecast = model.predict(series=train_series, past_covariates=covariates_scaled, n=PREDICTION_LENGTH)
  forecast_inverse = target_scaler.inverse_transform(forecast, n_jobs=-1)
  mape_metric = mape(validation_series, forecast_inverse, inter_reduction=np.mean, n_jobs=-1)
  rmse_metric = rmse(validation_series, forecast_inverse, inter_reduction=np.mean, n_jobs=-1)
  print(f"\n{model_name} training completed\n")
  output = {"Model": model_name, "MAPE": mape_metric, "RMSE": rmse_metric}
  print(output)
  return output


In [10]:
result_metrics = []

for model in [NBEATSModel, NHiTSModel, TransformerModel, TFTModel]:
    metrics = eval_model(model)
    result_metrics.append(metrics)


NBEATSModel training started



INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name          | Type             | Params
---------------------------------------------------
0 | criterion     | MSELoss          | 0     
1 | train_metrics | MetricCollection | 0     
2 | val_metrics   | MetricCollection | 0     
3 | stacks        | ModuleList       | 6.3 M 
---------------------------------------------------
6.3 M     Trainable params
1.5 K     Non-trainable params
6.3 M     Total params
25.182    Total estimated model params size (MB)


Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=50` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name          | Type             | Params
---------------------------------------------------
0 | criterion     | MSELoss          | 0     
1 | train_metrics | MetricCollection | 0     
2 | val_metrics   | MetricCollection | 0     
3 | stacks        | ModuleList       | 881 K 
---------------------------------------------------
863 K     Trainable params
18.5 K    Non-trainable params
881 K     Total params
3.527     Total estimated model params size (MB)



NBEATSModel training completed

{'Model': 'NBEATSModel', 'MAPE': 3894.783881980342, 'RMSE': 447.05551660534684}

NHiTSModel training started



Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=50` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name                | Type                | Params
------------------------------------------------------------
0 | criterion           | MSELoss             | 0     
1 | train_metrics       | MetricCollection    | 0     
2 | val_metrics         | MetricCollection    | 0     
3 | encoder             | Linear              | 256   
4 | positional_encoding | _PositionalEncoding | 0     
5 | transformer         | Transformer         | 548 K 
6 | decoder             | Linear              | 780   
--------------------------------------------


NHiTSModel training completed

{'Model': 'NHiTSModel', 'MAPE': 5791.892015394546, 'RMSE': 699.8061683221412}

TransformerModel training started



Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=50` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]


TransformerModel training completed

{'Model': 'TransformerModel', 'MAPE': 1198.3637678197156, 'RMSE': 49.69864888493327}

TFTModel training started



INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
   | Name                              | Type                             | Params
----------------------------------------------------------------------------------------
0  | train_metrics                     | MetricCollection                 | 0     
1  | val_metrics                       | MetricCollection                 | 0     
2  | input_embeddings                  | _MultiEmbedding                  | 0     
3  | static_covariates_vsn             | _VariableSelectionNetwork        | 0     
4  | encoder_vsn                       | 

Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=1` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]


TFTModel training completed

{'Model': 'TFTModel', 'MAPE': 673.7083422351322, 'RMSE': 29.822771334213748}


In [11]:
pd.DataFrame(result_metrics)

Unnamed: 0,Model,MAPE,RMSE
0,NBEATSModel,3894.783882,447.055517
1,NHiTSModel,5791.892015,699.806168
2,TransformerModel,1198.363768,49.698649
3,TFTModel,673.708342,29.822771


In [14]:
INPUT_CHUNK_LENGTH = 12
OUTPUT_CHUNK_LENGTH = 1
N_EPOCHS = 50
PREDICTION_LENGTH = 12


def eval_model(model_name):
  if model_name == TFTModel:
    model = model_name(input_chunk_length=12, output_chunk_length=1, n_epochs=1, add_encoders={"cyclic": {"future": ["month"]}}, pl_trainer_kwargs={"accelerator": "gpu","devices": [0]})
  else:
    model = model_name(input_chunk_length=INPUT_CHUNK_LENGTH, output_chunk_length=OUTPUT_CHUNK_LENGTH, n_epochs=N_EPOCHS, pl_trainer_kwargs={"accelerator": "gpu","devices": [0]})
  model_name = re.findall(r"'(.*?)'", str(model_name))[0].split(".")[-1]
  print(f"\n{model_name} training started\n")
  model.fit(train_series_scaled, past_covariates=covariates_scaled)
  forecast = model.predict(series=train_series, past_covariates=covariates_scaled, n=PREDICTION_LENGTH)
  forecast_inverse = target_scaler.inverse_transform(forecast, n_jobs=-1)
  mape_metric = mape(validation_series, forecast_inverse, inter_reduction=np.mean, n_jobs=-1)
  rmse_metric = rmse(validation_series, forecast_inverse, inter_reduction=np.mean, n_jobs=-1)
  print(f"\n{model_name} training completed\n")
  output = {"Model": model_name, "MAPE": mape_metric, "RMSE": rmse_metric}
  print(output)
  return output


In [15]:
result_metrics = []

for model in [NBEATSModel, NHiTSModel, TransformerModel, TFTModel]:
    metrics = eval_model(model)
    result_metrics.append(metrics)

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name          | Type             | Params
---------------------------------------------------
0 | criterion     | MSELoss          | 0     
1 | train_metrics | MetricCollection | 0     
2 | val_metrics   | MetricCollection | 0     
3 | stacks        | ModuleList       | 6.3 M 
---------------------------------------------------
6.3 M     Trainable params
1.5 K     Non-trainable params
6.3 M     Total params
25.158    Total estimated model params size (MB)



NBEATSModel training started



Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=50` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name          | Type             | Params
---------------------------------------------------
0 | criterion     | MSELoss          | 0     
1 | train_metrics | MetricCollection | 0     
2 | val_metrics   | MetricCollection | 0     
3 | stacks        | ModuleList       | 880 K 
---------------------------------------------------
861 K     Trainable params
18.5 K    Non-trainable params
880 K     Total params
3.521     Total estimated model params size (MB)



NBEATSModel training completed

{'Model': 'NBEATSModel', 'MAPE': 1631.0576759641897, 'RMSE': 154.41073948783165}

NHiTSModel training started



Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=50` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
  | Name                | Type                | Params
------------------------------------------------------------
0 | criterion           | MSELoss             | 0     
1 | train_metrics       | MetricCollection    | 0     
2 | val_metrics         | MetricCollection    | 0     
3 | encoder             | Linear              | 256   
4 | positional_encoding | _PositionalEncoding | 0     
5 | transformer         | Transformer         | 548 K 
6 | decoder             | Linear              | 65    
--------------------------------------------


NHiTSModel training completed

{'Model': 'NHiTSModel', 'MAPE': 3204.7343295541923, 'RMSE': 274.91182670476263}

TransformerModel training started



Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=50` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]


TransformerModel training completed

{'Model': 'TransformerModel', 'MAPE': 862.7424192720205, 'RMSE': 49.084594808727495}

TFTModel training started



INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:pytorch_lightning.callbacks.model_summary:
   | Name                              | Type                             | Params
----------------------------------------------------------------------------------------
0  | train_metrics                     | MetricCollection                 | 0     
1  | val_metrics                       | MetricCollection                 | 0     
2  | input_embeddings                  | _MultiEmbedding                  | 0     
3  | static_covariates_vsn             | _VariableSelectionNetwork        | 0     
4  | encoder_vsn                       | 

Training: 0it [00:00, ?it/s]

INFO:pytorch_lightning.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=1` reached.
INFO:pytorch_lightning.utilities.rank_zero:GPU available: True (cuda), used: True
INFO:pytorch_lightning.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO:pytorch_lightning.utilities.rank_zero:IPU available: False, using: 0 IPUs
INFO:pytorch_lightning.utilities.rank_zero:HPU available: False, using: 0 HPUs
INFO:pytorch_lightning.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting: 0it [00:00, ?it/s]


TFTModel training completed

{'Model': 'TFTModel', 'MAPE': 422.0986961743866, 'RMSE': 20.988074529396435}


In [16]:
pd.DataFrame(result_metrics)

Unnamed: 0,Model,MAPE,RMSE
0,NBEATSModel,1631.057676,154.410739
1,NHiTSModel,3204.73433,274.911827
2,TransformerModel,862.742419,49.084595
3,TFTModel,422.098696,20.988075
