## **Running the models using the 'modelling' package**

For running the models, we take the following steps:

> * Prepare packages, setup, data
> * Load model
> * Define hyperparameters
> * Train the model
> * Evaluate the model

Throughout the notebook, there are printing statements to clarify potential errors happening on Habrok

#### **Prepare packages, setup, data**

In [None]:
print("Starting script...")

print("Importing modelling package...")
from modelling import *
from modelling import GRU
from modelling import HGRU

#! Kijk welke imports weg kunnen!
print("Importing libs...")
import os
import datetime
import numpy as np
import pandas as pd
import torch as tc
import torch.nn as nn
from torch.optim import Adam
from torch.optim.lr_scheduler import ReduceLROnPlateau
from torch.utils.data import (
    DataLoader,
    SubsetRandomSampler,
    SequentialSampler
)
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt


Starting script...
Importing modelling package...

Running __init__.py for data pipeline
Modelling package initialized

Importing libs...


Use GPU when available

In [2]:
use_cuda = tc.cuda.is_available()
device = tc.device("cuda" if use_cuda else "cpu")
print("Device: ", device)

Device:  cpu


"Global" variables

In [None]:
HABROK = bool(0)                       # True if running on Habrok or external server
if HABROK:
    print("Successfully imported libraries into env")
    USER = 'habrok'
else:
    USER = 'tinus'

if USER == 'tinus':
    os.chdir(r"c:\Users\vwold\Documents\Bachelor\ICML_paper\forecasting_smog_DL\forecasting_smog_DL\src")
    MODEL_PATH = os.path.join(os.getcwd(), "results/models") #! TODO get these two out and to somewhere else, or gone in general
    MINMAX_PATH = "../data/data_combined/contaminant_minmax.csv"
elif HABROK:
    os.chdir(r"/home1/s4372948/thesis/modelling/")
    MODEL_PATH = os.path.join(os.getcwd(), "results/models")
    MINMAX_PATH = "../data/data_combined/contaminant_minmax.csv"

tc.manual_seed(34)
mpl.rcParams['figure.figsize'] = (7, 3)

N_HOURS_U = 72
N_HOURS_Y = 24
N_HOURS_STEP = 24
CONTAMINANTS = ['NO2', 'O3', 'PM10', 'PM25']
COMPONENTS = ['NO2', 'O3', 'PM10', 'PM25', 'SQ', 'WD', 'Wvh', 'dewP', 'p', 'temp']

Load in data and create PyTorch *Datasets*

In [4]:
# Load in data and create PyTorch Datasets. To tune
# which exact .csv files get extracted, change the
# lists in the get_dataframes() definition

train_input_frames = get_dataframes('train', 'u')
train_output_frames = get_dataframes('train', 'y')

val_input_frames = get_dataframes('val', 'u')
val_output_frames = get_dataframes('val', 'y')

test_input_frames = get_dataframes('test', 'u')
test_output_frames = get_dataframes('test', 'y')

print("Successfully loaded data") if HABROK else None

In [5]:
train_dataset = TimeSeriesDataset(
    train_input_frames,  # list of input training dataframes
    train_output_frames, # list of output training dataframes
    5,                   # number of dataframes put in for both
                         # (basically len(train_input_frames) and
                         # len(train_output_frames) must be equal)
    N_HOURS_U,           # number of hours of input data
    N_HOURS_Y,           # number of hours of output data
    N_HOURS_STEP,        # number of hours between each input/output pair
)
val_dataset = TimeSeriesDataset(
    val_input_frames,    # etc.
    val_output_frames,
    3,
    N_HOURS_U,
    N_HOURS_Y,
    N_HOURS_STEP,
)
test_dataset = TimeSeriesDataset(
    test_input_frames,
    test_output_frames,
    3,
    N_HOURS_U,
    N_HOURS_Y,
    N_HOURS_STEP,
)

#### **Define model**

#### **Define evaluation and plotting functions**

#### **Define training functions**

#### **Hyperparameter tuning**

Define an ordinary grid search through the given hyperparameter space

Define hyperparameters and searchable distributions

In [11]:
# Here, all (hyper)parameters are defined. The hyperparameters are defined in
# a dictionary, which is then passed to the model and the training functions.
# The grid search is performed by generating all possible combinations of the
# hyperparameters defined in the hp_space dictionary, and then performing k-fold cross
# validation on each of these configurations. The best configuration is then returned.
# When the search is finished, comment out the hp_space dictionary and save the best found
# hyperparameters in the hp dictionary, and train the final model with these.

#! TODO voeg nieuwe constructors van Model classes toe aan hp_space en ook de model initialisatie

hp = {
    'n_hours_u' : N_HOURS_U,
    'n_hours_y' : N_HOURS_Y,

    'model_class' : HGRU,
    'input_units' : train_dataset.__n_features_in__(),
    'hidden_layers' : 4,
    'hidden_units' : 64,
    'branches' : 4,
    'output_units' : train_dataset.__n_features_out__(),

    'Optimizer' : Adam,
    'lr_shared' : 1e-3,
    'scheduler' : ReduceLROnPlateau,
    'scheduler_kwargs' : {'mode' : 'min', 'factor' : 0.1,
                          'patience' : 3, 'cooldown' : 8, 'verbose' : True},
    'w_decay' : 1e-7,
    'loss_fn' : nn.MSELoss(),

    'epochs' : 5000,
    'early_stopper' : EarlyStopper,
    'patience' : 20,
    'batch_sz' : 16,
    'k_folds' : 5,
}                                   # The lr for the branched layer(s) is calculated
                                    # based on the "power ratio" between the branched
                                    # part of the network and the shared layer, which
                                    # is *assumed* to be proportional to n_hidden_layers
hp['lr_branch'] = hp['lr_shared'] * hp['hidden_layers']

# hp_space = {
#     'hidden_layers' : [2, 3, 4, 5, 6, 7, 8],
#     'hidden_units' : [16, 32, 48, 64, 80, 96, 112, 128],
#     'w_decay' : [1e-4, 1e-5, 1e-6, 1e-7, 1e-8],
# }

#### **Training the model**

Train the model by performing a grid search and saving the optimally configurated model

In [12]:
print("Starting training now") if HABROK else None

In [None]:
current_time = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
stdout_location = f'results/grid_search_exe_s/exe_of_HGRU_at_{current_time}.txt'
# train_dataset_full = ConcatDataset([train_dataset, val_dataset])
#                                     If HABROK, print to external file, else print to stdout
# with PrintManager(stdout_location, 'a', HABROK):
#     print(f"Grid search execution of HGRU at {current_time}\n")
#                                     # Train on the full training set
#     model, best_hp, val_loss = grid_search(hp, hp_space, train_dataset_full, True)
#                                     # Externally save the best model
#     tc.save(model.state_dict(), f"{MODEL_PATH}/results/model_HGRU.pth")

#     hp = update_dict(hp, best_hp)   # Update the hp dictionary with the best hyperparameters
#     print_dict_vertically(best_hp)

Lay out model architecture with optimal hyperparameters

In [14]:
with PrintManager(stdout_location, 'a', HABROK):
    print("\nPrinting model:")
    model = HGRU(hp['n_hours_u'],
                 hp['n_hours_y'],
                 hp['input_units'],
                 hp['hidden_layers'],
                 hp['hidden_units'], 
                 hp['branches'],
                 hp['output_units'])
    print(model)


Printing model:
HGRU(
  (input_layer): GRU(10, 64, batch_first=True)
  (shared_layer): GRU(64, 64, batch_first=True)
  (branches): ModuleList(
    (0-3): 4 x Branch(
      (layers): ModuleList(
        (0): GRU(64, 16, batch_first=True)
        (1): GRU(16, 16, num_layers=3, batch_first=True)
        (2): Linear(in_features=16, out_features=1, bias=True)
      )
    )
  )
)


Train model on complete training dataset (= train + validation)

In [None]:
train_loader = DataLoader(train_dataset, batch_size = hp['batch_sz'], shuffle = True)
val_loader = DataLoader(val_dataset, batch_size = hp['batch_sz'], shuffle = False) 
                                            
                                        # Train the final model on the full training set,
                                        # save the final model, and save the losses for plotting
with PrintManager(stdout_location, 'a', HABROK):
    print("\nTraining on full training set...")
    model_final, train_losses, test_losses, shared_losses, branch_losses = \
        train_hierarchical(hp, train_loader, val_loader, True)
    tc.save(model_final.state_dict(), f'{MODEL_PATH}/model_HGRU.pth')

df_losses = pd.DataFrame({'L_train': train_losses, 'L_test': test_losses})
df_losses.to_csv(f'{os.path.join(os.getcwd(), "results/final_losses")}/losses_HGRU_at_{current_time}.csv', 
                 sep = ';', decimal = '.', encoding = 'utf-8')


Training on full training set...
An exception of type <class 'KeyboardInterrupt'> occurred with value 
Traceback: <traceback object at 0x00000184E38F4C00>


KeyboardInterrupt: 

#### **Testing the model**

In [None]:
model_final = HGRU(hp['input_units'], hp['hidden_layers'], hp['hidden_units'],
                     hp['branches'], hp['output_units'])
model_final.load_state_dict(tc.load(f"{MODEL_PATH}/model_HGRU.pth"))
print(model_final)

FileNotFoundError: [Errno 2] No such file or directory: 'c:\\Users\\vwold\\Documents\\Bachelor\\ICML_paper\\forecasting_smog_DL\\forecasting_smog_DL\\src\\models\\model_MBGRU.pth'

In [None]:
test_loader = DataLoader(test_dataset, batch_size = hp['batch_sz'], shuffle = False) 
test_error = test_hierarchical(model_final, nn.MSELoss(), test_loader)

with PrintManager(stdout_location, 'a', HABROK):
    print()
    print("Testing MSE:", test_error)

In [None]:
print(test_hierarchical(model_final, nn.MSELoss(), train_loader))
print(test_hierarchical(model_final, nn.MSELoss(), val_loader))
print(test_hierarchical(model_final, nn.MSELoss(), test_loader))

print("\nMSE Training set:")
print_dict_vertically(
    test_hierarchical_separately(model_final, nn.MSELoss(), train_loader, True, MINMAX_PATH)
)
print("\nMSE Validation set:")
print_dict_vertically(
    test_hierarchical_separately(model_final, nn.MSELoss(), val_loader, True, MINMAX_PATH)
)
print("\nMSE Test set:")
print_dict_vertically(
    test_hierarchical_separately(model_final, nn.MSELoss(), test_loader, True, MINMAX_PATH)
)

In [None]:
print("\nRMSE Training set:")
print_dict_vertically_root(
    test_hierarchical_separately(model_final, nn.MSELoss(), train_loader, True, MINMAX_PATH)
)
print("\nRMSE Validation set:")
print_dict_vertically_root(
    test_hierarchical_separately(model_final, nn.MSELoss(), val_loader, True, MINMAX_PATH)
)
print("\nRMSE Test set:")
print_dict_vertically_root(
    test_hierarchical_separately(model_final, nn.MSELoss(), test_loader, True, MINMAX_PATH)
)
np.sqrt(test_hierarchical(model_final, nn.MSELoss(), test_loader, True, MINMAX_PATH))

In [None]:
pair = 5
plot_pred_vs_gt(model_final, test_dataset, pair, 'NO2', N_HOURS_Y)
plot_pred_vs_gt(model_final, test_dataset, pair, 'O3', N_HOURS_Y)
plot_pred_vs_gt(model_final, test_dataset, pair, 'PM10', N_HOURS_Y)
plot_pred_vs_gt(model_final, test_dataset, pair, 'PM25', N_HOURS_Y)