# Neural Networks project
## Short-Term Load Forecasting using Bi-directional Sequential Models and Feature Engineering for Small Datasets


**Name**: *\<Ali Nosouhi Dehnavi\>*

**Matricola**: *\<1950716\>*



In [1]:
!pip install torchmetrics

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting torchmetrics
  Downloading torchmetrics-0.11.0-py3-none-any.whl (512 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m512.4/512.4 KB[0m [31m30.1 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: torchmetrics
Successfully installed torchmetrics-0.11.0


In [2]:
import os
import pandas as pd
import numpy as np
from sklearn.preprocessing import OneHotEncoder
import torch
from torch import nn, optim, as_tensor
from torch.utils.data import Dataset, DataLoader,Subset
import torch.nn.functional as F
from torch.nn.init import *
import tensorflow as tf
from torch.autograd import Variable
from matplotlib.dates import DateFormatter
import matplotlib
matplotlib.use('agg')
import datetime
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import tqdm
plt.style.use('ggplot')
from torchmetrics import MeanAbsolutePercentageError 


# Objective

The purpose of this Project is to forecast the future load consumption given a data set of load consumption history.Due to the volatile nature of electricity power consumption and its dependence on many parameters namely: 'time of the day','holiday','weather' and etc. the prediction is not an easy task.A briliant idea is to extract some hand-crafted features and use them along with raw data which led to better accuracy compared to use of only raw datas. In this paper a new architecture consisting two parallel path of recurrent neural networks is propsed.

 <figure>
 <center>
 <img src='https://i.postimg.cc/8cSVDg1W/Architechure.png' width="1000" 
      height="500"/>
 <figcaption>Proposed architecture 2 parallel recurrent layers</figcaption></center>
 </figure>




# Data pre-processing 

###  


1.   First we read one of the CSV file from PERCON dataset and convert it from  1 minute to 30 minutes intervals taking the average of 30 successive samples.
2.   since in evaluation phase MAPE(Mean Average Percentage Error) is sensetive to close-to-zero data we add a small value 0.1 kw to avoid non realistic result.
3. we add extra columns for basic  **I**=[0 47], **D**=[0 6] **H**={0,1} and derived features **Avg**  **Std**  **Avg_w**  **Std_W**  which will be fed to two branches of network in parallel 
4. For simplicity we take  **window=2** for the expriment(this choice resulted better accuracy in the paper)








In [3]:
from google.colab import drive
drive.mount('/content/drive') 

Mounted at /content/drive


In [4]:
src_dir ='/content/drive/My Drive/NN project/data/sampled'  # source directory
dest_dir='/content/drive/My Drive/NN project/data/processed'  # destination directory after preprocessing 
window=2
for file in os.listdir(src_dir):
  data = pd.read_csv(os.path.join(src_dir, file))
  data['Date_Time'] = pd.to_datetime(data['Date_Time'], infer_datetime_format = True) #from  READING_DATETIME column in dataset(PERCON) to date-time

  data['Usage_kW']=data['Usage_kW']+0.1   # add a small biase 
  data_30min=pd.DataFrame(0, index=np.arange(0,len(data)/30, dtype=int), columns=data.columns) # create a date frame with the same columns as original data

  j=0
  for k in  range(0, len(data)):
    ts = data.loc[k, 'Date_Time']
     
    if (ts.minute == 0 or ts.minute == 30):
        data_30min.loc[j,'Usage_kW']=data.loc[k:k+29, 'Usage_kW'].mean()
        data_30min.loc[j,'Date_Time']=data.loc[k,'Date_Time']
        j=j+1
  
  #data=data_30min 
  # expand the data frame with new null features with column labels I D H Avg Std Avg_w Std_w
  data_30min['I'], data_30min['D'], data_30min['H'],data_30min['Avg'],data_30min['Std'],data_30min['Avg_w'],data_30min['Std_w'] = [0, 0, 0, 0, 0, 0, 0]

  # build an array with row dimension=the number of days  and columns dim=48(reading intervals per day) 
  arr = np.zeros((int(np.ceil(len(data_30min)/48)), 48))   
   

   #####################     Basic features     ################################

   # for each day in data frame it satrts from 0 to 47 in column 'I'
   # in case of missing datas corresponding  element [i,I-th] is left 0
   # for the first and last day that we have not all measurment from 00:00 to 24:00  the elements of
   # data.loc[i, 'I'] will receive values based on their hours(ts. hour) 
  for i in  range(0, len(data_30min)): 
    ts = data_30min.loc[i, 'Date_Time']    
   
    if (ts.minute == 0):
        I = (ts.hour * 2) + 0
    elif (ts.minute == 30):
        I = (ts.hour * 2) + 1
        
    data_30min.loc[i, 'I'] = I
        
    # weekdays in coulmun 'D' satrts from 0 to 6 in column 
    weekday = ts.weekday()
    data_30min.loc[i, 'D'] = weekday
    
    # Holidays column 'H' is set to 1 for saturdays(D=5) and sundays(D=6)
    if (weekday > 4):
        data_30min.loc[i, 'H'] = 1
        
#####################     Derived features     ################################
    #current day number  
    day = int(i/48)
    #compute the avg and std for the same time of the current and  previous days(limited to window length [K in paper])
    if(day >= window):
      data_30min.loc[i, 'Avg'] = np.mean(arr[day-window:day, I])
      data_30min.loc[i, 'Std'] = np.std(arr[day-window:day, I])

    arr[day, I] = data.loc[i, 'Usage_kW'] # fill bar array from data frame

    #compute avg and std of window length of preceding  time steps (window length) in the same day
    if(i >= window - 1):
      window_values = data_30min.loc[i+1-window: i, 'Usage_kW'].values
      data_30min.loc[i, 'Avg_w'] = np.mean(window_values)
      data_30min.loc[i, 'Std_w'] = np.std(window_values)    
  
  
  data_30min.rename(columns={'Usage_kW': 'E'}, inplace=True)
  data_30min = data_30min[['Date_Time', 'E', 'I', 'D', 'H', 'Avg', 'Std', 'Avg_w', 'Std_w']]
  data=data_30min
  data.to_csv(os.path.join(dest_dir, file)) # write new data frame in destination directory
      

**After pre-processing new data set contains basic and derived features which will be used in proposed architecture**:


 <figure>
 <center>
 <img src='https://i.postimg.cc/433L0z0p/data-preprocessing.png\\' width="2000" 
      height="400"/>
 <figcaption>data after preprocess with  basic and derived features</figcaption></center>
 </figure>




**data frame to tensor** in order to become usable for pytorch models

In [5]:
data_notime = data[[ 'E', 'I', 'D', 'H', 'Avg', 'Std', 'Avg_w', 'Std_w']]
data_ten=torch.tensor(data_notime.values,dtype=torch.float32)

# Data Generation 


Given cleaned dataset from previous section we are now able to produce $x_{train}$ ,$x_{train-derived}$, $y_{train}$ , $x_{test}$ , $x_{train-derived}$ , $y_{test}$



1.   first we normalize the reading columns and featured data
2.   use one-hot-encoding to create sets of sequential input batch for train and test



In [40]:
split=[0.9,0.1] #0.9 0.1

In [41]:
def get_data(data,data_type, window,split):
    
    mRange = int(len(data)*split[0])
    dtrain = data[:mRange]
    dtest = data[mRange:]
        
    if(data_type=='train'):
      data = dtrain

    else:
      data = dtest

    #Basic features
    E=data[:,0]
    I=data[:,1]
    D=data[:,2]
    H= data[:,3]

    E_minmax = [torch.min(E),torch.max(E)] 

    E=(E-torch.min(E))/(torch.max(E)-torch.min(E))
    E = E.reshape(-1, 1)

    #I=[0....47] and after OHE it will be an array of binary with number of samples as row number
    #and 48 columns(48 bits)    
    I=F.one_hot(I.long())
    #D=[0....6] and after OHE it will be an array of binary with number of samples as row number
    #and 7 columns(7 bits) 
    D=F.one_hot(D.long())
    #H=[0,1] and after OHE it will be an array of binary with number of samples as row number
    #and 2 columns(2 bits) 
    H=F.one_hot(H.long())
    #axis=1 to concatenate along columns 
    basic_data = torch.cat((E, D, H, I),1)

    #Derived features
    Avg = data[:,4]
    Std = data[:,5]
    Avg_w = data[:,6]
    Std_w = data[:,7]
   
    Avg[:-1] = Avg[1:].clone()  #by this instead of avarage(or std) btw time T of this day and previous days(depending on window size) it 
    Std[:-1] = Std[1:].clone()  #becomes avg(or std) of time T of current day and following days
    
    
    Avg=(Avg-torch.min(Avg))/(torch.max(Avg)-torch.min(Avg))
    Avg = Avg.reshape(-1, 1)

    Std=(Std-torch.min(Std))/(torch.max(Std)-torch.min(Std))
    Std = Std.reshape(-1, 1)

    Avg_w=(Avg_w-torch.min(Avg_w))/(torch.max(Avg_w)-torch.min(Avg_w))
    Avg_w = Avg_w.reshape(-1, 1)

    Std_w=(Std_w-torch.min(Std_w))/(torch.max(Std_w)-torch.min(Std_w))
    Std_w = Std_w.reshape(-1, 1)


    derived_feature = torch.cat((E, Avg_w, Std_w, Avg, Std),1)
    
    seq_len = window + 1
    seq_basic = []
    seq_derived = []

    for i in range(len(data) - seq_len):
      seq_basic.append(basic_data[i: i + seq_len])
      seq_derived.append(derived_feature[i: i + window])
   
    seq_basic=torch.stack(seq_basic) #(n, 3, 58) 
    seq_derived=torch.stack(seq_derived) #(10782, 2, 5)

    x_data = seq_basic[:, :-1] #(n, 2, 58)
    y_data = seq_basic[:, -1][:, 0] #(n,1) reading values E

    
    y_data = (y_data* (E_minmax[1] - E_minmax[0])) + E_minmax[0]  
    y_data=y_data.reshape(-1,1)
       
    return x_data, y_data, seq_derived   #x_data, y_data, seq_derived



# Model

Model is defined in pytorch.
for each experiment type: 'basic' and 'derived'(proposed method)
6 recurrent models are defined as follow:
**LSTM  GRU , RNN , BLSTM , BGRU , BRNN**


In [8]:
class recmodel(nn.Module):

    def __init__(self, arcitecture_name, model_name):
        super(recmodel, self).__init__()
        self.arcitecture_name=arcitecture_name
        self.input_dim1=58
        self.input_dim2=5
        self.hidden_dim =20
        self.hn1 = Variable(torch.randn(2, 2, 20))
        self.hn2 = Variable(torch.randn(2, 2, 20))
        
        
        if arcitecture_name=='basic':
          if model_name == "LSTM":  
            self.lstm = nn.LSTM(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2, batch_first=False)
            self.fc1 =  nn.Linear(self.hidden_dim, 20)
             
          elif model_name == "RNN":
            self.lstm = nn.RNN(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.fc1 =  nn.Linear(self.hidden_dim, 20)
            
          elif model_name == "GRU":
            self.lstm = nn.GRU(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.fc1 =  nn.Linear(self.hidden_dim, 20)
            
          elif model_name == "BLSTM":  
            self.lstm = nn.LSTM(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.fc1 =  nn.Linear(self.hidden_dim*2, 20) 
            
          elif model_name == "BRNN":
            self.lstm = nn.RNN(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.fc1 =  nn.Linear(self.hidden_dim*2, 20)
             
          elif model_name == "BGRU":
            self.lstm = nn.GRU(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.fc1 =  nn.Linear(self.hidden_dim*2, 20)  
            
          self.fc3 =  nn.Linear(20,1)  

        elif arcitecture_name=='derived':
          
          if model_name == "RNN":
            self.lstm1 = nn.RNN(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.lstm2 = nn.RNN(input_size=self.input_dim2, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.fc2 =  nn.Linear(self.hidden_dim*2, 20) 
            self.fc3 =  nn.Linear(20,1)
          elif model_name == "LSTM":  
            self.lstm1 = nn.LSTM(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.lstm2 = nn.LSTM(input_size=self.input_dim2, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.fc2 =  nn.Linear(self.hidden_dim*2, 20)
            self.fc3 =  nn.Linear(20,1)
          elif model_name == "GRU":
            self.lstm1 = nn.GRU(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.lstm2 = nn.GRU(input_size=self.input_dim2, hidden_size=self.hidden_dim,num_layers=2,batch_first=False)
            self.fc2 =  nn.Linear(self.hidden_dim*2, 20)
            self.fc3 =  nn.Linear(20,1)
          elif model_name == "BLSTM":  
            self.lstm1 = nn.LSTM(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.lstm2 = nn.LSTM(input_size=self.input_dim2, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.fc2 =  nn.Linear(self.hidden_dim*4, 20)
            self.fc3 =  nn.Linear(20,1)
          elif model_name == "BRNN":
            self.lstm1 = nn.RNN(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.lstm2 = nn.RNN(input_size=self.input_dim2, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.fc2 =  nn.Linear(self.hidden_dim*4, 20)
            self.fc3 =  nn.Linear(20,1)
          elif model_name == "BGRU":
            self.lstm1 = nn.GRU(input_size=self.input_dim1, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.lstm2 = nn.GRU(input_size=self.input_dim2, hidden_size=self.hidden_dim,num_layers=2,batch_first=False,bidirectional=True)
            self.fc2 =  nn.Linear(self.hidden_dim*4, 20)
            self.fc3 =  nn.Linear(20,1)

        
        self.relu= nn.ReLU()
        self.drop=nn.Dropout(0.2)
              


    def forward(self,xb,xd):

 
        if self.arcitecture_name=='basic':
          x,self.hn1=self.lstm(xb)
          x=self.drop(x[:,-1,:])
          x=self.fc1(x)
        
        elif  self.arcitecture_name=='derived':
          xb,self.hn1=self.lstm1(xb) 
          xb=self.drop(xb)
          xd,self.hn2=self.lstm2(xd) 
          xd=self.drop(xd)
          x=torch.cat((xb[:,-1,:],xd[:,-1,:]),1)
          x=self.fc2(x)

        
        x=self.relu(x)
        x=self.fc3(x)
        return x 



# Utils

This Class  saves the best model while training. If the current epoch's 
validation loss is less than the previous least less, then save themodel state.

In [9]:
class SaveBestModel:
    def __init__(
        self, best_valid_loss=float('inf')
    ):
        self.best_valid_loss = best_valid_loss
        
    def __call__(
        self, current_valid_loss, 
        epoch, model, optimizer, criterion ,weight_file
    ):
        if current_valid_loss < self.best_valid_loss:
            self.best_valid_loss = current_valid_loss
            print(f"\nBest validation loss: {self.best_valid_loss}")
            print(f"\nSaving best model for epoch: {epoch+1}\n")
            torch.save({
                'epoch': epoch+1,
                'model_state_dict': model.state_dict(),
                'optimizer_state_dict': optimizer.state_dict(),
                'loss': criterion,
                }, weight_file ) #'outputs/best_model.pth'



**Train** function which receives model train and test datasets as well as directory for saving and loading models parameters and plots

In [92]:
def train(model, train_data, test_data, weights_file, plots_file):  

  train_loader = torch.utils.data.DataLoader(train_data, batch_size=32, shuffle=False)
  test_loader  = torch.utils.data.DataLoader(test_data, batch_size=32, shuffle=False)    
    
  loss_function = MeanAbsolutePercentageError().to(device)
  optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    
  
  lr_scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, patience=10, verbose = True)   # in training phase :after validation 
  best_model=SaveBestModel()    
  training_loss = []
  test_loss_best=[]
  epochs =40
  model.to(device)
 
  for epoch in range(epochs):
    model.train()
    for (xb,xd,y) in tqdm.tqdm(train_loader):
          xb,xd,y = xb.to(device),xd.to(device), y.to(device)
          optimizer.zero_grad() #zeroing gradient
          outputs = model(xb,xd) #forward pass
          loss_tr=loss_function(outputs, y) ## obtain the loss function
          loss_tr.backward() #calculates the loss of the loss function #retain_graph=True
          optimizer.step() #improve from loss, i.e backprop
    best_model(loss_tr,epoch,model,optimizer,loss_function,weights_file)        
    training_loss.append(loss_tr.item())
    if epoch % 5 == 0:
           print(f"Epoch: %d, loss: %1.5f" % (epoch, loss_tr.item())) 
           
  #save_model(epoch, model, optimizer, loss_function,weights_file)
    #print(f'train mape at epoch {epoch}: {loss_tr}')
    """ 
    model.eval()
    #with torch.no_grad():
    for (xb,xd,y) in tqdm.tqdm(test_loader):
          xb,xd,y = xb.to(device),xd.to(device), y.to(device)
          outputs = model(xb,xd) #forward pass
          loss_val=loss_function(outputs, y) ## obtain the loss function
    validation_loss.append(loss_val.item())
    lr_scheduler.step(loss_val)#after validation      
    print(f'epoch {epoch} -->  train loss :{loss_tr} , val loss : {loss_val}')     
  #save_model(epoch, model, optimizer, loss_function,weights_file)
    best_model(loss_val,epoch,model,optimizer,loss_function,weights_file)
    """
 
  #  Create count of the number of epochs
  epoch_count = range(1, len(training_loss) + 1)

  # Visualize loss history
  plt.clf()
  plt.plot(epoch_count, training_loss, 'r--')
  #plt.plot(epoch_count, validation_loss, 'b-')
  plt.legend(['Training Loss']) #, ['test Loss']
  plt.xlabel('Epoch')
  plt.ylabel('Loss')
  plt.savefig(plots_file)

  #TO make the evaluation on the est weights saved
  best_model_cp = torch.load(weights_file)
  model.load_state_dict(best_model_cp['model_state_dict'])     
  model.eval()
 
 
  for (xb,xd,y) in tqdm.tqdm(test_loader):
    xb,xd,y = xb.to(device),xd.to(device), y.to(device)
    outputs = model(xb,xd) #forward pass
    loss=loss_function(outputs, y)
    test_loss_best.append(loss.item())

  mape=(torch.tensor(test_loss_best,dtype=torch.float32)).mean() 
  return mape.numpy()

In [89]:
#initialization

experiment_dir = '/content/drive/My Drive/NN project/experiments'
data_dir = '/content/drive/My Drive/NN project/data/processed/House1.csv'

experiment_type = "derived"  #['basic','derived']
loss_function = "mape"
windows = [2] #[2,6,12] 
model_names = ['LSTM', 'RNN', 'GRU', 'BLSTM', 'BRNN', 'BGRU'] #['LSTM','BLSTM']
resume = False
dropout = 0.2
verbose = 0

#For generating graphs
labels =  ['Actual', 'LSTM', 'RNN', 'GRU', 'BLSTM', 'BRNN', 'BGRU']  #['Actual', 'LSTM' 'BLSTM']
graph_data = {}
graph_data[0] = ['*', 'black', 'solid'] #Actual
graph_data[1] = ['+', 'green', 'dashed']#LSTM
graph_data[2] = ['.', 'yellow', 'dashed']#RNN
graph_data[3] = ['^', 'cyan', 'dashdot']#GRU
graph_data[4] = ['o', 'orange', 'dashdot']#BLSTM
graph_data[5] = ['-', 'red', 'dotted']#BRNN
graph_data[6] = ['x', 'blue', 'dotted']#BGRU


info = f"Performed experiment with following parameters: {experiment_type} features, {loss_function} loss function, , dropout {dropout} \
Time_stamps {windows}, models {model_names}, resume {resume} on {datetime.datetime.now()}"


# Training

**Initialization** :


Create directories to save plots and results

In [13]:
device = ('cuda' if torch.cuda.is_available() else 'cpu')
print(device)

cuda


In [12]:
weights_dir = os.path.join(experiment_dir , 'weights')
plots_dir = os.path.join(experiment_dir , 'plots')
results_dir = os.path.join(experiment_dir , 'results')
graphs_dir = os.path.join(experiment_dir , 'graphs')

if(not os.path.isdir(weights_dir)):
    os.makedirs(weights_dir)

if(not os.path.isdir(plots_dir)):
    os.makedirs(plots_dir)

if(not os.path.isdir(results_dir)):
    os.makedirs(results_dir)

if(not os.path.isdir(graphs_dir)):
    os.makedirs(graphs_dir)


with open(os.path.join(experiment_dir, "info.txt"), "w") as f:
  f.write(info)
  


In [None]:
for window in windows:
    #generate data and dataset with train,val,test split
    train_x, train_y, train_d =get_data(data_ten,'train', window,split) 
    test_x, test_y, test_d = get_data(data_ten,'test', window,split)

    results = pd.DataFrame(columns=['MODEL_NAME', 'MAPE'])   
      
    for model_name in model_names:

        train_data=torch.utils.data.TensorDataset(train_x,train_d,train_y)
        test_data=torch.utils.data.TensorDataset(test_x,test_d,test_y)
        
        if(experiment_type == "derived"):
            model=recmodel('derived',model_name)

        elif(experiment_type == "basic"):
            model=recmodel('basic',model_name)
          
        else: 
            raise ValueError(f'{experiment_type} not defined')

        weight_file = os.path.join(weights_dir , str(window) + '_' + model_name + '.pth')  #'.h5'
        plot_file = os.path.join(plots_dir , str(window) + '_' + model_name + '.png')
       
        print(f"training-->{model_name} with exprimen type:{experiment_type} and window:{window}")      
        mape = train(model, train_data, test_data, weight_file, plot_file) #  doTrain = True, verbose = 1  doTrain = not resume , verbose = verbose
        print(f"MAPE on {model_name} is {mape}")

        results = results.append([{'MODEL_NAME': model_name, 'MAPE': mape}])

    results.to_csv(os.path.join(results_dir , str(window) +".csv"))
   

In [31]:
def create_graphs(data, labels, graph_data, dates, graph_file):
    formatter = DateFormatter('%Y-%m-%d')
    fig, ax = plt.subplots(figsize=(18,8))

    for i in range(len(data)):
      ax.plot(dates, data[i], graph_data[i][0], color=graph_data[i][1], linestyle=graph_data[i][2])

    plt.ylabel ('KW per 30 minutes', size = 20)
    plt.xticks(size = 20)
    plt.yticks(size = 20)

    plt.gcf().axes[0].xaxis.set_major_formatter(formatter)

    for label in plt.gcf().axes[0].xaxis.get_ticklabels()[::2]:
      label.set_visible(False)
    
    ax.legend(labels, loc='upper center', bbox_to_anchor=(0.5, 1.13),
            ncol=3, fancybox=True, shadow=True, fontsize=18)
    
    plt.savefig(graph_file, format='pdf', bbox_inches='tight')
    plt.show()

In [94]:
dates = mdates.drange(datetime.datetime(2019, 4, 28, 0, 0), datetime.datetime(2019, 5, 1, 0, 0),  datetime.timedelta(minutes=30))

for window in windows:
  
    test_x = test_x[144:288]  
    test_d = test_d[144:288] 
    test_y = test_y[144:288] 

    data = []
    data.append(test_y)

    for model_name in model_names:
    
        if(experiment_type == "derived"):
              model=recmodel('derived',model_name)
              

        elif(experiment_type == "basic"):
              model=recmodel('basic',model_name)

        weight_file = os.path.join(weights_dir , str(window) + '_' + model_name + '.pth') #.h5
        model.eval()
        model_cp = torch.load(weight_file)
        model.load_state_dict(model_cp['model_state_dict'])
        
        #model.load_weights(weight_file)
        with torch.no_grad():
          data.append(model(test_x,test_d))
    create_graphs(data, labels, graph_data, dates, os.path.join(graphs_dir, str(window)+".pdf"))

# Results

**Plots folder in experiment directory contains (training loss-epochs) plots.** 
Due to the volatile nature of data training is done only on training set without spiliting it to train+validation sets.
After trying different ways of spiliting the training data at the end training without validation set resulted better MAPE/.
**for example for RNN with window=2:**
 

 <figure>
 <center>
 <img src='https://i.postimg.cc/wMjn9Xt5/2-RNN.png' width="800" 
      height="500"/>
 <figcaption>Training loss</figcaption></center>
 </figure>




graph folder in experiment directory contains actual and predicted value for a portion of testset from 2019-04-28
to 2019-05-01

 <figure>
 <center>
 <img src='https://i.postimg.cc/85PyqLnh/2.png' width="1200" 
      height="500"/>
 <figcaption>Training loss</figcaption></center>
 </figure>





