#### Create and Train a RNN 
to predict the value of the voltage rise on the next period

Prediction is done in [VoltageRiseNum_Pred](VoltageRiseNum_Pred.ipynb)

We consider 

\begin{equation}
Z(k) = \begin{bmatrix} X(k-1)\\ \vdots \\ X(k-6)\ \end{bmatrix}, \;\; \text{where} \; \; 
X(k-i) = \begin{bmatrix} P_{load}(k-i) \\ P_{BT}(k-i) \\ P0013(k-i) \\ P0018(k-i) \\ P0100(k-i)\\ max\_vm\_pu(k-i) \end{bmatrix}
\end{equation}

to predict 

\begin{equation} Pred = \begin{bmatrix} P_{load}(k)\\  max\_vm\_pu(k) \end{bmatrix} \end{equation}



I am predicting column [0,5] i.e ['Cons','Voltage_rise'] even though the only column of interest is 5. 
This is because the RNN does not learn well when one is predicting on a unique feature. 
Column 5 is the one considered among all because it has the smoothest curve within the backward 
looking window considered hence yield the best accuracy

---

#### Import modules to be used

In [1]:
#Import Modules 
import pandas as pd
import pandapower as pp
import matplotlib.pyplot as plt
import numpy as np
from tqdm import tqdm # Profiling 
import seaborn as sbn
import pickle, sys, importlib,  time
import os
from pickle import load
import tensorflow as tf
import joblib

#### Import Module for ML

In [2]:
# import pakages forML
from sklearn.preprocessing import MinMaxScaler, OneHotEncoder
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras.callbacks import EarlyStopping

#### Import my own modules

In [3]:
# import psutil
function_folder = '../Modules/' 
# Add function folder to path if it is not already
if function_folder not in sys.path: sys.path.append(function_folder)

import oriFunctions as oriFc
from oriVariables import (network_folder, defAuth_hvBus_vRiseMax, defAuth_hvBus_vRiseMin, excel_folder, train_split_date)

#### Import data Cleaned file for training the RNN

In [4]:
df_data = joblib.load(network_folder+'simulationResults/cleanedData.pkl')

# Extract only a part of data to be used as 
df_data = df_data[df_data.index<='2021 12 31']

In [5]:
# # Extract only the relavant testing set since the training set covers the first part of the data
df_final = df_data[df_data.index < train_split_date]

# Extract only dailight period i.e. from 07am to 7PM
# The daylight period is considered to be defined betwenn 07am and 7Pm excluded. 
h_start_end = ('06:50','18:50') # for the persistance model, the previous period i.e. 06:50 
                                # is needed to compute the first instant i.e. 07:00
per_index = df_final.index
per_daylight = (pd.Series(index=per_index.to_timestamp(), dtype=object)
                .between_time(*h_start_end) ).index.to_period('10T')

# Extract only daylight hours 
df_final = df_final.loc[per_daylight]


per_index = df_final.index
per_index2 = ( pd.Series(index=per_index.to_timestamp(), dtype=object
                           ).between_time('07:00','18:50') ).index.to_period('10T')

#### Import the voltage rise from [Voltage_rise](VoltageRiseBinary.ipynb)

In [6]:
# Import the voltage rise from 
numAndBin_vRise = joblib.load(network_folder+'simulationResults/Binary_Voltage_Rise.pkl')
df_final['Volt_Rise_bin'] = numAndBin_vRise['Volt_Rise_Bin']

### Set variables For numerical voltage rise prediction

In [7]:
# # Extract only the relavant testing set since the training set covers the first part of the data
df_final = df_data[df_data.index<'2021 06 01']
per_index = df_final.index
per_index2 = ( pd.Series(index=per_index.to_timestamp(), dtype=object
                        ).between_time('07:00','18:50') ).index.to_period('10T')
df_final['Volt_Rise_num'] = numAndBin_vRise['known']

In [8]:
# # Separate training and testing set 
df_train = df_final[df_final.index<'2021 06 01']


# I'm using all the dataset to train the RNN to improve the performance since ive already
# tried with the validation  set and get an accuraccy of 94%
# # Separate training and testing set 
# df_train = df_final[df_final.index<'2021 06 01']

# Define scaler
numerical_scaler2 = MinMaxScaler()
numerical_scaler_out = MinMaxScaler()

numerical_scaler2.fit(df_train);
numerical_scaler_out.fit(df_train.iloc[:,[0,5]])

train_scaled = numerical_scaler2.transform(df_train)

##### Define Timeseries  generators


In [9]:
gen_length = 6 # 1 hour

batchSize = 24*7*2; #  (gen_length//6)  To convert in hour *24 hour * 7 days
# I am predicting column [0,5] i.e ['Cons','Voltage_rise'] even though the only collumn of interest is 5. 
# This is because the RNN does not learn well when one is predicting on a unique feature. 
# Column 5 is the one considered among all because it has the smoothest curve within the backward 
# looking window considered hence yield the best accuracy
train_generator = TimeseriesGenerator(train_scaled, train_scaled[:,[0,5]], 
                                      length = gen_length, 
                                      batch_size= batchSize )

# n_features = train_generator[0][0][0].shape[1]  # Define total number of features
n_features_inputs = 6  # Define total number of features in inputs 
n_features_outputs = 2  # Define total number of features to predicts

#### Define RNN

In [10]:
num_vRise_RNN = Sequential()

num_vRise_RNN.add( LSTM(units=128, activation='relu', input_shape=(gen_length,n_features_inputs)) )
# num_vRise_RNN.add( LSTM(units=128, activation='relu' ) )
num_vRise_RNN.add(Dense(units=n_features_outputs, activation='relu'))

num_vRise_RNN.compile(optimizer='adam', loss='mse', )

##### Define early stopping mechanism

In [11]:
early_stop = EarlyStopping(monitor= 'loss',patience=20, mode='min')

#### Train RNN

In [27]:
num_vRise_RNN.fit(train_generator, 
                  epochs=100, 
                  callbacks=[early_stop], 
                  )

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x1f2b08943a0>

#### Save results

In [28]:
num_vRise_RNN.save(f'{network_folder}RNN/StLaurent_num_vRise_model')

INFO:tensorflow:Assets written to: pickle_files/RNN/StLaurent_num_vRise_model\assets


In [29]:
joblib.dump(numerical_scaler_out,f'{network_folder}RNN/StLaurent_num_vRise_scalerPred.plk')
joblib.dump(numerical_scaler2,f'{network_folder}RNN/StLaurent_num_vRise_scaler.plk')

['pickle_files/RNN/StLaurent_num_vRise_scaler.plk']