## Notebook description

This notebook consists of the code for loading the cleaned data and generation of the required model for the project. Notebook takes the following 2 files:
1. 2020_clean_data_by_minute_for_prediction.csv
2. 2021_clean_data_by_minute_for_prediction.csv

In the data provided there might be some data points which might be abnormal. So, a deep cleaning of the data is performed with use of Isolation Forest algorithm. The data that is categorized as normal by the algorithm will be then used to train the multi layer perceptron.

**Note:** This notebook if executed as a whole will consume huge amount of time as we are trying with different number of hidden nodes in the multi layer perceptron.

In [1]:
# Loading the required libraries
import pandas as pd
import os
import re
from datetime import datetime
import seaborn as sns
import matplotlib.pyplot as plt
import itertools
import shutil
import numpy as np

import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from keras.utils import np_utils

from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_absolute_error, mean_squared_error

from sklearn.ensemble import IsolationForest

import warnings
warnings.filterwarnings('ignore')

### Data Loading and Splitting

In [2]:
data_2020 = pd.read_csv("Data_Clean/2020_clean_data_by_minute_for_prediction.csv")
data_2020.head()

Unnamed: 0,Date,Timestamp,Inverter,Energy,Total_Energy,Inv_Temp,Wms_Temp,Wms_Irr
0,2020-03-15 12:36:00,1584255960000,Inv01,1.0,12332.0,34.6,36.7,944
1,2020-03-15 12:51:00,1584256860000,Inv01,1.0,12333.0,41.6,35.4,949
2,2020-03-15 13:06:00,1584257760000,Inv01,1.0,12334.0,44.9,34.5,949
3,2020-03-15 13:27:00,1584259020000,Inv01,1.0,12335.0,47.7,34.7,930
4,2020-03-15 13:40:00,1584259800000,Inv01,1.0,12335.0,48.8,35.4,905


In [3]:
data_2021 = pd.read_csv("Data_Clean/2021_clean_data_by_minute_for_prediction.csv")
data_2021.head()

Unnamed: 0,Date,Timestamp,Inverter,Energy,Total_Energy,Inv_Temp,Wms_Temp,Wms_Irr
0,2021-01-01 07:25:00,1609466100000,Inv01,1.0,69999.0,35.1,17.5,107.0
1,2021-01-01 07:26:00,1609466160000,Inv01,1.0,69999.0,35.2,17.6,109.0
2,2021-01-01 07:27:00,1609466220000,Inv01,1.0,69999.0,35.3,17.7,112.0
3,2021-01-01 07:28:00,1609466280000,Inv01,1.0,70000.0,35.4,17.8,113.0
4,2021-01-01 07:29:00,1609466340000,Inv01,1.0,70000.0,35.6,17.8,114.0


In [4]:
data_2020.Inverter.value_counts()

Inv03    160837
Inv05    151636
Inv04    143015
Inv06    136508
Inv02    134009
Inv01    131150
Inv08    128601
Inv07    124427
Inv10    122718
Inv09    118295
Name: Inverter, dtype: int64

In [5]:
data_2021.Inverter.value_counts()

Inv03    187248
Inv01    179033
Inv06    178322
Inv02    177607
Inv08    177586
Inv07    167870
Inv05    167173
Inv10    161807
Inv04    158423
Inv09    157068
Name: Inverter, dtype: int64

In [6]:
inv01_2020 = data_2020[data_2020.Inverter == 'Inv01']
inv02_2020 = data_2020[data_2020.Inverter == 'Inv02']
inv03_2020 = data_2020[data_2020.Inverter == 'Inv03']
inv04_2020 = data_2020[data_2020.Inverter == 'Inv04']
inv05_2020 = data_2020[data_2020.Inverter == 'Inv05']
inv06_2020 = data_2020[data_2020.Inverter == 'Inv06']
inv07_2020 = data_2020[data_2020.Inverter == 'Inv07']
inv08_2020 = data_2020[data_2020.Inverter == 'Inv08']
inv09_2020 = data_2020[data_2020.Inverter == 'Inv09']
inv10_2020 = data_2020[data_2020.Inverter == 'Inv10']

inv01_2021 = data_2021[data_2021.Inverter == 'Inv01']
inv02_2021 = data_2021[data_2021.Inverter == 'Inv02']
inv03_2021 = data_2021[data_2021.Inverter == 'Inv03']
inv04_2021 = data_2021[data_2021.Inverter == 'Inv04']
inv05_2021 = data_2021[data_2021.Inverter == 'Inv05']
inv06_2021 = data_2021[data_2021.Inverter == 'Inv06']
inv07_2021 = data_2021[data_2021.Inverter == 'Inv07']
inv08_2021 = data_2021[data_2021.Inverter == 'Inv08']
inv09_2021 = data_2021[data_2021.Inverter == 'Inv09']
inv10_2021 = data_2021[data_2021.Inverter == 'Inv10']

In [7]:
features_reqd = ['Date', 'Timestamp', 'Inv_Temp', 'Wms_Temp', 'Wms_Irr', 'Energy']

In [8]:
# This will be combining the data from 2020 and 2021 on an inverter basis.
combined_inv01 = inv01_2020[features_reqd].append(inv01_2021[features_reqd])
combined_inv02 = inv02_2020[features_reqd].append(inv02_2021[features_reqd])
combined_inv03 = inv03_2020[features_reqd].append(inv03_2021[features_reqd])
combined_inv04 = inv04_2020[features_reqd].append(inv04_2021[features_reqd])
combined_inv05 = inv05_2020[features_reqd].append(inv05_2021[features_reqd])
combined_inv06 = inv06_2020[features_reqd].append(inv06_2021[features_reqd])
combined_inv07 = inv07_2020[features_reqd].append(inv07_2021[features_reqd])
combined_inv08 = inv08_2020[features_reqd].append(inv08_2021[features_reqd])
combined_inv09 = inv09_2020[features_reqd].append(inv09_2021[features_reqd])
combined_inv10 = inv10_2020[features_reqd].append(inv10_2021[features_reqd])

### Deep cleaning by  Isolation Forest

Isolation Forest is an unsupervised machine learning algorithm used for anomaly detection. It was proposed by Liu et al. in 2008. The algorithm is based on the concept of isolating anomalies or outliers in a dataset.

The goal of the Isolation Forest algorithm is to separate anomalous data points from normal data points by constructing isolation trees. An isolation tree is a binary tree where each internal node represents a splitting rule on a particular feature, and each leaf node represents an outlier/anomaly. The height of the tree represents the number of splits required to isolate an instance.

The algorithm works as follows:

1. Randomly select a feature from the dataset and randomly select a split value between the minimum and maximum values of that  feature.

2. Split the data based on the selected feature and split value, creating two new child nodes.

3. Repeat steps 1 and 2 recursively for each child node until a predefined stopping criterion is met. This criterion could be a maximum tree depth or a minimum number of samples in a leaf node.

4. Repeat steps 1-3 to construct multiple isolation trees.

5. To detect anomalies, a new data point is passed down each isolation tree. The number of splits required to isolate the data point is recorded as the path length.

6. Finally, an anomaly score is calculated for each data point based on the average path length across all isolation trees. The lower the score, the more likely it is to be an anomaly.

Source: Towards Data Science

In [None]:
# Model formation
random_state = np.random.RandomState(42)
model=IsolationForest(n_estimators=100,max_samples='auto',random_state=random_state, warm_start=True, bootstrap = True)

model_inv01 = model.fit(combined_inv01[features_reqd[2:]])
model_inv02 = model.fit(combined_inv02[features_reqd[2:]])
model_inv03 = model.fit(combined_inv03[features_reqd[2:]])
model_inv04 = model.fit(combined_inv04[features_reqd[2:]])
model_inv05 = model.fit(combined_inv05[features_reqd[2:]])
model_inv06 = model.fit(combined_inv06[features_reqd[2:]])
model_inv07 = model.fit(combined_inv07[features_reqd[2:]])
model_inv08 = model.fit(combined_inv08[features_reqd[2:]])
model_inv09 = model.fit(combined_inv09[features_reqd[2:]])
model_inv10 = model.fit(combined_inv10[features_reqd[2:]])

In [10]:
# Assigning anomaly scores and splitting data into normal and abnormal data
combined_inv01.loc[:, 'scores'] = model_inv01.decision_function(combined_inv01[features_reqd[2:]]);
combined_inv01.loc[:, 'anomaly_score'] = model_inv01.predict(combined_inv01[features_reqd[2:]]);

combined_inv02.loc[:, 'scores'] = model_inv02.decision_function(combined_inv02[features_reqd[2:]]);
combined_inv02.loc[:, 'anomaly_score'] = model_inv02.predict(combined_inv02[features_reqd[2:]]);

combined_inv03.loc[:, 'scores'] = model_inv03.decision_function(combined_inv03[features_reqd[2:]]);
combined_inv03.loc[:, 'anomaly_score'] = model_inv03.predict(combined_inv03[features_reqd[2:]]);

combined_inv04.loc[:, 'scores'] = model_inv04.decision_function(combined_inv04[features_reqd[2:]]);
combined_inv04.loc[:, 'anomaly_score'] = model_inv04.predict(combined_inv04[features_reqd[2:]]);

combined_inv05.loc[:, 'scores'] = model_inv05.decision_function(combined_inv05[features_reqd[2:]]);
combined_inv05.loc[:, 'anomaly_score'] = model_inv05.predict(combined_inv05[features_reqd[2:]]);

combined_inv06.loc[:, 'scores'] = model_inv06.decision_function(combined_inv06[features_reqd[2:]]);
combined_inv06.loc[:, 'anomaly_score'] = model_inv06.predict(combined_inv06[features_reqd[2:]]);

combined_inv07.loc[:, 'scores'] = model_inv07.decision_function(combined_inv07[features_reqd[2:]]);
combined_inv07.loc[:, 'anomaly_score'] = model_inv07.predict(combined_inv07[features_reqd[2:]]);

combined_inv08.loc[:, 'scores'] = model_inv08.decision_function(combined_inv08[features_reqd[2:]]);
combined_inv08.loc[:, 'anomaly_score'] = model_inv08.predict(combined_inv08[features_reqd[2:]]);

combined_inv09.loc[:, 'scores'] = model_inv09.decision_function(combined_inv09[features_reqd[2:]]);
combined_inv09.loc[:, 'anomaly_score'] = model_inv09.predict(combined_inv09[features_reqd[2:]]);

combined_inv10.loc[:, 'scores'] = model_inv10.decision_function(combined_inv10[features_reqd[2:]]);
combined_inv10.loc[:, 'anomaly_score'] = model_inv10.predict(combined_inv10[features_reqd[2:]]);

## Modelling - Multi Layer Perceptron

In [11]:
# The function is used to form the model layers
def build_and_compile_model(norm, nodes):
    model = Sequential([
      norm,
      Dense(nodes, activation='relu'),
      Dense(nodes, activation='relu'),
      Dense(1)
      ])

    model.compile(loss='mse',
                optimizer=tensorflow.keras.optimizers.Adam(0.001))
    return model

In [12]:
# The method will return all data except December 2021 data and return training and test data sets.
def form_training_network_data(df):
    condition_for_training_data = (df.anomaly_score == 1) & (df.Date < "2021-12-01")
    data = df[condition_for_training_data]
    X_train, X_test, y_train, y_test = train_test_split(data[features_reqd[2:-1]], data[features_reqd[-1]], test_size=0.2, random_state=42)
    
    return X_train, X_test, y_train, y_test

In [13]:
# Model fitting function
def form_model(X_train, y_train, nodes):
    normalizer = tensorflow.keras.layers.Normalization(axis=-1)
    normalizer.adapt(np.array(X_train))

    dnn_model = build_and_compile_model(normalizer, nodes)
    
    history = dnn_model.fit(
    X_train,
    y_train,
    validation_split=0.2,
    verbose=0, epochs=100)
    
    return dnn_model, history

In [14]:
def model_evaluation(y_test, predictions):
    return np.sqrt(mean_squared_error(y_test, predictions))

In [15]:
# The method will return the predictions dataframe for December 2021
def form_predictions_df(df, dnn_model, date):
    condition_for_prediction_data = (df.Date >= date)
    prediction_df = df[condition_for_prediction_data]
    predictions = dnn_model.predict(prediction_df[features_reqd[2:-1]])
    prediction_df.loc[:, 'predictions'] = predictions
    return prediction_df

**Inv01**

In [16]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv01)

In [17]:
node_configs = [2,3,4,5,64,100]

In [18]:
# Running the model with different number of nodes in the hidden layers.
for no_nodes in node_configs:
    dnn_model_1, history_1 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_1.save(f'dnn_model_inv01_{no_nodes}')
    
    predictions_1 = dnn_model_1.predict(X_test)
    print(model_evaluation(y_test, predictions_1), no_nodes)

0.6877360713224737 2
0.6841030742393894 3
0.6814357144588011 4
0.6775061396279386 5
0.6716465185624793 64
0.6696423764205707 100


In [19]:
prediction_df_1 = form_predictions_df(combined_inv01, dnn_model_1, "2021-12-01")
prediction_df_1.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
166200,2021-12-01 08:06:00,1638326160000,39.5,20.8,82.0,1.0,-0.061232,-1,0.956806
166201,2021-12-01 08:23:00,1638327180000,41.0,21.1,94.0,1.0,-0.029794,-1,0.840254
166202,2021-12-01 08:24:00,1638327240000,41.1,21.1,118.0,1.0,-0.009246,-1,1.003176
166203,2021-12-01 08:25:00,1638327300000,41.1,21.2,118.0,1.0,-0.0058,-1,1.002867
166204,2021-12-01 08:28:00,1638327480000,41.4,21.3,72.0,1.0,-0.045867,-1,0.623461


**Inv02**

In [20]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv02)

In [21]:
for no_nodes in node_configs:
    dnn_model_2, history_2 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_2.save(f'dnn_model_inv02_{no_nodes}')
    
    predictions_2 = dnn_model_2.predict(X_test)
    print(model_evaluation(y_test, predictions_2), no_nodes)

0.6371759849669975 2
0.6323990605621763 3
0.6284449705550209 4
0.6264844695010285 5
0.6209776971423934 64
0.6211329384851064 100


In [22]:
prediction_df_2 = form_predictions_df(combined_inv02, dnn_model_2, "2021-12-01")
prediction_df_2.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
344021,2021-12-01 08:23:00,1638327180000,40.6,21.1,94.0,1.0,-0.030087,-1,0.994046
344022,2021-12-01 08:26:00,1638327360000,40.8,21.2,102.0,1.0,-0.023318,-1,0.994046
344023,2021-12-01 08:27:00,1638327420000,40.9,21.2,92.0,1.0,-0.027888,-1,0.994046
344024,2021-12-01 08:32:00,1638327720000,41.3,21.4,130.0,1.0,0.006846,1,1.035302
344025,2021-12-01 08:35:00,1638327900000,41.5,21.6,121.0,1.0,0.006893,1,1.007751


**Inv03**

In [23]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv03)

In [24]:
for no_nodes in node_configs:
    dnn_model_3, history_3 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_3.save(f'dnn_model_inv03_{no_nodes}')
    
    predictions_3 = dnn_model_3.predict(X_test)
    print(model_evaluation(y_test, predictions_3), no_nodes)

0.7869978767679406 2
0.7657672546692795 3
0.7620722466932481 4
0.7608277000428099 5
0.7394790421489582 64
0.7452246718695853 100


In [25]:
prediction_df_3 = form_predictions_df(combined_inv03, dnn_model_3, "2021-12-01")
prediction_df_3.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
530622,2021-12-01 08:07:00,1638326220000,38.6,20.9,71.0,1.0,-0.085205,-1,1.011927
530623,2021-12-01 08:10:00,1638326400000,39.0,20.9,65.0,1.0,-0.085648,-1,0.974076
530624,2021-12-01 08:22:00,1638327120000,40.1,21.0,83.0,1.0,-0.049415,-1,0.99501
530625,2021-12-01 08:25:00,1638327300000,40.3,21.2,118.0,2.0,-0.02397,-1,1.203434
530626,2021-12-01 08:26:00,1638327360000,40.5,21.2,102.0,1.0,-0.022488,-1,1.076953


**Inv04**

In [26]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv04)

In [27]:
for no_nodes in node_configs:
    dnn_model_4, history_4 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_4.save(f'dnn_model_inv04_{no_nodes}')
    
    predictions_4 = dnn_model_4.predict(X_test)
    print(model_evaluation(y_test, predictions_4), no_nodes)

0.4685741993015077 2
0.45756907039596983 3
0.4552739920074057 4
0.45311070533851 5
0.4428278973564009 64
0.4425415062001865 100


In [28]:
prediction_df_4 = form_predictions_df(combined_inv04, dnn_model_4, "2021-12-01")
prediction_df_4.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
691246,2021-12-01 08:26:00,1638327360000,29.8,21.2,102.0,1.0,-0.122813,-1,3.627649
691247,2021-12-01 08:33:00,1638327780000,30.3,21.5,137.0,1.0,-0.10938,-1,3.587825
691248,2021-12-01 08:50:00,1638328800000,31.6,21.8,188.0,1.0,-0.105395,-1,3.443815
691249,2021-12-01 08:51:00,1638328860000,31.6,21.9,179.0,1.0,-0.100489,-1,3.415919
691250,2021-12-01 08:52:00,1638328920000,31.7,21.9,139.0,1.0,-0.103032,-1,3.314875


**Inv05**

In [29]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv05)

In [30]:
for no_nodes in node_configs:
    dnn_model_5, history_5 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_5.save(f'dnn_model_inv05_{no_nodes}')
    
    predictions_5 = dnn_model_5.predict(X_test)
    print(model_evaluation(y_test, predictions_5), no_nodes)

0.5444949709588698 2
0.5402157358356859 3
0.533690516535569 4
0.5351256991789134 5
0.5288533078589913 64
0.5271398014215911 100


In [31]:
prediction_df_5 = form_predictions_df(combined_inv05, dnn_model_5, "2021-12-01")
prediction_df_5.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
857663,2021-12-01 08:26:00,1638327360000,30.0,21.2,102.0,1.0,-0.122813,-1,3.662943
857664,2021-12-01 08:34:00,1638327840000,30.6,21.5,141.0,1.0,-0.108968,-1,3.636024
857665,2021-12-01 08:37:00,1638328020000,30.7,21.6,110.0,1.0,-0.117431,-1,3.499018
857666,2021-12-01 08:40:00,1638328200000,30.8,21.7,145.0,1.0,-0.105316,-1,3.594235
857667,2021-12-01 08:45:00,1638328500000,31.1,21.7,97.0,1.0,-0.116522,-1,3.345


**Inv06**

In [32]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv06)

In [33]:
for no_nodes in node_configs:
    dnn_model_6, history_6 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_6.save(f'dnn_model_inv06_{no_nodes}')
    
    predictions_6 = dnn_model_6.predict(X_test)
    print(model_evaluation(y_test, predictions_6), no_nodes)

0.7749038399432349 2
0.757743703859589 3
0.7582501103627637 4
0.7542214450022767 5
0.7431968192694227 64
0.7428694978906903 100


In [34]:
prediction_df_6 = form_predictions_df(combined_inv06, dnn_model_6, "2021-12-01")
prediction_df_6.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
1034649,2021-12-01 08:03:00,1638325980000,37.1,20.8,56.0,1.0,-0.108582,-1,0.768859
1034650,2021-12-01 08:06:00,1638326160000,37.4,20.9,84.0,1.0,-0.089192,-1,0.856321
1034651,2021-12-01 08:07:00,1638326220000,37.5,20.9,76.0,1.0,-0.096553,-1,0.825291
1034652,2021-12-01 08:10:00,1638326400000,37.8,21.0,64.0,1.0,-0.096242,-1,0.802544
1034653,2021-12-01 08:21:00,1638327060000,38.8,21.1,83.0,1.0,-0.064208,-1,0.866322


**Inv07**

In [35]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv07)

In [36]:
for no_nodes in node_configs:
    dnn_model_7, history_7 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_7.save(f'dnn_model_inv07_{no_nodes}')
    
    predictions_7 = dnn_model_7.predict(X_test)
    print(model_evaluation(y_test, predictions_7), no_nodes)

0.7839741311274718 2
0.7804899135021969 3
0.7748685896040572 4
0.7670924712145456 5
0.7581027426430959 64
0.7599441452598354 100


In [37]:
prediction_df_7 = form_predictions_df(combined_inv07, dnn_model_7, "2021-12-01")
prediction_df_7.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
1203560,2021-12-01 08:01:00,1638325860000,38.2,20.7,53.0,1.0,-0.093918,-1,0.668857
1203561,2021-12-01 08:06:00,1638326160000,38.6,20.9,84.0,1.0,-0.068441,-1,0.933486
1203562,2021-12-01 08:07:00,1638326220000,38.8,20.9,76.0,1.0,-0.077912,-1,0.85661
1203563,2021-12-01 08:12:00,1638326520000,39.2,21.0,54.0,1.0,-0.084147,-1,0.651536
1203564,2021-12-01 08:22:00,1638327120000,40.2,21.1,81.0,1.0,-0.047657,-1,0.860498


**Inv08**

In [38]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv08)

In [39]:
for no_nodes in node_configs:
    dnn_model_8, history_8 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_8.save(f'dnn_model_inv08_{no_nodes}')
    
    predictions_8 = dnn_model_8.predict(X_test)
    print(model_evaluation(y_test, predictions_8), no_nodes)

0.7996222478609107 2
0.7962944483803671 3
0.8008851684233249 4
0.7821568501118294 5
0.7739826737185125 64
0.7729459818617402 100


In [40]:
prediction_df_8 = form_predictions_df(combined_inv08, dnn_model_8, "2021-12-01")
prediction_df_8.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
1380415,2021-12-01 08:06:00,1638326160000,38.4,20.9,84.0,1.0,-0.075863,-1,1.155724
1380416,2021-12-01 08:07:00,1638326220000,38.5,20.9,76.0,1.0,-0.080344,-1,1.117223
1380417,2021-12-01 08:10:00,1638326400000,38.7,21.0,64.0,1.0,-0.085721,-1,1.060503
1380418,2021-12-01 08:18:00,1638326880000,39.5,21.0,47.0,1.0,-0.079076,-1,0.922363
1380419,2021-12-01 08:23:00,1638327180000,39.9,21.1,86.0,1.0,-0.050357,-1,1.02116


**Inv09**

In [41]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv09)

In [42]:
for no_nodes in node_configs:
    dnn_model_9, history_9 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_9.save(f'dnn_model_inv09_{no_nodes}')
    
    predictions_9 = dnn_model_9.predict(X_test)
    print(model_evaluation(y_test, predictions_9), no_nodes)

0.7032052225664727 2
0.70019254381695 3
0.6938565823343591 4
0.6916931955097713 5
0.6815640840069878 64
0.6841969569222991 100


In [43]:
prediction_df_9 = form_predictions_df(combined_inv05, dnn_model_9, "2021-12-01")
prediction_df_9.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
857663,2021-12-01 08:26:00,1638327360000,30.0,21.2,102.0,1.0,-0.122813,-1,3.300672
857664,2021-12-01 08:34:00,1638327840000,30.6,21.5,141.0,1.0,-0.108968,-1,3.31581
857665,2021-12-01 08:37:00,1638328020000,30.7,21.6,110.0,1.0,-0.117431,-1,3.188075
857666,2021-12-01 08:40:00,1638328200000,30.8,21.7,145.0,1.0,-0.105316,-1,3.289207
857667,2021-12-01 08:45:00,1638328500000,31.1,21.7,97.0,1.0,-0.116522,-1,3.063368


**Inv10**

In [44]:
X_train, X_test, y_train, y_test = form_training_network_data(combined_inv10)

In [45]:
for no_nodes in node_configs:
    dnn_model_10, history_10 = form_model(X_train, y_train, no_nodes)
    
    dnn_model_10.save(f'dnn_model_inv10_{no_nodes}')
    
    predictions_10 = dnn_model_10.predict(X_test)
    print(model_evaluation(y_test, predictions_10), no_nodes)

0.7389917578501607 2
0.7393221244117427 3
0.7327010216455614 4
0.7326059988170662 5
0.7257144868444739 64
0.7251742755107441 100


In [46]:
prediction_df_10 = form_predictions_df(combined_inv10, dnn_model_10, "2021-12-01")
prediction_df_10.head()



Unnamed: 0,Date,Timestamp,Inv_Temp,Wms_Temp,Wms_Irr,Energy,scores,anomaly_score,predictions
1700246,2021-12-01 08:24:00,1638327240000,29.6,21.2,114.0,1.0,-0.121106,-1,4.299479
1700247,2021-12-01 08:27:00,1638327420000,29.6,21.3,91.0,1.0,-0.123282,-1,4.210784
1700248,2021-12-01 08:34:00,1638327840000,30.2,21.7,128.0,1.0,-0.108368,-1,4.155117
1700249,2021-12-01 08:39:00,1638328140000,30.6,21.7,137.0,1.0,-0.105726,-1,4.077256
1700250,2021-12-01 08:40:00,1638328200000,30.6,21.8,121.0,1.0,-0.110755,-1,4.013768
