<a href="https://colab.research.google.com/github/Chirag314/Homogeneous-ensemble-energydata/blob/main/Homogeneous_ensemble_energydata.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

###This notebook is made from exercises from book Ensemble Machine Learning Cookbook.

In the case of ensemble models, each base classifier must have some degree of diversity within itself. This diversity can be obtained in one of the following manners:

By using different subsets of training data through various resampling methods or randomization of the training data
By using different learning hyperparameters for different base learners
By using different learning algorithms 
In the case of ensemble models, where different algorithms are used for the base learners, the ensemble is called a heterogeneous ensemble method. If the same algorithm is used for all the base learners on different distributions of the training set, the ensemble is called a homogeneous ensemble. 

In [1]:
#import required libraries

import os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense

In [2]:
# Read data from github. Use raw format and copy url# Note normal url and raw url will be different.
import pandas as pd
pd.options.display.max_rows=None
pd.options.display.max_columns=None
url = 'https://raw.githubusercontent.com/PacktPublishing/Ensemble-Machine-Learning-Cookbook/master/Chapter09/energydata.csv'
df_energydata= pd.read_csv(url)
#df = pd.read_csv(url)
print(df_energydata.head(5))

            date  Appliances  lights     T1       RH_1    T2       RH_2  \
0  1/11/16 17:00          60      30  19.89  47.596667  19.2  44.790000   
1  1/11/16 17:10          60      30  19.89  46.693333  19.2  44.722500   
2  1/11/16 17:20          50      30  19.89  46.300000  19.2  44.626667   
3  1/11/16 17:30          50      40  19.89  46.066667  19.2  44.590000   
4  1/11/16 17:40          60      40  19.89  46.333333  19.2  44.530000   

      T3       RH_3         T4       RH_4         T5   RH_5        T6  \
0  19.79  44.730000  19.000000  45.566667  17.166667  55.20  7.026667   
1  19.79  44.790000  19.000000  45.992500  17.166667  55.20  6.833333   
2  19.79  44.933333  18.926667  45.890000  17.166667  55.09  6.560000   
3  19.79  45.000000  18.890000  45.723333  17.166667  55.09  6.433333   
4  19.79  45.000000  18.890000  45.530000  17.200000  55.09  6.366667   

        RH_6         T7       RH_7    T8       RH_8         T9   RH_9  T_out  \
0  84.256667  17.200000  41.62

In [3]:
#Check missing values
df_energydata.isnull().sum()
#It seems there are no missing data in any featues


date           0
Appliances     0
lights         0
T1             0
RH_1           0
T2             0
RH_2           0
T3             0
RH_3           0
T4             0
RH_4           0
T5             0
RH_5           0
T6             0
RH_6           0
T7             0
RH_7           0
T8             0
RH_8           0
T9             0
RH_9           0
T_out          0
Press_mm_hg    0
RH_out         0
Windspeed      0
Visibility     0
Tdewpoint      0
rv1            0
rv2            0
dtype: int64

In [4]:
#Separate the test subset to apply the models in order to make predictions
df_traindata,df_testdata=train_test_split(df_energydata,test_size=0.3)

In [5]:
#Check the shape of the train and test subsets:
print(df_traindata.shape)
print(df_testdata.shape)

(13814, 29)
(5921, 29)


In [6]:
#Take the test subset and split it into target and feature variables
X_test=df_testdata.iloc[:,3:27]
Y_test=df_testdata.iloc[:,28]

In [7]:
#Validate the preceding split by checking the shape of X_test and Y_test
print(X_test.shape)
print(Y_test.shape)

(5921, 24)
(5921,)


In [8]:
#Let's create multiple neural network models using Keras. We use For...Loop to build multiple models
ensemble=20
frac=0.7

predictions_total=np.zeros(5921,dtype=float)

for i in range(ensemble):
  print("number of iteration :",i)
  print("prediction_total",predictions_total)

  #Sample randomly the train data
  Traindata=df_traindata.sample(frac=frac)
  X_train=Traindata.iloc[:,3:27]
  Y_train=Traindata.iloc[:,28]

  model=Sequential()
   # Adding the input layer and the first hidden layer
  model.add(Dense(units=16,kernel_initializer='normal',activation='relu'))
  model.add(Dense(units=24,kernel_initializer='normal',activation='relu'))
  model.add(Dense(units = 32, kernel_initializer = 'normal', activation = 'relu'))
  model.add(Dense(units = 1, kernel_initializer = 'normal', activation = 'relu'))

  #compiling the ANN
  adam=keras.optimizers.Adam(lr=0.001,beta_1=0.9,beta_2=0.9,epsilon=None,decay=0.0)
  model.compile(loss='mse',optimizer=adam,metrics=['mean_squared_error'])

  #Fitting on training set
  model.fit(X_train, Y_train,batch_size=16,epochs=25)

  #Predict the values
  model_predictions=model.predict(X_test)
  model_predictions=model_predictions.flatten()
  print("Test MSE for individual model: ",mean_squared_error(Y_test,model_predictions))
  print("")
  print(model_predictions)
  print("")

predictions_total=np.add(predictions_total, model_predictions)



number of iteration : 0
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 1
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 2
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 3
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  210.00485779829677

[23.83239  24.170807 23.71089  ... 24.047583 24.180254 24.092232]

number of iteration : 4
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  211.2335782426322

[23.654163 23.485935 23.49876  ... 23.626942 23.440626 23.639618]

number of iteration : 5
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  208.84021474114022

[25.417936 25.679611 25.497574 ... 25.647476 25.44191  25.378365]

number of iteration : 6
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  209.36668146028376

[25.706144 25.73814  26.192846 ... 26.216892 25.856136 26.030573]

number of iteration : 7
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 8
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  209.25468996138807

[24.320217 24.686432 24.503359 ... 24.562792 24.396818 24.261383]

number of iteration : 9
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  209.64991902706765

[26.021946 25.856943 26.214796 ... 26.224009 25.751293 25.771387]

number of iteration : 10
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  209.21116846670228

[24.542082 24.612036 24.47388  ... 24.592241 24.411184 24.330912]

number of iteration : 11
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 12
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  210.88323991781047

[23.67442  23.888689 23.565075 ... 23.726633 23.691141 23.500113]

number of iteration : 13
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 14
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 15
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 16
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  211.71499590665863

[23.502918 23.46439  23.219273 ... 23.543373 23.41851  23.26132 ]

number of iteration : 17
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  209.4452103789244

[24.489233 24.446594 24.04379  ... 24.390451 24.279915 24.040594]

number of iteration : 18
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]

number of iteration : 19
prediction_total [0. 0. 0. ... 0. 0. 0.]


  super(Adam, self).__init__(name, **kwargs)


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Test MSE for individual model:  836.9763385502636

[0. 0. 0. ... 0. 0. 0.]



In [9]:
#Take the summation of the predicted values and divide them by the number of iterations to get the average predicted values. We use the average predicted values to calculate the mean-squared error (MSE) for our ensemble
predictions_total=predictions_total/ensemble
print("MSE after ensemble:",mean_squared_error(np.array(Y_test),predictions_total))

MSE after ensemble: 836.9763385502636
