<a href="https://colab.research.google.com/github/A-Wadhwani/ME498-Project/blob/main/04_LGBM_Catboost_FNN_Weighted_Average.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This is an ensemble of three models:
- LightGBM
- Feedforward Neural Network
- Catboost

The models are trained indivdually, and then the coefficients for a weighted average:

$Y_{true} = aY_{FNN} +  bY_{Catboost} +  cY_{LightGBM}$

Are created by simple Linear Regression.

In [6]:
# Open drive

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [7]:
# Installing catboost
!pip install catboost



In [67]:
import numpy as np
import matplotlib.pyplot as plt
import datetime
import seaborn as sns
import tensorflow as tfb
import pandas as pd
import random
import time
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from datetime import datetime
from sklearn.metrics import r2_score, mean_squared_log_error, mean_squared_error

import lightgbm as lgb
import catboost as cb
print("TensorFlow version: ",tf.__version__)  #print the version of tensorflow

TensorFlow version:  2.4.1


In [9]:
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))  

Num GPUs Available:  1


In [10]:
# To build a Feedforward Neural Network
from tensorflow.python.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import Sequential 
from tensorflow.keras.layers import Dense, Activation, Dropout
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras import regularizers
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras import regularizers

In [11]:
# Reading training and testing dataset files
path = "drive/My Drive/ASHRAE_DATA/"
x_train = pd.read_csv(path + "x_train.csv", index_col=0)
x_test = pd.read_csv(path + "x_test.csv", index_col=0)
y_train = pd.read_csv(path + "y_train.csv", index_col=0)
y_test = pd.read_csv(path + "y_test.csv", index_col=0)

  mask |= (ar1 == a)


In [12]:
x_train.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1024067 entries, 13765924 to 158735
Data columns (total 18 columns):
 #   Column              Non-Null Count    Dtype  
---  ------              --------------    -----  
 0   site_id             1024067 non-null  int64  
 1   air_temperature     1024067 non-null  float64
 2   dew_temperature     1024067 non-null  float64
 3   sea_level_pressure  1024067 non-null  float64
 4   wind_direction      1024067 non-null  float64
 5   wind_speed          1024067 non-null  float64
 6   building_id         1024067 non-null  int64  
 7   primary_use         1024067 non-null  int64  
 8   square_feet         1024067 non-null  int64  
 9   year_built          1024067 non-null  float64
 10  floor_count         1024067 non-null  float64
 11  Year                1024067 non-null  int64  
 12  Month               1024067 non-null  int64  
 13  Day_of_Month        1024067 non-null  int64  
 14  Day_of_Year         1024067 non-null  int64  
 15  Day_of_We

In [13]:
# Scaling data
from sklearn.preprocessing import StandardScaler
x_scaler = StandardScaler() 
y_scaler = StandardScaler()

x_train_scaled = x_scaler.fit_transform(x_train)
y_train_scaled = y_scaler.fit_transform(y_train)

x_test_scaled = x_scaler.fit_transform(x_test)
y_test_scaled = y_scaler.transform(y_test)

## Feedforward Neural Network Training

I have included Dropouts to prevent overfitting. I have also included a reduce learning rate feature to make the model decrease learning rate if the MSE does not decrease. I have also included Early Stopping to halt training when the change in MSE is not large.

For hyperparameters, I found that larger batch sizes were necessary for good accuracy. I tested batch sizes of [32, 64, 128, 256, 1024, 2048, 4096] and found 1024 to be the perfect balance between speed and accuracy.


In [22]:
# Creating Feedforward Neural Network Model
fnn_model = Sequential()
fnn_model.add(Dense(256, input_shape=(18, ), activation='relu', name='inputs'))
fnn_model.add(Dense(128, activation='relu', name='dense_4'))
fnn_model.add(Dropout(0.3))
fnn_model.add(Dense(64, activation='relu', name='dense_5'))
fnn_model.add(Dropout(0.3))
fnn_model.add(Dense(32, activation='relu', name='dense_6'))
fnn_model.add(Dropout(0.3))
fnn_model.add(Dense(16, activation='relu', name='dense_7'))
fnn_model.add(Dense(1, activation='linear', name='dense_output'))

fnn_model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
inputs (Dense)               (None, 256)               4864      
_________________________________________________________________
dense_4 (Dense)              (None, 128)               32896     
_________________________________________________________________
dropout_6 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_5 (Dense)              (None, 64)                8256      
_________________________________________________________________
dropout_7 (Dropout)          (None, 64)                0         
_________________________________________________________________
dense_6 (Dense)              (None, 32)                2080      
_________________________________________________________________
dropout_8 (Dropout)          (None, 32)               

In [23]:
opt = keras.optimizers.Adam(learning_rate = 0.01)
fnn_model.compile(loss='mse', optimizer=opt, metrics=['mse', 'mae'])

#Reduce Learning rate on Plateau 
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=10, verbose = 1)

#Earlystopping callback
early_stop = EarlyStopping(monitor ='val_loss', min_delta= 1e-3, patience = 50, verbose = 1, restore_best_weights=True)

start_time = time.time()
history = fnn_model.fit(x_train_scaled, y_train_scaled, callbacks = [early_stop, reduce_lr], 
                    validation_data=(x_test_scaled, y_test_scaled), epochs=400, batch_size=1024, verbose=1)
end_time = time.time()
print("Training Time: ", end_time-start_time, " seconds")

Epoch 1/400
Epoch 2/400
Epoch 3/400
Epoch 4/400
Epoch 5/400
Epoch 6/400
Epoch 7/400
Epoch 8/400
Epoch 9/400
Epoch 10/400
Epoch 11/400
Epoch 12/400
Epoch 13/400
Epoch 14/400
Epoch 15/400
Epoch 16/400
Epoch 17/400
Epoch 18/400
Epoch 19/400

Epoch 00019: ReduceLROnPlateau reducing learning rate to 0.0019999999552965165.
Epoch 20/400
Epoch 21/400
Epoch 22/400
Epoch 23/400
Epoch 24/400
Epoch 25/400
Epoch 26/400
Epoch 27/400
Epoch 28/400
Epoch 29/400
Epoch 30/400
Epoch 31/400
Epoch 32/400
Epoch 33/400
Epoch 34/400
Epoch 35/400
Epoch 36/400
Epoch 37/400
Epoch 38/400
Epoch 39/400
Epoch 40/400
Epoch 41/400
Epoch 42/400
Epoch 43/400
Epoch 44/400
Epoch 45/400
Epoch 46/400
Epoch 47/400
Epoch 48/400
Epoch 49/400
Epoch 50/400
Epoch 51/400
Epoch 52/400
Epoch 53/400
Epoch 54/400
Epoch 55/400
Epoch 56/400
Epoch 57/400
Epoch 58/400
Epoch 59/400
Epoch 60/400
Epoch 61/400
Epoch 62/400
Epoch 63/400
Epoch 64/400
Epoch 65/400
Epoch 66/400
Epoch 67/400
Epoch 68/400
Epoch 69/400
Epoch 70/400
Epoch 71/400
Epoch

In [24]:
# Make predictions
start_time = time.time()
y_predict_scaled = fnn_model.predict(x_test_scaled)
end_time = time.time()
print("Prediction Time: ", end_time-start_time, " seconds")
y_predict = y_scaler.inverse_transform(y_predict_scaled)

Prediction Time:  10.483282566070557  seconds


In [25]:
# Evaluate accuracy
print("MSE: ", mean_squared_error(y_test, y_predict))
print("R^2 Score: ", r2_score(y_test, y_predict))

MSE:  4811.329506747989
R^2 Score:  0.9427561225784716


In [28]:
fnn_model.save("FNNModel.h5")

# Catboost Model

In [110]:
from catboost import CatBoostRegressor

params = {
        'n_estimators': 5000,
        'learning_rate': 0.25,
        'eval_metric': 'RMSE',
        'loss_function': 'RMSE',
        'metric_period': 10,
        'task_type': 'GPU',
        'depth': 12,
}

cb_model = CatBoostRegressor(**params)
start_time = time.time()
cb_model.fit(x_train, y_train, eval_set=(x_test, y_test), use_best_model=True, verbose=100, early_stopping_rounds=100)
end_time = time.time()
print("Training Time: ", end_time-start_time, " seconds")

0:	learn: 229.0094245	test: 227.8561283	best: 227.8561283 (0)	total: 55.4ms	remaining: 4m 37s
100:	learn: 44.2079179	test: 45.2581648	best: 45.2581648 (100)	total: 2.19s	remaining: 1m 45s
200:	learn: 38.1181296	test: 40.0920984	best: 40.0920984 (200)	total: 4.25s	remaining: 1m 41s
300:	learn: 35.2105472	test: 38.0893134	best: 38.0893134 (300)	total: 6.43s	remaining: 1m 40s
400:	learn: 32.8232941	test: 36.5448243	best: 36.5448243 (400)	total: 8.96s	remaining: 1m 42s
500:	learn: 31.2608691	test: 35.5294649	best: 35.5294560 (499)	total: 11.5s	remaining: 1m 43s
600:	learn: 29.5973921	test: 34.3762726	best: 34.3762726 (600)	total: 14.3s	remaining: 1m 44s
700:	learn: 27.7732251	test: 33.2109054	best: 33.2109054 (700)	total: 17.5s	remaining: 1m 47s
800:	learn: 26.4468379	test: 32.4726187	best: 32.4726187 (800)	total: 20.6s	remaining: 1m 47s
900:	learn: 25.1268010	test: 31.6605174	best: 31.6605174 (900)	total: 23.7s	remaining: 1m 47s
1000:	learn: 23.9704538	test: 31.1018402	best: 31.1018402 (1

In [111]:
start_time = time.time()
y_predict = cb_model.predict(x_test)
end_time = time.time()
print("Prediction Time: ", end_time-start_time, " seconds")
print("MSE: ", mean_squared_error(y_test, y_predict))
print("R^2 Score: ", r2_score(y_test, y_predict))

Prediction Time:  9.947292804718018  seconds
MSE:  820.4632600056462
R^2 Score:  0.9902383534075645


In [113]:
cb_model.save_model("CBMModel.h5", format='cbm') # Save model

## Finding Weighted Average


In [116]:
# Loading LightGBM model:

lgbm_model = lgb.Booster(model_file='lightGBMModel.mod')

In [117]:
# Make predictions on training dataset for each model
lgb_pred = lgbm_model.predict(x_train)
fnn_pred = y_scaler.inverse_transform(fnn_model.predict(x_train_scaled))
cat_pred = cb_model.predict(x_train)

In [118]:
train_avg = pd.DataFrame()
train_avg['LightGBM'] = lgb_pred
train_avg['FNN'] = fnn_pred
train_avg['Catboost'] = cat_pred

train_avg.head()

Unnamed: 0,LightGBM,FNN,Catboost
0,270.681268,223.084335,262.350119
1,77.138039,54.236919,63.095787
2,-1.167184,18.62109,-2.708787
3,63.309174,50.970737,46.821493
4,134.556585,99.457054,129.633678


In [119]:
weighted_average = LinearRegression()

weighted_average.fit(train_avg, y_train)
print('Coefficients: ', weighted_average.coef_)
print('Intercept: %.2f' % weighted_average.intercept_)

Coefficients:  [[ 1.01798112 -0.03438864  0.01284118]]
Intercept: 0.41


In [120]:
# Evaluating Weighted Average Model
lgb_pred1 = lgbm_model.predict(x_test)
fnn_pred1 = y_scaler.inverse_transform(fnn_model.predict(x_test_scaled))
cat_pred1 = cb_model.predict(x_test)

test_avg = pd.DataFrame()
test_avg['LightGBM'] = lgb_pred1
test_avg['FNN'] = fnn_pred1
test_avg['Catboost'] = cat_pred1

test_avg.head()

Unnamed: 0,LightGBM,FNN,Catboost
0,54.867965,69.785774,65.84355
1,16.016331,18.62109,20.58438
2,283.091721,301.380249,288.583216
3,6.381409,18.62109,1.938315
4,-0.580406,18.62109,2.328023


In [121]:
y_predict = weighted_average.predict(test_avg)
print("MSE: ", mean_squared_error(y_test, y_predict))
print("R^2 Score: ", r2_score(y_test, y_predict))

MSE:  474.3699966150309
R^2 Score:  0.9943560760283418
