# **heating load and cooling load prediction of buildings (energy efficiency) as a function of building parameters**

## Source:

The dataset was created by Angeliki Xifara (angxifara '@' gmail.com, Civil/Structural Engineer) and was processed by Athanasios Tsanas (tsanasthanasis '@' gmail.com, Oxford Centre for Industrial and Applied Mathematics, University of Oxford, UK).

### Data Set Information:

We perform energy analysis using 12 different building shapes simulated in Ecotect. The buildings differ with respect to the glazing area, the glazing area distribution, and the orientation, amongst other parameters. We simulate various settings as functions of the afore-mentioned characteristics to obtain 768 building shapes. The dataset comprises 768 samples and 8 features, aiming to predict two real valued responses. It can also be used as a multi-class classification problem if the response is rounded to the nearest integer.

### Attribute Information:

The dataset contains eight attributes (or features, denoted by X1...X8) and two responses (or outcomes, denoted by y1 and y2). The aim is to use the eight features to predict each of the two responses.

# Specifically:

X1 Relative Compactness

X2 Surface Area

X3 Wall Area

X4 Roof Area

X5 Overall Height

X6 Orientation

X7 Glazing Area

X8 Glazing Area Distribution

y1 Heating Load

y2 Cooling Load


### importing the libraries

In [None]:
import tensorflow as tf
import tensorflow.keras as ks
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split

## creating functions to preprocess and visualizing the data

In [None]:
def format_output(data):
  y1 = data.pop('Y1')
  y1 = np.array(y1)
  y2 = data.pop('Y2')
  y2 = np.array(y2)
  return y1, y2

def normalization(x):
  return (x-train_stats['mean'])/train_stats['std']

def plot_diff(y_true, y_preds, title=''):
  plt.figure(figsize=(5, 5))
  plt.scatter(y_true, y_preds)
  plt.xlabel('True values')
  plt.ylabel('predicted Values')
  plt.axis('square')
  plt.xlim(plt.xlim())
  plt.ylim(plt.ylim())
  plt.plot([-100, 100], [-100, 100])
  plt.title(title)
  plt.show()

def plot_metrics(metric_name, title='', ylim=5):
  plt.plot(history.history[metric_name], color='blue', label=metric_name)
  plt.plot(history.history['val_'+metric_name], color='red', label='val_'+metric_name)
  plt.ylim(0, ylim)
  plt.title(title)
  plt.legend()
  plt.show()

In [None]:
# URL = 'https://archive.ics.uci.edu/ml/machine-learning-databases/00242/ENB2012_data.xlsx'

df = pd.read_csv('../input/eergy-efficiency-dataset/ENB2012_data.csv')
print(df.shape)
df.sample(4)

In [None]:
df.isna().sum()

In [None]:
train, test = train_test_split(df, test_size=0.2)
train_stats = train.describe()

train_stats.pop('Y1')
train_stats.pop('Y2')

train_stats = train_stats.T

train_y = format_output(train)
test_y = format_output(test)

norm_train_x = normalization(train)
norm_test_x = normalization(test)

norm_train_x.sample(2)

In [None]:
train.info()

In [None]:
train_stats

**Inorder to predict the heating load and Cooling load we'll use a Functional model having a single input, a five dense and with two outputs**

In [None]:
input_layer = Input(shape=(len(train.columns), ))

dense_1 = Dense(units=128, activation='relu', name='Dense_1')(input_layer)
dense_2 = Dense(units=128, activation='relu', name='Dense_2')(dense_1)
dense_3 = Dense(units=256, activation='relu', name='Dense_3')(dense_2)

output_1 = Dense(units=1, name='output_1')(dense_3)

dense_4 = Dense(units=64, activation='relu', name='dense_4')(dense_3)
dense_5 = Dense(units=128, activation='relu', name='dense_5')(dense_4)

output_2 = Dense(units=1, name='output_2')(dense_5)

model = Model(inputs=input_layer, outputs=[output_1, output_2])

model.summary()

In [None]:
ks.utils.plot_model(model)

In [None]:
sgd = ks.optimizers.SGD(lr=0.001)

model.compile(optimizer=sgd, loss={'output_1': 'mse', 'output_2': 'mse'}, metrics={'output_1':  ks.metrics.RootMeanSquaredError(), 
                                                                                   'output_2': ks.metrics.RootMeanSquaredError()})

In [None]:
history = model.fit(x=norm_train_x, y=train_y, epochs=700, batch_size=10,verbose=0, validation_data=(norm_test_x, test_y))

In [None]:
loss, y1_loss, y2_loss, y1_rmse, y2_rmse = model.evaluate(x=norm_test_x, y=test_y)
print(f'Loss = {loss}\nY1_Loss = {y1_loss}\nY2_Loss = {y2_loss}\ny1_rmse = {y1_rmse}\ny2_rmse = {y2_rmse}')

In [None]:
y_preds = model.predict(norm_test_x)
plt.style.use('ggplot')
plot_diff(test_y[0], y_preds[0], title='plot_difference for y1')
plot_diff(test_y[1], y_preds[1], title='plot_difference for y2')

In [None]:
plot_metrics('output_1_root_mean_squared_error', title='performance of the model for Y1', ylim=5)
plot_metrics('output_2_root_mean_squared_error', title='performance of the model for Y2', ylim=5)