# Artificial Neural Network for predicting heating load
First we import packages and data.

In [9]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error
from pandas.plotting import scatter_matrix
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler

data = pd.read_csv('heating_load.csv')

### Splitting and transforming data
The data is split into training and validation sets. The training set will be used to find the optimal ANN architecture, and the validation will validate that architecture.
The data is also normalized and made compatible with the already built MLPRegressor.

In [10]:
# split data into train (which will be 5-fold crossvalidated to determine best model), and validation
train, val = train_test_split(data, test_size=0.2, random_state=1)
train_X = train.loc[:, '# surface area (m^2)':'roof area  (m^2)']
train_Y = train.loc[:, ['heating load (BTU)']]
val_X = val.loc[:, '# surface area (m^2)':'roof area  (m^2)']
val_Y = val.loc[:, ['heating load (BTU)']]
scaler = StandardScaler()
train_X = scaler.fit_transform(train_X)
val_X = scaler.transform(val_X)
train_Y = train_Y.to_numpy().squeeze()
val_Y = val_Y.to_numpy().squeeze()

### Preparing to tune the ANN model
param_grid is created, which contains all the different parameters we would like to try for the ANN

In [11]:
hidden_layers = [(20, 20, 20)] # trying one with three just to see if there is any large difference
for i in range(10, 20):
    for j in range(10, 20):
        hidden_layers.append((i,j))

param_grid = {
    "hidden_layer_sizes": hidden_layers,
    "activation": ["relu", "tanh"],
    "batch_size": [32, 64], 
    "learning_rate": ["constant"]
}

### Tuning the ANN
Using every possible combination of the given parameters, GridSearchCV finds the ANN that gives the lowest MSE using 5-fold cross validation.

In [12]:
ann_reg = MLPRegressor(max_iter=1000)
grid_search = GridSearchCV(ann_reg, param_grid, scoring="neg_mean_squared_error", return_train_score=True)
grid_search.fit(train_X, train_Y)



GridSearchCV(estimator=MLPRegressor(max_iter=1000),
             param_grid={'activation': ['relu', 'tanh'], 'batch_size': [32, 64],
                         'hidden_layer_sizes': [(20, 20, 20), (10, 10),
                                                (10, 11), (10, 12), (10, 13),
                                                (10, 14), (10, 15), (10, 16),
                                                (10, 17), (10, 18), (10, 19),
                                                (11, 10), (11, 11), (11, 12),
                                                (11, 13), (11, 14), (11, 15),
                                                (11, 16), (11, 17), (11, 18),
                                                (11, 19), (12, 10), (12, 11),
                                                (12, 12), (12, 13), (12, 14),
                                                (12, 15), (12, 16), (12, 17),
                                                (12, 18), ...],
                         'lea

In [15]:
print(grid_search.best_params_)

{'activation': 'tanh', 'batch_size': 32, 'hidden_layer_sizes': (18, 10), 'learning_rate': 'constant'}


In [16]:
print(grid_search.best_estimator_)

MLPRegressor(activation='tanh', batch_size=32, hidden_layer_sizes=(18, 10),
             max_iter=1000)


### Validating the optimal ANN
Now that the best ANN has been found, it is validated with the validation set.

In [8]:
best_ann = grid_search.best_estimator_
pred_val = best_ann.predict(val_X)
# error
print(mean_squared_error(val_Y, pred_val, squared=False))

2.5502987812186295


This RMSE is lower than that of the linear regression, even with engineered features. Since the ANN can be viewed as a black box, it is hard to explicitly explain this difference, but essentially the ANN can engineer features in many different ways in the hidden layers, and thereby find important relationships in the data and use them to predict the heating load well.