# Neural Networks

#### This notebook covers our team's Neural Network Implementation.

We have tried 5 different architectures-hyperparameter combinations in total. We train on the 'train_clean.csv', which the train dataset obtained after processing all the EDA steps mentioned in the 'EDA.ipynb'.
At the end of the notebook, there is a comparison of the performances of the five models and test set prediction by using the best neural network model.
 

In [8]:
import pandas as pd
import numpy as np


In [9]:
df_train =pd.read_csv('../../datasets/final/train_clean.csv')

In [10]:
df_train.head()

Unnamed: 0,rent_approval_date,flat_type,floor_area_sqm,lease_commence_date,latitude,longitude,monthly_rent,distance_to_nearest_existing_mrt,distance_to_nearest_planned_mrt,distance_to_nearest_school,...,town_pasir ris,town_punggol,town_queenstown,town_sembawang,town_sengkang,town_serangoon,town_tampines,town_toa payoh,town_woodlands,town_yishun
0,0.26674,0.25,0.18232,0.320755,1.344518,103.73863,1600,0.2719,0.067848,0.143355,...,False,False,False,False,False,False,False,False,False,False
1,0.532382,0.5,0.320442,0.226415,1.330186,103.938717,2250,0.353866,0.092239,0.277286,...,False,False,False,False,False,False,False,False,False,False
2,0.700329,0.25,0.18232,0.09434,1.332242,103.845643,1900,0.074831,0.391439,0.187977,...,False,False,False,False,False,False,False,True,False,False
3,0.232711,1.0,0.635359,0.509434,1.370239,103.962894,2850,0.619229,0.050944,0.256304,...,True,False,False,False,False,False,False,False,False,False
4,0.734358,0.25,0.187845,0.113208,1.320502,103.863341,2100,0.062221,0.297298,0.112373,...,False,False,False,False,False,False,False,False,False,False


Preparing and Splitting the dataset into train and val

In [11]:
X_train = df_train.drop('monthly_rent', axis=1)
y_train = df_train['monthly_rent']

In [12]:
X_train = X_train.astype('float32')
y_train = y_train.astype('float32')

In [13]:
from sklearn.model_selection import train_test_split

# Splitting the data into training (80%) and validation (20%)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)


#### Model-1: 
This is a simple architecture in which there is a mid-section spike from 64 neurons to 128 neurons. The input dimension is 46 which is the number of features in 'train_clean'. The model is trained for 100 epochs.

In [71]:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import layers
from sklearn.model_selection import train_test_split
from tqdm.auto import tqdm

model = keras.Sequential()

model.add(layers.Input(shape=(46,)))

model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(32, activation='relu'))

model.add(layers.Dense(1))

model.compile(optimizer='adam', loss='mean_squared_error')

batch_size = 32
epochs = 100

total_batches = len(X_train) // batch_size

for epoch in range(epochs):
    epoch_losses = []

    for i in range(0, len(X_train), batch_size):
        batch_X = X_train[i:i + batch_size]
        batch_y = y_train[i:i + batch_size]
        loss = model.train_on_batch(batch_X, batch_y)
        epoch_losses.append(loss)

    
    rmse = np.sqrt(np.mean(epoch_losses))

    print(f'Epoch {epoch + 1}/{epochs}, RMSE: {rmse:.4f}')




Epoch 1/100, RMSE: 785.8766
Epoch 2/100, RMSE: 522.9919
Epoch 3/100, RMSE: 514.7362
Epoch 4/100, RMSE: 512.5783
Epoch 5/100, RMSE: 511.5088
Epoch 6/100, RMSE: 510.8309
Epoch 7/100, RMSE: 510.3884
Epoch 8/100, RMSE: 510.0048
Epoch 9/100, RMSE: 509.6690
Epoch 10/100, RMSE: 509.4381
Epoch 11/100, RMSE: 509.1405
Epoch 12/100, RMSE: 508.9160
Epoch 13/100, RMSE: 508.7526
Epoch 14/100, RMSE: 508.4435
Epoch 15/100, RMSE: 508.2274
Epoch 16/100, RMSE: 507.9902
Epoch 17/100, RMSE: 507.8317
Epoch 18/100, RMSE: 507.5518
Epoch 19/100, RMSE: 507.2876
Epoch 20/100, RMSE: 507.0728
Epoch 21/100, RMSE: 506.9119
Epoch 22/100, RMSE: 506.6514
Epoch 23/100, RMSE: 506.4364
Epoch 24/100, RMSE: 506.2107
Epoch 25/100, RMSE: 505.9876
Epoch 26/100, RMSE: 505.8299
Epoch 27/100, RMSE: 505.5454
Epoch 28/100, RMSE: 505.4079
Epoch 29/100, RMSE: 505.2676
Epoch 30/100, RMSE: 504.9849
Epoch 31/100, RMSE: 504.8386
Epoch 32/100, RMSE: 504.6833
Epoch 33/100, RMSE: 504.6223
Epoch 34/100, RMSE: 504.4175
Epoch 35/100, RMSE: 504

In [None]:
# Saving the model
model.save("nn_regression_model_1.h5")

In [27]:
predictions = model.predict(X_val)

predictions = predictions.reshape(9517,)

rmse_val = np.sqrt(np.mean((predictions - y_val) ** 2))
print(f'RMSE on Validation Set: {rmse_val:.4f}')

#### Model-2: 
This is a deeper model in which the architecture is increased in powers of 2 up until 512 and then brought back down linearly. A Learning Rate Scheduler is utilised here. The model is trained for 100 epochs

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam
from keras.regularizers import l2
from keras.callbacks import LearningRateScheduler

from keras.callbacks import ReduceLROnPlateau
from keras import backend as K

# Root Mean Squared Error (RMSE) custom loss function
def rmse(y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_pred - y_true)))

def lr_schedule(epoch):
    initial_learning_rate = 0.001
    decay_factor = 0.96
    decay_steps = 3
    lr = initial_learning_rate * (decay_factor ** (epoch // decay_steps))
    return lr

model = Sequential()
model.add(layers.Input(shape=(46,)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(256, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(512, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(512, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(256, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(16, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(8, activation='relu', kernel_regularizer=l2(0.001)))

model.add(Dense(1))

initial_learning_rate = 0.001
optimizer = Adam(learning_rate=initial_learning_rate)
model.compile(optimizer=optimizer, loss=rmse)

batch_size = 32
epochs = 100

lr_callback = LearningRateScheduler(lr_schedule)

history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(X_val, y_val), callbacks=[lr_callback])

model.save("nn_regression_model_2.h5")


In [39]:
# Making predictions on the validation set
predictions = model.predict(X_val)
a,b = predictions.shape
predictions = predictions.reshape(a,)

# Calculating RMSE on the validation set
rmse_val = np.sqrt(np.mean((predictions - y_val) ** 2))
print(f'RMSE on Validation Set: {rmse_val:.4f}')

RMSE on Validation Set: 492.2489


#### Model-3: 
This model is deeper than Model-2. The number of neurons are increased up until 1024 and then brought back down. Learning Rate scheduler is employed here too with a decay factor of 0.96. 'batch_size' of 64 is used and trained for 100 epochs

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam
from keras.regularizers import l2
from keras.callbacks import LearningRateScheduler

from keras.callbacks import ReduceLROnPlateau
from keras import backend as K

def rmse(y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_pred - y_true)))

def lr_schedule(epoch):
    initial_learning_rate = 0.001
    decay_factor = 0.96
    decay_steps = 3
    lr = initial_learning_rate * (decay_factor ** (epoch // decay_steps))
    return lr



model = Sequential()
model.add(layers.Input(shape=(46,)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(256, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(512, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(1024, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(1024, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(512, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(256, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(16, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(8, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(4, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(2, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(1))

optimizer = Adam(learning_rate=initial_learning_rate)
model.compile(optimizer=optimizer, loss=rmse)

batch_size = 64
epochs = 100

lr_callback = LearningRateScheduler(lr_schedule)

history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(X_val, y_val), callbacks=[lr_callback])


model.save("nn_regression_model_3.h5")


In [43]:
# Making predictions on the validation set
predictions = model.predict(X_val)
a,b = predictions.shape
predictions = predictions.reshape(a,)

# Calculating RMSE on the validation set
rmse_val = np.sqrt(np.mean((predictions - y_val) ** 2))
print(f'RMSE on Validation Set: {rmse_val:.4f}')

RMSE on Validation Set: 493.9776


### Model-4: 
Increasing the number of neurons did not yield in betetr results. Hence we tried to trim down the model and limit the number of neurons in the deeper layers to 256. 'batch_size' of 32 is used and the model is trained for 200 epochs.

In [None]:
import numpy as np
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam
from keras.regularizers import l2
from keras.callbacks import LearningRateScheduler

from keras.callbacks import ReduceLROnPlateau
from keras import backend as K

def rmse(y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_pred - y_true)))

def lr_schedule(epoch):
    initial_learning_rate = 0.001
    decay_factor = 0.96
    decay_steps = 3
    lr = initial_learning_rate * (decay_factor ** (epoch // decay_steps))
    return lr


model = Sequential()
model.add(layers.Input(shape=(46,)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(256, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(256, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(16, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(8, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(4, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(2, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(1))

optimizer = Adam(learning_rate=initial_learning_rate)
model.compile(optimizer=optimizer, loss=rmse)

batch_size = 32
epochs = 200

lr_callback = LearningRateScheduler(lr_schedule)

history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(X_val, y_val), callbacks=[lr_callback])

model.save("nn_regression_model_4.h5")


In [49]:
# Making predictions on the validation set
predictions = model.predict(X_val)
a,b = predictions.shape
predictions = predictions.reshape(a,)

# Calculating RMSE on the validation set
rmse_val = np.sqrt(np.mean((predictions - y_val) ** 2))
print(f'RMSE on Validation Set: {rmse_val:.4f}')

RMSE on Validation Set: 491.8362


#### Model-5: 
This is an even more trimmed down model, more similar to Model-1. The model is trained for 200 epochs.

In [14]:
import numpy as np
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import Adam
from keras.regularizers import l2
from keras.callbacks import LearningRateScheduler
from keras.metrics import RootMeanSquaredError

from keras.callbacks import ReduceLROnPlateau
from keras import backend as K

from tensorflow import keras
from keras import layers

def lr_schedule(epoch):
    initial_learning_rate = 0.001
    decay_factor = 0.96
    decay_steps = 3
    lr = initial_learning_rate * (decay_factor ** (epoch // decay_steps))
    return lr


model = Sequential()
model.add(layers.Input(shape=(46,)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(128, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(46, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(16, activation='relu', kernel_regularizer=l2(0.001)))
model.add(Dense(1))

initial_learning_rate = 0.001

optimizer = Adam(learning_rate=initial_learning_rate)
model.compile(optimizer=optimizer, loss='mse', metrics=[RootMeanSquaredError()])

batch_size = 32
epochs = 200

lr_callback = LearningRateScheduler(lr_schedule)

history = model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(X_val, y_val), callbacks=[lr_callback])


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

In [15]:
model.save("nn_regression_model_5_final.h5")

  saving_api.save_model(


In [16]:
# Make predictions on the validation set
predictions = model.predict(X_val)
a,b = predictions.shape
predictions = predictions.reshape(a,)
# Calculate RMSE on the validation set
rmse_val = np.sqrt(np.mean((predictions - y_val) ** 2))
print(f'RMSE on Validation Set: {rmse_val:.4f}')

RMSE on Validation Set: 483.6177


### Models performance comparison:
Models and RMSE on val set:
Model-1: 495.3260
Model-2: 492.2489
Model-3: 493.9776
Model-4: 491.8362
Model-5: 483.6177

Neural networks are not able to go below a loss of 490 RMSE. The volume of data available is not enough for the network to learn effectively and hence the performance is not as good as traditional regressors and tree-based regressors. </h3>

### Model-5 seems to be the best of the bunch. Hence we predict on the test dataset using the saved model-5

In [85]:
X_test =pd.read_csv('../datasets/final/test_clean.csv')
X_test.head()


Unnamed: 0,rent_approval_date,flat_type,floor_area_sqm,lease_commence_date,latitude,longitude,distance_to_nearest_existing_mrt,distance_to_nearest_planned_mrt,distance_to_nearest_school,distance_to_nearest_mall,...,town_pasir ris,town_punggol,town_queenstown,town_sembawang,town_sengkang,town_serangoon,town_tampines,town_toa payoh,town_woodlands,town_yishun
0,0.801317,0.75,0.480663,0.339623,1.358411,103.891722,0.321608,0.092001,0.052465,0.3173,...,0,0,0,0,0,0,0,0,0,0
1,0.667398,0.5,0.364641,0.622642,1.446343,103.820817,0.111405,0.9332,0.049352,0.109394,...,0,0,0,1,0,0,0,0,0,0
2,1.0,0.5,0.314917,0.264151,1.305719,103.762168,0.435355,0.074979,0.489875,0.213565,...,0,0,0,0,0,0,0,0,0,0
3,0.232711,0.25,0.220994,0.377358,1.344832,103.730778,0.133972,0.109898,0.506897,0.685062,...,0,0,0,0,0,0,0,0,0,0
4,0.465423,0.75,0.480663,0.320755,1.345437,103.735241,0.169311,0.079866,0.329833,0.627168,...,0,0,0,0,0,0,0,0,0,0


In [18]:
# for final model-5

X_test =pd.read_csv('../../datasets/final/test_clean.csv')
X_test.head()

# Make predictions on the Test
from keras.models import load_model

# Load the model
model = load_model("nn_regression_model_5_final.h5")
predictions = model.predict(X_test)
a,b = predictions.shape
predictions_test = predictions.reshape(a,)



In [19]:
from utils.data_utils import save_test_predictions_in_kaggle_format

save_test_predictions_in_kaggle_format(predictions_test, 'neural_network_best_model', True)

Unnamed: 0,Id,Predicted
0,0,3074.443604
1,1,2702.907227
2,2,3535.118896
3,3,1915.916992
4,4,2759.219971
...,...,...
29995,29995,2898.543457
29996,29996,3007.866211
29997,29997,2702.607910
29998,29998,3363.416504
