# **CS412 - Machine Learning - 2022**
###[Berna Yıldıran 26431]
## Assignment #2
100 pts


## Goal

The goal of this homework is two-fold:

*   Gain experience with neural network approaches
*   Gain experience with the Keras library

## Dataset
You are going to use a house price dataset that we prepared for you, that contains four independent variables (predictors) and one target variable. The task is predicting the target variable (house price) from the predictors (house attributes).


Download the data from SuCourse. Reserve 10% of the training data for validation and use the rest for development (learning your models). The official test data we provide (1,200 samples) should only be used for testing at the end, and not model selection.

## Task 
Build a regressor with a neural network that has only one hidden layer, using the Keras library function calls to predict house prices in the provided dataset.

Your code should follow the given skeleton and try the indicated parameters.

## Preprocessing and Meta-parameters
You should try 10,50 and 100 as hidden node count. 

You should  decide on the learning rate (step size), you can try values such as 0.001, 0.01, 0.1, but you may need to increase if learning is very slow or decrease if you see the loss increase!

You can use either sigmoid or Relu activations for the hidden nodes (indicate with your results) and you should know what to use for the activation for the output layer, input, output layer sizes, and the suitable loss function. 

## Software: 

Keras is a library that we will use especially for deep learning, but also with basic neural network functionality of course.

You may find the necessary function references here: 

http://scikit-learn.org/stable/supervised_learning.html
https://keras.io/api/

When you search for Dense for instance, you should find the relevant function and explained parameters, easily.

## Submission: 

Fill this notebook. Write the report section at the end.

You should prepare a separate pdf document as your homework (name hw2-CS412-yourname.pdf) which consists of the report (Part 8) of the notebook for easy viewing -and- include a link to your notebook from within the pdf report (make sure to include the link obtained from the #share link on top right, **be sure to share with Sabancı University first** as otherwise there will be access problems.). Also, do not forget to add your answers for Questions 2 and 3 on the assignment document.

##1) Initialize

*   First make a copy of the notebook given to you as a starter.

*   Make sure you choose Connect form upper right.


## 2) Load training dataset

* Load the datasets (train.csv, test.csv) provided on SuCourse on your Google drive and read the datasets using Google Drive's mount functions. 
You may find the necessary functions here: 
https://colab.research.google.com/notebooks/io.ipynb

In [None]:
from google.colab import drive
drive.mount('/content/drive/') 
# click on the url that pops up and give the necessary authorizations

Drive already mounted at /content/drive/; to attempt to forcibly remount, call drive.mount("/content/drive/", force_remount=True).




*   Set your notebooks working directory to the path where the datasets are uploaded (cd is the linux command for change directory) 
*   You may need to use cd drive/MyDrive depending on your path to the datasets on Google Drive. (don't comment the code in the cells when using linux commands)






In [None]:
PATH = "/content/drive/My Drive/Colab Notebooks/CS412 - ML/Assignment#2/"

* List the files in the current directory.

In [None]:
ls

[0m[01;34mdrive[0m/  [01;34msample_data[0m/


##3) Understanding the dataset (5 pts)

There are alot of functions that can be used to know more about this dataset

- What is the shape of the training set (num of samples X number of attributes) **[shape function can be used]**

- Display attribute names **[columns function can be used]**

- Display the first 5 rows from training dataset **[head or sample functions can be used]**

..

In [None]:
# import the necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

train_df = pd.read_csv(PATH+"train.csv")

# show first 10 elements of the training data
train_df.head(10)

Unnamed: 0,sqmtrs,nrooms,view,crime_rate,price
0,251,5,west,low,925701.721399
1,211,3,west,high,622237.482636
2,128,5,east,low,694998.182376
3,178,3,east,high,564689.015926
4,231,3,west,low,811222.970379
5,253,5,north,high,766250.032506
6,101,1,north,low,512749.401548
7,242,1,north,high,637010.760148
8,174,5,west,high,638136.374869
9,328,2,south,high,787704.988273


In [None]:
# print the shape of data
print("Data dimensionality of the training set is: ", train_df.shape, "\n")

# also give some statistics about the data like mean, standard deviation etc.
print("Mean of the train set is:\n", train_df.mean(numeric_only=True), "\n")
print("Standard Deviation of the train set is:\n", train_df.std(numeric_only=True))

Data dimensionality of the training set is:  (4800, 5) 

Mean of the train set is:
 sqmtrs       225.033542
nrooms         2.983958
price     725756.960758
dtype: float64 

Standard Deviation of the train set is:
 sqmtrs        71.851436
nrooms         1.421251
price     151041.121658
dtype: float64


##4) Preprocessing Steps (10 pts)

As some of the features (predictive variables) on this dataset are categorical (non-numeric) you need to do some preprocessing for those features.

You can use as many **dummy or indicator variables** as there are categories within one feature. You can also look at pandas' get_dummies or keras.utils.to_categorical functions.

In neural networks, scaling of the features are important, because they affect the net input of a neuron as a whole. You should use **MinMax scaler** on sklearn for this task, which scales the variables between 0 and 1 on by default. (Remember that mean-squared error loss function tends to be extremely large with unscaled features.)


In [None]:
# encode the categorical variables --> view and crime_rate
# find categories of view
Categories_View = {}
for i in train_df['view']:
  Categories_View[i] = ""

print("Categories in View: ", Categories_View.keys())

# ------------------------------------------------------------------------------

# find categories of crime_rate
Categories_Crime_Rate = {}
for i in train_df['crime_rate']:
  Categories_Crime_Rate[i] = ""

print("Categories in Crime Rate: ", Categories_Crime_Rate.keys())

# ------------------------------------------------------------------------------

#train_df['view'] = train_df['view'].map(lambda x: target_mapping_view[x])
train_df['west'] = train_df['view'].map(lambda x: 1 if x == 'west' else 0)
train_df['east'] = train_df['view'].map(lambda x: 1 if x == 'east' else 0)
train_df['north'] = train_df['view'].map(lambda x: 1 if x == 'north' else 0)
train_df['south'] = train_df['view'].map(lambda x: 1 if x == 'south' else 0)
train_df.drop("view", inplace=True, axis=1)

# ------------------------------------------------------------------------------

# encode categories crime_rate
# low --> 0
# high --> 1

target_mapping_crime_rate = {'low': 0,
                             'high': 1}

train_df['crime_rate'] = train_df['crime_rate'].map(lambda x: target_mapping_crime_rate[x])

Categories in View:  dict_keys(['west', 'east', 'north', 'south'])
Categories in Crime Rate:  dict_keys(['low', 'high'])


In [None]:
# dataframe after turning categorical values into numeric
train_df.head(10)

Unnamed: 0,sqmtrs,nrooms,crime_rate,price,west,east,north,south
0,251,5,0,925701.721399,1,0,0,0
1,211,3,1,622237.482636,1,0,0,0
2,128,5,0,694998.182376,0,1,0,0
3,178,3,1,564689.015926,0,1,0,0
4,231,3,0,811222.970379,1,0,0,0
5,253,5,1,766250.032506,0,0,1,0
6,101,1,0,512749.401548,0,0,1,0
7,242,1,1,637010.760148,0,0,1,0
8,174,5,1,638136.374869,1,0,0,0
9,328,2,1,787704.988273,0,0,0,1


In [None]:
# scale the features between 0-1
from sklearn.preprocessing import MinMaxScaler
from sklearn import preprocessing
from sklearn.model_selection import train_test_split

msc = MinMaxScaler()
scaled = msc.fit_transform(train_df)

train_df = pd.DataFrame(scaled, columns=train_df.columns, index=train_df.index)

train_df[["sqmtrs", "nrooms", "crime_rate", "price"]] = msc.fit_transform(train_df[["sqmtrs", "nrooms", "crime_rate", "price"]])

train_df.head(10)

Unnamed: 0,sqmtrs,nrooms,crime_rate,price,west,east,north,south
0,0.606426,1.0,0.0,0.791034,1.0,0.0,0.0,0.0
1,0.445783,0.5,1.0,0.369303,1.0,0.0,0.0,0.0
2,0.11245,1.0,0.0,0.470421,0.0,1.0,0.0,0.0
3,0.313253,0.5,1.0,0.289327,0.0,1.0,0.0,0.0
4,0.526104,0.5,0.0,0.631941,1.0,0.0,0.0,0.0
5,0.614458,1.0,1.0,0.569441,0.0,0.0,1.0,0.0
6,0.004016,0.0,0.0,0.217145,0.0,0.0,1.0,0.0
7,0.570281,0.0,1.0,0.389834,0.0,0.0,1.0,0.0
8,0.297189,1.0,1.0,0.391398,1.0,0.0,0.0,0.0
9,0.915663,0.25,1.0,0.599257,0.0,0.0,0.0,1.0


Don't forget the split the training data to obtain a validation set. **Use random_state=42**

In [None]:
# split 90-10

X_all = train_df.drop('price', axis=1)
y_all = train_df.drop(['sqmtrs', 'nrooms', 'west', 'east', 'north', 'south', 'crime_rate'], axis=1)

X_train, X_val, y_train, y_val = train_test_split(X_all, y_all, test_size=0.1, random_state=42)

In [None]:
# print the shape of data
print("Data dimensionality of training set after preprocessing: ", train_df.shape, "\n")

print ("Train X Shape: ", X_train.shape)
print ("Train Y Shape: ", y_train.shape)

# also give some statistics about the data like mean, standard deviation etc.
print("\nMean of the training set after preprocessing:\n", train_df.mean(numeric_only=True), "\n")
print("Standard Deviation of the training set after preprocessing:\n", train_df.std(numeric_only=True))

Data dimensionality of training set after preprocessing:  (4800, 8) 

Train X Shape:  (4320, 7)
Train Y Shape:  (4320, 1)

Mean of the training set after preprocessing:
 sqmtrs        0.502143
nrooms        0.495990
crime_rate    0.501250
price         0.513167
west          0.235833
east          0.260833
north         0.250833
south         0.252500
dtype: float64 

Standard Deviation of the training set after preprocessing:
 sqmtrs        0.288560
nrooms        0.355313
crime_rate    0.500051
price         0.209905
west          0.424563
east          0.439135
north         0.433538
south         0.434492
dtype: float64


In [None]:
validation_df = pd.concat([X_val, y_val], axis=1)

# print the shape of data
print("Data dimensionality of validation set is: ", validation_df.shape, "\n")

print ("Val X Shape: ", X_val.shape)
print ("Val Y Shape: ", y_val.shape)

# also give some statistics about the data like mean, standard deviation etc.
print("\nMean of the validation set is:\n", validation_df.mean(numeric_only=True), "\n")
print("Standard Deviation of the validation set is:\n", validation_df.std(numeric_only=True))

Data dimensionality of validation set is:  (480, 8) 

Val X Shape:  (480, 7)
Val Y Shape:  (480, 1)

Mean of the validation set is:
 sqmtrs        0.491022
nrooms        0.502083
crime_rate    0.518750
west          0.229167
east          0.266667
north         0.250000
south         0.254167
price         0.503941
dtype: float64 

Standard Deviation of the validation set is:
 sqmtrs        0.291518
nrooms        0.359040
crime_rate    0.500170
west          0.420735
east          0.442678
north         0.433464
south         0.435846
price         0.220180
dtype: float64


##5) Train neural networks on development data and do model selection using the validation data (55 pts)


* Train a neural network with **one hidden layer** (try 3 different values for the number of neurons in that hidden layer, as 25, 50, 100), you will need to correctly choose the optimizer and the loss function that this model will train with. Use batch_size as 64 and train each model for 30 epochs. 

* Train another neural network with two hidden layers with meta-parameters of your choice. Again, use batch_size as 64 and train the model for 30 epochs. 

* **Bonus (5 pts)** Train a KNN or a Decision Tree model with your own choice of meta parameters to predict the house prices.


###One Layer Neural Network

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
from sklearn.model_selection import train_test_split
from numpy import array,asarray,zeros

import keras
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

from keras.models import Sequential
from keras.layers.convolutional import Conv1D,MaxPooling1D
from keras.layers import GlobalMaxPooling1D
from keras.layers import Dense,Flatten,Embedding,Input,Dropout
from keras.callbacks import ModelCheckpoint
from tensorflow.keras.optimizers import Adam

from gensim.models import Word2Vec
import gensim.downloader as api
import string

####One-Layered Model Generation Function

In [None]:
# train one-hidden layered neural networks
# define your model architecture

# Define One-Hidden Layer Model
def One_Hidden_Layer_Model(hidden_layer_size, _learning_rate):
  print("Model // Hidden Layer Size ", hidden_layer_size)
  model = Sequential()
  model.add(Input((X_train.shape[1],)))
  model.add(Dense(hidden_layer_size, activation='relu'))
  model.add(Dense(1, activation='sigmoid'))

  # compile your model with an optimizer
  model.compile(loss='binary_crossentropy', optimizer=Adam(_learning_rate), metrics=['mse'])

  # fit the model on training data
  model.fit(x=X_train, y=y_train, batch_size=64, epochs=30, verbose=2)

  #print("\nEvaluation:")
  model.evaluate(x=X_val, y=y_val)
  
  print("\n-----------------------------------------------------------------------")
  print(model.summary())
  print("-----------------------------------------------------------------------\n")
  return model

In [None]:
# Parameters
hidden_layer_size_1  = [25, 50, 100]
learning_rates_1 = [0.1, 0.01, 0.001]

####Model with Hidden Layer 25

Learning Rate = 0.1

In [None]:
# Hidden Layer Size = 25 / Learning Rate = 0.1
NN_Model_1L_25_1 = One_Hidden_Layer_Model(hidden_layer_size_1[0], learning_rates_1[0])

Model // Hidden Layer Size  25
Epoch 1/30
68/68 - 1s - loss: 0.6199 - mse: 0.0098 - 1s/epoch - 17ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5991 - mse: 6.1421e-04 - 181ms/epoch - 3ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5991 - mse: 6.3092e-04 - 150ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5992 - mse: 6.6892e-04 - 253ms/epoch - 4ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5993 - mse: 7.1881e-04 - 142ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5992 - mse: 7.0760e-04 - 218ms/epoch - 3ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5991 - mse: 6.8908e-04 - 159ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5990 - mse: 6.5018e-04 - 131ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5993 - mse: 7.3514e-04 - 118ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5991 - mse: 6.8615e-04 - 134ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5989 - mse: 6.0477e-04 - 140ms/epoch - 2ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5989 - mse: 6.1659e-04 - 151ms/epoch - 2ms/step
Epoch 13/30


Learning Rate = 0.01

In [None]:
# Hidden Layer Size = 25 / Learning Rate = 0.01
NN_Model_1L_25_01 = One_Hidden_Layer_Model(hidden_layer_size_1[0], learning_rates_1[1])

Model // Hidden Layer Size  25
Epoch 1/30
68/68 - 1s - loss: 0.6147 - mse: 0.0073 - 1s/epoch - 19ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5994 - mse: 6.7845e-04 - 201ms/epoch - 3ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5992 - mse: 6.2338e-04 - 209ms/epoch - 3ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5992 - mse: 6.1801e-04 - 186ms/epoch - 3ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5991 - mse: 6.0022e-04 - 158ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5989 - mse: 4.9044e-04 - 178ms/epoch - 3ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5987 - mse: 4.4413e-04 - 153ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5986 - mse: 4.0569e-04 - 162ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5984 - mse: 3.4685e-04 - 154ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5983 - mse: 3.1050e-04 - 244ms/epoch - 4ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5982 - mse: 2.6941e-04 - 197ms/epoch - 3ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5982 - mse: 2.6462e-04 - 140ms/epoch - 2ms/step
Epoch 13/30


Learning Rate = 0.001

In [None]:
# Hidden Layer Size = 25 / Learning Rate = 0.001
NN_Model_1L_25_001 = One_Hidden_Layer_Model(hidden_layer_size_1[0], learning_rates_1[2])

Model // Hidden Layer Size  25
Epoch 1/30
68/68 - 1s - loss: 0.6612 - mse: 0.0279 - 958ms/epoch - 14ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6306 - mse: 0.0134 - 146ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.6108 - mse: 0.0048 - 176ms/epoch - 3ms/step
Epoch 4/30
68/68 - 0s - loss: 0.6025 - mse: 0.0016 - 155ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.6001 - mse: 8.1320e-04 - 170ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5997 - mse: 7.1679e-04 - 184ms/epoch - 3ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5995 - mse: 6.7107e-04 - 149ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5994 - mse: 6.4147e-04 - 147ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5993 - mse: 6.1761e-04 - 181ms/epoch - 3ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5992 - mse: 5.8664e-04 - 137ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5992 - mse: 5.7054e-04 - 128ms/epoch - 2ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5991 - mse: 5.5653e-04 - 166ms/epoch - 2ms/step
Epoch 13/30
68/68 - 0

####Model with Hidden Layer 50

Learning Rate = 0.1

In [None]:
# Hidden Layer Size = 50 / Learning Rate = 0.1
NN_Model_1L_50_1 = One_Hidden_Layer_Model(hidden_layer_size_1[1], learning_rates_1[0])

Model // Hidden Layer Size  50
Epoch 1/30
68/68 - 1s - loss: 0.6138 - mse: 0.0069 - 1s/epoch - 16ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5981 - mse: 2.8864e-04 - 165ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5977 - mse: 1.7342e-04 - 209ms/epoch - 3ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5976 - mse: 1.6091e-04 - 146ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5975 - mse: 1.3594e-04 - 141ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5975 - mse: 1.5238e-04 - 121ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5975 - mse: 1.3085e-04 - 124ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5975 - mse: 1.5334e-04 - 138ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5975 - mse: 1.4313e-04 - 139ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5975 - mse: 1.6484e-04 - 142ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5976 - mse: 1.8577e-04 - 126ms/epoch - 2ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5976 - mse: 1.7522e-04 - 129ms/epoch - 2ms/step
Epoch 13/30


Learning Rate = 0.01

In [None]:
# Hidden Layer Size = 50 / Learning Rate = 0.01
NN_Model_1L_50_01 = One_Hidden_Layer_Model(hidden_layer_size_1[1], learning_rates_1[1])

Model // Hidden Layer Size  50
Epoch 1/30
68/68 - 1s - loss: 0.6167 - mse: 0.0083 - 515ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5991 - mse: 5.5991e-04 - 115ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5989 - mse: 4.8227e-04 - 88ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5987 - mse: 4.1115e-04 - 89ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5985 - mse: 3.5914e-04 - 85ms/epoch - 1ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5984 - mse: 3.3908e-04 - 105ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5983 - mse: 2.9941e-04 - 89ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5981 - mse: 2.3593e-04 - 84ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5979 - mse: 2.0040e-04 - 87ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5978 - mse: 1.8907e-04 - 86ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5978 - mse: 1.7302e-04 - 98ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5976 - mse: 1.3779e-04 - 91ms/epoch - 1ms/step
Epoch 13/30
68/68 -

Learning Rate = 0.001

In [None]:
# Hidden Layer Size = 50 / Learning Rate = 0.001
NN_Model_1L_50_001 = One_Hidden_Layer_Model(hidden_layer_size_1[1], learning_rates_1[2])

Model // Hidden Layer Size  50
Epoch 1/30
68/68 - 1s - loss: 0.6592 - mse: 0.0271 - 528ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6230 - mse: 0.0102 - 105ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.6041 - mse: 0.0022 - 94ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5998 - mse: 7.2671e-04 - 86ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5994 - mse: 6.3881e-04 - 88ms/epoch - 1ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5993 - mse: 5.8492e-04 - 87ms/epoch - 1ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5992 - mse: 5.5375e-04 - 87ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5991 - mse: 5.2586e-04 - 86ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5990 - mse: 5.0608e-04 - 104ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5990 - mse: 4.8395e-04 - 82ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5989 - mse: 4.6913e-04 - 84ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5989 - mse: 4.5182e-04 - 86ms/epoch - 1ms/step
Epoch 13/30
68/68 - 0s - lo

####Model with Hidden Layer 100

Learning Rate = 0.1

In [None]:
# Hidden Layer Size = 100 / Learning Rate = 0.1
NN_Model_1L_100_1 = One_Hidden_Layer_Model(hidden_layer_size_1[2], learning_rates_1[2])

Model // Hidden Layer Size  100
Epoch 1/30
68/68 - 1s - loss: 0.6613 - mse: 0.0281 - 523ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6157 - mse: 0.0071 - 91ms/epoch - 1ms/step
Epoch 3/30
68/68 - 0s - loss: 0.6007 - mse: 0.0010 - 109ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5995 - mse: 6.5293e-04 - 92ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5993 - mse: 5.7987e-04 - 88ms/epoch - 1ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5991 - mse: 5.4580e-04 - 94ms/epoch - 1ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5990 - mse: 5.0661e-04 - 97ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5990 - mse: 4.8442e-04 - 104ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5989 - mse: 4.5718e-04 - 88ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5988 - mse: 4.3135e-04 - 88ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5987 - mse: 3.9721e-04 - 90ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5986 - mse: 3.7478e-04 - 98ms/epoch - 1ms/step
Epoch 13/30
68/68 - 0s - l

Learning Rate = 0.01

In [None]:
# Hidden Layer Size = 100 / Learning Rate = 0.01
NN_Model_1L_100_01 = One_Hidden_Layer_Model(hidden_layer_size_1[2], learning_rates_1[2])

Model // Hidden Layer Size  100
Epoch 1/30
68/68 - 0s - loss: 0.6688 - mse: 0.0318 - 495ms/epoch - 7ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6190 - mse: 0.0085 - 89ms/epoch - 1ms/step
Epoch 3/30
68/68 - 0s - loss: 0.6015 - mse: 0.0013 - 94ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5996 - mse: 7.0725e-04 - 89ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5994 - mse: 6.2339e-04 - 103ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5992 - mse: 5.7228e-04 - 102ms/epoch - 1ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5991 - mse: 5.2992e-04 - 94ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5990 - mse: 5.0149e-04 - 90ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5989 - mse: 4.8646e-04 - 92ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5989 - mse: 4.5011e-04 - 94ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5988 - mse: 4.2993e-04 - 91ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5987 - mse: 4.0749e-04 - 96ms/epoch - 1ms/step
Epoch 13/30
68/68 - 0s - l

Learning Rate = 0.001

In [None]:
# Hidden Layer Size = 100 / Learning Rate = 0.001
NN_Model_1L_100_001 = One_Hidden_Layer_Model(hidden_layer_size_1[2], learning_rates_1[2])

Model // Hidden Layer Size  100
Epoch 1/30
68/68 - 1s - loss: 0.6556 - mse: 0.0254 - 509ms/epoch - 7ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6147 - mse: 0.0067 - 96ms/epoch - 1ms/step
Epoch 3/30
68/68 - 0s - loss: 0.6014 - mse: 0.0013 - 91ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5997 - mse: 7.3837e-04 - 92ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5995 - mse: 6.7718e-04 - 94ms/epoch - 1ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5994 - mse: 6.3749e-04 - 93ms/epoch - 1ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5993 - mse: 5.9248e-04 - 95ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5992 - mse: 5.6407e-04 - 86ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5991 - mse: 5.4064e-04 - 100ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5990 - mse: 5.1514e-04 - 95ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5990 - mse: 4.9187e-04 - 94ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5989 - mse: 4.7012e-04 - 90ms/epoch - 1ms/step
Epoch 13/30
68/68 - 0s - lo

###Two Layer Neural Network

####Two-Layered Model Generation Function

In [None]:
# train a two-hidden layered neural network
# Define Two-Hidden Layer Model
def Two_Hidden_Layer_Model(hidden_layer_first, hidden_layer_second, _learning_rate):
  print("Model // Hidden Layer Sizes ", hidden_layer_first, " and ", hidden_layer_second)
  model = Sequential()
  model.add(Input((X_train.shape[1],)))
  model.add(Dense(hidden_layer_first, activation='relu', name='hidden_1'))
  model.add(Dense(hidden_layer_second, activation='relu', name='hidden_2'))
  model.add(Dense(1, activation='sigmoid', name='output_layer'))

  # compile your model with an optimizer
  model.compile(loss='binary_crossentropy', optimizer=Adam(_learning_rate), metrics=['mse'])

  # fit the model on training data
  model.fit(x=X_train, y=y_train, batch_size=64, epochs=30, verbose=2)

  print("\nEvaluation:")
  model.evaluate(x=X_val, y=y_val)
  
  print("\n-----------------------------------------------------------------------")
  print(model.summary())
  print("-----------------------------------------------------------------------\n")
  return model

In [None]:
# Parameters
hidden_layer_size_2  = [[200, 100], [100, 50], [50, 25]]
learning_rates_2 = [0.1, 0.01, 0.001]

####Model with Hidden Layer 200 / 100

Learning Rate = 0.1

In [None]:
# Hidden Layer Sizes // 200 / 100 //// Learning Rate = 0.1
NN_Model_2L_200_100_1 = Two_Hidden_Layer_Model(hidden_layer_size_2[0][0], hidden_layer_size_2[0][1], learning_rates_2[0])

Model // Hidden Layer Sizes  200  and  100
Epoch 1/30
68/68 - 1s - loss: 0.7414 - mse: 0.0259 - 588ms/epoch - 9ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5991 - mse: 6.1670e-04 - 116ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5985 - mse: 4.6978e-04 - 119ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5982 - mse: 3.6340e-04 - 117ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5980 - mse: 3.1411e-04 - 120ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5979 - mse: 2.9505e-04 - 117ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5976 - mse: 1.7581e-04 - 119ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5977 - mse: 2.2261e-04 - 115ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5978 - mse: 2.5770e-04 - 121ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5976 - mse: 1.9541e-04 - 115ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5974 - mse: 1.2073e-04 - 122ms/epoch - 2ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5977 - mse: 2.2629e-04 - 118ms/epoch - 2ms/ste

Learning Rate = 0.01

In [None]:
# Hidden Layer Sizes // 200 / 100 //// Learning Rate = 0.01
NN_Model_2L_200_100_01 = Two_Hidden_Layer_Model(hidden_layer_size_2[0][0], hidden_layer_size_2[0][1], learning_rates_2[1])

Model // Hidden Layer Sizes  200  and  100
Epoch 1/30
68/68 - 1s - loss: 0.6057 - mse: 0.0035 - 595ms/epoch - 9ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5977 - mse: 1.7809e-04 - 127ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5975 - mse: 1.4297e-04 - 118ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5975 - mse: 1.2509e-04 - 121ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5975 - mse: 1.4759e-04 - 115ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5975 - mse: 1.5518e-04 - 116ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5975 - mse: 1.3360e-04 - 133ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5975 - mse: 1.4278e-04 - 114ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5976 - mse: 1.9725e-04 - 119ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5974 - mse: 1.2509e-04 - 127ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5975 - mse: 1.4678e-04 - 119ms/epoch - 2ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5974 - mse: 1.0329e-04 - 115ms/epoch - 2ms/ste

Learning Rate = 0.001

In [None]:
# Hidden Layer Sizes // 200 / 100 //// Learning Rate = 0.001
NN_Model_2L_200_100_001 = Two_Hidden_Layer_Model(hidden_layer_size_2[0][0], hidden_layer_size_2[0][1], learning_rates_2[2])

Model // Hidden Layer Sizes  200  and  100
Epoch 1/30
68/68 - 1s - loss: 0.6217 - mse: 0.0104 - 574ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5993 - mse: 6.1497e-04 - 140ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5987 - mse: 4.0883e-04 - 138ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5983 - mse: 3.0280e-04 - 123ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5981 - mse: 2.2886e-04 - 122ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5979 - mse: 2.0031e-04 - 118ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5978 - mse: 1.5973e-04 - 125ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5977 - mse: 1.5550e-04 - 126ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5976 - mse: 1.3976e-04 - 118ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5976 - mse: 1.2316e-04 - 122ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5975 - mse: 1.0604e-04 - 133ms/epoch - 2ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5975 - mse: 1.0334e-04 - 124ms/epoch - 2ms/ste

####Model with Hidden Layer 100 / 50

Learning Rate = 0.1

In [None]:
# Hidden Layer Sizes // 100 / 50 //// Learning Rate = 0.1
NN_Model_2L_100_50_1 = Two_Hidden_Layer_Model(hidden_layer_size_2[1][0], hidden_layer_size_2[1][1], learning_rates_2[0])

Model // Hidden Layer Sizes  100  and  50
Epoch 1/30
68/68 - 1s - loss: 0.6748 - mse: 0.0240 - 568ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6009 - mse: 0.0011 - 114ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.6006 - mse: 0.0011 - 102ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5999 - mse: 8.5776e-04 - 98ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5997 - mse: 8.1018e-04 - 102ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5996 - mse: 8.0258e-04 - 108ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.6001 - mse: 9.9774e-04 - 96ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5999 - mse: 9.1997e-04 - 108ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.6003 - mse: 0.0011 - 99ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5998 - mse: 8.6839e-04 - 105ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5996 - mse: 8.0745e-04 - 97ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5997 - mse: 8.4345e-04 - 108ms/epoch - 2ms/step
Epoch 13/30
68/

Learning Rate = 0.01

In [None]:
# Hidden Layer Sizes // 100 / 50 //// Learning Rate = 0.01
NN_Model_2L_100_50_01 = Two_Hidden_Layer_Model(hidden_layer_size_2[1][0], hidden_layer_size_2[1][1], learning_rates_2[1])

Model // Hidden Layer Sizes  100  and  50
Epoch 1/30
68/68 - 1s - loss: 0.6063 - mse: 0.0037 - 548ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5982 - mse: 3.3441e-04 - 101ms/epoch - 1ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5977 - mse: 1.7563e-04 - 108ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5975 - mse: 1.4753e-04 - 105ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5975 - mse: 1.3158e-04 - 95ms/epoch - 1ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5975 - mse: 1.6499e-04 - 110ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5976 - mse: 2.1571e-04 - 107ms/epoch - 2ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5973 - mse: 9.6831e-05 - 111ms/epoch - 2ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5974 - mse: 1.1263e-04 - 97ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5974 - mse: 1.0579e-04 - 100ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5973 - mse: 8.4668e-05 - 94ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5975 - mse: 1.4745e-04 - 100ms/epoch - 1ms/step
Ep

Learning Rate = 0.001

In [None]:
# Hidden Layer Sizes // 100 / 50 //// Learning Rate = 0.001
NN_Model_2L_100_50_001 = Two_Hidden_Layer_Model(hidden_layer_size_2[1][0], hidden_layer_size_2[1][1], learning_rates_2[2])

Model // Hidden Layer Sizes  100  and  50
Epoch 1/30
68/68 - 1s - loss: 0.6423 - mse: 0.0195 - 553ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6001 - mse: 8.7548e-04 - 108ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5992 - mse: 5.7741e-04 - 101ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5988 - mse: 4.4301e-04 - 104ms/epoch - 2ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5985 - mse: 3.7106e-04 - 103ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5984 - mse: 3.1008e-04 - 105ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5982 - mse: 2.5872e-04 - 94ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5981 - mse: 2.3814e-04 - 97ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5980 - mse: 2.0621e-04 - 104ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5978 - mse: 1.7588e-04 - 107ms/epoch - 2ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5978 - mse: 1.6209e-04 - 98ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5977 - mse: 1.4511e-04 - 102ms/epoch - 2ms/step
Ep

####Model with Hidden Layer 50 / 25

Learning Rate = 0.1

In [None]:
# Hidden Layer Sizes // 50 / 25 //// Learning Rate = 0.1
NN_Model_2L_50_25_1 = Two_Hidden_Layer_Model(hidden_layer_size_2[2][0], hidden_layer_size_2[2][1], learning_rates_2[0])

Model // Hidden Layer Sizes  50  and  25
Epoch 1/30
68/68 - 1s - loss: 0.6464 - mse: 0.0172 - 542ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6002 - mse: 9.9747e-04 - 107ms/epoch - 2ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5999 - mse: 8.5953e-04 - 110ms/epoch - 2ms/step
Epoch 4/30
68/68 - 0s - loss: 0.6005 - mse: 0.0011 - 93ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.6000 - mse: 9.3048e-04 - 103ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.6011 - mse: 0.0013 - 94ms/epoch - 1ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5996 - mse: 7.7933e-04 - 93ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5997 - mse: 8.2046e-04 - 90ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5999 - mse: 9.0125e-04 - 93ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5999 - mse: 8.8408e-04 - 94ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5997 - mse: 8.2547e-04 - 97ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.6000 - mse: 9.3690e-04 - 111ms/epoch - 2ms/step
Epoch 13/30
68/

Learning Rate = 0.01

In [None]:
# Hidden Layer Sizes // 50 / 25 //// Learning Rate = 0.01
NN_Model_2L_50_25_01 = Two_Hidden_Layer_Model(hidden_layer_size_2[2][0], hidden_layer_size_2[2][1], learning_rates_2[1])

Model // Hidden Layer Sizes  50  and  25
Epoch 1/30
68/68 - 1s - loss: 0.6099 - mse: 0.0053 - 766ms/epoch - 11ms/step
Epoch 2/30
68/68 - 0s - loss: 0.5990 - mse: 5.4191e-04 - 96ms/epoch - 1ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5986 - mse: 4.2177e-04 - 101ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5981 - mse: 2.7590e-04 - 97ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5979 - mse: 2.3310e-04 - 108ms/epoch - 2ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5978 - mse: 2.0474e-04 - 92ms/epoch - 1ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5977 - mse: 1.5468e-04 - 95ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5976 - mse: 1.5154e-04 - 91ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5976 - mse: 1.3823e-04 - 103ms/epoch - 2ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5976 - mse: 1.5838e-04 - 93ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5976 - mse: 1.4480e-04 - 96ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5976 - mse: 1.6841e-04 - 99ms/epoch - 1ms/step
Epoch 1

Learning Rate = 0.001

In [None]:
# Hidden Layer Sizes // 50 / 25 //// Learning Rate = 0.001
NN_Model_2L_50_25_001 = Two_Hidden_Layer_Model(hidden_layer_size_2[2][0], hidden_layer_size_2[2][1], learning_rates_2[2])

Model // Hidden Layer Sizes  50  and  25
Epoch 1/30
68/68 - 1s - loss: 0.6597 - mse: 0.0274 - 543ms/epoch - 8ms/step
Epoch 2/30
68/68 - 0s - loss: 0.6045 - mse: 0.0025 - 102ms/epoch - 1ms/step
Epoch 3/30
68/68 - 0s - loss: 0.5996 - mse: 7.1458e-04 - 94ms/epoch - 1ms/step
Epoch 4/30
68/68 - 0s - loss: 0.5993 - mse: 5.8166e-04 - 94ms/epoch - 1ms/step
Epoch 5/30
68/68 - 0s - loss: 0.5990 - mse: 5.1743e-04 - 100ms/epoch - 1ms/step
Epoch 6/30
68/68 - 0s - loss: 0.5989 - mse: 4.6538e-04 - 106ms/epoch - 2ms/step
Epoch 7/30
68/68 - 0s - loss: 0.5988 - mse: 4.3993e-04 - 91ms/epoch - 1ms/step
Epoch 8/30
68/68 - 0s - loss: 0.5986 - mse: 3.8114e-04 - 91ms/epoch - 1ms/step
Epoch 9/30
68/68 - 0s - loss: 0.5985 - mse: 3.3836e-04 - 92ms/epoch - 1ms/step
Epoch 10/30
68/68 - 0s - loss: 0.5984 - mse: 3.0853e-04 - 98ms/epoch - 1ms/step
Epoch 11/30
68/68 - 0s - loss: 0.5983 - mse: 2.8701e-04 - 95ms/epoch - 1ms/step
Epoch 12/30
68/68 - 0s - loss: 0.5982 - mse: 2.7392e-04 - 94ms/epoch - 1ms/step
Epoch 13/30


###Decision Tree

In [None]:
# Train decision tree regressor
from sklearn.model_selection import GridSearchCV
from sklearn.tree import DecisionTreeRegressor
from sklearn.metrics import mean_squared_error

param_grid = {'max_depth': [3, 5, 7, 9, 11]}
reg = DecisionTreeRegressor()
reg.fit(X_train, y_train)


grid = GridSearchCV(reg, 
                    param_grid, 
                    cv= 5, 
                    verbose=1, 
                    refit=True)

grid_search = grid.fit(X_train, y_train)

print(grid_search.best_score_)
print(grid_search.best_estimator_)

Fitting 5 folds for each of 5 candidates, totalling 25 fits
0.9972202311538011
DecisionTreeRegressor(max_depth=11)


In [None]:
# fit datasets to the Decision Tree Regressor with best parameters
reg_best = DecisionTreeRegressor(max_depth=11, random_state=42)
reg_best.fit(X_train, y_train)

DecisionTreeRegressor(max_depth=11, random_state=42)

## 6) Test your trained classifiers on the Validation set (10 pts)
Test your trained classifiers on the validation set and print the mean squared errors.


###Neural Network Models on Validation Set

In [None]:
# tests on validation
from sklearn.metrics import mean_squared_error

def validation_mse(model):
    predictions = model.predict(X_val)
    mse = mean_squared_error(y_val, predictions)
    return mse

In [None]:
print("Results of the Models on Validation Set")
print("-------------------------------------------------------------------------")
print("\nOne Layer Neural Network Models")
print("Hidden Layer = 25")
print("MSE of 1 Hidden Layer (25) // Learning Rate (0.1) NN Model: ", validation_mse(NN_Model_1L_25_1))
print("MSE of 1 Hidden Layer (25) // Learning Rate (0.01) NN Model: ", validation_mse(NN_Model_1L_25_01))
print("MSE of 1 Hidden Layer (25) // Learning Rate (0.001) NN Model: ", validation_mse(NN_Model_1L_25_001))

print("\nHidden Layer = 50")
print("MSE of 1 Hidden Layer (50) // Learning Rate (0.1) NN Model: ", validation_mse(NN_Model_1L_50_1))
print("MSE of 1 Hidden Layer (50) // Learning Rate (0.01) NN Model: ", validation_mse(NN_Model_1L_50_01))
print("MSE of 1 Hidden Layer (50) // Learning Rate (0.001) NN Model: ", validation_mse(NN_Model_1L_50_001))

print("\nHidden Layer = 100")
print("MSE of 1 Hidden Layer (100) // Learning Rate (0.1) NN Model: ", validation_mse(NN_Model_1L_100_1))
print("MSE of 1 Hidden Layer (100) // Learning Rate (0.01) NN Model: ", validation_mse(NN_Model_1L_100_01))
print("MSE of 1 Hidden Layer (100) // Learning Rate (0.001) NN Model: ", validation_mse(NN_Model_1L_100_001))

print("\n-------------------------------------------------------------------------")

print("\nTwo Layer Neural Network Models")
print("Hidden Layer = 200/100")
print("MSE of 2 Hidden Layer (200/100) // Learning Rate (0.1) NN Model: ", validation_mse(NN_Model_2L_200_100_1))
print("MSE of 2 Hidden Layer (200/100) // Learning Rate (0.01) NN Model: ", validation_mse(NN_Model_2L_200_100_01))
print("MSE of 2 Hidden Layer (200/100) // Learning Rate (0.001) NN Model: ", validation_mse(NN_Model_2L_200_100_001))

print("\nHidden Layer = 100/50")
print("MSE of 2 Hidden Layer (100/50) // Learning Rate (0.1) NN Model: ", validation_mse(NN_Model_2L_100_50_1))
print("MSE of 2 Hidden Layer (100/50) // Learning Rate (0.01) NN Model: ", validation_mse(NN_Model_2L_100_50_01))
print("MSE of 2 Hidden Layer (100/50) // Learning Rate (0.001) NN Model: ", validation_mse(NN_Model_2L_100_50_001))

print("\nHidden Layer = 50/25")
print("MSE of 2 Hidden Layer (50/25) // Learning Rate (0.1) NN Model: ", validation_mse(NN_Model_2L_50_25_1))
print("MSE of 2 Hidden Layer (50/25) // Learning Rate (0.01) NN Model: ", validation_mse(NN_Model_2L_50_25_01))
print("MSE of 2 Hidden Layer (50/25) // Learning Rate (0.001) NN Model: ", validation_mse(NN_Model_2L_50_25_001))

Results of the Models on Validation Set
-------------------------------------------------------------------------

One Layer Neural Network Models
Hidden Layer = 25
MSE of 1 Hidden Layer (25) // Learning Rate (0.1) NN Model:  0.0004907938130150637
MSE of 1 Hidden Layer (25) // Learning Rate (0.01) NN Model:  0.0001551278205485198
MSE of 1 Hidden Layer (25) // Learning Rate (0.001) NN Model:  0.00038025497940817534

Hidden Layer = 50
MSE of 1 Hidden Layer (50) // Learning Rate (0.1) NN Model:  0.00030226629212054914
MSE of 1 Hidden Layer (50) // Learning Rate (0.01) NN Model:  9.589929885016583e-05
MSE of 1 Hidden Layer (50) // Learning Rate (0.001) NN Model:  0.0002011220552063705

Hidden Layer = 100
MSE of 1 Hidden Layer (100) // Learning Rate (0.1) NN Model:  0.0001482838743029519
MSE of 1 Hidden Layer (100) // Learning Rate (0.01) NN Model:  0.00017097537826158443
MSE of 1 Hidden Layer (100) // Learning Rate (0.001) NN Model:  0.00019941555051612586

--------------------------------

###Decision Tree Regressor on Validation Set

In [None]:
# test decision tree regressor on validation
y_pred_dec_reg_val = reg_best.predict(X_val)
mse_dec_reg_val = mean_squared_error(y_val, y_pred_dec_reg_val)
print("Mse of Decision Tree Regressor: ", mse_dec_reg_val)

Mse of Decision Tree Regressor:  0.00010938510541337879


## 7) Test your classifier on Test set (10 pts)

- Load test data
- Apply same pre-processing as training data (encoding categorical variables, scaling)
- Predict the labels of testing data **using the best model that you have selected according to your validation results** and report the mean squared error. 

###Preprocess the Test Set

In [None]:
# Preprocessing the Test Set
test_df = pd.read_csv(PATH+"test.csv")

In [None]:
test_df.head(5)

Unnamed: 0,sqmtrs,nrooms,view,crime_rate,price
0,349,3,south,high,836553.5
1,169,1,west,high,512741.6
2,233,3,south,high,663880.6
3,340,4,north,low,1000086.0
4,199,2,east,low,745015.1


In [None]:
print("Data dimensionality of test set is: ", test_df.shape, "\n") 
# also give some statistics about the data like mean, standard deviation etc. 
print("Mean of the train set is:\n", test_df.mean(numeric_only=True), "\n") 
print("Standard Deviation of the train set is:\n", test_df.std(numeric_only=True))

Data dimensionality of test set is:  (1200, 5) 

Mean of the train set is:
 sqmtrs       221.747500
nrooms         2.953333
price     718718.940900
dtype: float64 

Standard Deviation of the train set is:
 sqmtrs        72.212760
nrooms         1.421093
price     148792.625998
dtype: float64


In [None]:
# Encode the categorical variables --> view and crime_rate
# Encoding the view
test_df['west'] = test_df['view'].map(lambda x: 1 if x == 'west' else 0)
test_df['east'] = test_df['view'].map(lambda x: 1 if x == 'east' else 0)
test_df['north'] = test_df['view'].map(lambda x: 1 if x == 'north' else 0)
test_df['south'] = test_df['view'].map(lambda x: 1 if x == 'south' else 0)
test_df.drop("view", inplace=True, axis=1)
# ------------------------------------------------------------------------------

# Encoding th crime_rate
# low --> 0
# high --> 1

target_mapping_crime_rate = {'low': 0,
                             'high': 1}

test_df['crime_rate'] = test_df['crime_rate'].map(lambda x: target_mapping_crime_rate[x])

In [None]:
test_df.head(5)

Unnamed: 0,sqmtrs,nrooms,crime_rate,price,west,east,north,south
0,349,3,1,836553.5,0,0,0,1
1,169,1,1,512741.6,1,0,0,0
2,233,3,1,663880.6,0,0,0,1
3,340,4,0,1000086.0,0,0,1,0
4,199,2,0,745015.1,0,1,0,0


In [None]:
# scale the features between 0-1
msc = MinMaxScaler()

y_test = msc.fit_transform(test_df) 
test_df[["sqmtrs", "nrooms", "crime_rate", "price"]] = msc.fit_transform(test_df[["sqmtrs", "nrooms", "crime_rate", "price"]])

test_df.head(5)

Unnamed: 0,sqmtrs,nrooms,crime_rate,price,west,east,north,south
0,1.0,0.5,1.0,0.690441,0,0,0,1
1,0.277108,0.0,1.0,0.229008,1,0,0,0
2,0.534137,0.5,1.0,0.444381,0,0,0,1
3,0.963855,0.75,0.0,0.923476,0,0,1,0
4,0.39759,0.25,0.0,0.559998,0,1,0,0


In [None]:
X_test = test_df.drop('price', axis=1)
y_test = test_df.drop(['sqmtrs', 'nrooms', 'west', 'east', 'north', 'south', 'crime_rate'], axis=1)

In [None]:
# print the shape of data 
print("Data dimensionality of test set after preprocessing: ", test_df.shape, "\n") 
print ("Test X Shape: ", X_test.shape) 
print ("Test Y Shape: ", y_test.shape) 
# also give some statistics about the data like mean, standard deviation etc. 
print("\nMean of the test set after preprocessing:\n", test_df.mean(numeric_only=True), "\n") 
print("Standard Deviation of the test set after preprocessing:\n", test_df.std(numeric_only=True))
# print the shape of data 

Data dimensionality of test set after preprocessing:  (1200, 8) 

Test X Shape:  (1200, 7)
Test Y Shape:  (1200, 1)

Mean of the test set after preprocessing:
 sqmtrs        0.488946
nrooms        0.488333
crime_rate    0.503333
price         0.522526
west          0.271667
east          0.254167
north         0.237500
south         0.236667
dtype: float64 

Standard Deviation of the test set after preprocessing:
 sqmtrs        0.290011
nrooms        0.355273
crime_rate    0.500197
price         0.212030
west          0.445004
east          0.435573
north         0.425729
south         0.425213
dtype: float64


###Neural Network Models on Test Set

In [None]:
# Test Results
from sklearn.metrics import mean_squared_error

def test_mse(model):
    predictions = model.predict(X_test)
    mse = mean_squared_error(y_test, predictions)
    return mse

In [None]:
print("Results of the Models on Test Set")
print("-------------------------------------------------------------------------")
print("\nOne Layer Neural Network Models")
print("Hidden Layer = 25")
print("MSE of 1 Hidden Layer (25) // Learning Rate (0.1) NN Model: ", test_mse(NN_Model_1L_25_1))
print("MSE of 1 Hidden Layer (25) // Learning Rate (0.01) NN Model: ", test_mse(NN_Model_1L_25_01))
print("MSE of 1 Hidden Layer (25) // Learning Rate (0.001) NN Model: ", test_mse(NN_Model_1L_25_001))

print("\nHidden Layer = 50")
print("MSE of 1 Hidden Layer (50) // Learning Rate (0.1) NN Model: ", test_mse(NN_Model_1L_50_1))
print("MSE of 1 Hidden Layer (50) // Learning Rate (0.01) NN Model: ", test_mse(NN_Model_1L_50_01))
print("MSE of 1 Hidden Layer (50) // Learning Rate (0.001) NN Model: ", test_mse(NN_Model_1L_50_001))

print("\nHidden Layer = 100")
print("MSE of 1 Hidden Layer (100) // Learning Rate (0.1) NN Model: ", test_mse(NN_Model_1L_100_1))
print("MSE of 1 Hidden Layer (100) // Learning Rate (0.01) NN Model: ", test_mse(NN_Model_1L_100_01))
print("MSE of 1 Hidden Layer (100) // Learning Rate (0.001) NN Model: ", test_mse(NN_Model_1L_100_001))

print("\n-------------------------------------------------------------------------")

print("\nTwo Layer Neural Network Models")
print("Hidden Layer = 200/100")
print("MSE of 2 Hidden Layer (200/100) // Learning Rate (0.1) NN Model: ", test_mse(NN_Model_2L_200_100_1))
print("MSE of 2 Hidden Layer (200/100) // Learning Rate (0.01) NN Model: ", test_mse(NN_Model_2L_200_100_01))
print("MSE of 2 Hidden Layer (200/100) // Learning Rate (0.001) NN Model: ", test_mse(NN_Model_2L_200_100_001))

print("\nHidden Layer = 100/50")
print("MSE of 2 Hidden Layer (100/50) // Learning Rate (0.1) NN Model: ", test_mse(NN_Model_2L_100_50_1))
print("MSE of 2 Hidden Layer (100/50) // Learning Rate (0.01) NN Model: ", test_mse(NN_Model_2L_100_50_01))
print("MSE of 2 Hidden Layer (100/50) // Learning Rate (0.001) NN Model: ", test_mse(NN_Model_2L_100_50_001))

print("\nHidden Layer = 50/25")
print("MSE of 2 Hidden Layer (50/25) // Learning Rate (0.1) NN Model: ", test_mse(NN_Model_2L_50_25_1))
print("MSE of 2 Hidden Layer (50/25) // Learning Rate (0.01) NN Model: ", test_mse(NN_Model_2L_50_25_01))
print("MSE of 2 Hidden Layer (50/25) // Learning Rate (0.001) NN Model: ", test_mse(NN_Model_2L_50_25_001))

Results of the Models on Test Set
-------------------------------------------------------------------------

One Layer Neural Network Models
Hidden Layer = 25
MSE of 1 Hidden Layer (25) // Learning Rate (0.1) NN Model:  0.0002696485240577584
MSE of 1 Hidden Layer (25) // Learning Rate (0.01) NN Model:  0.0007265900846522647
MSE of 1 Hidden Layer (25) // Learning Rate (0.001) NN Model:  0.0008907072581241067

Hidden Layer = 50
MSE of 1 Hidden Layer (50) // Learning Rate (0.1) NN Model:  0.0012779545091943362
MSE of 1 Hidden Layer (50) // Learning Rate (0.01) NN Model:  0.00039422992549674907
MSE of 1 Hidden Layer (50) // Learning Rate (0.001) NN Model:  0.0005274032983677439

Hidden Layer = 100
MSE of 1 Hidden Layer (100) // Learning Rate (0.1) NN Model:  0.0005113874032822377
MSE of 1 Hidden Layer (100) // Learning Rate (0.01) NN Model:  0.00048466450723667724
MSE of 1 Hidden Layer (100) // Learning Rate (0.001) NN Model:  0.0007264763477385583

----------------------------------------

###Decision Tree Regressor on Test Set

In [None]:
# test decision tree regressor on validation
y_pred_dec_reg_test = reg_best.predict(X_test)
mse_dec_reg_test = mean_squared_error(y_test, y_pred_dec_reg_test)
print("Mse of Decision Tree Regressor: ", mse_dec_reg_test)

Mse of Decision Tree Regressor:  0.0005134563178346381


##8) Report Your Results (10 pts)

**Notebook should be RUN:** As training and testing may take a long time, we may just look at your notebook results without running the code again; so make sure **each cell is run**, so outputs are there.

**Report:** Write an **1-2 page summary** of your approach to this problem **as indicated below**. 

**Must include statements such as those below:**
**(Remove the text in parentheses, below, and include your own report)**

( Include the problem definition: 1-2 lines )

 (Talk about train/val/test sets, size and how split. )

 (Talk about feature extraction or preprocessing.)

**Add your observations as follows** (keep the questions for easy grading/context) in the report part of your notebook.

**Observations**

- Try a few learning rates for N=25 hidden neurons,  train for the indicated amount of epochs. Comment on what happens when learning rate is large or small? What is a good number/range for the learning rate?
Your answer here….

- Use that learning rate and vary the number of hidden neurons for the given values and try the indicated number of epochs. Give the validation mean squared errors for different approach and meta-parameters tried **in a table** and state which one you selected as your model. How many hidden neurons give the best model? 
Your answer here….

- State  what your test results are with the chosen approach and meta-parameters: e.g. "We have obtained the best results on the validation set with the ..........approach using a value of ...... for .... parameter. The result of this model on the test data is ..... % accuracy."" 

- How slow is learning? Any other problems?
Your answer here….

- Any other observations (not obligatory)

 You can add additional visualization as separate pages if you want, think of them as appendix, keeping the summary to 1-2-pages.

