# AAI612: Deep Learning & its Applications

*Notebook 5.4: Predicting Housing Prices*

<a href="https://colab.research.google.com/github/OmarMlaeb/AAI612_Malaeb/blob/master/Week%205/Notebook5.4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
"""
The MIT License (MIT)
Copyright (c) 2021 NVIDIA
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""


'\nThe MIT License (MIT)\nCopyright (c) 2021 NVIDIA\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the "Software"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of\nthe Software, and to permit persons to whom the Software is furnished to do so,\nsubject to the following conditions:\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\nTHE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS\nFOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR\nCOPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER\nIN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OU

## Part I: Create the model

This example uses a neural network to solve a regression problem, using the Boston housing dataset.  The Boston Housing dataset is included in Keras, so it is simple to access using keras.datasets.boston_housing:

In [2]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np

# Read and standardize the data.
boston_housing = keras.datasets.boston_housing
(raw_x_train, y_train), (raw_x_test,
    y_test) = boston_housing.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/boston_housing.npz


We standardize both the training and test data by using the mean and standard deviation from the training data. The parameter axis=0 ensures that we compute the mean and standard deviation for each input variable separately. The resulting mean (and standard deviation) is a vector of means instead of a single value. That is, the standardized value of the nitric oxides concentration is not affected by the values of the per capita crime rate or any of the other variables.

In [3]:
x_mean = np.mean(raw_x_train, axis=0)
x_stddev = np.std(raw_x_train, axis=0)
x_train =(raw_x_train - x_mean) / x_stddev
x_test =(raw_x_test - x_mean) / x_stddev

Create the model by first instantiating the model object without any layers, and then add them one by one using the member method `add()`.
Next, complete the missing code in the below hidden layers using 64 ReLU neurons per layer.  Careful regarding the first layer as it needs to match the match the dataset. The output layer consists of a single neuron with a linear activation function.

In [4]:
# Create and train model.
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=[x_train.shape[1]]))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='linear'))
# model.add(Dense(64, activation='#Fill', input_shape=[#Fill]))
# model.add(Dense(64, activation='#Fill'))
# model.add(Dense(1, activation='#Fill'))

Use MSE as the loss function and use the `Adam` optimizer and compile method that we are interested in seeing the metric mean absolute error, and print out a summary of the model with `model.summary()` and then start training.  Experiment with different `batch sizes` and `epochs`.  Record the results for these experiments.

In [5]:
BATCH_SIZE = 16
EPOCHS = 500

# model.compile(loss='#Fill', optimizer='#Fill', metrics =['mean_absolute_error'])
model.compile(loss='mse', optimizer='adam', metrics=['mean_absolute_error'])
model.summary()
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=BATCH_SIZE, verbose=2, shuffle=True)

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 64)                896       
                                                                 
 dense_1 (Dense)             (None, 64)                4160      
                                                                 
 dense_2 (Dense)             (None, 1)                 65        
                                                                 
Total params: 5,121
Trainable params: 5,121
Non-trainable params: 0
_________________________________________________________________
Epoch 1/500
26/26 - 5s - loss: 536.0314 - mean_absolute_error: 21.3769 - val_loss: 502.5948 - val_mean_absolute_error: 20.6925 - 5s/epoch - 176ms/step
Epoch 2/500
26/26 - 0s - loss: 407.0558 - mean_absolute_error: 18.2799 - val_loss: 327.3828 - val_mean_absolute_error: 16.2475 - 100ms/epoch - 4ms/step
Epoch 3/500
26/

After the training is done, we use our model to predict the price for the entire test set.

In [6]:
predictions = model.predict(x_test)



Knowing the answers, let us check how far off you are:

In [7]:
expected =[7.939793, 18.455063, 20.115505, 32.14037]
for i in range(0, 4):
    assert ((predictions[i] - expected[i]) < 0.0001), "predicted value is too large"

AssertionError: predicted value is too large

## Part II: Tuning

Improve the above results by modifying the network to include more layers with more neurons.  Also, apply L1 and L2 regularization using different weight decay parameters: Start with $\lambda=0.1$ and try $\lambda=0.2$ and $\lambda=0.3$

In [8]:
from tensorflow.keras.layers import Dropout
from tensorflow.keras.regularizers import l1_l2

# Create and train model.
model = Sequential()

# Create the model
model = Sequential()
from tensorflow.keras.layers import Dropout

# Create and train model.
model = Sequential()

# Create the model
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=[x_train.shape[1]], kernel_regularizer=l1_l2(0.1)))
model.add(Dropout(0.2))
model.add(Dense(128, activation='relu', kernel_regularizer=l1_l2(0.1)))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu', kernel_regularizer=l1_l2(0.1)))
model.add(Dense(1, activation='linear'))

model.compile(loss='mse', optimizer='adam', metrics=['mean_absolute_error'])
model.summary()
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=BATCH_SIZE, verbose=2, shuffle=True)

predictions = model.predict(x_test)
model.add(Dense(1, activation='linear'))


model.summary()
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=BATCH_SIZE, verbose=2, shuffle=True)

predictions_model1 = model.predict(x_test)

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 128)               1792      
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                                 
 dense_4 (Dense)             (None, 128)               16512     
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 dense_5 (Dense)             (None, 64)                8256      
                                                                 
 dense_6 (Dense)             (None, 1)                 65        
                                                                 
Total params: 26,625
Trainable params: 26,625
Non-trai

In [9]:
# Create and train model.
model = Sequential()

# Create the model
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=[x_train.shape[1]], kernel_regularizer=l1_l2(0.2)))
model.add(Dropout(0.2))
model.add(Dense(128, activation='relu', kernel_regularizer=l1_l2(0.2)))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu', kernel_regularizer=l1_l2(0.2)))
model.add(Dense(1, activation='linear'))

model.compile(loss='mse', optimizer='adam', metrics=['mean_absolute_error'])
model.summary()
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=BATCH_SIZE, verbose=2, shuffle=True)

predictions_model2 = model.predict(x_test)

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_8 (Dense)             (None, 128)               1792      
                                                                 
 dropout_2 (Dropout)         (None, 128)               0         
                                                                 
 dense_9 (Dense)             (None, 128)               16512     
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_10 (Dense)            (None, 64)                8256      
                                                                 
 dense_11 (Dense)            (None, 1)                 65        
                                                                 
Total params: 26,625
Trainable params: 26,625
Non-trai

In [10]:
# Create and train model.
model = Sequential()

# Create the model
model = Sequential()
model.add(Dense(128, activation='relu', input_shape=[x_train.shape[1]], kernel_regularizer=l1_l2(0.3)))
model.add(Dropout(0.3))
model.add(Dense(128, activation='relu', kernel_regularizer=l1_l2(0.3)))
model.add(Dropout(0.3))
model.add(Dense(64, activation='relu', kernel_regularizer=l1_l2(0.3)))
model.add(Dense(1, activation='linear'))

model.compile(loss='mse', optimizer='adam', metrics=['mean_absolute_error'])
model.summary()
history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=BATCH_SIZE, verbose=2, shuffle=True)

predictions_model3 = model.predict(x_test)

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_12 (Dense)            (None, 128)               1792      
                                                                 
 dropout_4 (Dropout)         (None, 128)               0         
                                                                 
 dense_13 (Dense)            (None, 128)               16512     
                                                                 
 dropout_5 (Dropout)         (None, 128)               0         
                                                                 
 dense_14 (Dense)            (None, 64)                8256      
                                                                 
 dense_15 (Dense)            (None, 1)                 65        
                                                                 
Total params: 26,625
Trainable params: 26,625
Non-trai

Retry the above using dropout regularization.  What do you notice?

Repeat the above by trying multiple parameter combinations:
- Using a combination of 64 to 128 neurons
- Using Dropout of 0.2 and 0.3
- Using L2 = 0.1 and 0.2

For each case, print first 4 predictions and record the answers!

In [16]:
for i in range(0, 4):
    print('Prediction Model1: ', predictions_model1[i, 0],
          ', true value: ', y_test[i])

print('\n')

for i in range(0, 4):
    print('Prediction Model2: ', predictions_model2[i, 0],
      ', true value: ', y_test[i])

print('\n')

for i in range(0, 4):
    print('Prediction Model3: ', predictions_model3[i, 0],
      ', true value: ', y_test[i])

Prediction Model1:  12.534055 , true value:  7.2
Prediction Model1:  19.41805 , true value:  18.8
Prediction Model1:  20.773521 , true value:  19.0
Prediction Model1:  34.657 , true value:  27.0


Prediction Model2:  14.28108 , true value:  7.2
Prediction Model2:  18.795374 , true value:  18.8
Prediction Model2:  21.04365 , true value:  19.0
Prediction Model2:  36.012703 , true value:  27.0


Prediction Model3:  15.09882 , true value:  7.2
Prediction Model3:  18.009422 , true value:  18.8
Prediction Model3:  20.348314 , true value:  19.0
Prediction Model3:  34.37879 , true value:  27.0


as we can see increasing the L2 value and the dropout weakens the model’s ability to fit the data, leading to underfitting
moreover the model is struggling with extreme values like 7.2 and 27.0 meaning we need more tuning or to add more neurons or try batch normalization