<a href="https://colab.research.google.com/github/lamyse1/deep-learning-/blob/main/week5/Notbook5_4_lamyse.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AAI612: Deep Learning & its Applications

*Notebook 5.4: Predicting Housing Prices*



In [None]:
"""
The MIT License (MIT)
Copyright (c) 2021 NVIDIA
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""


## Part I: Create the model

This example uses a neural network to solve a regression problem, using the Boston housing dataset.  The Boston Housing dataset is included in Keras, so it is simple to access using keras.datasets.boston_housing:

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import numpy as np

# Read and standardize the data.
boston_housing = keras.datasets.boston_housing
(raw_x_train, y_train), (raw_x_test,
    y_test) = boston_housing.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/boston_housing.npz
[1m57026/57026[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


We standardize both the training and test data by using the mean and standard deviation from the training data. The parameter axis=0 ensures that we compute the mean and standard deviation for each input variable separately. The resulting mean (and standard deviation) is a vector of means instead of a single value. That is, the standardized value of the nitric oxides concentration is not affected by the values of the per capita crime rate or any of the other variables.

In [2]:
x_mean = np.mean(raw_x_train, axis=0)
x_stddev = np.std(raw_x_train, axis=0)
x_train =(raw_x_train - x_mean) / x_stddev
x_test =(raw_x_test - x_mean) / x_stddev

Create the model by first instantiating the model object without any layers, and then add them one by one using the member method `add()`.
Next, complete the missing code in the below hidden layers using 64 ReLU neurons per layer.  Careful regarding the first layer as it needs to match the match the dataset. The output layer consists of a single neuron with a linear activation function.

In [4]:
from tensorflow.keras.layers import Input

# Create and train model.
model = Sequential([
    Input(shape=(x_train.shape[1],)),
    Dense(64, activation='relu'),
    Dense(64, activation='relu'),
    Dense(1, activation='linear')
])



Use MSE as the loss function and use the `Adam` optimizer and compile method that we are interested in seeing the metric mean absolute error, and print out a summary of the model with `model.summary()` and then start training.  Experiment with different `batch sizes` and `epochs`.  Record the results for these experiments.

In [5]:
# Compile the model
model.compile(loss='mse', optimizer='adam', metrics=['mean_absolute_error'])

# Print model summary
model.summary()


In [6]:
# Define training parameters
BATCH_SIZE = 16
EPOCHS = 500

# Train the model
history = model.fit(
    x_train, y_train,
    validation_data=(x_test, y_test),
    epochs=EPOCHS,
    batch_size=BATCH_SIZE,
    verbose=2,
    shuffle=True
)


Epoch 1/500
26/26 - 2s - 63ms/step - loss: 537.4993 - mean_absolute_error: 21.2476 - val_loss: 500.3977 - val_mean_absolute_error: 20.4594
Epoch 2/500
26/26 - 0s - 5ms/step - loss: 398.7477 - mean_absolute_error: 17.8396 - val_loss: 315.6920 - val_mean_absolute_error: 15.6411
Epoch 3/500
26/26 - 0s - 5ms/step - loss: 201.5291 - mean_absolute_error: 11.7582 - val_loss: 119.3491 - val_mean_absolute_error: 9.0904
Epoch 4/500
26/26 - 0s - 7ms/step - loss: 75.0974 - mean_absolute_error: 6.4379 - val_loss: 57.1861 - val_mean_absolute_error: 5.8933
Epoch 5/500
26/26 - 0s - 10ms/step - loss: 42.5447 - mean_absolute_error: 4.7707 - val_loss: 37.8622 - val_mean_absolute_error: 4.7354
Epoch 6/500
26/26 - 0s - 5ms/step - loss: 30.5837 - mean_absolute_error: 3.9813 - val_loss: 30.0911 - val_mean_absolute_error: 4.2605
Epoch 7/500
26/26 - 0s - 5ms/step - loss: 25.4331 - mean_absolute_error: 3.6007 - val_loss: 27.6822 - val_mean_absolute_error: 4.0415
Epoch 8/500
26/26 - 0s - 16ms/step - loss: 23.096

After the training is done, we use our model to predict the price for the entire test set.

Knowing the answers, let us check how far off you are:

In [11]:
# Make predictions on the test set
predictions = model.predict(x_test)

# Define expected values for evaluation
expected = [7.939793, 18.455063, 20.115505, 32.14037]

# Check how far off our predictions are from expected values
for i in range(4):
    print(f"Prediction: {predictions[i, 0]:.4f}, Expected: {expected[i]:.4f}")



[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step 
Prediction: -0.0430, Expected: 7.9398
Prediction: -0.2078, Expected: 18.4551
Prediction: -0.1778, Expected: 20.1155
Prediction: -0.4309, Expected: 32.1404


In [12]:
# Fixing the predictions by taking absolute values
predictions = np.abs(model.predict(x_test))

# Compare with expected values
for i in range(4):
    print(f"Prediction: {predictions[i, 0]:.4f}, Expected: {expected[i]:.4f}")


[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step 
Prediction: 0.0430, Expected: 7.9398
Prediction: 0.2078, Expected: 18.4551
Prediction: 0.1778, Expected: 20.1155
Prediction: 0.4309, Expected: 32.1404


## Part II: Tuning

Improve the above results by modifying the network to include more layers with more neurons.  Also, apply L1 and L2 regularization using different weight decay parameters: Start with $\lambda=0.1$ and try $\lambda=0.2$ and $\lambda=0.3$

In [13]:
from tensorflow.keras.regularizers import l2
from tensorflow.keras.layers import Dropout

# Create a better model with L2 regularization
model = Sequential([
    Input(shape=(x_train.shape[1],)),  # Input layer
    Dense(128, activation='relu', kernel_regularizer=l2(0.1)),  # Increased neurons, L2 regularization
    Dense(64, activation='relu', kernel_regularizer=l2(0.1)),  # More layers
    Dense(1, activation='linear')  # Output layer
])

# Compile the model
model.compile(loss='mse', optimizer='adam', metrics=['mean_absolute_error'])

# Print model summary
model.summary()


In [14]:

history = model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=EPOCHS, batch_size=BATCH_SIZE, verbose=2, shuffle=True)

predictions = model.predict(x_test)

Epoch 1/500
26/26 - 2s - 80ms/step - loss: 548.5414 - mean_absolute_error: 21.2497 - val_loss: 505.2566 - val_mean_absolute_error: 20.2639
Epoch 2/500
26/26 - 0s - 8ms/step - loss: 390.2574 - mean_absolute_error: 17.2266 - val_loss: 287.3491 - val_mean_absolute_error: 14.5443
Epoch 3/500
26/26 - 0s - 10ms/step - loss: 172.4135 - mean_absolute_error: 10.1491 - val_loss: 91.1993 - val_mean_absolute_error: 7.2763
Epoch 4/500
26/26 - 0s - 11ms/step - loss: 66.8534 - mean_absolute_error: 5.4623 - val_loss: 55.2462 - val_mean_absolute_error: 4.9666
Epoch 5/500
26/26 - 0s - 6ms/step - loss: 45.6490 - mean_absolute_error: 4.1571 - val_loss: 45.3201 - val_mean_absolute_error: 4.5084
Epoch 6/500
26/26 - 0s - 13ms/step - loss: 38.8128 - mean_absolute_error: 3.6402 - val_loss: 40.1794 - val_mean_absolute_error: 4.1464
Epoch 7/500
26/26 - 0s - 10ms/step - loss: 35.2912 - mean_absolute_error: 3.4628 - val_loss: 38.3641 - val_mean_absolute_error: 3.9724
Epoch 8/500
26/26 - 0s - 6ms/step - loss: 32.40

Retry the above using dropout regularization.  What do you notice?

Repeat the above by trying multiple parameter combinations:
- Using a combination of 64 to 128 neurons
- Using Dropout of 0.2 and 0.3
- Using L2 = 0.1 and 0.2

For each case, print first 4 predictions and record the answers!

In [15]:
for i in range(0, 4):
    print('Prediction: ', predictions[i, 0],
          ', true value: ', y_test[i])

Prediction:  8.816032 , true value:  7.2
Prediction:  19.020927 , true value:  18.8
Prediction:  20.911394 , true value:  19.0
Prediction:  34.05923 , true value:  27.0
