<a href="https://colab.research.google.com/github/devparikh0506/DATA-602/blob/main/week_12/Homework_12.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Week 12 Template

This template provides code to load the [California housing dataset](https://scikit-learn.org/stable/datasets/real_world.html#california-housing-dataset) from scikit-learn.  In this dataset each observation represents a census block group. The dataset features represent numeric properties of the census block such as the median income, median house age, and average number of bedrooms for the block.  The target variable reflects the median house value for that census block (in hundreds of thousands of dollars).  Refer to the Scikit user guide for details.

For this assignment, you will need to build and train a deep (i.e., fully-connected) neural network in Keras that predicts the median house value from the given target variables. Note that this is a regression problem.

Your approach should:

* Scale the data and perform preprocessing as you see fit.  You may use scikit-learn for preprocessing.
* Predict unseen observations (validation and test) with a mean absolute percentage error (MAPE) of less than 25\%.
* Use a `ModelCheckpoint` callback during training to save the weights corresponding to the highest validation MAPE.  (You will need to use the `validation_split` parameter or provide validation data.)
* Load the weights from the best model after training
* Evaluate the best model against the test dataset

To receive full credit, your notebook must show that evaluation of the model on the test dataset yields an MAPE of 25\% or less.


In [1]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Input, Dense, Dropout, BatchNormalization
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from sklearn.metrics import mean_absolute_percentage_error

In [2]:
california_housing = fetch_california_housing(as_frame=False)
X = california_housing.data
y = california_housing.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.15)

In [3]:
# Using StandardScaler to center features and reduce variance, this will help neural network converge faster
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [4]:
# Defining model architecture
create_model = lambda: Sequential([
    Input(shape=(X_train.shape[1],)),
    Dense(64, activation='relu'),
    BatchNormalization(),
    Dropout(0.2),
    Dense(32, activation='relu'),
    BatchNormalization(),
    Dropout(0.2),
    Dense(16, activation='relu'),
    Dense(1)
])

This model is a fully connected neural network designed for regression with:

- First hidden layer: 64 neurons for initial feature extraction
- BatchNormalization to stabilize learning
- Dropout (0.2) to prevent overfitting
- Progressively smaller hidden layers (32, 16) to learn complex, abstract representations
- Final layer with single neuron for house value prediction
- ReLU activations for non-linear transformations


In [5]:
model = create_model()

In [6]:
# compiling model
model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])

In [7]:
# model early stop callback to stop training when 5 consicutive epochs doesn't improve validation MAE
early_stop = EarlyStopping(
    monitor='val_mae',
    patience=5,
    restore_best_weights=True,
    min_delta=0.001
)

In [8]:
# model checkpoint callback
checkpoint_path = 'best_model_weights.weights.h5'
model_checkpoint = ModelCheckpoint(
    filepath=checkpoint_path,
    monitor='val_mae',
    mode='min',
    save_best_only=True,
    save_weights_only=True
)

In [9]:
# Training model
model_history = model.fit(
    X_train_scaled, y_train,
    validation_split=0.2,
    epochs=100,
    batch_size=16,
    callbacks=[model_checkpoint, early_stop],
    verbose=1
)

Epoch 1/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 5ms/step - loss: 1.3844 - mae: 0.8639 - val_loss: 0.5632 - val_mae: 0.5308
Epoch 2/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - loss: 0.6220 - mae: 0.5880 - val_loss: 0.4037 - val_mae: 0.4508
Epoch 3/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - loss: 0.5671 - mae: 0.5527 - val_loss: 0.4093 - val_mae: 0.4545
Epoch 4/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - loss: 0.4992 - mae: 0.5188 - val_loss: 0.3900 - val_mae: 0.4452
Epoch 5/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - loss: 0.4693 - mae: 0.4997 - val_loss: 0.3671 - val_mae: 0.4290
Epoch 6/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - loss: 0.4490 - mae: 0.4920 - val_loss: 0.3929 - val_mae: 0.4417
Epoch 7/100
[1m878/878[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms

In [10]:
# Loading best weights
model.load_weights(checkpoint_path)

In [11]:
# Testing on test data
y_pred = model.predict(X_test_scaled)

[1m97/97[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step


In [12]:
mape = mean_absolute_percentage_error(y_test, y_pred)
print(f"Test MAPE: {mape * 100:.2f}%")

Test MAPE: 22.08%
