<a href="https://colab.research.google.com/github/VidushiSharma31/ML-DL/blob/main/Deep%20Learning/HousingPrice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# California Housing Price Prediction using Neural Networks

### Importing Libraries

This cell imports all the necessary libraries for the task, including libraries for numerical operations (numpy), building and training neural networks (tensorflow, keras), loading a dataset (sklearn.datasets), splitting data (sklearn.model_selection), scaling data (sklearn.preprocessing), and evaluating model performance (sklearn.metrics).

In [35]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.metrics import accuracy_score

### Loading the Dataset

This cell loads the California Housing dataset using scikit-learn's fetch_california_housing function. It separates the features (X) and the target variable (y).

In [36]:
housing = fetch_california_housing()
X, y = housing.data, housing.target

### Splitting the Data

This cell splits the dataset into training and testing sets. 80% of the data is used for training and 20% for testing. A fixed random_state ensures the split is the same each time the code is run.

In [37]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Scaling the Features

This cell scales the features using StandardScaler. This is important for neural networks as it helps with faster convergence during training. The scaler is fitted only on the training data to avoid data leakage.

In [38]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

### Building the Model

This cell defines the neural network model using Keras Sequential API. It consists of three hidden layers with ReLU activation and an output layer with linear activation, suitable for regression tasks.

In [39]:
model = Sequential([Dense(64, activation='relu'),
                    Dense(32, activation='relu'),
                    Dense(16, activation='relu'),
                    Dense(1, activation='linear')])

### Compiling the Model

This cell compiles the model. It uses the Adam optimizer with a learning rate of 1e-3, 'mean_squared_error' as the loss function (appropriate for regression), and 'mae' (Mean Absolute Error) as a metric to monitor during training.

In [40]:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3), loss='mean_squared_error', metrics=['mae'])

### Defining Callbacks

This cell defines two callbacks: EarlyStopping to stop training when the validation loss stops improving and restore the best weights, and ReduceLROnPlateau to reduce the learning rate when the validation loss plateaus.

In [41]:
early_stop = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5)

### Training the Model

This cell trains the neural network model using the scaled training data. It trains for a maximum of 75 epochs with a batch size of 16. 20% of the training data is used for validation, and the defined callbacks are applied.

In [46]:
model.fit(X_train_scaled, y_train, epochs=75, batch_size = 16, validation_split=0.2, callbacks=[early_stop, reduce_lr], verbose=1)

Epoch 1/75
[1m826/826[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - loss: 0.2219 - mae: 0.3189 - val_loss: 0.2727 - val_mae: 0.3499 - learning_rate: 6.2500e-05
Epoch 2/75
[1m826/826[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - loss: 0.2223 - mae: 0.3179 - val_loss: 0.2726 - val_mae: 0.3466 - learning_rate: 6.2500e-05
Epoch 3/75
[1m826/826[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - loss: 0.2174 - mae: 0.3158 - val_loss: 0.2723 - val_mae: 0.3487 - learning_rate: 6.2500e-05
Epoch 4/75
[1m826/826[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - loss: 0.2206 - mae: 0.3188 - val_loss: 0.2727 - val_mae: 0.3509 - learning_rate: 6.2500e-05
Epoch 5/75
[1m826/826[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 4ms/step - loss: 0.2216 - mae: 0.3173 - val_loss: 0.2724 - val_mae: 0.3493 - learning_rate: 6.2500e-05
Epoch 6/75
[1m826/826[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - loss: 0.2163

<keras.src.callbacks.history.History at 0x787bb9921a10>

### Evaluating the Model

This cell evaluates the trained model on the unseen test data and prints the test loss (Mean Squared Error) and Mean Absolute Error.

In [47]:
loss, mae = model.evaluate(X_test_scaled, y_test)
print(f"Test Loss: {loss:.4f}")
print(f"Test MAE: {mae:.4f}")

[1m129/129[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - loss: 0.2604 - mae: 0.3413
Test Loss: 0.2653
Test MAE: 0.3435


### Making Predictions

This cell makes predictions on the scaled test data using the trained model and prints the first 5 predictions along with their corresponding actual values.

In [48]:
y_pred = model.predict(X_test_scaled)
print("Predictions:", y_pred[:5].flatten())
print("Actual values:", y_test[:5].flatten())

[1m129/129[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step
Predictions: [0.5238954 1.0314524 5.058307  2.591416  2.6866624]
Actual values: [0.477   0.458   5.00001 2.186   2.78   ]


### Calculating Additional Evaluation Metrics

This cell calculates and prints additional evaluation metrics: Mean Squared Error, Root Mean Squared Error, and R2 Score, providing a more comprehensive understanding of the model's performance on the test set.

In [50]:
from sklearn.metrics import r2_score, mean_squared_error
mse = mean_squared_error(y_test, y_pred)
rmse = mse**0.5
r2 = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)
print("Root Mean Squared Error:", rmse)
print("R2 Score:", r2)

Mean Squared Error: 0.2652855257312225
Root Mean Squared Error: 0.5150587594937324
R2 Score: 0.7975551677863629
