# Task 3: Multimodal Housing Price Prediction (Images + Tabular Data)

## Objective
The objective of this task is to predict housing prices by combining visual information from house images with structured tabular features such as square footage, number of bedrooms, bathrooms, age of the house, and a location score.

This task demonstrates a **multimodal machine learning approach**, where features extracted from images using a Convolutional Neural Network (CNN) are fused with traditional tabular data to improve regression performance.

---

## Dataset Description
The dataset consists of:
- **Tabular features**:
  - `sqft`: Total area of the house in square feet
  - `bedrooms`: Number of bedrooms
  - `bathrooms`: Number of bathrooms
  - `age`: Age of the house in years
  - `location_score`: A proxy score representing neighborhood quality
- **Image data**:
  - Synthetic house images generated programmatically, each corresponding to one data row
- **Target variable**:
  - `price`: House price (regression target)

---

## Methodology
1. A **pretrained MobileNetV2 CNN** is used as a fixed feature extractor to obtain visual embeddings from house images.
2. Tabular features are standardized using `StandardScaler`.
3. Image embeddings and tabular features are **concatenated** to form a unified feature vector.
4. A fully connected neural network is trained on the combined features to predict house prices.

---

## Evaluation Metrics
Model performance is evaluated using:
- **Mean Absolute Error (MAE)**
- **Root Mean Squared Error (RMSE)**

These metrics are commonly used for regression problems and provide insight into prediction accuracy.


Cell 1:- Imports & Setup

In [5]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from tensorflow.keras.layers import Dense


Cell 2:- Load Tabular Data

In [7]:
df = pd.read_csv("data/housing.csv")
df.head()


Unnamed: 0,sqft,bedrooms,bathrooms,age,location_score,image,price
0,3219,1,1,47,5,house_1.png,671570.71
1,1603,2,2,47,2,house_2.png,338166.04
2,3371,6,1,37,7,house_3.png,810195.33
3,730,1,1,13,4,house_4.png,249545.45
4,2669,5,1,35,4,house_5.png,600407.7


Cell 3:- CNN Feature Extractor (Pretrained MobileNetV2)

In [8]:
base_model = MobileNetV2(
    weights="imagenet",
    include_top=False,
    pooling="avg",
    input_shape=(224, 224, 3)
)

base_model.trainable = False

def extract_image_features(image_path):
    img = load_img(image_path, target_size=(224, 224))
    img = img_to_array(img)
    img = np.expand_dims(img, axis=0)
    img = preprocess_input(img)
    return base_model.predict(img, verbose=0)[0]


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
[1m9406464/9406464[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 0us/step


Cell 4:- Extract Image Features

In [9]:
image_features = []

for img_name in df["image"]:
    path = os.path.join("data/images", img_name)
    image_features.append(extract_image_features(path))

image_features = np.array(image_features)
image_features.shape


(30, 1280)

Cell 5:- Tabular Preprocessing + Feature Fusion

In [10]:
X_tabular = df[["sqft", "bedrooms", "bathrooms", "age", "location_score"]].values
y = df["price"].values

scaler = StandardScaler()
X_tabular = scaler.fit_transform(X_tabular)

X = np.concatenate([X_tabular, image_features], axis=1)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


Cell 6:- Regression Model Training

In [11]:
model = tf.keras.Sequential([
    Dense(256, activation="relu", input_shape=(X_train.shape[1],)),
    Dense(128, activation="relu"),
    Dense(1)
])

model.compile(
    optimizer="adam",
    loss="mse"
)

history = model.fit(
    X_train,
    y_train,
    validation_split=0.2,
    epochs=30,
    batch_size=8,
    verbose=1
)


Epoch 1/30


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 160ms/step - loss: 301876740096.0000 - val_loss: 251194769408.0000
Epoch 2/30
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 52ms/step - loss: 301865828352.0000 - val_loss: 251181760512.0000
Epoch 3/30
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 54ms/step - loss: 301850230784.0000 - val_loss: 251163213824.0000
Epoch 4/30
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 83ms/step - loss: 301828898816.0000 - val_loss: 251138146304.0000
Epoch 5/30
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 53ms/step - loss: 301801046016.0000 - val_loss: 251105542144.0000
Epoch 6/30
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 98ms/step - loss: 301763493888.0000 - val_loss: 251063861248.0000
Epoch 7/30
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 86ms/step - loss: 301715947520.0000 - val_loss: 251011694592.0000
Epoch 8/30
[1m3/3[0m [32m━━━━━━━━━

Cell 7:- Model Evaluation (MAE & RMSE)

In [12]:
y_pred = model.predict(X_test).flatten()

mae = mean_absolute_error(y_test, y_pred)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))

print("MAE:", mae)
print("RMSE:", rmse)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 88ms/step
MAE: 552301.7100130208
RMSE: 578765.7531320207
