# Housing prices in Hyderabad, India

## Project Objective 🎯

The objective of this project is to develop a regression model to predict housing prices in Hyderabad, India. Using features such as the property's area, location, number of bedrooms, and available amenities, the model will aim to estimate the market value of a property as accurately as possible.

- This predictive model will be a valuable tool for:
- Home Buyers and Sellers: To obtain an objective price estimate for a property.
- Real Estate Agents: To assist with property valuation and client advisory.
- Investors: To identify potentially undervalued or overvalued properties in the market.

## 1.1 Getting training, validation, test datasets

In [1]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
import sys

sys.path.append('../../src/utils')


# Utilities
from regresion_metrics import evaluate_model_metrics, show_model_equation, get_model_coeficients_dataframe


training_features = pd.read_parquet('../../datasets/processed/housing_prices/hyderabad_house_price_training_features.parquet')
training_labels = pd.read_parquet('../../datasets/processed/housing_prices/hyderabad_house_price_training_labels.parquet')

validation_features = pd.read_parquet('../../datasets/processed/housing_prices/hyderabad_house_price_validation_features.parquet')
validation_labels = pd.read_parquet('../../datasets/processed/housing_prices/hyderabad_house_price_validation_labels.parquet')

test_features = pd.read_parquet('../../datasets/processed/housing_prices/hyderabad_house_price_test_features.parquet')
test_labels= pd.read_parquet('../../datasets/processed/housing_prices/hyderabad_house_price_test_labels.parquet')


## 1.2 Training and Predict with predetermined hyperparameters

In [4]:
# Training
linealRegresionModel = LinearRegression()
linealRegresionModel.fit(training_features, training_labels)

# Predict data sets (validation, test)
validation_predictions = linealRegresionModel.predict(validation_features)
test_predictions = linealRegresionModel.predict(test_features)

# Validation set metrics
validation_metrics = evaluate_model_metrics(linealRegresionModel, validation_features, validation_labels)

# Test set metrics
test_metrics = evaluate_model_metrics(linealRegresionModel, test_features, test_labels)

comparison_df = pd.DataFrame({
    'Validation Set': validation_metrics,
    'Test Set': test_metrics
}).round(4)


print("\n--- Regresion Metrics ---")
print(comparison_df)

print("\n--- Regresion Model Equation ---")
show_model_equation(linealRegresionModel, training_features)

print("\n--- Coeficients ---")
get_model_coeficients_dataframe(linealRegresionModel, training_features)


--- Regresion Metrics ---
      Validation Set  Test Set
MAE           0.1501    0.1605
MSE           0.0393    0.0516
RMSE          0.1982    0.2272
R²            0.8859    0.8750

--- Regresion Model Equation ---
y = 5.3275 + 1.3755 x (Area) - 0.0638 x (No. of Bedrooms) + 0.0353 x (Resale) - 0.0647 x (MaintenanceStaff) - 0.0524 x (Gymnasium) + 0.0108 x (SwimmingPool) + 0.1009 x (LandscapedGardens) - 0.0215 x (JoggingTrack) - 0.0596 x (RainWaterHarvesting) + 0.0582 x (IndoorGames) + 0.0484 x (ShoppingMall) - 0.0108 x (Intercom) + 0.0008 x (SportsFacility) + 0.0218 x (ATM) + 0.0374 x (ClubHouse) - 0.0927 x (School) + 0.0141 x (24X7Security) + 0.0185 x (PowerBackup) - 0.0272 x (CarParking) - 0.0780 x (StaffQuarter) - 0.0432 x (Cafeteria) + 0.0599 x (MultipurposeRoom) + 0.0515 x (Hospital) - 0.0192 x (WashingMachine) + 0.0338 x (Gasconnection) + 0.0738 x (AC) - 0.0209 x (Wifi) + 0.0426 x (Children'splayarea) + 0.0181 x (LiftAvailable) + 0.0116 x (BED) - 0.0043 x (VaastuCompliant) - 0.11

Unnamed: 0,Coeficiente (m)
Area,1.375493
No. of Bedrooms,-0.063752
Resale,0.035315
MaintenanceStaff,-0.064707
Gymnasium,-0.052377
...,...
Location_Tarnaka,0.628071
Location_Tellapur,0.546346
Location_TellapurOsman Nagar Road,0.651408
Location_Toli Chowki,0.531638


### 1.3 Dimensionality Reduction