# 🧠 Model Development — Housing Prices in India

---

### 🔗 **Notebook Context**

This notebook is the **third stage** of the *Housing Prices in India* project.  
In the previous notebooks, we:
1. Performed **Exploratory Data Analysis (EDA)** to understand the dataset.  
2. Conducted **Data Cleaning and Feature Engineering** to prepare high-quality inputs for modeling.

In this notebook, we’ll build, train, and evaluate three regression models — **Linear Regression**, **Ridge Regression**, and **Lasso Regression** — to predict housing prices in India.  
We’ll compare performance, analyze feature importance, and save the trained models for deployment.

---

## 🎯 **Objectives**

1. Load the cleaned dataset  
2. Split data into training and testing sets  
3. Train Linear, Ridge, and Lasso regression models  
4. Evaluate model performance (R², MAE, RMSE)  
5. Save the trained models into the `models/` directory  
6. Summarize key insights and next steps  

---

## 📦 **1. Setup & Data Loading**

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

In [2]:
df = pd.read_csv('../data/cleaned-feature-engineered-data.csv')
df.head()

Unnamed: 0,posted_by,under_construction,rera_approved,num_of_rooms,bhk_or_rk,ready_to_move,resale,longitude,latitude,price,avg_price_per_unit_area,avg_price_per_room,area_per_room
0,Owner,No,No,2,BHK,Yes,Yes,12.96991,77.59796,55.0,0.0423,27.5,650.118204
1,Dealer,No,No,2,BHK,Yes,Yes,12.274538,76.644605,51.0,0.04,25.5,637.5
2,Owner,No,No,2,BHK,Yes,Yes,12.778033,77.632191,43.0,0.04608,21.5,466.579861
3,Owner,No,Yes,2,BHK,Yes,Yes,28.6423,77.3445,62.5,0.06721,31.25,464.960571
4,Dealer,Yes,No,2,BHK,No,Yes,22.5922,88.484911,60.5,0.06056,30.25,499.504623
