## Title: Predicting House Prices with Linear Regression

Objective:-
The goal of this project is to build a linear regression model to predict house prices based on various features. Linear regression is a simple yet powerful algorithm for predicting a continuous outcome variable.

Dataset:-
We will use a dataset containing information about houses, including features such as square footage, number of bedrooms, bathrooms, and location. The dataset will be split into training and testing sets.

Steps:

1. Data Preprocessing:-
   - Load and explore the dataset.
   - Handle missing data and outliers.
   - Encode categorical variables if necessary.

2. Feature Selection:-
   - Identify relevant features that could influence house prices.
   - Use correlation analysis to understand relationships between variables.

3. Data Splitting:-
   - Split the dataset into training and testing sets to evaluate the model's performance.

4. Model Training-
   - Implement a linear regression model using a library like scikit-learn.
   - Train the model on the training set.

5. Model Evaluation:-
   - Evaluate the model's performance on the testing set using metrics like Mean Squared Error (MSE) or R-squared.
   - Visualize the predicted prices against the actual prices.

6. Optimization:-
   - Fine-tune the model by adjusting hyperparameters if necessary.
   - Consider feature engineering to improve model accuracy.

7. Interpretation:-
   - Interpret the coefficients of the linear regression equation to understand the impact of each feature on house prices.

8. Deployment:-
   - If satisfied with the model's performance, deploy it for making predictions on new data.

In [6]:
import pandas as pd 
import numpy as np 

In [35]:
Houseing=pd.read_csv("C://Users//Lenovo//Desktop//Housing.csv")

In [36]:
H=Houseing

In [37]:
H

Unnamed: 0,price,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus
0,13300000,7420,4,2,3,yes,no,no,no,yes,2,yes,furnished
1,12250000,8960,4,4,4,yes,no,no,no,yes,3,no,furnished
2,12250000,9960,3,2,2,yes,no,yes,no,no,2,yes,semi-furnished
3,12215000,7500,4,2,2,yes,no,yes,no,yes,3,yes,furnished
4,11410000,7420,4,1,2,yes,yes,yes,no,yes,2,no,furnished
...,...,...,...,...,...,...,...,...,...,...,...,...,...
540,1820000,3000,2,1,1,yes,no,yes,no,no,2,no,unfurnished
541,1767150,2400,3,1,1,no,no,no,no,no,0,no,semi-furnished
542,1750000,3620,2,1,1,yes,no,no,no,no,0,no,unfurnished
543,1750000,2910,3,1,1,no,no,no,no,no,0,no,furnished


In [38]:
H.head()

Unnamed: 0,price,area,bedrooms,bathrooms,stories,mainroad,guestroom,basement,hotwaterheating,airconditioning,parking,prefarea,furnishingstatus
0,13300000,7420,4,2,3,yes,no,no,no,yes,2,yes,furnished
1,12250000,8960,4,4,4,yes,no,no,no,yes,3,no,furnished
2,12250000,9960,3,2,2,yes,no,yes,no,no,2,yes,semi-furnished
3,12215000,7500,4,2,2,yes,no,yes,no,yes,3,yes,furnished
4,11410000,7420,4,1,2,yes,yes,yes,no,yes,2,no,furnished


In [51]:
x=H[["area","bedrooms","stories","parking","bathrooms"]].copy()
y=H[["price"]].copy()

In [52]:
x

Unnamed: 0,area,bedrooms,stories,parking,bathrooms
0,7420,4,3,2,2
1,8960,4,4,3,4
2,9960,3,2,2,2
3,7500,4,2,3,2
4,7420,4,2,2,1
...,...,...,...,...,...
540,3000,2,1,2,1
541,2400,3,1,0,1
542,3620,2,1,0,1
543,2910,3,1,0,1


In [53]:
y

Unnamed: 0,price
0,13300000
1,12250000
2,12250000
3,12215000
4,11410000
...,...
540,1820000
541,1767150
542,1750000
543,1750000


In [54]:
 x["intercept"] = 1

In [55]:
x = x[["intercept","area","bedrooms","stories","parking","bathrooms"]]

In [56]:
x

Unnamed: 0,intercept,area,bedrooms,stories,parking,bathrooms
0,1,7420,4,3,2,2
1,1,8960,4,4,3,4
2,1,9960,3,2,2,2
3,1,7500,4,2,3,2
4,1,7420,4,2,2,1
...,...,...,...,...,...,...
540,1,3000,2,1,2,1
541,1,2400,3,1,0,1
542,1,3620,2,1,0,1
543,1,2910,3,1,0,1


In [57]:
x_T = x.T

In [58]:
x_T

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,535,536,537,538,539,540,541,542,543,544
intercept,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
area,7420,8960,9960,7500,7420,7500,8580,16200,8100,5750,...,3360,3420,1700,3649,2990,3000,2400,3620,2910,3850
bedrooms,4,4,3,4,4,3,4,5,4,3,...,2,5,3,2,2,2,3,2,3,3
stories,3,4,2,2,2,1,4,2,2,4,...,1,2,2,1,1,1,1,1,1,2
parking,2,3,2,3,2,2,2,0,2,1,...,1,0,0,0,1,2,0,0,0,0
bathrooms,2,4,2,2,1,3,3,3,1,2,...,1,1,1,1,1,1,1,1,1,1


In [59]:
B = np.linalg.inv(x_T @ x) @ x_T @ y

In [60]:
B

Unnamed: 0,price
0,-145734.5
1,331.1155
2,167809.8
3,547939.8
4,377596.3
5,1133740.0


In [61]:
B.index = x.columns

In [62]:
B

Unnamed: 0,price
intercept,-145734.5
area,331.1155
bedrooms,167809.8
stories,547939.8
parking,377596.3
bathrooms,1133740.0


In [63]:
predictions = x @ B

In [64]:
predictions

Unnamed: 0,price
0,7.648874e+06
1,1.135181e+07
2,7.774158e+06
3,7.505020e+06
4,5.967194e+06
...,...
540,3.620104e+06
541,2.834052e+06
542,3.070203e+06
543,3.002921e+06


In [65]:
SSR = ((y - predictions) **2).sum()

In [66]:
SST = ((y-y.mean())**2).sum()

In [67]:
SSR

price    8.343997e+14
dtype: float64

In [68]:
SST

price    1.903208e+15
dtype: float64

In [69]:
R2 = 1 - (SSR/SST)

In [70]:
R2

price    0.561583
dtype: float64

In [71]:
from sklearn.linear_model import LinearRegression

In [72]:
lr = LinearRegression()

In [75]:
lr.fit(H[["area","bedrooms","stories","parking","bathrooms"]], H[["price"]])

In [76]:
lr.intercept_

array([-145734.48945588])

In [77]:
lr.coef_

array([[3.31115495e+02, 1.67809788e+05, 5.47939810e+05, 3.77596289e+05,
        1.13374016e+06]])

 ## Conclusion:-
This project demonstrates the application of linear regression in predicting house prices. The insights gained from this model could be valuable for real estate professionals, homebuyers, and sellers in understanding the factors influencing house prices.