# House Price Prediction with Linear Regression

This notebook explores a simple linear regression model to predict house prices based on numerical features. The project focuses on understanding the full machine learning workflow, from data inspection to model evaluation.

**Goal:** Predict house prices using linear and multiple linear regression  
**Tools:** Python, pandas, scikit-learn, matplotlib


### Import Necessary Tools & Libraries

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

### Load the Dataset

In [None]:
df = pd.read_csv("Housing_Price_Data.csv")

### Inspect Data

In [None]:
df.info()

In [None]:
df.head()

### Clean & Select Relevant Data

In [None]:
df = df[['price','area']]

### Split Train/Test data

In [None]:
X = df[['area']]   
y = df['price']    

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 42)

In [None]:
plt.scatter(df.area, df.price, color='red', marker='+')
plt.xlabel('Area (SQ ft)')
plt.ylabel('Price (USD)')
plt.title('Area vs Price')
plt.show()

In [None]:
reg = linear_model.LinearRegression()
reg.fit(df[['area']],df.price)

In [None]:
reg.predict(pd.DataFrame({'area': [300]}))

In [None]:
plt.xlabel('Area (SQ ft)')
plt.ylabel('Price (USD)')
plt.scatter(df.area, df.price, color='red', marker='+')
plt.title('Area vs Price')
plt.plot(df.area, reg.predict(df[['area']]), color='blue')
plt.show()

In [None]:
d = pd.read_csv("areas.csv")
d.head(5)

In [None]:
p = reg.predict(d)

In [None]:
d['prices'] = p

In [None]:
d.to_csv("prediction.csv")