# **Title of Project**

## **Objective**
Analyze and predict housing prices using historical data.

## **Data Source**
The dataset is sourced from [Kaggle's House Prices Competition](https://www.kaggle.com/c/house-prices-advanced-regression-techniques).

## **Import Library**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error

## **Import Data**

In [None]:
df = pd.read_csv('house_prices.csv')
df.head()

## **Describe Data**

In [None]:
df.info()
df.describe()

## **Data Visualization**

In [None]:

plt.figure(figsize=(10, 6))
sns.heatmap(df.corr(), cmap='coolwarm', annot=False)
plt.title('Correlation Heatmap')
plt.show()


## **Data Preprocessing**

In [None]:

# Handling missing values
df.fillna(df.median(), inplace=True)

# Encoding categorical variables
df = pd.get_dummies(df, drop_first=True)


## **Define Target Variable (y) and Feature Variables (X)**

In [None]:

X = df.drop(columns=['SalePrice'])
y = df['SalePrice']


## **Train Test Split**

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## **Modeling**

In [None]:

model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)


## **Model Evaluation**

In [None]:

y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")


## **Prediction**

In [None]:

sample_data = X_test.iloc[0:1]
prediction = model.predict(sample_data)
print(f"Predicted Sale Price: {prediction[0]}")


## **Explanation**
This project demonstrates data preprocessing, visualization, and modeling for predicting house prices using a Random Forest Regressor.