# 🚘 Car Price Prediction with Machine Learning

## 🧩 Problem Statement
The price of a car depends on factors such as brand, features, fuel type, transmission, and mileage. In this project, we will build a machine learning model to predict the **selling price** of a car using historical data.


In [None]:
# 📦 Step 1: Import Required Libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import r2_score
import warnings
warnings.filterwarnings('ignore')

In [None]:
# 📂 Step 2: Load the Dataset
df = pd.read_csv("car data.csv")
df.head()

In [None]:
# 🧹 Step 3: Data Preprocessing
# Create a new column for car age
df['Car_Age'] = 2020 - df['Year']
df.drop(['Car_Name', 'Year'], axis=1, inplace=True)

# One-hot encoding for categorical columns
df = pd.get_dummies(df, drop_first=True)
df.head()

In [None]:
# 📊 Step 4: Data Visualization
plt.figure(figsize=(10, 6))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()

In [None]:
# 🚂 Step 5: Split Data and Train Model
X = df.drop('Selling_Price', axis=1)
y = df['Selling_Price']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestRegressor(random_state=42)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
score = r2_score(y_test, y_pred)
print(f"R² Score: {score:.2f}")

In [None]:
# 📈 Step 6: Feature Importance
feat_importances = pd.Series(model.feature_importances_, index=X.columns)
feat_importances.nlargest(10).plot(kind='barh')
plt.title("Feature Importance")
plt.show()

## ✅ Conclusion
- We've successfully trained a Random Forest model to predict car prices.
- Model performs well with R² score showing how well it fits test data.
- Feature importance analysis reveals that **Present Price**, **Car Age**, and **Kms Driven** are significant contributors.

This notebook can be extended into a web app using **Streamlit** for real-time predictions.