
# Practical Application: What Drives the Price of a Car?

## Overview
This project investigates used vehicle listings to explore the factors that influence car prices. We use a dataset with over 426,000 entries and perform data cleaning, exploratory data analysis (EDA), visualizations, and correlation analysis. Libraries include Pandas, Matplotlib, Seaborn, and Scikit-learn.


In [None]:

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

# Set plot styles
sns.set(style='whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)


## 1. Load the Dataset

In [None]:

df = pd.read_csv("vehicles.csv")
df.head()


## 2. Data Summary and Cleaning

In [None]:

df.info()
df.isnull().sum()


In [None]:

# Drop rows with missing values and filter invalid prices/year
df = df.dropna()
df = df[df['price'] > 100]
df = df[df['year'] >= 1900]
df.shape


## 3. Descriptive Statistics

In [None]:

df.describe(include='all')


## 4. Visualizations

### 4.1 Price Distribution

In [None]:

sns.histplot(df['price'], bins=100, kde=True)
plt.title("Distribution of Car Prices")
plt.xlabel("Price ($)")
plt.ylabel("Count")
plt.xlim(0, 60000)
plt.show()


### 4.2 Price by Fuel Type

In [None]:

sns.boxplot(x="fuel", y="price", data=df)
plt.title("Price by Fuel Type")
plt.ylim(0, 60000)
plt.show()


### 4.3 Price by Transmission

In [None]:

sns.boxplot(x="transmission", y="price", data=df)
plt.title("Price by Transmission Type")
plt.ylim(0, 60000)
plt.show()


### 4.4 Vehicle Type Distribution

In [None]:

sns.countplot(data=df, x="type", order=df['type'].value_counts().index)
plt.title("Vehicle Type Distribution")
plt.xticks(rotation=45)
plt.show()


## 5. Feature Correlation

In [None]:

numeric_cols = df.select_dtypes(include=np.number)
corr = numeric_cols.corr()

sns.heatmap(corr, annot=True, cmap="coolwarm")
plt.title("Correlation Heatmap")
plt.show()



## 6. Key Findings & Recommendations

- **Newer vehicles** and those in **better condition** command higher prices.
- **Electric and diesel** vehicles tend to be more expensive.
- **Lower mileage** (odometer) is strongly correlated with higher price.
- Focus on clean, low-mileage SUVs or sedans for better resale.

---
Prepared by Erfan Maleki
