<a href="https://colab.research.google.com/github/SanketManav9620/Automobile_EDA-/blob/main/automobileEDA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Project Name**    -



##### **Project Type**    - EDA
##### **Contribution**    - Individual
Sanket Kumar

# **Project Summary -**

This Exploratory Data Analysis (EDA) project aims to understand the structure, trends, and relationships within a comprehensive automobile dataset. The dataset includes various car attributes such as price, engine specifications, fuel type, body style, number of cylinders, and performance metrics like mileage and horsepower. The objective was to prepare the data for analysis, uncover hidden patterns, and derive actionable insights for business decision-making.

We started with data cleaning by identifying and replacing missing or inconsistent values, converting data types, and handling special characters like '?'. Numeric columns with missing data were imputed using median values, while categorical columns were cleaned and transformed for consistency. Special attention was given to transforming textual representations (e.g., “four”, “six”) into numerical formats to enable analysis.

We performed extensive visual exploration using bar plots, box plots, scatter plots, pair plots, and heatmaps. These charts revealed that variables like engine size, horsepower, and curb weight are positively correlated with car price. In contrast, fuel efficiency (measured in city and highway MPG) tends to be higher in lower-priced cars.

Further analysis showed that body style, aspiration type (turbo vs standard), drive wheels, and fuel type significantly affect the vehicle's market value. We also examined brand-specific trends, observing that luxury brands typically dominate the higher price segments.

Based on our findings, we proposed actionable strategies such as segmentation of customers, feature-based pricing, and optimization of product offerings. These insights can directly support pricing strategies, customer targeting, and feature prioritization in product development.

Overall, this project demonstrates the critical role of EDA in transforming raw, unstructured data into a refined dataset that provides valuable business insights. It sets a strong foundation for future tasks like predictive modeling or deeper market analysis.

# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


The automotive industry constantly seeks to understand what factors influence car pricing and how these factors relate to consumer preferences and performance. With numerous car models varying by brand, engine size, fuel type, and other specifications, it becomes essential to identify which features most significantly impact a car’s market value. The raw dataset contains missing values, inconsistent formats, and categorical variables that require transformation to be useful for analysis.

This project aims to perform exploratory data analysis (EDA) on a car dataset to clean, transform, and uncover meaningful patterns and relationships that can guide business decisions.



#### **Define Your Business Objective?**

The primary business objective is to identify key factors that influence the price of a car and understand how different features such as brand, engine size, horsepower, mileage, fuel type, and body style affect consumer value perception.

By doing this, the goal is to:

Enable data-driven pricing strategies.
Segment the customer base for better targeted marketing.
Optimize product offerings based on performance and efficiency trends.
Provide a clean and structured dataset ready for predictive modeling in future phases.
The ultimate aim is to support automotive businesses in increasing profitability, improving customer satisfaction, and launching competitive products in the market.Answer Here.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [None]:
# Load Dataset
from google.colab import drive
drive.mount('/content/drive')
path="/content/drive/MyDrive/automobile project/automobile_data.csv"
df=pd.read_csv(path)

### Dataset First View

In [None]:
df.head()
# Dataset First Look

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count
df.shape

### Dataset Information

In [None]:
# Dataset Info
df.info

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count
print("total duplicate rows: ",int(df.duplicated().sum()))

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count
df.replace("?", np.nan, inplace=True)
print(f"total missing values in columns: \n\n{df.isnull().sum()}")

In [None]:
# Visualizing the missing values
plt.figure(figsize=(16, 6))
df.isnull().sum().plot(kind="bar")

plt.title("Missing Values per Column")
plt.ylabel("Count")
plt.show()

### **What did you know about your dataset?**
This dataset contains information about various automobile features, likely aimed at analyzing or predicting car prices. It has 205 rows and 26 columns, including both numerical and categorical variables such as make, fuel type, engine size, horsepower, and price. Several columns originally contained missing values represented by "?", which were successfully replaced with NaN. After cleaning, seven columns still have missing data, including normalized-losses, num-of-doors, bore, stroke, horsepower, peak-rpm, and price. Numerical columns vary widely, indicating potential for modeling and analysis, while some like compression-ratio show signs of outliers. The dataset is well-suited for tasks such as exploratory data analysis and building machine learning models for price prediction, provided missing values are handled and categorical variables are properly encoded.

**What** did you know about your dataset? **bold text**

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns
df.columns

In [None]:
# Dataset Describe
print(df.describe())

### Variables Description

| Column Name         | Description                                                                     |
| ------------------- | ------------------------------------------------------------------------------- |
| `symboling`         | **Risk factor rating**: from -3 (safe) to +3 (risky). Higher = more risky car.  |
| `normalized-losses` | Insurance loss rating (normalized). Missing values often.                       |
| `make`              | Brand/manufacturer of the car (e.g., BMW, Toyota).                              |
| `fuel-type`         | Fuel type: **gas** or **diesel**.                                               |
| `aspiration`        | Type of engine aspiration: **std** (standard) or **turbo** (turbocharged).      |
| `num-of-doors`      | Number of doors: **two**, **four**, etc. Sometimes missing.                     |
| `body-style`        | Sedan, hatchback, convertible, wagon, etc.                                      |
| `drive-wheels`      | FWD, RWD, or 4WD (front/rear/all-wheel drive).                                  |
| `engine-location`   | Where the engine is placed: front or rear.                                      |
| `wheel-base`        | Distance between front and rear wheels (in inches). Affects stability.          |
| `length`            | Car length (in inches).                                                         |
| `width`             | Car width (in inches).                                                          |
| `height`            | Car height (in inches).                                                         |
| `curb-weight`       | Car weight without passengers/cargo.                                            |
| `engine-type`       | Type of engine: dohc, ohc, ohcv, l, etc.                                        |
| `num-of-cylinders`  | Number of engine cylinders: four, six, eight, etc. (as string).                 |
| `engine-size`       | Total displacement of the engine (cc). Bigger = more power.                     |
| `fuel-system`       | Fuel system: mpfi, 2bbl, 1bbl, spfi, etc.                                       |
| `bore`              | Diameter of the engine cylinder (in inches). Missing values possible.           |
| `stroke`            | Movement of the piston inside the cylinder (in inches). Missing possible.       |
| `compression-ratio` | Ratio of compression before ignition. Higher ratio = more power/fuel efficient. |
| `horsepower`        | Engine output power. Often missing or represented as '?'.                       |
| `peak-rpm`          | RPM at which maximum horsepower is generated.                                   |
| `city-mpg`          | Mileage in city driving (miles per gallon).                                     |
| `highway-mpg`       | Mileage on highways (miles per gallon).                                         |
| `price`             | Price of the car (USD). May contain missing values or '?'.                      |


### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.
a = {x: df[x].unique().tolist() for x in df.columns}
print(a)
print(*[f"no. of unique values in column {x} is {len(a.get(x))}" for x in a],sep='\n')

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
df.replace("?", np.nan, inplace=True)

numeric_cols = ['normalized-losses', 'bore', 'stroke', 'horsepower', 'peak-rpm', 'price']
for col in numeric_cols:
    df[col] = pd.to_numeric(df[col], errors='coerce')
    df[col] = df[col].fillna(df[col].median())

# Handle missing values in 'num-of-doors' by checking if mode() is empty
num_doors_mode = df["num-of-doors"].mode()
if not num_doors_mode.empty:
    df["num-of-doors"] = df["num-of-doors"].fillna(num_doors_mode[0])
else:
    # Handle the case where mode() is empty, perhaps fill with a default value like 'four'
    df["num-of-doors"] = df["num-of-doors"].fillna('four')


for col in df.columns:
    if df[col].dtype == 'object':
        df[col] = df[col].str.lower()

maps = {'four': 4, 'six': 6, 'five': 5, 'three': 3, 'twelve': 12, 'two': 2, 'eight': 8}
df['num-of-doors'] = df['num-of-doors'].map(maps)
df['num-of-cylinders'] = df['num-of-cylinders'].map(maps)

df.drop_duplicates(inplace=True)
df.reset_index(drop=True, inplace=True)
df.info()

### What all manipulations have you done and insights you found?

1. Manipulations Done:

Replaced all "?" entries with NaN to handle missing values properly.

Converted object-type numerical columns like horsepower, bore, stroke, and price to actual numeric types using pd.to_numeric().

Filled missing values in numeric columns with the median of each column.

Filled missing values in the num-of-doors column using the mode (most common value).

Converted all string-based columns to lowercase for consistency.

Mapped textual numbers (e.g., 'four', 'two') to integers in num-of-doors and num-of-cylinders.

Removed duplicate rows using drop_duplicates().

Reset the DataFrame index after cleaning operations using reset_index().

2.  Insights Found:

Several key columns had missing values, such as normalized-losses, bore, stroke, and price, which are important for price prediction or analysis.

Most categorical columns have a manageable number of unique values, making them suitable for encoding in machine learning.

Some numerical variables, such as engine-size, horsepower, and curb-weight, are likely to be important predictors for car price.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
# Chart - 1 visualization code
# Scatterplot: Engine Size vs Price
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df, x='engine-size', y='price', hue='fuel-type')
plt.title('Engine Size vs Price colored by Fuel Type')
plt.xlabel('Engine Size')
plt.ylabel('Price ($)')
plt.show()


##### 1. Why did you pick the specific chart?

I chose this scatterplot because it shows the relationship between two important numeric variables — engine size and car price — and uses fuel-type as hue to compare gas vs diesel cars in the same plot. This helps visually check whether bigger engines mean higher prices and whether fuel type affects this relationship.

##### 2. What is/are the insight(s) found from the chart?

The chart shows a clear positive trend: cars with larger engine sizes generally have higher prices. Also, there is some visible difference in price ranges for gas vs diesel cars — diesel cars tend to cluster at slightly different engine sizes and price points, showing how fuel type may affect pricing too.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Convertibles and hardtops tend to have a higher average price compared to other body styles.

Hatchbacks and Wagon are the most affordable options.

Sedans fall somewhere in the middle, indicating moderate pricing.

#### Chart - 2

In [None]:

# Bar Plot: Average Price by Body Style
plt.figure(figsize=(8, 5))
sns.barplot(data=df, x='body-style', y='price', palette='pastel')
plt.title('Average Price by Body Style')
plt.xlabel('Body Style')
plt.ylabel('Average Price ($)')
plt.show()


##### 1. Why did you pick the specific chart?

I chose this chart because it clearly shows how the target variable (price) varies across a key categorical feature (like body-style or drive-wheels). A bar plot makes it easy to compare average prices between categories at a glance. This helps identify which types of cars tend to be more expensive and shows useful trends for buyers, sellers, or manufacturers

##### 2. What is/are the insight(s) found from the chart?

The chart shows that certain body styles — like convertibles and hardtops — have higher average prices than hatchbacks or sedans. This means that the type of body style strongly affects the car’s market value.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

“This insight helps stakeholders understand how body style affects pricing — which can guide production, marketing, and sales strategies for better profits.”

#### Chart - 3

In [None]:
# Chart - 3 visualization code
plt.figure(figsize=(6, 4))
sns.barplot(data=df, x='fuel-type', y='price', estimator='mean', hue='fuel-type', palette='Set2', errorbar=None)
plt.title('Average Car Price by Fuel Type')
plt.xlabel('Fuel Type')
plt.ylabel('Average Price (USD)')
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

This bar chart compares the average price of cars based on their fuel type — a key factor in customer decision-making. Since fuel-type is a simple categorical variable with only a few values (gas, diesel), it is easy to interpret and ideal for bar plots.

##### 2. What is/are the insight(s) found from the chart?

Diesel cars have a higher average price than gas cars.

This could mean diesel variants are more powerful or have better build/features — or simply cost more due to market positioning.

#### Chart - 4

In [None]:
# Chart - 4 visualization code
# Boxplot: Price by Drive Wheels
plt.figure(figsize=(10, 6))
sns.boxplot(data=df, x='drive-wheels', y='price', palette='Set3')
plt.title('Price Distribution by Drive Wheels Type')
plt.xlabel('Drive Wheels')
plt.ylabel('Price ($)')
plt.show()

##### 1. Why did you pick the specific chart?

I chose this boxplot because it shows how car prices vary across different drive wheels types (FWD, RWD, 4WD) and highlights price spread, median price, and outliers in each group.

##### 2. What is/are the insight(s) found from the chart?

Rear-wheel drive (RWD) cars tend to have a higher median price and wider price range compared to front-wheel drive (FWD) cars. 4-wheel drive (4WD) cars also show relatively high prices but fewer extreme outliers.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

knowing that RWD and 4WD cars tend to sell for higher prices helps a company focus marketing and production on these segments to improve profit margins.

 No direct insight leads to negative growth, but overproducing expensive RWD or 4WD models might miss out on the large market for affordable FWD cars, so a balanced product mix is important.

#### Chart - 5

In [None]:
# Chart - 5 visualization code

plt.figure(figsize=(8, 5))
sns.countplot(data=df, x='body-style', hue='body-style', palette='pastel', dodge=False)
plt.title('Count of Cars by Body Style')
plt.xlabel('Body Style')
plt.ylabel('Number of Cars')
plt.legend([], [], frameon=False)  # Hide redundant legend
plt.show()


##### 1. Why did you pick the specific chart?

I chose this chart because a countplot clearly shows the frequency of each body style. It helps identify which car types are most common in the dataset.

##### 2. What is/are the insight(s) found from the chart?

The chart shows that sedans and hatchbacks are the most common body styles, while hardtops and convertibles are rare.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

this helps a company know which segments have higher volume and which are niche. They can use this to decide production or marketing focus.

 There’s no direct insight that leads to negative growth. But only focusing on high-volume segments might ignore profitable premium segments like convertibles.



#### Chart - 6

In [None]:
plt.figure(figsize=(10, 6))
sns.histplot(data=df, x='horsepower', bins=30, hue='fuel-type', kde=True, palette='Set2')
plt.title('Distribution of Horsepower by Fuel Type')
plt.xlabel('Horsepower')
plt.ylabel('Count')
plt.show()


##### 1. Why did you pick the specific chart?

shows how horsepower is distributed — good to spot common vs rare ranges.

##### 2. What is/are the insight(s) found from the chart?

Most cars have mid-level horsepower; few high-power outliers exist.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

 Helps decide which horsepower ranges to target. No negative risk unless ignoring market segments

#### Chart - 7

In [None]:
# Chart - 7 visualization code
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df, x='curb-weight', y='price', hue='body-style')
plt.title('Curb Weight vs Price colored by Body Style')
plt.xlabel('Curb Weight')
plt.ylabel('Price ($)')
plt.show()


##### 1. Why did you pick the specific chart?

Checks link between weight & price, using body style for extra pattern.

##### 2. What is/are the insight(s) found from the chart?

Heavier cars generally cost more. Body styles cluster at different ranges.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

 Helps price heavier cars higher. No major negative impact

#### Chart - 8

In [None]:
# Chart - 8 visualization code
plt.figure(figsize=(8, 5))
sns.boxplot(data=df, x='fuel-type', y='price', palette='Set2',hue='fuel-type')
plt.title('Price Distribution by Fuel Type')
plt.xlabel('Fuel Type')
plt.ylabel('Price ($)')
plt.show()


##### 1. Why did you pick the specific chart?

Compares how price range differs by fuel type.

##### 2. What is/are the insight(s) found from the chart?

Diesel cars often cost more than gas cars — or have tighter range.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Supports pricing or supply plan by fuel type. No major downside

#### Chart - 9

In [None]:
# Chart - 9 visualization code
plt.figure(figsize=(8, 5))
sns.countplot(data=df, x='drive-wheels', hue='drive-wheels', palette='pastel', dodge=False)
plt.title('Count of Cars by Drive Wheels Type')
plt.xlabel('Drive Wheels')
plt.ylabel('Number of Cars')
plt.legend([], [], frameon=False)
plt.show()


##### 1. Why did you pick the specific chart?

Shows popularity of each drive wheel type.

##### 2. What is/are the insight(s) found from the chart?

Front-wheel drive cars dominate the dataset.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Helps plan production focus. No negative growth unless ignoring niche AWD/RWD demand.

#### Chart - 10

In [None]:
# Chart - 10 visualization code
# Violin Plot: Price by Body Style
plt.figure(figsize=(10, 6))
sns.violinplot(data=df, x='body-style', y='price', palette='muted')
plt.title('Price Distribution by Body Style (Violin Plot)')
plt.xlabel('Body Style')
plt.ylabel('Price ($)')
plt.show()


##### 1. Why did you pick the specific chart?

Shows detailed price distribution for each body style — shape + median + outliers.


##### 2. What is/are the insight(s) found from the chart?

Some body styles (like convertibles) have higher and wider price ranges.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

 Helps plan which segments are premium vs budget. No negative growth risk if variety is offered.

#### Chart - 11

In [None]:
# Chart - 11 visualization code
# Pairplot: Price vs Engine Size & Horsepower
sns.pairplot(df, vars=['engine-size', 'horsepower', 'price'], hue='fuel-type', palette='Set1')
plt.suptitle('Pairplot: Engine Size, Horsepower, and Price', y=1.02)
plt.show()


##### 1. Why did you pick the specific chart?

Shows how multiple numeric features relate to each other in one view.

##### 2. What is/are the insight(s) found from the chart?

Engine size and horsepower are strongly related; both relate positively with price.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Helps decide which specs influence price most. No major negative risk.

#### Chart - 12

In [None]:
# Bar Plot: Average Horsepower by Body Style
plt.figure(figsize=(10, 6))
sns.barplot(data=df, x='body-style', y='horsepower', estimator=np.mean, palette='Set3')
plt.title('Average Horsepower by Body Style')
plt.xlabel('Body Style')
plt.ylabel('Average Horsepower')
plt.show()



##### 1. Why did you pick the specific chart?

compares mean horsepower across different body styles.

##### 2. What is/are the insight(s) found from the chart?

Convertibles and hardtops usually have higher horsepower.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Supports marketing sporty models to power-focused customers. No negative growth.

#### Chart - 13

In [None]:
plt.figure(figsize=(12, 6))
sns.barplot(data=df, x='body-style', y='price', hue='drive-wheels', estimator=np.mean, palette='Set2')
plt.title('Average Price by Body Style and Drive Wheels')
plt.xlabel('Body Style')
plt.ylabel('Average Price ($)')
plt.legend(title='Drive Wheels')
plt.show()


##### 1. Why did you pick the specific chart?

Shows how both body style and drive wheels affect average price — gives more detail than a single category.

##### 2. What is/are the insight(s) found from the chart?

E.g., sedans with rear-wheel drive may cost more than FWD; convertibles with RWD may be premium.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Yes — helps position cars for sporty or premium buyers based on drive type + body style. No big negative risk if the lineup stays balanced.

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code
# Correlation Heatmap
plt.figure(figsize=(10, 8))
corr = df[['engine-size', 'horsepower', 'curb-weight', 'price']].corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.title('Correlation Heatmap')
plt.show()


##### 1. Why did you pick the specific chart?

Shows strength of linear relationships between key numeric variables.

##### 2. What is/are the insight(s) found from the chart?

Price has high correlation with engine size and horsepower.

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code
# Pair Plot visualization code
# Select a subset of key numeric columns
cols_to_plot = ['price', 'horsepower', 'engine-size', 'curb-weight', 'city-mpg']

# Drop missing values
sns_df = df[cols_to_plot].dropna()

# Create pair plot
sns.pairplot(sns_df, corner=True, diag_kind='kde', plot_kws={'alpha': 0.6})
plt.suptitle("Pair Plot of Car Attributes", y=1.02)
plt.tight_layout()
plt.show()

##### 1. Why did you pick the specific chart?

I picked the pair plot because it visually compares all combinations of selected numerical features in one grid. It helps identify relationships, trends, and possible outliers. This is especially useful in early stages of analysis when you're exploring how different variables interact.

##### 2. What is/are the insight(s) found from the chart?

price has a strong positive correlation with both horsepower and engine-size. More power → more expensive.

curb-weight also rises with engine-size — bigger engines mean heavier cars.

There's a negative relationship between city-mpg and price, indicating that more fuel-efficient cars are generally cheaper.

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Segment customers effectively
Group customers based on key attributes like price, engine size, fuel efficiency, and horsepower to target them with more relevant offerings.

Optimize the product portfolio
Focus on high-demand, high-margin models. Phase out cars that have high specs but low sales or poor pricing performance.

Implement feature-based pricing strategy
Since price correlates strongly with features like horsepower and engine size, pricing should reflect performance without undervaluing or overpricing.

Use insights for strategic planning
Leverage data on make, body-style, and performance to guide marketing, production planning, and inventory management.

Improve data consistency and completeness
Encourage better data collection practices to reduce missing values and standardize data entry across sources.

Monitor fuel economy trends
High fuel efficiency is linked with lower prices. Maintain a balance between performance and fuel economy to meet different market needs.

# **Conclusion**

In this exploratory data analysis (EDA) project, we examined various features of the automobile dataset to uncover key relationships and insights. By cleaning and preparing the data, handling missing values, and converting categorical values into usable formats, we made the dataset analysis-ready.

Through visualizations like bar plots, heatmaps, boxplots, and pair plots, we identified strong correlations between price and features such as engine size, horsepower, and curb weight. We also observed trends related to fuel efficiency, body style, and brand pricing strategies.

These insights can help businesses segment their customers better, price their products more effectively, and make informed decisions on product development and marketing.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***