Here are 20 business questions along with the Python code and explanations to answer them using the provided dataset:

---

### **1. What is the average price of all cars in the dataset?**
**Python Code:**
```python
import pandas as pd

# Load the dataset
df = pd.read_csv('car_price_dataset.csv')

# Calculate the average price
average_price = df['Price'].mean()
print(f"The average price of all cars is: ${average_price:.2f}")
```
**Explanation:**
- We load the dataset using `pd.read_csv`.
- The `mean()` function calculates the average price of all cars in the `Price` column.

---

### **2. Which car brand has the highest average price?**
**Python Code:**
```python
# Group by 'Brand' and calculate the mean price
brand_avg_price = df.groupby('Brand')['Price'].mean()

# Find the brand with the highest average price
highest_avg_brand = brand_avg_price.idxmax()
print(f"The brand with the highest average price is: {highest_avg_brand}")
```
**Explanation:**
- We group the data by `Brand` and calculate the mean price for each brand using `groupby()` and `mean()`.
- `idxmax()` returns the brand with the highest average price.

---

### **3. What is the most common fuel type in the dataset?**
**Python Code:**
```python
# Find the most common fuel type
most_common_fuel = df['Fuel_Type'].mode()[0]
print(f"The most common fuel type is: {most_common_fuel}")
```
**Explanation:**
- The `mode()` function calculates the most frequent value in the `Fuel_Type` column.

---

### **4. How many cars have a mileage greater than 200,000?**
**Python Code:**
```python
# Filter cars with mileage > 200,000
high_mileage_cars = df[df['Mileage'] > 200000]
print(f"Number of cars with mileage > 200,000: {len(high_mileage_cars)}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Mileage > 200000`) and count the rows using `len()`.

---

### **5. What is the distribution of cars by transmission type?**
**Python Code:**
```python
# Count the number of cars by transmission type
transmission_distribution = df['Transmission'].value_counts()
print("Distribution of cars by transmission type:")
print(transmission_distribution)
```
**Explanation:**
- `value_counts()` counts the occurrences of each unique value in the `Transmission` column.

---

### **6. Which car model has the highest price?**
**Python Code:**
```python
# Find the car with the highest price
highest_price_car = df.loc[df['Price'].idxmax()]
print(f"The car with the highest price is: {highest_price_car['Brand']} {highest_price_car['Model']} (${highest_price_car['Price']})")
```
**Explanation:**
- `idxmax()` finds the index of the car with the highest price, and `loc[]` retrieves the corresponding row.

---

### **7. What is the average price of cars by fuel type?**
**Python Code:**
```python
# Group by 'Fuel_Type' and calculate the mean price
avg_price_by_fuel = df.groupby('Fuel_Type')['Price'].mean()
print("Average price by fuel type:")
print(avg_price_by_fuel)
```
**Explanation:**
- We group the data by `Fuel_Type` and calculate the mean price for each fuel type.

---

### **8. How many cars are from the year 2020 or later?**
**Python Code:**
```python
# Filter cars from 2020 or later
recent_cars = df[df['Year'] >= 2020]
print(f"Number of cars from 2020 or later: {len(recent_cars)}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Year >= 2020`) and count the rows using `len()`.

---

### **9. What is the average engine size for each brand?**
**Python Code:**
```python
# Group by 'Brand' and calculate the mean engine size
avg_engine_by_brand = df.groupby('Brand')['Engine_Size'].mean()
print("Average engine size by brand:")
print(avg_engine_by_brand)
```
**Explanation:**
- We group the data by `Brand` and calculate the mean engine size for each brand.

---

### **10. Which car has the lowest mileage?**
**Python Code:**
```python
# Find the car with the lowest mileage
lowest_mileage_car = df.loc[df['Mileage'].idxmin()]
print(f"The car with the lowest mileage is: {lowest_mileage_car['Brand']} {lowest_mileage_car['Model']} ({lowest_mileage_car['Mileage']} miles)")
```
**Explanation:**
- `idxmin()` finds the index of the car with the lowest mileage, and `loc[]` retrieves the corresponding row.

---

### **11. What is the correlation between mileage and price?**
**Python Code:**
```python
# Calculate the correlation between mileage and price
correlation = df['Mileage'].corr(df['Price'])
print(f"Correlation between mileage and price: {correlation:.2f}")
```
**Explanation:**
- The `corr()` function calculates the Pearson correlation coefficient between `Mileage` and `Price`.

---

### **12. How many cars have more than 2 owners?**
**Python Code:**
```python
# Filter cars with more than 2 owners
cars_more_than_2_owners = df[df['Owner_Count'] > 2]
print(f"Number of cars with more than 2 owners: {len(cars_more_than_2_owners)}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Owner_Count > 2`) and count the rows using `len()`.

---

### **13. What is the average price of cars by year?**
**Python Code:**
```python
# Group by 'Year' and calculate the mean price
avg_price_by_year = df.groupby('Year')['Price'].mean()
print("Average price by year:")
print(avg_price_by_year)
```
**Explanation:**
- We group the data by `Year` and calculate the mean price for each year.

---

### **14. Which car has the highest number of doors?**
**Python Code:**
```python
# Find the car with the highest number of doors
max_doors_car = df.loc[df['Doors'].idxmax()]
print(f"The car with the highest number of doors is: {max_doors_car['Brand']} {max_doors_car['Model']} ({max_doors_car['Doors']} doors)")
```
**Explanation:**
- `idxmax()` finds the index of the car with the highest number of doors, and `loc[]` retrieves the corresponding row.

---

### **15. What is the average mileage for each fuel type?**
**Python Code:**
```python
# Group by 'Fuel_Type' and calculate the mean mileage
avg_mileage_by_fuel = df.groupby('Fuel_Type')['Mileage'].mean()
print("Average mileage by fuel type:")
print(avg_mileage_by_fuel)
```
**Explanation:**
- We group the data by `Fuel_Type` and calculate the mean mileage for each fuel type.

---

### **16. How many cars are from the brand 'Toyota'?**
**Python Code:**
```python
# Filter cars from the brand 'Toyota'
toyota_cars = df[df['Brand'] == 'Toyota']
print(f"Number of Toyota cars: {len(toyota_cars)}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Brand == 'Toyota'`) and count the rows using `len()`.

---

### **17. What is the average price of cars with automatic transmission?**
**Python Code:**
```python
# Filter cars with automatic transmission
automatic_cars = df[df['Transmission'] == 'Automatic']

# Calculate the average price
avg_price_automatic = automatic_cars['Price'].mean()
print(f"The average price of cars with automatic transmission is: ${avg_price_automatic:.2f}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Transmission == 'Automatic'`) and calculate the mean price.

---

### **18. Which car has the highest owner count?**
**Python Code:**
```python
# Find the car with the highest owner count
max_owner_car = df.loc[df['Owner_Count'].idxmax()]
print(f"The car with the highest owner count is: {max_owner_car['Brand']} {max_owner_car['Model']} ({max_owner_car['Owner_Count']} owners)")
```
**Explanation:**
- `idxmax()` finds the index of the car with the highest owner count, and `loc[]` retrieves the corresponding row.

---

### **19. What is the average engine size for cars with diesel fuel type?**
**Python Code:**
```python
# Filter cars with diesel fuel type
diesel_cars = df[df['Fuel_Type'] == 'Diesel']

# Calculate the average engine size
avg_engine_diesel = diesel_cars['Engine_Size'].mean()
print(f"The average engine size for diesel cars is: {avg_engine_diesel:.2f}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Fuel_Type == 'Diesel'`) and calculate the mean engine size.

---

### **20. How many cars have a price greater than $10,000?**
**Python Code:**
```python
# Filter cars with price > $10,000
high_price_cars = df[df['Price'] > 10000]
print(f"Number of cars with price > $10,000: {len(high_price_cars)}")
```
**Explanation:**
- We filter the DataFrame using a condition (`Price > 10000`) and count the rows using `len()`.

---

These questions and code snippets provide insights into the dataset and demonstrate how to manipulate and analyze data using Python and pandas.

In [None]:
import pandas as pd

# Load the dataset
df = pd.read_csv('car_price_dataset.csv')

# Calculate the average price
average_price = df['Price'].mean()
print(f"The average price of all cars is: ${average_price:.2f}")

The average price of all cars is: $8852.96


In [None]:
# Group by 'Brand' and calculate the mean price
brand_avg_price = df.groupby('Brand')['Price'].mean()

# Find the brand with the highest average price
highest_avg_brand = brand_avg_price.idxmax()
print(f"The brand with the highest average price is: {highest_avg_brand}")

The brand with the highest average price is: Chevrolet


In [None]:
# Find the most common fuel type
most_common_fuel = df['Fuel_Type'].mode()[0]
print(f"The most common fuel type is: {most_common_fuel}")

The most common fuel type is: Electric


In [None]:
# Filter cars with mileage > 200,000
high_mileage_cars = df[df['Mileage'] > 200000]
print(f"Number of cars with mileage > 200,000: {len(high_mileage_cars)}")

Number of cars with mileage > 200,000: 3290


In [None]:
# Count the number of cars by transmission type
transmission_distribution = df['Transmission'].value_counts()
print("Distribution of cars by transmission type:")
print(transmission_distribution)

Distribution of cars by transmission type:
Transmission
Manual            3372
Automatic         3317
Semi-Automatic    3311
Name: count, dtype: int64


In [None]:
# Find the car with the highest price
highest_price_car = df.loc[df['Price'].idxmax()]
print(f"The car with the highest price is: {highest_price_car['Brand']} {highest_price_car['Model']} (${highest_price_car['Price']})")

The car with the highest price is: Toyota Corolla ($18301)


In [None]:
# Group by 'Fuel_Type' and calculate the mean price
avg_price_by_fuel = df.groupby('Fuel_Type')['Price'].mean()
print("Average price by fuel type:")
print(avg_price_by_fuel)

Average price by fuel type:
Fuel_Type
Diesel       8117.336385
Electric    10032.220190
Hybrid       9113.030167
Petrol       8070.561826
Name: Price, dtype: float64


In [None]:
# Filter cars from 2020 or later
recent_cars = df[df['Year'] >= 2020]
print(f"Number of cars from 2020 or later: {len(recent_cars)}")

Number of cars from 2020 or later: 1651


In [None]:
# Group by 'Brand' and calculate the mean engine size
avg_engine_by_brand = df.groupby('Brand')['Engine_Size'].mean()
print("Average engine size by brand:")
print(avg_engine_by_brand)

Average engine size by brand:
Brand
Audi          3.032948
BMW           2.976877
Chevrolet     2.993719
Ford          3.044275
Honda         2.925966
Hyundai       2.936683
Kia           3.053791
Mercedes      3.069745
Toyota        2.961443
Volkswagen    3.011078
Name: Engine_Size, dtype: float64


In [None]:
# Find the car with the lowest mileage
lowest_mileage_car = df.loc[df['Mileage'].idxmin()]
print(f"The car with the lowest mileage is: {lowest_mileage_car['Brand']} {lowest_mileage_car['Model']} ({lowest_mileage_car['Mileage']} miles)")

The car with the lowest mileage is: Ford Explorer (25 miles)


In [None]:
# Filter cars with more than 2 owners
cars_more_than_2_owners = df[df['Owner_Count'] > 2]
print(f"Number of cars with more than 2 owners: {len(cars_more_than_2_owners)}")

Number of cars with more than 2 owners: 5944


In [None]:
# Calculate the correlation between mileage and price
correlation = df['Mileage'].corr(df['Price'])
print(f"Correlation between mileage and price: {correlation:.2f}")

Correlation between mileage and price: -0.55


In [None]:
# Group by 'Year' and calculate the mean price
avg_price_by_year = df.groupby('Year')['Price'].mean()
print("Average price by year:")
print(avg_price_by_year)

Average price by year:
Year
2000     5393.735369
2001     5904.064039
2002     5956.751082
2003     6225.834646
2004     6330.725888
2005     6943.037123
2006     7249.961446
2007     7632.513953
2008     7728.418848
2009     8083.917293
2010     8503.419954
2011     8622.855234
2012     9011.446224
2013     9114.208531
2014     9587.455635
2015     9896.774648
2016    10177.600000
2017    10343.757506
2018    10939.192941
2019    11132.722090
2020    11495.284337
2021    11637.813299
2022    12067.690176
2023    12169.470982
Name: Price, dtype: float64


In [None]:
# Find the car with the highest number of doors
max_doors_car = df.loc[df['Doors'].idxmax()]
print(f"The car with the highest number of doors is: {max_doors_car['Brand']} {max_doors_car['Model']} ({max_doors_car['Doors']} doors)")

The car with the highest number of doors is: Volkswagen Golf (5 doors)


In [None]:
# Group by 'Fuel_Type' and calculate the mean mileage
avg_mileage_by_fuel = df.groupby('Fuel_Type')['Mileage'].mean()
print("Average mileage by fuel type:")
print(avg_mileage_by_fuel)

Average mileage by fuel type:
Fuel_Type
Diesel      150261.533041
Electric    151059.307429
Hybrid      145577.587036
Petrol      149917.694606
Name: Mileage, dtype: float64


In [None]:
# Filter cars from the brand 'Toyota'
toyota_cars = df[df['Brand'] == 'Toyota']
print(f"Number of Toyota cars: {len(toyota_cars)}")

Number of Toyota cars: 970


In [None]:
# Filter cars with automatic transmission
automatic_cars = df[df['Transmission'] == 'Automatic']

# Calculate the average price
avg_price_automatic = automatic_cars['Price'].mean()
print(f"The average price of cars with automatic transmission is: ${avg_price_automatic:.2f}")

The average price of cars with automatic transmission is: $9938.25


In [None]:
# Find the car with the highest owner count
max_owner_car = df.loc[df['Owner_Count'].idxmax()]
print(f"The car with the highest owner count is: {max_owner_car['Brand']} {max_owner_car['Model']} ({max_owner_car['Owner_Count']} owners)")

The car with the highest owner count is: Kia Rio (5 owners)


In [None]:
# Filter cars with diesel fuel type
diesel_cars = df[df['Fuel_Type'] == 'Diesel']

# Calculate the average engine size
avg_engine_diesel = diesel_cars['Engine_Size'].mean()
print(f"The average engine size for diesel cars is: {avg_engine_diesel:.2f}")

The average engine size for diesel cars is: 3.01


In [None]:
# Filter cars with price > $10,000
high_price_cars = df[df['Price'] > 10000]
print(f"Number of cars with price > $10,000: {len(high_price_cars)}")

Number of cars with price > $10,000: 3651
