# Auto Analytics: Exploring Car Market Dynamics
This project conducts Exploratory Data Analysis (EDA) on a dataset containing information about cars. It covers various aspects of data preprocessing, visualization, and analysis to derive insights into the characteristics and trends of the cars.

## Table of Contents
- Introduction
- Steps
- Importing Libraries
- Loading Data and Data Understanding
- Data Cleaning
- Detecting Outliers
- Visualization

## Introduction
This project aims to explore and analyze a dataset related to cars, uncovering insights into various attributes such as price, horsepower, and cylinders. By performing EDA, we gain a deeper understanding of the dataset and its underlying patterns.

## Steps
1. Importing Libraries
    - Pandas: Used for data manipulation and analysis.
    - NumPy: Utilized for numerical computations.
    - Seaborn: Employed for data visualization.
    - Matplotlib: Utilized for creating plots and charts.
  
2. Loading Data and Data Understanding
    - The "cars_data.csv" file is loaded into a pandas DataFrame (df).
    - The first and last five rows (df.head(5), df.tail(5)) are displayed to get a glimpse of the data structure.
    - Basic data information (df.info()) is explored, including data types (df.dtypes) and descriptive statistics (df.describe()).

3. Data Cleaning
- Irrelevant columns ('Engine Fuel Type', 'Market Category', etc.) are dropped (df.drop(...)).
- Column names are made more descriptive (df.rename(...)).
- Duplicate rows are identified (duplicate_rows_df) and removed (df.drop_duplicates()).
- Missing values (df.isna().sum()) are examined and dropped (df.dropna()).
  
4. Detecting Outliers
    - Boxplots are generated (sns.boxplot(...)) to visualize potential outliers in columns like price and horse power.
    - Interquartile Range (IQR) is calculated to define outlier thresholds.
    - Rows with values outside the outlier range (IQR) are removed.
      
5. Visualization
- A bar chart (plt.bar(...)) shows the distribution of car makes.
- A heatmap (sns.heatmap(...)) visualizes correlations between numerical features.
- A scatter plot (plt.scatter(...)) explores the relationship between horsepower and price.

6. Analysis
- Which type of cars are sold maximum?
- What is the co-relation between price and mileage?
- How many cars are registered?
- Price distribution between registered and non-registered cars.
- What is the car price distribution based on Engine Value?
- Which Engine Type of cars users preferred maximum?
- Establish corelation between all features using heatmap.
- Distribution of Price

In [4]:
!pip install pandas
!pip install seaborn





[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


Collecting seaborn
  Downloading seaborn-0.13.2-py3-none-any.whl.metadata (5.4 kB)
Collecting matplotlib!=3.6.1,>=3.4 (from seaborn)
  Downloading matplotlib-3.10.1-cp313-cp313-win_amd64.whl.metadata (11 kB)
Collecting contourpy>=1.0.1 (from matplotlib!=3.6.1,>=3.4->seaborn)
  Downloading contourpy-1.3.1-cp313-cp313-win_amd64.whl.metadata (5.4 kB)
Collecting cycler>=0.10 (from matplotlib!=3.6.1,>=3.4->seaborn)
  Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib!=3.6.1,>=3.4->seaborn)
  Downloading fonttools-4.56.0-cp313-cp313-win_amd64.whl.metadata (103 kB)
Collecting kiwisolver>=1.3.1 (from matplotlib!=3.6.1,>=3.4->seaborn)
  Downloading kiwisolver-1.4.8-cp313-cp313-win_amd64.whl.metadata (6.3 kB)
Collecting pillow>=8 (from matplotlib!=3.6.1,>=3.4->seaborn)
  Downloading pillow-11.1.0-cp313-cp313-win_amd64.whl.metadata (9.3 kB)
Collecting pyparsing>=2.3.1 (from matplotlib!=3.6.1,>=3.4->seaborn)
  Downloading pyparsing-3.2.3-py3-


[notice] A new release of pip is available: 24.3.1 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


# Importing libraries

In [None]:
# Importing libraries
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt

### loading data and understanding data

In [95]:
car_data= pd.read_csv("car45.csv")
car_data

Unnamed: 0.1,Unnamed: 0,car_id,description,amount,region,make,model,year_of_man,color,condition,...,bought_cond,trim,drive_train,reg_city,seat,num_cylinder,horse_power,body_build,fuel_type,transmission
0,0,5IQTDBTYmvK1tJwhdvGJfESJ,Lexus ES 350 FWD 2013 Red,12937500,"Lagos State, Ikeja",Lexus,ES,2013,Red,Foreign Used,...,Imported,350 FWD,Front Wheel,,5,6.0,268.0,Sedan,Petrol,Automatic
1,1,zpZUGomoVXuKk9UFa8j8moC9,Land Rover Range Rover 2012 White,6750000,"Abuja (FCT), Garki 2",Land Rover,Range Rover,2012,White,Nigerian Used,...,Registered,,,,,,,SUV,Petrol,Automatic
2,2,a6ShZXOX4KtY6IBGJIcF3Cxk,Toyota Sequoia 2018 Black,50625000,"Lagos State, Lekki",Toyota,Sequoia,2018,Black,Foreign Used,...,Imported,,,,,,,SUV,Petrol,Automatic
3,3,CciPNDN6vhhQQI1FTQHAbfxi,Toyota Corolla 2007 Green,3600000,"Abuja (FCT), Lugbe District",Toyota,Corolla,2007,Green,Nigerian Used,...,Registered,,,ABUJA,,,,,Petrol,Automatic
4,4,bvwd5LDMx6mIYpVa6Uhi2jqJ,Mercedes-Benz M Class 2005 Silver,3262500,"Lagos State, Isolo",Mercedes-Benz,M Class,2005,Silver,Nigerian Used,...,Imported,,,ABUJA,,,,,Petrol,Automatic
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2889,4348,sHnZZvambm9Prcwq4IwKPmGi,Mazda CX-7 2011 Beige,5062500,"Lagos State, Lekki",Mazda,CX-7,2011,Beige,Foreign Used,...,Imported,,,,,,,SUV,Petrol,Automatic
2890,4349,fsa5b9HiCDnZt2P0Sf9qKLJN,Honda CR-V 1998 Red,1350000,"Lagos State, Lekki",Honda,CR-V,1998,Red,Nigerian Used,...,Imported,,,Lagos,,,,Crossover,Petrol,Automatic
2891,4350,uE5eXlbAEMfHSMZ0SYahHP8x,Lexus GS 2006 Black,4725000,"Lagos State, Lekki",Lexus,GS,2006,Black,Foreign Used,...,Imported,,,,,,,,Petrol,Automatic
2892,4351,CzI81oX5BIFITuoFozBtnQLS,Land Rover Range Rover Sport 2008 White,6930000,"Edo State, Benin City",Land Rover,Range Rover Sport,2008,White,Foreign Used,...,Imported,,,,,,,SUV,Petrol,Automatic


In [96]:
car_data.head()

Unnamed: 0.1,Unnamed: 0,car_id,description,amount,region,make,model,year_of_man,color,condition,...,bought_cond,trim,drive_train,reg_city,seat,num_cylinder,horse_power,body_build,fuel_type,transmission
0,0,5IQTDBTYmvK1tJwhdvGJfESJ,Lexus ES 350 FWD 2013 Red,12937500,"Lagos State, Ikeja",Lexus,ES,2013,Red,Foreign Used,...,Imported,350 FWD,Front Wheel,,5.0,6.0,268.0,Sedan,Petrol,Automatic
1,1,zpZUGomoVXuKk9UFa8j8moC9,Land Rover Range Rover 2012 White,6750000,"Abuja (FCT), Garki 2",Land Rover,Range Rover,2012,White,Nigerian Used,...,Registered,,,,,,,SUV,Petrol,Automatic
2,2,a6ShZXOX4KtY6IBGJIcF3Cxk,Toyota Sequoia 2018 Black,50625000,"Lagos State, Lekki",Toyota,Sequoia,2018,Black,Foreign Used,...,Imported,,,,,,,SUV,Petrol,Automatic
3,3,CciPNDN6vhhQQI1FTQHAbfxi,Toyota Corolla 2007 Green,3600000,"Abuja (FCT), Lugbe District",Toyota,Corolla,2007,Green,Nigerian Used,...,Registered,,,ABUJA,,,,,Petrol,Automatic
4,4,bvwd5LDMx6mIYpVa6Uhi2jqJ,Mercedes-Benz M Class 2005 Silver,3262500,"Lagos State, Isolo",Mercedes-Benz,M Class,2005,Silver,Nigerian Used,...,Imported,,,ABUJA,,,,,Petrol,Automatic


In [97]:
car_data.tail()

Unnamed: 0.1,Unnamed: 0,car_id,description,amount,region,make,model,year_of_man,color,condition,...,bought_cond,trim,drive_train,reg_city,seat,num_cylinder,horse_power,body_build,fuel_type,transmission
2889,4348,sHnZZvambm9Prcwq4IwKPmGi,Mazda CX-7 2011 Beige,5062500,"Lagos State, Lekki",Mazda,CX-7,2011,Beige,Foreign Used,...,Imported,,,,,,,SUV,Petrol,Automatic
2890,4349,fsa5b9HiCDnZt2P0Sf9qKLJN,Honda CR-V 1998 Red,1350000,"Lagos State, Lekki",Honda,CR-V,1998,Red,Nigerian Used,...,Imported,,,Lagos,,,,Crossover,Petrol,Automatic
2891,4350,uE5eXlbAEMfHSMZ0SYahHP8x,Lexus GS 2006 Black,4725000,"Lagos State, Lekki",Lexus,GS,2006,Black,Foreign Used,...,Imported,,,,,,,,Petrol,Automatic
2892,4351,CzI81oX5BIFITuoFozBtnQLS,Land Rover Range Rover Sport 2008 White,6930000,"Edo State, Benin City",Land Rover,Range Rover Sport,2008,White,Foreign Used,...,Imported,,,,,,,SUV,Petrol,Automatic
2893,4352,41KZsjPRmovhziJJQ3gkw37Q,Toyota Corolla Sedan 2005 White,2600000,"Abuja (FCT), Lugbe District",Toyota,Corolla,2005,White,Nigerian Used,...,Imported,Sedan,Front,ABUJA,5.0,4.0,90.0,Sedan,Petrol,Automatic


In [98]:
car_data.describe()

Unnamed: 0.1,Unnamed: 0,amount,year_of_man,mileage,engine_size,num_cylinder,horse_power
count,2894.0,2894.0,2894.0,2829.0,2846.0,369.0,338.0
mean,2174.766413,4986147.0,2007.782654,243347.5,3079.524596,5.411924,216.816568
std,1507.677422,6177222.0,4.565697,1416721.0,3753.377371,1.301533,69.989033
min,0.0,577500.0,1988.0,0.0,25.0,4.0,67.0
25%,728.25,2166000.0,2005.0,130059.0,2300.0,4.0,158.0
50%,1641.5,3203125.0,2007.0,191398.0,3000.0,6.0,225.0
75%,3617.75,5250000.0,2011.0,265988.0,3500.0,6.0,269.75
max,4352.0,98700000.0,2023.0,74026750.0,158713.0,8.0,477.0


car_data.shape()

In [99]:
car_data.shape[0]
car_data.shape[1]

23

In [100]:
car_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2894 entries, 0 to 2893
Data columns (total 23 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   Unnamed: 0    2894 non-null   int64  
 1   car_id        2894 non-null   object 
 2   description   2894 non-null   object 
 3   amount        2894 non-null   int64  
 4   region        2894 non-null   object 
 5   make          2894 non-null   object 
 6   model         2894 non-null   object 
 7   year_of_man   2894 non-null   int64  
 8   color         2893 non-null   object 
 9   condition     2894 non-null   object 
 10  mileage       2829 non-null   float64
 11  engine_size   2846 non-null   float64
 12  selling_cond  2877 non-null   object 
 13  bought_cond   2877 non-null   object 
 14  trim          394 non-null    object 
 15  drive_train   361 non-null    object 
 16  reg_city      1669 non-null   object 
 17  seat          345 non-null    object 
 18  num_cylinder  369 non-null  

In [101]:
car_data.isnull().sum()

Unnamed: 0         0
car_id             0
description        0
amount             0
region             0
make               0
model              0
year_of_man        0
color              1
condition          0
mileage           65
engine_size       48
selling_cond      17
bought_cond       17
trim            2500
drive_train     2533
reg_city        1225
seat            2549
num_cylinder    2525
horse_power     2556
body_build      1637
fuel_type         18
transmission       1
dtype: int64

In [102]:
car_data.isnull().sum()/len(car_data)

Unnamed: 0      0.000000
car_id          0.000000
description     0.000000
amount          0.000000
region          0.000000
make            0.000000
model           0.000000
year_of_man     0.000000
color           0.000346
condition       0.000000
mileage         0.022460
engine_size     0.016586
selling_cond    0.005874
bought_cond     0.005874
trim            0.863856
drive_train     0.875259
reg_city        0.423290
seat            0.880788
num_cylinder    0.872495
horse_power     0.883207
body_build      0.565653
fuel_type       0.006220
transmission    0.000346
dtype: float64

In [103]:
car_data.duplicated().sum()

np.int64(0)

In [104]:
car_data.columns

Index(['Unnamed: 0', 'car_id', 'description', 'amount', 'region', 'make',
       'model', 'year_of_man', 'color', 'condition', 'mileage', 'engine_size',
       'selling_cond', 'bought_cond', 'trim', 'drive_train', 'reg_city',
       'seat', 'num_cylinder', 'horse_power', 'body_build', 'fuel_type',
       'transmission'],
      dtype='object')

car_data["bought_cond"]

In [105]:
car_data["bought_cond"]

0         Imported
1       Registered
2         Imported
3       Registered
4         Imported
           ...    
2889      Imported
2890      Imported
2891      Imported
2892      Imported
2893      Imported
Name: bought_cond, Length: 2894, dtype: object

In [106]:
car_data["bought_cond"].isnull().sum()

np.int64(17)