# Electric Vehicle Data Analysis

This Jupyter Notebook provides an analysis of the Electric Vehicle Population Data using Python and the pandas library. The dataset contains information about various electric vehicles, including their make, model, electric range, state, city, county and more.

## Table of Contents
1. [Dataset Overview](#dataset-overview)
2. [Data Cleaning](#data-cleaning)
3. [Exploratory Data Analysis](#exploratory-data-analysis)
4. [Conclusion](#conclusion)

## Dataset Overview
- The dataset was read using the pandas library to facilitate data manipulation and analysis.
- Initial insights were gained by examining the first few rows of the dataset using the `head()` function.
- The overall structure and data types were inspected using the `info()` function.

In [82]:
# Import relevant libraries
import pandas as pd

In [83]:
# Read electric_vehicle dataset using pandas
electric_vehicles = pd.read_csv('Electric_Vehicle_Population_Data.csv')
electric_vehicles.head()

Unnamed: 0,VIN (1-10),County,City,State,Postal Code,Model Year,Make,Model,Electric Vehicle Type,Clean Alternative Fuel Vehicle (CAFV) Eligibility,Electric Range,Base MSRP,Legislative District,DOL Vehicle ID,Vehicle Location,Electric Utility,2020 Census Tract
0,3C3CFFGE4E,Yakima,Yakima,WA,98902.0,2014,FIAT,500,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,87,0,14.0,1593721,POINT (-120.524012 46.5973939),PACIFICORP,53077000000.0
1,5YJXCBE40H,Thurston,Olympia,WA,98513.0,2017,TESLA,MODEL X,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,200,0,2.0,257167501,POINT (-122.817545 46.98876),PUGET SOUND ENERGY INC,53067010000.0
2,3MW39FS03P,King,Renton,WA,98058.0,2023,BMW,330E,Plug-in Hybrid Electric Vehicle (PHEV),Not eligible due to low battery range,20,0,11.0,224071816,POINT (-122.1298876 47.4451257),PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA),53033030000.0
3,7PDSGABA8P,Snohomish,Bothell,WA,98012.0,2023,RIVIAN,R1S,Battery Electric Vehicle (BEV),Eligibility unknown as battery range has not b...,0,0,21.0,260084653,POINT (-122.1873 47.820245),PUGET SOUND ENERGY INC,53061050000.0
4,5YJ3E1EB8L,King,Kent,WA,98031.0,2020,TESLA,MODEL 3,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,322,0,33.0,253771913,POINT (-122.2012521 47.3931814),PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA),53033030000.0


In [84]:
# Check all info of the dataset
electric_vehicles.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 166800 entries, 0 to 166799
Data columns (total 17 columns):
 #   Column                                             Non-Null Count   Dtype  
---  ------                                             --------------   -----  
 0   VIN (1-10)                                         166800 non-null  object 
 1   County                                             166795 non-null  object 
 2   City                                               166795 non-null  object 
 3   State                                              166800 non-null  object 
 4   Postal Code                                        166795 non-null  float64
 5   Model Year                                         166800 non-null  int64  
 6   Make                                               166800 non-null  object 
 7   Model                                              166800 non-null  object 
 8   Electric Vehicle Type                              166800 non-null  object

## Data Cleaning
- The 'Postal Code' column, initially of type float64, was converted to a string to avoid potential issues.
- The 'Electric Vehicle Type' column was split into 'Electric Vehicle Type Description' and 'Electric Vehicle Type Code'.
- Unnecessary columns, such as 'Base MSRP' and 'Legislative District', were dropped to focus on relevant features.

In [85]:
# Fix column Postal Code which is float64 and change it to str
electric_vehicles['Postal Code'] = electric_vehicles['Postal Code'].astype('str')
electric_vehicles.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 166800 entries, 0 to 166799
Data columns (total 17 columns):
 #   Column                                             Non-Null Count   Dtype  
---  ------                                             --------------   -----  
 0   VIN (1-10)                                         166800 non-null  object 
 1   County                                             166795 non-null  object 
 2   City                                               166795 non-null  object 
 3   State                                              166800 non-null  object 
 4   Postal Code                                        166800 non-null  object 
 5   Model Year                                         166800 non-null  int64  
 6   Make                                               166800 non-null  object 
 7   Model                                              166800 non-null  object 
 8   Electric Vehicle Type                              166800 non-null  object

In [86]:
# Check if Electric Vehicle Type has fixed values by using value counts
electric_type_counts = electric_vehicles['Electric Vehicle Type'].value_counts
electric_type_counts

<bound method IndexOpsMixin.value_counts of 0                 Battery Electric Vehicle (BEV)
1                 Battery Electric Vehicle (BEV)
2         Plug-in Hybrid Electric Vehicle (PHEV)
3                 Battery Electric Vehicle (BEV)
4                 Battery Electric Vehicle (BEV)
                           ...                  
166795    Plug-in Hybrid Electric Vehicle (PHEV)
166796            Battery Electric Vehicle (BEV)
166797            Battery Electric Vehicle (BEV)
166798    Plug-in Hybrid Electric Vehicle (PHEV)
166799            Battery Electric Vehicle (BEV)
Name: Electric Vehicle Type, Length: 166800, dtype: object>

In [87]:
# Split electric vehicle type into separate columns of vehicle type description and vehicle type code
electric_vehicles[['Electric Vehicle Type Description','Electric Vehicle Type Code']] = electric_vehicles['Electric Vehicle Type'].str.split('(',expand=True)
# Strip the paranethesis from the vehicle type code
electric_vehicles['Electric Vehicle Type Code'] = electric_vehicles['Electric Vehicle Type Code'].str.rstrip(')')
# Check if columns have been split correctly
electric_vehicles.head()

Unnamed: 0,VIN (1-10),County,City,State,Postal Code,Model Year,Make,Model,Electric Vehicle Type,Clean Alternative Fuel Vehicle (CAFV) Eligibility,Electric Range,Base MSRP,Legislative District,DOL Vehicle ID,Vehicle Location,Electric Utility,2020 Census Tract,Electric Vehicle Type Description,Electric Vehicle Type Code
0,3C3CFFGE4E,Yakima,Yakima,WA,98902.0,2014,FIAT,500,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,87,0,14.0,1593721,POINT (-120.524012 46.5973939),PACIFICORP,53077000000.0,Battery Electric Vehicle,BEV
1,5YJXCBE40H,Thurston,Olympia,WA,98513.0,2017,TESLA,MODEL X,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,200,0,2.0,257167501,POINT (-122.817545 46.98876),PUGET SOUND ENERGY INC,53067010000.0,Battery Electric Vehicle,BEV
2,3MW39FS03P,King,Renton,WA,98058.0,2023,BMW,330E,Plug-in Hybrid Electric Vehicle (PHEV),Not eligible due to low battery range,20,0,11.0,224071816,POINT (-122.1298876 47.4451257),PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA),53033030000.0,Plug-in Hybrid Electric Vehicle,PHEV
3,7PDSGABA8P,Snohomish,Bothell,WA,98012.0,2023,RIVIAN,R1S,Battery Electric Vehicle (BEV),Eligibility unknown as battery range has not b...,0,0,21.0,260084653,POINT (-122.1873 47.820245),PUGET SOUND ENERGY INC,53061050000.0,Battery Electric Vehicle,BEV
4,5YJ3E1EB8L,King,Kent,WA,98031.0,2020,TESLA,MODEL 3,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,322,0,33.0,253771913,POINT (-122.2012521 47.3931814),PUGET SOUND ENERGY INC||CITY OF TACOMA - (WA),53033030000.0,Battery Electric Vehicle,BEV


In [88]:
# Convert electric vehicle type desc and electric vehicle type code columns as category instead of object
electric_vehicles['Electric Vehicle Type Description'] = electric_vehicles['Electric Vehicle Type Description'].astype('category')
electric_vehicles['Electric Vehicle Type Code'] = electric_vehicles['Electric Vehicle Type Code'].astype('category')

In [89]:
# Drop columns which are not relevant to analysis
drop_columns = ['Base MSRP','Legislative District','DOL Vehicle ID','Electric Utility','2020 Census Tract']
electric_vehicles = electric_vehicles.drop(labels=drop_columns, axis=1)
electric_vehicles.head()

Unnamed: 0,VIN (1-10),County,City,State,Postal Code,Model Year,Make,Model,Electric Vehicle Type,Clean Alternative Fuel Vehicle (CAFV) Eligibility,Electric Range,Vehicle Location,Electric Vehicle Type Description,Electric Vehicle Type Code
0,3C3CFFGE4E,Yakima,Yakima,WA,98902.0,2014,FIAT,500,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,87,POINT (-120.524012 46.5973939),Battery Electric Vehicle,BEV
1,5YJXCBE40H,Thurston,Olympia,WA,98513.0,2017,TESLA,MODEL X,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,200,POINT (-122.817545 46.98876),Battery Electric Vehicle,BEV
2,3MW39FS03P,King,Renton,WA,98058.0,2023,BMW,330E,Plug-in Hybrid Electric Vehicle (PHEV),Not eligible due to low battery range,20,POINT (-122.1298876 47.4451257),Plug-in Hybrid Electric Vehicle,PHEV
3,7PDSGABA8P,Snohomish,Bothell,WA,98012.0,2023,RIVIAN,R1S,Battery Electric Vehicle (BEV),Eligibility unknown as battery range has not b...,0,POINT (-122.1873 47.820245),Battery Electric Vehicle,BEV
4,5YJ3E1EB8L,King,Kent,WA,98031.0,2020,TESLA,MODEL 3,Battery Electric Vehicle (BEV),Clean Alternative Fuel Vehicle Eligible,322,POINT (-122.2012521 47.3931814),Battery Electric Vehicle,BEV


## Exploratory Data Analysis
- The dataset was filtered to identify Clean Alternative Fuel Vehicles (CAFV) and their respective counts.
- Electric vehicles with an electric range greater than or equal to 200 were identified, and relevant details were displayed.
- The number of electric vehicles with ranges above 200 was obtained, and similar analysis was performed for ranges below 200.
- States with the highest counts of Battery Electric Vehicles (BEV) and Plug-in Hybrid Electric Vehicles (PHEV) were determined.

In [90]:
# Obtain all makes of cars which are Clean Alternative Fuel Vehicles and their counts 
CAFV_vehicles = electric_vehicles[electric_vehicles['Clean Alternative Fuel Vehicle (CAFV) Eligibility'] == 'Clean Alternative Fuel Vehicle Eligible']
CAFV_vehicles.sort_values(by='Make',ascending=True) 
CAFV_unique_vehicles = CAFV_vehicles['Make'].value_counts()
CAFV_unique_vehicles


Make
TESLA                   25258
NISSAN                  10827
CHEVROLET                9181
BMW                      3674
KIA                      2919
CHRYSLER                 2878
TOYOTA                   2065
VOLKSWAGEN               1070
VOLVO                    1027
HONDA                     811
FIAT                      801
AUDI                      739
HYUNDAI                   648
FORD                      624
MITSUBISHI                392
SMART                     275
PORSCHE                   217
JAGUAR                    192
LEXUS                     161
POLESTAR                  142
MINI                      120
CADILLAC                   92
MERCEDES-BENZ              91
ALFA ROMEO                 29
DODGE                      28
FISKER                     15
AZURE DYNAMICS              8
LAND ROVER                  7
TH!NK                       5
WHEEGO ELECTRIC CARS        3
Name: count, dtype: int64

In [91]:
# Find list of cars whose electric range is >= 200
cars_high_range = electric_vehicles[electric_vehicles['Electric Range'] >= 200]
cars_high_range = cars_high_range.sort_values(by='Electric Range', ascending=True)
cars_high_range_display = cars_high_range[['Make','Model','Model Year','City','State','Electric Vehicle Type Description','Electric Vehicle Type Code', 'Electric Range']]
cars_high_range_display

Unnamed: 0,Make,Model,Model Year,City,State,Electric Vehicle Type Description,Electric Vehicle Type Code,Electric Range
1,TESLA,MODEL X,2017,Olympia,WA,Battery Electric Vehicle,BEV,200
65678,TESLA,MODEL X,2016,Woodland,WA,Battery Electric Vehicle,BEV,200
127686,TESLA,MODEL X,2016,Colfax,WA,Battery Electric Vehicle,BEV,200
127756,TESLA,MODEL X,2016,Snohomish,WA,Battery Electric Vehicle,BEV,200
127835,TESLA,MODEL X,2016,Redmond,WA,Battery Electric Vehicle,BEV,200
...,...,...,...,...,...,...,...,...
140397,TESLA,MODEL S,2020,Lake Stevens,WA,Battery Electric Vehicle,BEV,337
88238,TESLA,MODEL S,2020,Mill Creek,WA,Battery Electric Vehicle,BEV,337
142565,TESLA,MODEL S,2020,Spokane,WA,Battery Electric Vehicle,BEV,337
95555,TESLA,MODEL S,2020,Maple Valley,WA,Battery Electric Vehicle,BEV,337


In [92]:
# Obtain the number of cars in electric ranges above 200 (inclusive)
cars_high_range_count = cars_high_range['Electric Range'].value_counts().sort_index()
cars_high_range_count

Electric Range
200    1281
203     128
204     516
208    2472
210    1852
215    6272
218      70
220    4103
222     153
233     142
234     192
238    3790
239     846
245      26
249     891
258     220
259    1165
265     124
266    1400
270     275
289     646
291    2335
293     443
308     485
322    1671
330     318
337      74
Name: count, dtype: int64

In [93]:
# Find list of cars whose electric range is < 200
cars_low_range = electric_vehicles[electric_vehicles['Electric Range'] < 200]
cars_low_range = cars_low_range.sort_values(by='Electric Range', ascending=True)
cars_low_range_display = cars_low_range[['Make','Model','Model Year','City','State','Electric Vehicle Type Description','Electric Vehicle Type Code', 'Electric Range']]
cars_low_range_display

Unnamed: 0,Make,Model,Model Year,City,State,Electric Vehicle Type Description,Electric Vehicle Type Code,Electric Range
83757,TESLA,MODEL Y,2023,Woodinville,WA,Battery Electric Vehicle,BEV,0
85273,VOLKSWAGEN,ID.4,2021,Gig Harbor,WA,Battery Electric Vehicle,BEV,0
85271,TESLA,MODEL Y,2023,Bellevue,WA,Battery Electric Vehicle,BEV,0
148918,TESLA,MODEL Y,2022,Brewster,WA,Battery Electric Vehicle,BEV,0
85269,TESLA,MODEL 3,2022,Everett,WA,Battery Electric Vehicle,BEV,0
...,...,...,...,...,...,...,...,...
80115,PORSCHE,TAYCAN,2020,Spokane,WA,Battery Electric Vehicle,BEV,192
133786,PORSCHE,TAYCAN,2020,Hunts Point,WA,Battery Electric Vehicle,BEV,192
14440,PORSCHE,TAYCAN,2021,Bothell,WA,Battery Electric Vehicle,BEV,192
92524,PORSCHE,TAYCAN,2020,Seattle,WA,Battery Electric Vehicle,BEV,192


In [94]:
# Obtain the number of cars in electric ranges lesser than 200
cars_low_range_count = cars_low_range['Electric Range'].value_counts().sort_index()
cars_low_range_count

Electric Range
0      83517
6        935
8         35
9         21
10       162
       ...  
150     1376
151     1210
153      103
170       28
192       89
Name: count, Length: 75, dtype: int64

In [95]:
# Find list of states which have highest BEV vehicles
BEV_vehicles = electric_vehicles[electric_vehicles['Electric Vehicle Type Code'] == 'BEV']
state_BEV_counts = BEV_vehicles['State'].value_counts()
state_BEV_counts

State
WA    130055
CA        62
VA        24
MD        19
TX        15
IL        10
CO         9
NC         7
FL         6
GA         6
AZ         6
HI         6
NJ         5
AL         5
NV         5
NY         5
LA         4
DC         4
BC         3
MA         3
MO         3
SC         3
PA         3
KY         2
OH         2
OR         2
ID         2
AR         2
UT         2
IA         1
KS         1
AK         1
CT         1
WY         1
OK         1
NH         1
DE         1
NE         1
MN         1
IN         1
AE         1
AP         1
Name: count, dtype: int64

In [96]:
# Find list of states which have highest PHEV vehicles
PHEV_vehicles = electric_vehicles[electric_vehicles['Electric Vehicle Type Code'] == 'PHEV']
state_PHEV_counts = PHEV_vehicles['State'].value_counts()
state_PHEV_counts

State
WA    36385
CA       29
VA       14
MD       13
TX        9
NC        7
CT        6
OR        4
SC        4
FL        4
HI        3
IL        3
NJ        3
CO        3
MI        2
NV        2
NY        2
GA        2
IN        1
AZ        1
KY        1
KS        1
RI        1
UT        1
NE        1
PA        1
OH        1
MO        1
LA        1
MA        1
Name: count, dtype: int64

## Conclusion
- The analysis revealed insights into the distribution and characteristics of electric vehicles in the dataset.
- Key findings include the prevalence of Clean Alternative Fuel Vehicles, the distribution of electric ranges, and the states with the highest counts of BEV and PHEV vehicles.

This notebook demonstrates proficiency in data cleaning, exploratory data analysis, and visualization using Python and pandas.